PP-838 Support for logging via Syslog in PBS


#1

I have written down use cases for the logging via syslog functionality present in PBS. We are reverse engineering the EDD for logging via syslog.
Please review the use cases here: Logging via syslog

Please review the design here: Logging via syslog

Please review the Test Scenarios here: Logging via syslog


#2

Hey Tejas,
Thanks for writing this up. Syslog tends to be one of the more forgotten pieces of functionality in PBS. Spending some time on it is a good thing.

My comments:
Use case 1: You say logging via all daemons and then including objects like jobs and reservations. The objects are not daemons. What did you mean by referring to daemons and objects together like that?
Use case 2: PBSEVENT_DEBUG/PBSEVENT_DEBUG2 are not syslog log levels (they’re pbs log levels). There is a different log levels for syslog. Some examples are LOG_DEBUG or LOG_WARNING.
Use case 4: I don’t see how the user will be setting up syslog in a failover environment. I thinks this is just admin
Use case 6: This seems a little weird. Part of this is use case 5 (turning on both local and syslog together). The other part talks about running tracejob. How do these meet together? From reading it as it is, I think you’re saying you want to be able to run tracejob on either a local log or syslog. If you are truly just reverse engineering the syslog functionality and not adding anything new, then you currently can’t run tracejob on syslog logs. This might be a good enhancement to tracejob, but it can’t do it now.
Use case 7: I’m not sure we need this use case. This sounds like something completely in syslog’s corner. We don’t get into the business of how long logs are kept around


#3

Hi Bhroam,
Thanks a lot for the review.
I have edited usecase 1 to remove the reference of PBS objects and daemons together as it is confusing.
I have also updated usecases 2 and 4 according to the comments.
For 6, I have added this as tracejob will be a nice enhancement to have in future. I have added a note now saying this functionality is currently not supported. This note should be documented so that community knows that tracejob does not work with only syslog enabled.
I have removed use case 7.


#4

Hey Tejas,
Thanks for making the changes. I am happy with the use cases now. They cover the PBS syslog functionality.

Bhroam


#5

I have edited the first comment on this topic to have the link to design.
Here is the link
Please review design here: logging via syslog


#6
  1. Check if the datetime format logged in syslog file by default is same on every Linux distribution.
    Check if we can read the datetime format from syslogconf file and use that when reading logs

  2. If we can directly read the format from syslogconf file, the design should change to include using the datetime format according to what is set in the config file.
    Check if by default all linux distrubutions stores syslog file in same place under same name.
    Check if can find read this file path for syslog file from syslogconf file

  3. If we can directly read from syslogconf file, the design should change to include using the date format according to what is set in the config file.
    Research if we can use https://pypi.python.org/pypi/syslog-rfc5424-parser to read the syslog file. This parser uses the pyparsing module

  4. I have to verify if this parser can make it easier for us to read from syslog file

  5. Check for other Open source syslog parsers

  6. Analyze if changes are needed in pbs_loganalyzer file in PTL.

  7. Add method for reading compressed file in more detail- design decompress/read directly from .bz, .gz file

  8. Expand the design to include multiple combinations of PBS_SYSLOG and PBS_LOCALLOG settings

  9. Expand the design to include an attribute to log_match() for specifying whether to read from syslog / local_logs


#7

Hi Tejas,

Thanks for making updates on the EDD at https://pbspro.atlassian.net/wiki/spaces/PD/pages/52789715/PP-838+Support+for+logging+via+Syslog+in+PBS.

I have the following comments:

  1. Please refer https://pbspro.atlassian.net/wiki/spaces/DG/pages/293076995/PBS+Pro+Design+Document+Guidelines. Some sections like “Overview”, “Glossary” mentioned in the guidelines are good to add here.
  2. Objective: If the scope of work now is to only to address the design for syslog support in PTL we should mention that clearly. Are we addressing other PBS enhancements related to syslog now?
  3. Design : 1) syslog_match is a new function ? Where and who is using this function.Please be clear.
  4. Typo : “Things to be taken intoo consideration”
  5. “Wiki page for syslog. This can be referred to for understanding basics of syslog” . Can this be placed in “Glossary”section, may be
  6. log_config_values(): In the table, be consistent with “True/False” or “0 and 1”. The function above the title seems to be setting the syslog attribute “True/False”
  7. log_config_values(): What error will be thrown ? Is it not good to be silent and do nothing ?
  8. Should the name of the function be something like “get_log_config_values”?
  9. I feel it will be useful if you say at what point this function is called and updated with the syslog config information
  10. I could not quite understand the return value mentioned in the table . Should it not be the return value (1 ,2 or 3)?
  11. The font style (bold and italics) are not consistent
  12. log_syslog_lines(): Again, should the name of the function start with “get” ?
  13. log_syslog_lines(): Item 1 in Description . Should it be “get the type of syslog utility?
  14. log_syslog_lines(): “list of priorites[] “ Should it have underscores ?
  15. log_syslog_lines():” list_of_syslog_files[] is the list of paths to the files right ?
  16. log_syslog_lines(): “if there is no list of files returned, throw PTL error) get_rsyslog_files(), get_syslogng_priorites()” . Here, should it be get_syslogng_files() not get_syslogng_priorites()
  17. get_rsyslog_priorites: Input Should it be severity and “facility” (priority is mentioned)
  18. get_syslogng_priorites: same as above
  19. Please explicitly mention that the functions in bold are retuning these values
  20. If we are not supporting syslogng now, please mention it somewhere
  21. get_syslogng_priorites: No description of the function . Consciously missed ? Are we using syslog specific methods inside ?
  22. get_syslog_lines(): Description says about a parameter “name” but the prototype fo the function does not have it
  23. get_syslog_lines(): Summary should be “… last n blocks of lines”… n is missing here
  24. Do we have any high mark for the max number of lines ? Will there be any performance hits for huge numbers as there seems to be sorting and combining of text? You may want to unit test this for huge number of lines and see if there are any performance hits
  25. log_match(): The heading “Execution steps” is not consistent with other sections. Input , Return Titles not added explicitly’
  26. log_match(): “If syslog fails there is no need to check in local lo” Please rephrase this . I guess you are trying to say that if the match is found in syslog there is no need to check the local logs .Correct?
  27. Test Scenarios: You may want to check the performance when matching for a log messages from a huge file
  28. “Changes to existing methods are made in the following files:- “ Please also add a heading explicitly for the new functions added to have better consistency and clarity
  29. convert_date_time: fmt=syslog_format . What is this format and how do you get it ?
  30. Test Scenarios: “Test that logging via sys logging can be set for only a particular daemon.” How can we do this ? In pbs.conf?