I recently was tasked with a small problem on how to get a time difference of two log entries
Problem
Considering we have the following log and we are interesting in the amount of seconds the demoengine process until it’s up and running:
Two things which are important:
starting demoengine indicates that the process is starting.
demoengine is running indicates that the process has started to run.
The result should be displayed in the following format:
How to solve?
I want to solve this using bash and default tools which you usually have available on Linux. So first we are interested in the respective lines, as mentioned above. This can be easily achieved with grep.
Ultimately we only care about the time stamps, so another grep -o extracts them:
Now as whe have the respective information, we can use awk to process the remaining log statements - one can probably write the whole thing in awk without using grep, but still my solution works fine. We have to differentiate two cases
the app is started and running i.e. both statements are present
the app is started but not yet running i.e. only started statement is present, but not the running (see also last line of the example log)
For 2. as we only have the lines necessary in our input for awk we can easily say that when the total line of numbers is odd, the last line cannot have a duration:
The END will be executed at the end, when all input is exhausted. Now awk processes line by line using statements which are encapsulated in { }. Now we care on each odd line and it’s subsequent next line. Using the first field $0 we can show the starting and running times as follows
Which would output
However, what we care of is the difference between starting and running thus we would need to calculate the time difference. awk has time functions built in, but none of these is really suitable. First of all they work with timestamp which is Seconds since Jan 01 1970. We however have something like MM DD HH:MM:SS which has to be converted in a format we can calculate the difference. We can use date function but how? Looking at the documentation of awk we can find this
`system(cmd-line)
Execute the command cmd-line, and return the exit status. (This may not be available on non-POSIX systems.)
I had a look but I find it very hard to use for my purpose, so I looked further and found this stackoverflow post which relates to the awk-documentation:
So I can use a bash function which handles the calculatoion of the time differences, assuming we pass both times:
The function transforms the date/time to seconds and subtracts startTime from endTime and then convert it to HH:MM:SS. Important is also the export -f statement so that the function is available to awk. We can the use the output from the function getStartupTime() using the pipe and getline as mentioned above
Final solution
Now that we have the bits and pieces, we have to bring this all together in a script get-startup-time.sh as follows:
Running the script against our example log file will output what we initially expected: