Commit 072196e5 authored by Thomas Cadeau's avatar Thomas Cadeau Committed by Danny Auble
Browse files

Sometimes, generally with several jobs on the same node or calling many sstat...

Sometimes, generally with several jobs on the same node or calling many sstat for the job, the pipe is not ready to be read.
In this case, the function reading the pipe return an error and the values of consumed energy are set to NO_VAL.
From this point, the values are never read again because the process "knows" there is no value to read.
Thus, if there is one error, NO_VAL is saved in database and no information of consumed energy is stored.

To avoid this, we wrote the attached patch.

For first read of pipe, if the pipe doesn't exist, the function retry "NBFIRSTREAD = 3" times with a waiting time of 1 second.
Then during job run and for final read, if the pipe doesn't exist, the values are not updated.

The first time, the pipe is read if the writer thread is running.
If sstat fails to read pipe, the value is not update and last value is printed.
But if there is a problem during last read:

    if there was sstat calls, the value exists but we miss all change between last sstat and end of step.
    if not, the value is just "0" (no update from the begin).
parent bea11a6d
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment