Improve how we log errors
Last one for today from !395 (merged).
I noticed it some months ago, but only now I had some time to carefully look at the code. I think it might be due to our custom Log
class, or some other thing, but our exception stack traces are not showing everything. See this error for example.
Configuration files OK
Traceback (most recent call last):
File "/home/kinow/Development/python/workspace/autosubmit/autosubmit/autosubmit.py", line 2814, in recovery
raise AutosubmitCritical(
log.log.AutosubmitCritical:
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/kinow/Development/python/workspace/autosubmit/bin/autosubmit", line 37, in main
return_value = Autosubmit.parse_args()
File "/home/kinow/Development/python/workspace/autosubmit/autosubmit/autosubmit.py", line 685, in parse_args
return Autosubmit.recovery(args.expid, args.noplot, args.save, args.all, args.hide, args.group_by,
File "/home/kinow/Development/python/workspace/autosubmit/autosubmit/autosubmit.py", line 2819, in recovery
raise AutosubmitCritical(
log.log.AutosubmitCritical:
[ERROR] Trace:
[CRITICAL] Couldn't restore the job_list or packages, check if the filesystem is having issues [eCode=7040]
More info at https://autosubmit.readthedocs.io/en/master/troubleshooting/error-codes.html
The user sees the last message, "Couldn't restore the job_list or packages, check if the filesystem is having issues [eCode=7040]". The traceback is, however, hiding a more useful message, created where the initial exception is raised.
In the image above, you should be able to see the original error: "Experiment can't be recovered due being 1 active jobs in your experiment, If you want to recover the experiment, please use the flag -f and all active jobs will be cancelled".
But that error message is never displayed to the user.
This means users will open an issue in Autosubmit, a developer will open the code, see that message when following the traceback back to where it originated, and then tell the user something that could have been printed.
This is not very efficient (for both users and devs), and I think we can do better at printing the complete traceback for our users, empowering them and allowing them to troubleshoot issues much faster (i.e. a better user experience for Autosubmit users).