Commit af7b4531 authored by Danny Auble's avatar Danny Auble Committed by Alejandro Sanchez
Browse files

Handle situation where a slurmctld tries to communicate with slurmdbd more...

Handle situation where a slurmctld tries to communicate with slurmdbd more than once at the same time.

What can happen here is the slurmdbd/slurmctld connection gets hung up
somehow.  If the slurmctld is restarted a new connection is made along
side the old connection.  When the old connection gets unwedged the old
connection will clear out the registration of the slurmctld making it so
no updates are sent to that slurmctld.

What this does is checks for old connections when a registration message
comes in.  If we find one we print error set the rem_port = 0 and
remove it from the list.  This makes it so when it gets unwedged we just
close the socket instead of remove the registration.

Bug 5213
parent d0729247
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment