- 17 May, 2017 40 commits
-
-
Brian Christiansen authored
In order for scancel -u <user> to be able to cancel jobs by user that exist only on other clusters in the federation, scancel needs to load all of them.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
Running jobs can happen out of order.
-
Brian Christiansen authored
-
Brian Christiansen authored
Original list was being free'd before the copy was.
-
Isaac Hartung authored
sbatch|srun --test-only Bug 3740
-
Isaac Hartung authored
Bug 3802
-
Isaac Hartung authored
Bug 3641
-
Isaac Hartung authored
and steps Bug 3700
-
Isaac Hartung authored
Bug 3699
-
Isaac Hartung authored
Bug 3698
-
Isaac Hartung authored
Bug 3697
-
Isaac Hartung authored
Bug 3667
-
Isaac Hartung authored
to validate scontrol --local and --sibling options Bug 3662
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
Only used in fed_mgr.c
-
Brian Christiansen authored
This will make it easier to add new proto types without having to modifying protocol_defs.[ch]. Leaving job_lock and job_unlock to be handled by slurmctld_req since they aren't a "queued" type.
-
Brian Christiansen authored
prevent possible memory leak.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
All handled in _proc_multi_msg except for sib_job_[un]lock.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
This prevents deadlocks when having the fed_job_list_mutex locked higher up and calling job_completion_logger inside of the locked mutex.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
Don't need to have the fed_write_lock when destroying the persist_conn_server.
-
Brian Christiansen authored
-
Brian Christiansen authored
Since federated submissions are now asynchronous and because the working_cluster_rec can be multithreaded, it's better to have the federadated will_runs in the client. This prevents the deadlocks and holding up the persistent connections as could happen in the previous model.
-
Brian Christiansen authored
-
Brian Christiansen authored
With the change to the asynchronous model, it's better to have the cluster always get the lock from the origin cluster. Previously, the origin cluster would try to pick one cluster that could start the job the soonest and the scenario where there would be only one sibling was more common. Now that sibling jobs are sent to all clusters this is less common.
-
Brian Christiansen authored
Queue up the fed job completions.
-