• Brian Christiansen's avatar
    Optmized queries with additional db indexes. · f1ed6616
    Brian Christiansen authored
    Bug 1819
    
    Composite indexes search left to right. E.g. an index of (inx1, inx2, inx3) will search from left to right. inx2 can't be used in a where statement by itself, it requires inx1 to be present (inx3 can be optional). For the rollup index having time_end first speeds up the below query. The actual rollup queries still benefit from the original rollup index.
    
    sacct -S 07/22-09:41:36 -E 07/22-09:42:37 -i 1-4 -ojobid,start,end,nnodes,nodelist -n -a:
    mysql> explain select t1.account, t1.array_max_tasks, t1.array_task_str, t1.cpus_alloc, t1.cpus_req, t1.derived_ec, t1.derived_es, t1.exit_code, t1.id_array_job, t1.id_array_task, t1.id_assoc, t1.id_block, t1.id_group, t1.id_job, t1.id_qos, t1.id_resv, t3.resv_name, t1.id_user, t1.id_wckey, t1.job_db_inx, t1.job_name, t1.kill_requid, t1.mem_req, t1.node_inx, t1.nodelist, t1.nodes_alloc, t1.partition, t1.priority, t1.state, t1.time_eligible, t1.time_end, t1.time_start, t1.time_submit, t1.time_suspended, t1.timelimit, t1.track_steps, t1.wckey, t1.gres_alloc, t1.gres_req, t1.gres_used, t2.acct, t2.lft, t2.user from compy_job_table as t1 left join compy_assoc_table as t2 on t1.id_assoc=t2.id_assoc left join compy_resv_table as t3  on t1.id_resv=t3.id_resv  where ((t1.nodes_alloc between 1 and 4)) && ((t1.time_eligible < 1437550957 && (t1.time_end >= 1437550896 || t1.time_end = 0))) group by id_job, time_submit desc;
    +----+-------------+-------+--------+---------------+---------+---------+------------------------+--------+----------------------------------------------+
    | id | select_type | table | type   | possible_keys | key     | key_len | ref                    | rows   | Extra                                        |
    +----+-------------+-------+--------+---------------+---------+---------+------------------------+--------+----------------------------------------------+
    |  1 | SIMPLE      | t1    | ALL    | id_job,rollup | NULL    | NULL    | NULL                   | 120953 | Using where; Using temporary; Using filesort |
    |  1 | SIMPLE      | t2    | eq_ref | PRIMARY       | PRIMARY | 4       | slurm_1412.t1.id_assoc |      1 | Using where                                  |
    |  1 | SIMPLE      | t3    | ref    | PRIMARY       | PRIMARY | 4       | slurm_1412.t1.id_resv  |      1 | NULL                                         |
    +----+-------------+-------+--------+---------------+---------+---------+------------------------+--------+----------------------------------------------+
    3 rows in set (0.00 sec)
    
    mysql> explain select t1.account, t1.array_max_tasks, t1.array_task_str, t1.cpus_alloc, t1.cpus_req, t1.derived_ec, t1.derived_es, t1.exit_code, t1.id_array_job, t1.id_array_task, t1.id_assoc, t1.id_block, t1.id_group, t1.id_job, t1.id_qos, t1.id_resv, t3.resv_name, t1.id_user, t1.id_wckey, t1.job_db_inx, t1.job_name, t1.kill_requid, t1.mem_req, t1.node_inx, t1.nodelist, t1.nodes_alloc, t1.partition, t1.priority, t1.state, t1.time_eligible, t1.time_end, t1.time_start, t1.time_submit, t1.time_suspended, t1.timelimit, t1.track_steps, t1.wckey, t1.gres_alloc, t1.gres_req, t1.gres_used, t2.acct, t2.lft, t2.user from compy_job_table as t1 left join compy_assoc_table as t2 on t1.id_assoc=t2.id_assoc left join compy_resv_table as t3  on t1.id_resv=t3.id_resv  where ((t1.nodes_alloc between 1 and 4)) && ((t1.time_eligible < 1437550957 && (t1.time_end >= 1437550896 || t1.time_end = 0))) group by id_job, time_submit desc;
    +----+-------------+-------+--------+-----------------------+---------+---------+------------------------+------+---------------------------------------------------------------------+
    | id | select_type | table | type   | possible_keys         | key     | key_len | ref                    | rows | Extra                                                               |
    +----+-------------+-------+--------+-----------------------+---------+---------+------------------------+------+---------------------------------------------------------------------+
    |  1 | SIMPLE      | t1    | range  | id_job,rollup,rollup2 | rollup2 | 8       | NULL                   |    6 | Using index condition; Using where; Using temporary; Using filesort |
    |  1 | SIMPLE      | t2    | eq_ref | PRIMARY               | PRIMARY | 4       | slurm_1412.t1.id_assoc |    1 | Using where                                                         |
    |  1 | SIMPLE      | t3    | ref    | PRIMARY               | PRIMARY | 4       | slurm_1412.t1.id_resv  |    1 | NULL                                                                |
    +----+-------------+-------+--------+-----------------------+---------+---------+------------------------+------+---------------------------------------------------------------------+
    3 rows in set (0.00 sec)
    
    rollup:
    mysql> explain select job.job_db_inx, job.id_job, job.id_assoc, job.id_wckey, job.array_task_pending, job.time_eligible, job.time_start, job.time_end, job.time_suspended, job.cpus_alloc, job.cpus_req, job.id_resv, SUM(step.consumed_energy) from compy_job_table as job left outer join compy_step_table as step on job.job_db_inx=step.job_db_inx and (step.id_step>=0) where (job.time_eligible < 1420102800 && (job.time_end >= 1420099200 || job.time_end = 0)) group by job.job_db_inx order by job.id_assoc, job.time_eligible;
    +----+-------------+-------+-------+--------------------------------------------------------------------------------+---------+---------+---------------------------+------+--------------------------------------------------------+
    | id | select_type | table | type  | possible_keys                                                                  | key     | key_len | ref                       | rows | Extra                                                  |
    +----+-------------+-------+-------+--------------------------------------------------------------------------------+---------+---------+---------------------------+------+--------------------------------------------------------+
    |  1 | SIMPLE      | job   | range | PRIMARY,id_job,rollup,rollup2,wckey,qos,association,array_job,reserv,sacct_def | rollup  | 4       | NULL                      |    1 | Using index condition; Using temporary; Using filesort |
    |  1 | SIMPLE      | step  | ref   | PRIMARY                                                                        | PRIMARY | 4       | slurm_1412.job.job_db_inx |    1 | Using where                                            |
    +----+-------------+-------+-------+--------------------------------------------------------------------------------+---------+---------+---------------------------+------+--------------------------------------------------------+
    2 rows in set (0.01 sec)
    
    A plain sacct is sped up by moving time_end into the middle of the index (ex. id_user, time_end, time_eligible). sacct_def is for sacct calls with a state specified, sacct_def2 is for a plain sacct call.
    
    plain sacct:
    mysql> explain select t1.account, t1.array_max_tasks, t1.array_task_str, t1.cpus_alloc, t1.cpus_req, t1.derived_ec, t1.derived_es, t1.exit_code, t1.id_array_job, t1.id_array_task, t1.id_assoc, t1.id_block, t1.id_group, t1.id_job, t1.id_qos, t1.id_resv, t3.resv_name, t1.id_user, t1.id_wckey, t1.job_db_inx, t1.job_name, t1.kill_requid, t1.mem_req, t1.node_inx, t1.nodelist, t1.nodes_alloc, t1.partition, t1.priority, t1.state, t1.time_eligible, t1.time_end, t1.time_start, t1.time_submit, t1.time_suspended, t1.timelimit, t1.track_steps, t1.wckey, t1.gres_alloc, t1.gres_req, t1.gres_used, t2.acct, t2.lft, t2.user from compy_job_table as t1 left join compy_assoc_table as t2 on t1.id_assoc=t2.id_assoc left join compy_resv_table as t3  on t1.id_resv=t3.id_resv  where (t1.id_user='1003') && ((t1.time_end >= 1437548400 || t1.time_end = 0)) group by id_job, time_submit desc;
    +----+-------------+-------+--------+------------------+-----------+---------+------------------------+-------+---------------------------------------------------------------------+
    | id | select_type | table | type   | possible_keys    | key       | key_len | ref                    | rows  | Extra                                                               |
    +----+-------------+-------+--------+------------------+-----------+---------+------------------------+-------+---------------------------------------------------------------------+
    |  1 | SIMPLE      | t1    | ref    | id_job,sacct_def | sacct_def | 4       | const                  | 60476 | Using index condition; Using where; Using temporary; Using filesort |
    |  1 | SIMPLE      | t2    | eq_ref | PRIMARY          | PRIMARY   | 4       | slurm_1412.t1.id_assoc |     1 | Using where                                                         |
    |  1 | SIMPLE      | t3    | ref    | PRIMARY          | PRIMARY   | 4       | slurm_1412.t1.id_resv  |     1 | NULL                                                                |
    +----+-------------+-------+--------+------------------+-----------+---------+------------------------+-------+---------------------------------------------------------------------+
    3 rows in set (0.00 sec)o
    
    mysql> explain select t1.account, t1.array_max_tasks, t1.array_task_str, t1.cpus_alloc, t1.cpus_req, t1.derived_ec, t1.derived_es, t1.exit_code, t1.id_array_job, t1.id_array_task, t1.id_assoc, t1.id_block, t1.id_group, t1.id_job, t1.id_qos, t1.id_resv, t3.resv_name, t1.id_user, t1.id_wckey, t1.job_db_inx, t1.job_name, t1.kill_requid, t1.mem_req, t1.node_inx, t1.nodelist, t1.nodes_alloc, t1.partition, t1.priority, t1.state, t1.time_eligible, t1.time_end, t1.time_start, t1.time_submit, t1.time_suspended, t1.timelimit, t1.track_steps, t1.wckey, t1.gres_alloc, t1.gres_req, t1.gres_used, t2.acct, t2.lft, t2.user from compy_job_table as t1 left join compy_assoc_table as t2 on t1.id_assoc=t2.id_assoc left join compy_resv_table as t3  on t1.id_resv=t3.id_resv  where (t1.id_user='1003') && ((t1.time_end >= 1437548400 || t1.time_end = 0)) group by id_job, time_submit desc;
    +----+-------------+-------+--------+-------------------------------------+------------+---------+------------------------+------+--------------------------------------------------------+
    | id | select_type | table | type   | possible_keys                       | key        | key_len | ref                    | rows | Extra                                                  |
    +----+-------------+-------+--------+-------------------------------------+------------+---------+------------------------+------+--------------------------------------------------------+
    |  1 | SIMPLE      | t1    | range  | id_job,rollup2,sacct_def,sacct_def2 | sacct_def2 | 8       | NULL                   |   68 | Using index condition; Using temporary; Using filesort |
    |  1 | SIMPLE      | t2    | eq_ref | PRIMARY                             | PRIMARY    | 4       | slurm_1412.t1.id_assoc |    1 | Using where                                            |
    |  1 | SIMPLE      | t3    | ref    | PRIMARY                             | PRIMARY    | 4       | slurm_1412.t1.id_resv  |    1 | NULL                                                   |
    +----+-------------+-------+--------+-------------------------------------+------------+---------+------------------------+------+--------------------------------------------------------+
    3 rows in set (0.00 sec)
    
    Adding the sacct_def2 index order didn't affect other queries:
    
    sacct -s CA,CD,F,R:
    mysql> explain select t1.account, t1.array_max_tasks, t1.array_task_str, t1.cpus_alloc, t1.cpus_req, t1.derived_ec, t1.derived_es, t1.exit_code, t1.id_array_job, t1.id_array_task, t1.id_assoc, t1.id_block, t1.id_group, t1.id_job, t1.id_qos, t1.id_resv, t3.resv_name, t1.id_user, t1.id_wckey, t1.job_db_inx, t1.job_name, t1.kill_requid, t1.mem_req, t1.node_inx, t1.nodelist, t1.nodes_alloc, t1.partition, t1.priority, t1.state, t1.time_eligible, t1.time_end, t1.time_start, t1.time_submit, t1.time_suspended, t1.timelimit, t1.track_steps, t1.wckey, t1.gres_alloc, t1.gres_req, t1.gres_used, t2.acct, t2.lft, t2.user from compy_job_table as t1 left join compy_assoc_table as t2 on t1.id_assoc=t2.id_assoc left join compy_resv_table as t3  on t1.id_resv=t3.id_resv  where (t1.id_user='1003') && ((t1.state='4' && (t1.time_end && (t1.time_end >= 1438028802))) || (t1.state='3' && (t1.time_end && (t1.time_end >= 1438028802))) || (t1.state='5' && (t1.time_end && (t1.time_end >= 1438028802))) || (t1.time_start && ((!t1.time_end && t1.state=1) || (1438028802 between t1.time_start and t1.time_end)))) group by id_job, time_submit desc;
    +----+-------------+-------+--------+-------------------------------------+-----------+---------+------------------------+-------+---------------------------------------------------------------------+
    | id | select_type | table | type   | possible_keys                       | key       | key_len | ref                    | rows  | Extra                                                               |
    +----+-------------+-------+--------+-------------------------------------+-----------+---------+------------------------+-------+---------------------------------------------------------------------+
    |  1 | SIMPLE      | t1    | ref    | id_job,rollup2,sacct_def,sacct_def2 | sacct_def | 4       | const                  | 60513 | Using index condition; Using where; Using temporary; Using filesort |
    |  1 | SIMPLE      | t2    | eq_ref | PRIMARY                             | PRIMARY   | 4       | slurm_1412.t1.id_assoc |     1 | Using where                                                         |
    |  1 | SIMPLE      | t3    | ref    | PRIMARY                             | PRIMARY   | 4       | slurm_1412.t1.id_resv  |     1 | NULL                                                                |
    +----+-------------+-------+--------+-------------------------------------+-----------+---------+------------------------+-------+---------------------------------------------------------------------+
    3 rows in set (0.00 sec)
    
    Adding nodes_alloc index speeds up quries with queries like: sacct -i 2-10000
    
    mysql> EXPLAIN SELECT t1.account, t1.array_max_tasks, t1.array_task_str, t1.cpus_req, t1.derived_ec, t1.derived_es, t1.exit_code, t1.id_array_job, t1.id_array_task, t1.id_assoc, t1.id_block, t1.id_group, t1.id_job, t1.id_qos, t1.id_resv, t3.resv_name, t1.id_user, t1.id_wckey, t1.job_db_inx, t1.job_name, t1.kill_requid, t1.mem_req, t1.node_inx, t1.nodelist, t1.nodes_alloc, t1.partition, t1.priority, t1.state, t1.time_eligible, t1.time_end, t1.time_start, t1.time_submit, t1.time_suspended, t1.timelimit, t1.track_steps, t1.wckey, t1.gres_alloc, t1.gres_req, t1.gres_used, t2.acct, t2.lft, t2.user FROM compy_job_table AS t1 LEFT JOIN compy_assoc_table AS t2 ON t1.id_assoc = t2.id_assoc LEFT JOIN compy_resv_table AS t3 ON t1.id_resv = t3.id_resv WHERE ((t1.nodes_alloc between 2 and 10000)) && ((t1.time_start && ((1434384740 BETWEEN t1.time_start AND t1.time_end) || (t1.time_start BETWEEN 1434384740 AND 1434384741)))) GROUP BY id_job , time_submit DESC;
    +----+-------------+-------+--------+----------------+---------+---------+-------------------------+--------+----------------------------------------------+
    | id | select_type | table | type   | possible_keys  | key     | key_len | ref                     | rows   | Extra                                        |
    +----+-------------+-------+--------+----------------+---------+---------+-------------------------+--------+----------------------------------------------+
    |  1 | SIMPLE      | t1    | ALL    | id_job,rollup2 | NULL    | NULL    | NULL                    | 117549 | Using where; Using temporary; Using filesort |
    |  1 | SIMPLE      | t2    | eq_ref | PRIMARY        | PRIMARY | 4       | 1411_master.t1.id_assoc |      1 | NULL                                         |
    |  1 | SIMPLE      | t3    | ref    | PRIMARY        | PRIMARY | 4       | 1411_master.t1.id_resv  |      1 | NULL                                         |
    +----+-------------+-------+--------+----------------+---------+---------+-------------------------+--------+----------------------------------------------+
    3 rows in set (0.00 sec)
    
    mysql> EXPLAIN SELECT t1.account, t1.array_max_tasks, t1.array_task_str, t1.cpus_req, t1.derived_ec, t1.derived_es, t1.exit_code, t1.id_array_job, t1.id_array_task, t1.id_assoc, t1.id_block, t1.id_group, t1.id_job, t1.id_qos, t1.id_resv, t3.resv_name, t1.id_user, t1.id_wckey, t1.job_db_inx, t1.job_name, t1.kill_requid, t1.mem_req, t1.node_inx, t1.nodelist, t1.nodes_alloc, t1.partition, t1.priority, t1.state, t1.time_eligible, t1.time_end, t1.time_start, t1.time_submit, t1.time_suspended, t1.timelimit, t1.track_steps, t1.wckey, t1.gres_alloc, t1.gres_req, t1.gres_used, t2.acct, t2.lft, t2.user FROM compy_job_table AS t1 LEFT JOIN compy_assoc_table AS t2 ON t1.id_assoc = t2.id_assoc LEFT JOIN compy_resv_table AS t3 ON t1.id_resv = t3.id_resv WHERE ((t1.nodes_alloc between 2 and 10000)) && ((t1.time_start && ((1434384740 BETWEEN t1.time_start AND t1.time_end) || (t1.time_start BETWEEN 1434384740 AND 1434384741)))) GROUP BY id_job , time_submit DESC;
    +----+-------------+-------+--------+----------------------------+-------------+---------+-------------------------+------+---------------------------------------------------------------------+
    | id | select_type | table | type   | possible_keys              | key         | key_len | ref                     | rows | Extra                                                               |
    +----+-------------+-------+--------+----------------------------+-------------+---------+-------------------------+------+---------------------------------------------------------------------+
    |  1 | SIMPLE      | t1    | range  | id_job,rollup2,nodes_alloc | nodes_alloc | 4       | NULL                    |  720 | Using index condition; Using where; Using temporary; Using filesort |
    |  1 | SIMPLE      | t2    | eq_ref | PRIMARY                    | PRIMARY     | 4       | 1411_master.t1.id_assoc |    1 | NULL                                                                |
    |  1 | SIMPLE      | t3    | ref    | PRIMARY                    | PRIMARY     | 4       | 1411_master.t1.id_resv  |    1 | NULL                                                                |
    +----+-------------+-------+--------+----------------------------+-------------+---------+-------------------------+------+---------------------------------------------------------------------+
    3 rows in set (0.00 sec)
    f1ed6616
To find the state of this project's repository at the time of any of these versions, check out the tags.