Commit 6c927b3f authored by Moe Jette's avatar Moe Jette
Browse files

select/cray: fix error in 'is_gemini' logic

The is_gemini logic is too simple: as just observed on a SeaStar system, it can
be fooled into the wrong result if more than 1 row has NULL coordinates. 

This case happens if a blade has been powered down completely, so that the SeaStar
network chip is also powered off. The routing system recognizes this case and 
routes around the powered-down node in the torus. It is plausible that in such a
case the torus coordinates are NULL, since the node(s) are no longer part of the
torus. 

(It is also possible to set all nodes on a blade down, but leave power switched
 on. The SeaStar chip, which is independent of the motherboard, will continue to
 provide routing connectivity, i.e. the torus coordinates would all be non-NULL,
 but no computing can be done by the node, the ALPS state is "ROUTING".)

Here is the example which revealed this behaviour: one blade, nodes 804-807,
had been powered down after system failure.

mysql> select COUNT(*), COUNT(DISTINCT x_coord,y_coord,z_coord) FROM processor;
+----------+-----------------------------------------+
| COUNT(*) | COUNT(DISTINCT x_coord,y_coord,z_coord) |
+----------+-----------------------------------------+
|     1882 |                                    1878 | 
+----------+-----------------------------------------+

==> There are 4 more node IDs than there are distinct coordinates.

mysql> select processor_id,x_coord,y_coord,z_coord from processor\
       WHERE x_coord IS NULL OR y_coord IS NULL OR z_coord IS NULL;
+--------------+---------+---------+---------+
| processor_id | x_coord | y_coord | z_coord |
+--------------+---------+---------+---------+
|          804 |    NULL |    NULL |    NULL | 
|          805 |    NULL |    NULL |    NULL | 
|          806 |    NULL |    NULL |    NULL | 
|          807 |    NULL |    NULL |    NULL | 
+--------------+---------+---------+---------+

==> The corrected query now also gives the correct result (equality):
mysql> select COUNT(*), COUNT(DISTINCT x_coord,y_coord,z_coord) FROM processor\
       WHERE x_coord IS NOT NULL AND y_coord IS NOT NULL AND z_coord IS NOT NULL;
+----------+-----------------------------------------+
| COUNT(*) | COUNT(DISTINCT x_coord,y_coord,z_coord) |
+----------+-----------------------------------------+
|     1878 |                                    1878 | 
+----------+-----------------------------------------+
parent 3d96a32e
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment