Add the unit test for merge_across_dims_narm = F

1 job for develop-merge_inner_dim_with_diff_lengths
in 65 minutes and 57 seconds, using 0 compute credits, and was queued for 3 seconds