diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 0000000000000000000000000000000000000000..ff1e283a938ab8e12939d38c10a8e4e79be37a3d --- /dev/null +++ b/.gitmodules @@ -0,0 +1,4 @@ +[submodule "src/magiccut"] + path = src/magiccut + url = https://earth.bsc.es/gitlab/ces/magiccut.git + branch = numpy_version diff --git a/INSTALL.md b/INSTALL.md deleted file mode 100644 index 5fedce4316060fe8b13739af7b72050263d39e4f..0000000000000000000000000000000000000000 --- a/INSTALL.md +++ /dev/null @@ -1,28 +0,0 @@ -# Installation - -There is no installation required. Just copy the content of this folder to your -preferred location and add the directory to the PATH environment variable. - -## Prerequisites - -Basicanalysis requires Python 3 and relies on -*paramedir* and *Dimemas* being installed and available -through the PATH environment variable. - -* *paramedir* available at https://tools.bsc.es/paraver -* *Dimemas* available at https://tools.bsc.es/dimemas - -If not already done, install both tools and add them to the PATH environment -variable with: - -``` -export PATH=/bin:$PATH -export PARAVER_HOME= -export PATH=/bin:$PATH -export DIMEMAS_HOME= - -``` - -Additionally, plotting relies on the according SciPy(>= 0.17.0), -NumPy, pandas, searborn and matplotlib (>= 3.x) modules for Python 3. -Furthermore, the gnuplot output requires gnuplot version 5.0 or higher. diff --git a/README.md b/README.md index 388dea6c5a97b1eb8186cce39fce07a4d2eff920..ade9efd8110ef83f0c120ee3d458fd9ebbe5db2e 100644 --- a/README.md +++ b/README.md @@ -1,32 +1,42 @@ # Nemo modelfactors +The project was developed and tested using NEMO 4.2 version. -Script used to get important metrics from NEMO +This project is intended to compute important performance metrics for a NEMO run. + +The statistic produced is focusing on the timestep loop, so these numbers are not related to the initialization and finalization parts. # Installation and requeriments -This script requires the following BSCTOOLS to be installed, loaded and available through the PATH environment variable. +This script relies on the following tools: * *Extrae (4.0.0 or above)* * *Paraver* * *Dimemas (latest has a bug use 5.4.2-devel instead)* * *Basicanalysis* -*Tools available at https://tools.bsc.es/downloads* +*They can be downloaded at https://tools.bsc.es/downloads* and need to be installed, loaded and available through the PATH environment variable. -Also the different modules needed to compile and execute NEMO should be loaded before the script execution. +Here the list of the modules that need to be loaded before the script execution. -* Perl interpreter +* Perl interpreter. * Fortran compiler (ifort, gfortran, pgfortran, ftn, …), * Message Passing Interface (MPI) implementation (e.g. OpenMPI or MPICH). -* Network Common Data Form (NetCDF) library with its underlying Hierarchical Data Form (HDF) +* Network Common Data Form (NetCDF) library with its underlying Hierarchical Data Form (HDF). # Usage -* Copy all the content of this folder into the folder with the input data for NEMO -* Edit the file config.bash adding the information required. -* Execute script.sh +* Clone this repository wherever you please. +* Don't move the content of the repository outside , the sub-modules won't load and the script will fail. If you want you can init the sub-modules manually with `git submodule update --init` and then move the content. +* ***Edit the file perf_metrics.config and replace the parameters values with the suited information.*** +* ***MINIMUM CHANGES perf_metrics.config:*** + * Nemo_path, change the value to the path were NEMO is installed in your machine. + * Nemo_input_data, change the value to the path were the input data for the configuration is downloaded. + * Compilation_arch, replace the value with the name of the arch file that you use to compile NEMO. + * Modules, change the value to suit the name of the modules you need to load. + * Jobs_scheduler, replace the value with the name of the scheduler installed in your machine (currently supports slurm, lsf and torque) +* Execute perf_metrics.bash ``` -./script.sh +./perf_metrics.bash ``` -* If the script executed without problems the data will be ready at the Metrics folder. +* If the script executes without problems the data will be by default ready inside ../Output/Metrics folder. The Output dir path can be changed at perf_metrics.config. diff --git a/TODO.md b/TODO.md deleted file mode 100644 index b8b4b814722a596ec8ec4350ce98de1070389b46..0000000000000000000000000000000000000000 --- a/TODO.md +++ /dev/null @@ -1,14 +0,0 @@ -# Open Issues and Future Features - -* Add some sanity checks: - * Test if simulated time is less or equal to original time to detect Dimemas errors. - * Correctly track return values of external calls and handle errors. - -* Add support for compressed Paraver traces. - -* Add basic threading support to run some analyses in parallel. - -* Add support PyCompSs - * Encapsulate main routines in functions - * Provide stable switches for systems without PyCompSs - diff --git a/config.bash b/config.bash deleted file mode 100644 index 8d0f4c8745195a3c66db23ef6ea14f20be83ee99..0000000000000000000000000000000000000000 --- a/config.bash +++ /dev/null @@ -1,45 +0,0 @@ - -# Nemo_path: Relative path to nemo installation folder containing the cfgs and arch dirs -# Nemo_cores: List of nºcores used for executing Nemo, ( 4 48 ) makes the script execute and -# get Nemo traces with 4 and 48 cores. 2 different nºcores are needed to obtain scalability data. - -Nemo_path="../NEMO" -Nemo_cores=( 4 24 48 ) - -# Jobs_n_cores: nºcores used for executing other scripts. More than 4 is not optimal -# Jobs_scheduler: Available (slurm/lsf) -# Jobs_time: Max duration of the job in min -# Jobs_queue: Queue used - -Jobs_n_cores=4 -Jobs_scheduler="slurm" -Jobs_time="60" -Jobs_queue=debug - -# Compilation_compile: When false only compiles NEMO if arch file lacks the needed flags, when true always compiles NEMO. -# Compilation_ref: Reference configuration -# Compilation_arch: Architecture used (without the -arch sufix and the .fcm) -# Compilation_name: Name of the new configutation -# Compilation_sub: Add or remove subcomponents - - -Compilation_compile="false" -Compilation_ref="ORCA2_ICE_PISCES" -Compilation_arch="X64_MN4" -Compilation_name="ORCA2_EXTRAE" -Compilation_sub="OCE del_key 'key_si3 key_top'" - -# List of modules loaded -# Required: -# - Perl interpreter -# - Fortran compiler (ifort, gfortran, pgfortran, ftn, …) -# - Message Passing Interface (MPI) implementation (e.g. OpenMPI or MPICH). -# - Network Common Data Form (NetCDF) library with its underlying Hierarchical Data Form (HDF) -# - Extrae -# - Paraver -# - Dimemas 4.2 -devel -# - Python3 -# - gnuplot -# EXTRAE BASICANALYSIS gcc/7.2.0 intel/2017.4 impi/2018.4 netcdf/4.4.1.1 hdf5/1.8.19 DIMEMAS/5.4.2-devel - -Modules="EXTRAE BASICANALYSIS gcc intel/2018.3 impi/2018.4 netcdf/4.4.1.1 hdf5/1.8.19 DIMEMAS/5.4.2-devel perl" diff --git a/magiccut/README.md b/magiccut/README.md deleted file mode 100644 index 3015d293d89a54637c0bec7b5bc2429fd69fc19c..0000000000000000000000000000000000000000 --- a/magiccut/README.md +++ /dev/null @@ -1,4 +0,0 @@ -# Magic Cut - -A tool created to automatically cut a single iteration from a **extrae trace** using function events (60000019) as a reference. -Created using NEMO traces, not tested with other codes. \ No newline at end of file diff --git a/magiccut/bin/TraceCutter.py b/magiccut/bin/TraceCutter.py deleted file mode 100755 index 4b26cd12677b6079b2f324c6cce66e399c9631da..0000000000000000000000000000000000000000 --- a/magiccut/bin/TraceCutter.py +++ /dev/null @@ -1,266 +0,0 @@ -import numpy as np -import xml.etree.ElementTree as ET -import time as clock_time -from os.path import exists, realpath - - -def get_command_line_arguments(): - """ - Returns a list of files that have been provided as a command line argument - :return: list of files - """ - - # Parse and assert command line options - import argparse - - parser = argparse.ArgumentParser() - parser.add_argument("-i", "--input-data", required=True, type=PathType(exists=True, type='file'), - help="File with filtered info on cores ID, function ID, and timings") - parser.add_argument("--ts-number", default=None, type=int, - help="Number of time-steps present in the trace") - parser.add_argument("-t", "--tool-path", default=None, type=PathType(exists=True, type='dir'), - help="Path to Magicut templates") - parser.add_argument("--function-file", required=True, type=PathType(exists=True, type='file'), - help="File with names corresponding to function IDs") - - args = parser.parse_args() - - if args.input_data is None: - parser.error("Input data are needed") - else: - if not exists(args.input_data): - parser.error("The specified input does not exists") - if args.function_file is not None and not exists(args.function_file): - parser.error("The specified file does not exists") - if args.ts_number is None: - print("Warning: magicCut will try to guess the number of time-steps") - if args.tool_path is None: - args.tool_path = "/".join(realpath(__file__).split("/")[:-2]) + "/templates/" - if not exists(args.tool_path): - parser.error("The specified path to tools does not exist: " + args.tool_path) - - return args - - -class PathType(object): - def __init__(self, exists=True, type='file', dash_ok=True): - '''exists: - True: a path that does exist - False: a path that does not exist, in a valid parent directory - None: don't care - type: file, dir, symlink, None, or a function returning True for valid paths - None: don't care - dash_ok: whether to allow "-" as stdin/stdout''' - - assert exists in (True, False, None) - assert type in ('file', 'dir', 'symlink', None) or hasattr(type, '__call__') - - self._exists = exists - self._type = type - self._dash_ok = dash_ok - - def __call__(self, string): - from argparse import ArgumentTypeError as err - import os - if string == '-': - # the special argument "-" means sys.std{in,out} - if self._type == 'dir': - raise err('standard input/output (-) not allowed as directory path') - elif self._type == 'symlink': - raise err('standard input/output (-) not allowed as symlink path') - elif not self._dash_ok: - raise err('standard input/output (-) not allowed') - else: - e = os.path.exists(string) - if self._exists == True: - if not e: - raise err("path does not exist: '%s'" % string) - - if self._type is None: - pass - elif self._type == 'file': - if not os.path.isfile(string): - raise err("path is not a file: '%s'" % string) - elif self._type == 'symlink': - if not os.path.symlink(string): - raise err("path is not a symlink: '%s'" % string) - elif self._type == 'dir': - if not os.path.isdir(string): - raise err("path is not a directory: '%s'" % string) - elif not self._type(string): - raise err("path not valid: '%s'" % string) - else: - if self._exists == False and e: - raise err("path exists: '%s'" % string) - - p = os.path.dirname(os.path.normpath(string)) or '.' - if not os.path.isdir(p): - raise err("parent path is not a directory: '%s'" % p) - elif not os.path.exists(p): - raise err("parent directory does not exist: '%s'" % p) - - return string - - -def get_function_name(functions_name_file, function_ids): - # Opens file with format #ID FUNCTION_NAME - with open(functions_name_file) as f: - ids_name_list = f.read() - id = [l.split()[0] for l in ids_name_list.splitlines()] - names = [l.split()[1] for l in ids_name_list.splitlines()] - function_names = [] - for ID in function_ids: - index = id.index(str(ID)) - function_names.append(names[index]) - return function_names - - -def save_function_names(function_ids, functions_name_file): - # Opens file with format #ID FUNCTION_NAME - with open(functions_name_file) as f: - ids_name_list = f.read() - id = [l.split()[0] for l in ids_name_list.splitlines()] - names = [l.split()[1] for l in ids_name_list.splitlines()] - with open('functions.txt', 'w') as f: - for ID in function_ids: - index = id.index(str(ID)) - f.write(id[index] + " " + names[index] + "\n") - print("Functions that are called once per time-step saved to functions.txt") - - -def cut_time(tool_path, input_data, function_name_file, time_steps=None): - # Path to the cutter template - template_path = tool_path + "cutter_template.xml" - print("Starting the magic") - start = clock_time.time() - - # Read trace and load function calls - cpu_ids, time, function_ids = np.genfromtxt(input_data, dtype='int', unpack='True') - - print("for opening the input_data: ") - end = clock_time.time() - print(end - start) - - # Find set of different routines, and number of cores - unique_function_ids, call_counts = np.unique(function_ids, return_counts=True) - cpu_number = len(np.unique(cpu_ids)) - - function_names = get_function_name(function_name_file, unique_function_ids) - - if time_steps is None: - # Remember that everything is multiplied by N cores - # Creates an histogram with the number of how many functions are called n times - counts = np.bincount(call_counts) - # Suppose that the most common value is the number of ts - time_steps = np.argmax(counts) - # Find functions that are called once per time-step - functions_called_once_per_step = unique_function_ids[call_counts == time_steps] - else: - # We just search for that functions that appears once per core per time-step - functions_called_once_per_step = unique_function_ids[call_counts / cpu_number == time_steps] - - if not len(functions_called_once_per_step): - raise Exception("No function has been found which is called once per timestep") - - # Find which routine is called in first place - magic_sentinel = [n for n in function_names if n.count("magiccut")] - if magic_sentinel: - # An artificial first routine was introduces - index = function_names.index(magic_sentinel[0]) - first_routine = unique_function_ids[index] - else: - # see which functions is first - first_routine = functions_called_once_per_step[0] - - print("First called routine: ", function_names[unique_function_ids.tolist().index(first_routine)]) - # Find number of nemo_proc_number - nemo_proc = cpu_ids[function_ids == first_routine] - nemo_proc = np.unique(nemo_proc) - nemo_proc_number = nemo_proc.shape[0] - - nemo_proc_min = min(nemo_proc) - nemo_proc_max = max(nemo_proc) - - print("Actual n_proc", nemo_proc_number, "with ", time_steps, "time steps") - - # Find the index of the fastest step - time_step = np.ones([nemo_proc_number, time_steps]) - - ts_min_index = np.zeros(nemo_proc_number, dtype='int') - ts_max_index = np.zeros(nemo_proc_number, dtype='int') - - ts_time = np.zeros([nemo_proc_number, time_steps]) - start = clock_time.time() - # For each processor - for index, proc in enumerate(nemo_proc): - - # Get the starting time of all time-step - ts_time[index] = time[(cpu_ids == proc) & (function_ids == first_routine)] - - # Compute the duration of each ts - ts_duration = np.diff(ts_time[index]) - - # Index of the shortest ts - ts_min_index[index] = np.argmin(ts_duration) - - # Index of the largest ts - ts_max_index[index] = np.argmax(ts_duration[1:-1]) + 1 - - # Evaluate the most common index for best ts - counts = np.bincount(ts_min_index) - best_ts_index = np.argmax(counts) - - # Evaluate the most common index for worst ts - counts = np.bincount(ts_max_index) - worst_ts_index = np.argmax(counts) - - print("for finding the index of the slowest / fastest step: ", worst_ts_index, "/", best_ts_index) - end = clock_time.time() - print(end - start) - - # Find the start and the end of the best step - best_ts_start = min(ts_time[:, best_ts_index]) - best_ts_end = max(ts_time[:, best_ts_index + 1]) - - # Find the start and the end of the best step - worst_ts_start = min(ts_time[:, worst_ts_index]) - worst_ts_end = max(ts_time[:, worst_ts_index + 1]) - - print("Worst / best time step's duration: ", worst_ts_end - worst_ts_start, best_ts_end - best_ts_start) - - # Open the xml template - tree = ET.parse(template_path) - - cutter = tree.find('cutter') - for tasks in tree.iter('tasks'): - tasks.text = str(nemo_proc_min) + "-" + str(nemo_proc_max) - for minimum_time in tree.iter('minimum_time'): - minimum_time.text = str(best_ts_start - 1000) - for maximum_time in tree.iter('maximum_time'): - maximum_time.text = str(best_ts_end + 1000) - - # Create paramedir cutter file - tree.write('best_time_cutter.xml') - for minimum_time in tree.iter('minimum_time'): - minimum_time.text = str(worst_ts_start - 1000) - for maximum_time in tree.iter('maximum_time'): - maximum_time.text = str(worst_ts_end + 1000) - - # Create paramedir cutter file - tree.write('worst_time_cutter.xml') - - return unique_function_ids.tolist() - - -if __name__ == "__main__": - # Get files from command line - args = get_command_line_arguments() - - tool_path = args.tool_path - input_data = args.input_data - function_file = args.function_file - ts_number = args.ts_number - - unique_functions = cut_time(tool_path, input_data, function_file, ts_number) - # unique_functions = [6, 33] - save_function_names(unique_functions, function_file) diff --git a/magiccut/magicCut b/magiccut/magicCut deleted file mode 100755 index 9d0b6aa906193857b2e39648a73699265daaa51c..0000000000000000000000000000000000000000 --- a/magiccut/magicCut +++ /dev/null @@ -1,64 +0,0 @@ -#!/bin/bash - -# Get inputs arg -if [[ $# -lt 2 ]] -then - echo $0 "expects trace and number of timesteps as input" - echo "abort" - exit 1 -fi - -# trace file to cut -trace_file=$1 -# Number of timesteps -time_steps=$2 - -if [[ ! -e ${trace_file} ]] -then - echo "The specified trace does not exist: "${trace_file} - echo "Abort" - exit 1 -fi -trace_file=`readlink -f ${trace_file}` -pcf_trace_file=${trace_file/.prv/.pcf} -trace_folder=${trace_file%/*} - -tool_path=`readlink -f $0` -tool_path=${tool_path//'/magicCut'} - - -trace_base_name=${trace_file//.exe.prv.gz} -trace_base_name=${trace_base_name//.exe.prv} -trace_base_name=${trace_base_name//.prv.gz} -trace_base_name=${trace_base_name//.prv} - -# grep through prv file if functions are there -func_check=`grep -m 1 60000019 ${pcf_trace_file} | wc -l` -# Stores the list of function and extrae ids -begin_line_number=`awk '/60000019 User function/ {print FNR}' ${pcf_trace_file}` -begin_line_number=$((begin_line_number+2)) -end_line_number=`tail -n +${begin_line_number} ${pcf_trace_file} | grep -nm 1 EVENT_TYPE | awk -F: '{print $1}'` -end_line_number=$((begin_line_number+end_line_number-2)) - -# removes also the blank lines at the end -sed -n "${begin_line_number},${end_line_number}p" ${pcf_trace_file}| awk '{print $1, $2}' | awk NF > FUNCTION_ID_NAMES.txt - -if [[ ${func_check} -gt 0 ]] -then - CPU_T_ID=${trace_folder}/CPU_T_ID.txt - cat /dev/null > ${CPU_T_ID} - - # Retrieve function's ID - echo "Retrieve function's ID" - grep ':60000019:' ${trace_file} |\ - grep -v ':0:' |\ - awk -F : '{print $2, $6, $8}' > ${CPU_T_ID} - # Finds best time step - python ${tool_path}/bin/TraceCutter.py --input-data ${CPU_T_ID} --ts-number ${time_steps} --function-file FUNCTION_ID_NAMES.txt - #rm ${CPU_T_ID} - echo "start paramedir cutter" - time paramedir -c best_time_cutter.xml ${trace_file} -o ${trace_base_name}.best_cut.prv -else - echo -e "Functions must be present in the trace for the script to work.\nAborting ... " - exit 1 -fi diff --git a/magiccut/templates/cutter_template.xml b/magiccut/templates/cutter_template.xml deleted file mode 100755 index 3394112ad74c7f1c429259bb2821daa4416b350a..0000000000000000000000000000000000000000 --- a/magiccut/templates/cutter_template.xml +++ /dev/null @@ -1,23 +0,0 @@ - - - - - - - - - - - 1-256 - 0 - 1 - %START% - %END% - 0 - 100 - 0 - 1 - 0 - 0 - - diff --git a/perf_metrics.bash b/perf_metrics.bash new file mode 100755 index 0000000000000000000000000000000000000000..9470d4aa1eb019a172abd76f4ff476c7fcb61483 --- /dev/null +++ b/perf_metrics.bash @@ -0,0 +1,56 @@ +#!/bin/bash + +main() +{ + +if [ $# -gt 0 ]; then + + echo "This script does not accept arguments, parameters need to be added to perf_metrics.config Aborting" + exit 1 + +fi + + +#Load submodules +git submodule update --init +#Get script directory +dir=$(pwd) + +echo +echo "Using the following configuration:" +echo + +#Load functions from file +source "$dir"/src/functions.bash + +#Load parameters from file +source "$dir"/perf_metrics.config + +# print parameters +grep -o '^[^#]*' perf_metrics.config +echo + +#Init variables +Init + +#Test if parameters are valid +Test_arguments + +cd "${Gprof_path}"||(echo "Error ${Gprof_path} folder doesn't exists"; exit 1) + +#Create the list of important functions from NEMO +Gprof_functions & + +cd "${Run_path}"||(echo "Error ${Run_path} folder doesn't exists"; exit 1) +#Get the traces of the executions and cut 1 timestep +Get_trace + +cd "${Metrics_path}"||(echo "Error ${Metrics_path} folder doesn't exists"; exit 1) +#Generate the performance metrics +Create_metrics + +} + +main "$@"; exit + + diff --git a/perf_metrics.config b/perf_metrics.config new file mode 100644 index 0000000000000000000000000000000000000000..59ee9cc73f1840e88c294ee1848a6d561a17e729 --- /dev/null +++ b/perf_metrics.config @@ -0,0 +1,60 @@ +# Respect the bash format, no spaces between variables, the = symbol and the respective value. +# Arrays need a space after the open and before the closing parenthesism, the elements are separated by spaces. + +################################################################################# + +# Output (string): Path where the Output dir, containing all the output files, will be created. + +Output=".." + +# Nemo_path (string) : Path to nemo installation folder containing the cfgs and arch dirs. +# Nemo_input_data (string): Path to the input data needed to run the nemo cfg. +# Nemo_run (string): Path where the folder Run_NEMO will be created. +# Nemo_cores (array): List of nºcores used for executing Nemo, ( 4 48 ) makes the script execute and +# get Nemo traces with 4 and 48 cores. 2 different nºcores are needed to obtain scalability data. + +Nemo_path="NEMO_INSTALLATION_PATH" +Nemo_input_data="NEMO_INPUT_DATA_PATH" +Nemo_cores=( 4 24 48 96 192) + +# Jobs_n_cores (integer): nºcores used for other jobs like compiling nemo. +# Jobs_cores_per_node (integer): define the number of cores per node. +# Jobs_scheduler (string): Available (slurm/lsf/torque). +# Jobs_time (integer): Max duration of the job in min. +# Jobs_queue (string): Queue used. + +Jobs_n_cores=4 +Jobs_cores_per_node= +Jobs_scheduler="slurm" +Jobs_time=60 +Jobs_queue= + +# Compilation_compile (boolean): When false only compiles NEMO if arch file lacks the needed flags, when true always compiles NEMO. +# Compilation_ref (string): Reference configuration. +# Compilation_arch (string): Architecture used (without the -arch suffix and the .fcm). +# Compilation_name (string): Name of the new configuration (Important to not be an existing one). +# Compilation_sub (string): Add or remove sub-components. + + +Compilation_compile=false +Compilation_ref="ORCA2_ICE_PISCES" +Compilation_arch="YOUR_ARCH_FILE" +Compilation_name="ORCA2_EXTRAE" +Compilation_sub="OCE del_key 'key_si3 key_top'" + +# Clean (boolean): If true, at the end of the script, all residual files from NEMO executions (data, outputs, executable, folders) are deleted. + +Clean=true + +# Modules (string): List of modules loaded. +# Required: +# - Perl interpreter +# - Fortran compiler (ifort, gfortran, pgfortran, ftn, …) +# - Message Passing Interface (MPI) implementation (e.g. OpenMPI or MPICH). +# - Network Common Data Form (NetCDF) library with its underlying Hierarchical Data Form (HDF) +# - Extrae +# - Paraver +# - Dimemas 4.2 -devel +# - Python3 + +Modules="EXTRAE BASICANALYSIS gcc intel/2018.3 impi/2018.4 netcdf/4.4.1.1 hdf5/1.8.19 DIMEMAS/5.4.2-devel perl" diff --git a/script.sh b/script.sh deleted file mode 100755 index 4a97f5cb0e6b87abaec8fce898748b9f99e82296..0000000000000000000000000000000000000000 --- a/script.sh +++ /dev/null @@ -1,381 +0,0 @@ -#!/bin/bash - -main() -{ - -if [ $# -gt 0 ]; then - - echo "This script does not accept arguments, parameters need to be added to config.bash Aborting" - exit 1 - -fi - -#Get script directory -dir=$(pwd) - -echo -echo "Using the following configuration:" -echo -source "$dir"/config.bash -grep -o '^[^#]*' config.bash -echo - -Init -Test_arguments -Test_Comp -Create_metrics - -} - - -Job_completed() -{ - if [ "$Jobs_scheduler" == "slurm" ]; then - local id1=${1##* } - sleep 5 - if ! scontrol show job $id1 | grep -q 'JobState=COMPLETED'; then - Completed=false - else - Completed=true - fi - - else - local id1=${head -n1 1 | cut -d'<' -f2 | cut -d'>' -f1} - sleep 5 - if ! bjobs -l $id | grep -q 'Status '; then - Completed=false - else - Completed=true - fi - fi - -} - -Init() -{ - #Init variables with default values in case of missing - Nemo_path="${Nemo_path:-"."}" - Nemo_cores="${Nemo_cores:-( 48 )}" - Jobs_n_cores="${Jobs_n_cores:-48}" - Jobs_scheduler="${Jobs_scheduler:-"slurm"}" - time="${Jobs_time:-"0"}" - queue="${Jobs_queue:-""}" - compile="${Compilation_compile:-"false"}" - cfg="${Compilation_ref:-"ORCA2_ICE_PISCES"}" - arch="${Compilation_arch:-"X64_MN4"}" - name_cfg="${Compilation_name:-"ORCA2_EXTRAE"}" - comp_cfg="${Compilation_sub:-""}" - Modules="${Modules:-""}" -} - - -# Checks if the paths given are correct. - -Test_arguments() -{ - # Nemo path correct? - if ! test -d "${Nemo_path}"; then - echo "Nemo relative path: ${Nemo_path} is not found" - echo - exit 1 - fi - #Nemo_cores is array? - if [[ ! "$(declare -p Nemo_cores)" =~ "declare -a" ]]; then - echo "Error Nemo_cores has to be a bash array like ( 4 24 48 )" - echo - exit 1 - fi - # cfg exists? - if ! test -d "${Nemo_path}/cfgs/${cfg}"; then - echo "configuration: ${cfg} doesn't exists in ${Nemo_path}/cfgs dir" - echo - exit 1 - fi - # arch exists? - if ! test -f "${Nemo_path}/arch/arch-${arch}.fcm"; then - echo "architecture: arch-${arch}.fcm doesn't exists in ${Nemo_path}/arch dir" - echo - exit 1 - fi - # scheduler correct? - if [ "$Jobs_scheduler" != "slurm" ] && [ "$Jobs_scheduler" != "lsf" ]; then - echo "$Jobs_scheduler is not a valid scheduler" - echo - exit 1 - fi - #Nemo_cores is array? - if [[ ! "$(declare -p Nemo_cores)" =~ "declare -a" ]]; then - echo "Error, variable Nemo_cores has to be an array" - echo - exit 1 - fi - #Modules available - if ! module load $Modules;then - echo "Error loading modules aborting" - echo - exit 1 - fi - echo - #$EXTRAE_HOME loaded ? - if ! test -d "${EXTRAE_HOME}"; then - echo "Extrae relative path: ${EXTRAE_HOME} is not found" - echo - exit 1 - else - sed -i 's|home=.*|home="'"$EXTRAE_HOME"'"|g' extrae.xml - fi - - # Adding -d to variable if not empty - if [ -n "$comp_cfg" ]; then - comp_cfg="-d $comp_cfg" - fi - - # Creating auxiliar vars for submiting jobs - if [ "$Jobs_scheduler" == "slurm" ]; then - job="sbatch" - else - job="bsub" - fi - - -} - - - -Test_Comp() -{ - - # Checking if compilation is needed - - if [ "$compile" == true ]; then - echo 'compile parameter is inicialized true' - fi - - # Get text lines corresponding to the flags - line=$(sed -n '/^%FCFLAGS /p' "$Nemo_path"/arch/arch-"${arch}".fcm) - line2=$(sed -n '/^%FPPFLAGS /p' "$Nemo_path"/arch/arch-"${arch}".fcm) - line3=$(sed -n '/^%LDFLAGS /p' "$Nemo_path"/arch/arch-"${arch}".fcm) - - # If -g is not there, recompilation is requiered and -g added - if ! echo "${line}"|grep -q "\-g\b"; then - echo "-g flag not found in arch-${arch}.fcm: editing arch-${arch}.fcm " - sed -i '/^%FCFLAGS/ s/$/ -g /' "${Nemo_path}"/arch/arch-"${arch}".fcm - compile=true - fi - - # If finstrument-functions is not there recompilation is requiered and -finstrument-functions added - if ! echo "${line}"|grep -q "\-finstrument-functions\b"; then - echo "-finstrument-functions flag not found in arch-${arch}.fcm: editing arch-${arch}.fcm " - sed -i '/^%FCFLAGS/ s/$/ -finstrument-functions/' "${Nemo_path}"/arch/arch-"${arch}".fcm - compile=true - fi - - # If nemo executable is not on the run file compile - - if ! test -f "${Nemo_path}/cfgs/${name_cfg}/EXP00/nemo"; then - echo "nemo executable not found in cfg" - compile=true - fi - - # If -pg is not there recompilation is requiered and -pg added - - if ! echo "${line}"|grep -q "\-pg\b"; then - echo "-pg flag not found in FCFLAGS arch-${arch}.fcm: editing arch-${arch}.fcm " - sed -i '/^%FCFLAGS/ s/$/ -pg/' "${Nemo_path}"/arch/arch-"${arch}".fcm - compile=true - fi - - if ! echo "${line2}"|grep -q "\-pg\b"; then - echo "-pg flag not found in FPPFLAGS arch-${arch}.fcm : editing arch-${arch}.fcm " - sed -i '/^%FPPFLAGS/ s/$/ -pg/' "${Nemo_path}"/arch/arch-"${arch}".fcm - compile=true - fi - - if ! echo "${line3}"|grep -q "\-pg\b"; then - echo "-pg flag not found in LDFLAGS arch-${arch}.fcm: editing arch-${arch}.fcm " - sed -i '/^%LDFLAGS/ s/$/ -pg/' "${Nemo_path}"/arch/arch-"${arch}".fcm - compile=true - fi - - - # If -rdynamic is not there recompilation is requiered and -rdynamic added - - if ! echo "${line}"|grep -q "\-rdynamic\b"; then - echo "-rdynamic flag not found in FCFLAGS arch-${arch}.fcm: editing arch-${arch}.fcm " - sed -i '/^%FCFLAGS/ s/$/ -rdynamic/' "${Nemo_path}"/arch/arch-"${arch}".fcm - compile=true - fi - - if ! echo "${line3}"|grep -q "\-export-dynamic\b"; then - echo "-export-dynamic flag not found in LDFLAGS arch-${arch}.fcm: editing arch-${arch}.fcm " - sed -i '/^%LDFLAGS/ s/$/ -export-dynamic/' "${Nemo_path}"/arch/arch-"${arch}".fcm - compile=true - fi - - - - #Compile the program if needed - - if [ $compile == true ]; then - echo "Compiling Nemo, expected duration 35m" - echo "Output of the compilation in compile.err and compile.out" - - printf -v workload1 "cd ${Nemo_path}\n./makenemo -r ${cfg} -n ${name_cfg} -m ${arch} -j$Jobs_n_cores $comp_cfg" - python3 Job_Creator.py -f "compile" -j "compile" --set-core "${Jobs_n_cores}" -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "${workload1}" - - state1=$("$job" --wait compile."$Jobs_scheduler") - echo - Job_completed "$state1" - if [ $Completed == false ]; then - echo "Nemo compilation failed, remember to load all the needed modules. Check the details in compile.err" - echo - exit 1 - else - echo "Nemo compilation successful" - echo - fi - - else - echo "Compilation not needed" - echo - fi - - - #Rename the namelist_cfg if exists in order to not overwrite it - if test -f "namelist_cfg"; then - mv namelist_cfg namelist_cfg_old - cd "$dir" || echo "Error original dir doesn't exist" exit - cp "${Nemo_path}"/cfgs/"${name_cfg}"/EXP00/* . - rm namelist_cfg - mv namelist_cfg_old namelist_cfg - else - cd "$dir" || echo "Error original dir doesn't exist" exit - cp "${Nemo_path}"/cfgs/"${name_cfg}"/EXP00/* . - fi - - if [[ $comp_cfg == "-d OCE del_key 'key_si3 key_top'" ]]; then - sed -i '/_def_nemo-ice.xml\|def_nemo-pisces.xml/d' context_nemo.xml #DELETE ICE AND PISCES CONTEXT (NOT USED) - fi - - #Solving NEMO 4.2 Errors - sed -i 's|ln_zdfiwm * = .true.|ln_zdfiwm = .false.|g' namelist_cfg #CHANGE DUE TO NON EXISTING FILES - if test -f "weights_core_orca2_bicubic_noc.nc"; then - mv weights_core_orca2_bicubic_noc.nc weights_core2_orca2_bicub.nc #RENAME WRONG NAMED FILES - fi - if test -f "weights_core_orca2_bilinear_noc.nc"; then - mv weights_core_orca2_bilinear_noc.nc weights_core2_orca2_bilin.nc #RENAME WRONG NAMED FILES - fi - -} - -Create_metrics() -{ - - - #Changing iterations, big traces generate problems. - sed -i 's|nn_itend * =.*|nn_itend = 12 ! last time step (std 5475)|g' namelist_cfg - - #Generating function list in case of missing - - if ! test -f "extrae_functions_for_xml.txt"; then - - rm gmon* 2> /dev/null - echo "Runing Nemo with 2 cores to obtain function data..." - echo - python3 Job_Creator.py -f "run" -j "run" --set-core 4 -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "mpirun -np 2 ./nemo" - - state2=$("$job" --wait run."$Jobs_scheduler") - Job_completed "$state2" - if [ $Completed == false ]; then - echo "Nemo execution failed no gprof files generated look at run.err for more info" - echo - exit 1 - else - echo "Gprof files generated " - echo - fi - echo "Gthrottling functions ..." - echo - python3 Job_Creator.py -f "gthrottling" -j "gthrottling" --set-core 4 -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "./gthrottling.sh nemo" - state3=$("$job" --wait gthrottling."$Jobs_scheduler") - Job_completed "$state3" - if [ $Completed == false ]; then - echo "Error listing functions, look at gthrottling.err for more info" - echo - exit 1 - else - echo "Functions listed correctly" - echo - fi - - ./extraf.sh nemo extrae_functions.txt - - else - echo "Functions already listed, file extrae_functions_for_xml.txt does exist" - echo - fi - - sed -i "s|list=.*|list=\"${dir}/extrae_functions_for_xml.txt\" exclude-automatic-functions=\"yes\">|g" extrae.xml - - # Run nemo with extrae - - for core in "${Nemo_cores[@]}" - - do - - echo "Creating trace with $core cores..." - echo - python3 Job_Creator.py -f "run_extrae" -j "run_extrae" --set-core "$core" -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "mpirun -np $core ./trace.sh ./nemo" - - state4=$("$job" --wait run_extrae."$Jobs_scheduler") - Job_completed "$state4" - if [ $Completed == false ]; then - echo "Nemo execution failed no traces files generated more info inside run_extrae.err" - echo - exit 1 - fi - mv nemo.prv nemo_"$core".prv - mv nemo.pcf nemo_"$core".pcf - mv nemo.row nemo_"$core".row - echo "Cutting best iteration" - echo - magiccut/./magicCut nemo_"${core}".prv 12 > cut_"$core".out 2>&1 - if ! ls nemo_"$core".best_cut.prv; then - echo "Cut failed, aborting." - echo - exit 1 - fi - echo - # Creating folder - if ! test -d "Metrics"; then - mkdir Metrics - fi - - cp nemo_"$core".best_cut.* Metrics - done - - # Create performance metrics - - - echo "Creating metrics and storing theme in Metrics folder" - echo - python3 Job_Creator.py -f "analysis" -j "analysis" --set-core "${Jobs_n_cores}" -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w ".././modelfactors.py *" - mv analysis."$Jobs_scheduler" Metrics - cd Metrics||(echo "Error Metrics folder doesn't exists"; exit 1) - state5=$("$job" --wait analysis."$Jobs_scheduler") - Job_completed "$state5" - if [ $Completed == false ]; then - echo "Error, metrics have not generated check Metrics/analysis.err to get more details" - echo - exit 1 - - fi - - echo "------------------------------------------------------------------------------" - echo "------------------------- Script Completed -----------------------------------" - echo "----------------------- Data in Metrics folder -------------------------------" - echo "------------------------------------------------------------------------------" - echo "------------------------------------------------------------------------------" - echo -} -main "$@"; exit diff --git a/Job_Creator.py b/src/Job_Creator.py similarity index 76% rename from Job_Creator.py rename to src/Job_Creator.py index 15330c8a56aa2f52607069c340a84bf719947006..94180fd2b4632166050c7cfcf5d957158de80939 100644 --- a/Job_Creator.py +++ b/src/Job_Creator.py @@ -22,7 +22,7 @@ def get_command_line_arguments(): help="Set slurm time, in minutes") parser.add_argument("--set-core", default=1, type=int, help="Set number of cores to be used for the job") - parser.add_argument("--set-core-per-node", default=48, type=int, + parser.add_argument("--set-core-per-node", default=0, type=int, help="Set number of cores to be used for the job") parser.add_argument("-j", "--job-name", default=None, help="Name of the job you want to create or modify") @@ -65,9 +65,9 @@ def create_job_slurm(args): if cores is not None: file.append("#SBATCH --ntasks " + str(cores) + "\n") - if cores_per_node is not None: + if cores_per_node is not None and cores_per_node != 0: file.append("#SBATCH --ntasks-per-node " + str(cores_per_node) + "\n") - if time is not None and not 0: + if time is not None and time!= 0: file.append("#SBATCH --time " + str(time) + "\n") if name is not None: file.append("#SBATCH -J " + name + "\n") @@ -100,7 +100,7 @@ def create_job_lsf(args): if cores is not None: file.append("#BSUB -n " + str(cores) + "\n") - if time is not None and not 0: + if time is not None and time!= 0: file.append("#BSUB -W " + str(time) + "\n") if name is not None: file.append("#BSUB-J " + name + "\n") @@ -117,6 +117,45 @@ def create_job_lsf(args): lsf_job.write(line) +def create_job_torque(args): + file_name = args.file_name + time = args.set_time + cores = args.set_core + name = args.job_name + cores_per_node = args.set_core_per_node + queue = args.set_queue + workload = args.set_workload + if cores_per_node is not None: + nodes = (cores//cores_per_node)+1 + + if time is not None: + hours = time // 60 + minutes = time % 60 + + file = ["#!/bin/bash \n", + "############################################################################### \n", + "#PBS -o "+str(name)+".out \n#PBS -e "+str(name)+".err \n" + "#PBS --constraint=perfparanoid \n"] + + if cores is not None: + file.append("#PBS -l nodes" + str(nodes) + ":ppn=" + str(cores) + "\n") + if time is not None and time != 0: + file.append("#PBS -l cput=" + str(hours) + ":"+str(minutes)+ ":00\n") + if name is not None: + file.append("#PBS -N " + name + "\n") + if queue is not None and not len(queue) == 0: + file.append("#PBS -q " + queue + "\n") + + if workload is not None: + file.append("\n") + for work in workload: + file.append(str(work) + "") + + with open(file_name, "w") as torque_job: + for line in file: + torque_job.write(line) + + def modify_job(args): file_name = args.file_name time = args.set_time @@ -148,6 +187,8 @@ if __name__ == "__main__": args.file_name = args.file_name if args.file_name.count(".slurm") else args.file_name + ".slurm" elif args.scheduler == "lsf": args.file_name = args.file_name if args.file_name.count(".lsf") else args.file_name + ".lsf" + elif args.scheduler == "torque": + args.file_name = args.file_name if args.file_name.count(".torque") else args.file_name + ".torque" if os.path.exists(str(args.file_name)): os.remove(str(args.file_name)) @@ -156,4 +197,5 @@ if __name__ == "__main__": create_job_slurm(args) elif args.scheduler == "lsf": create_job_lsf(args) - + elif args.scheduler == "torque": + create_job_torque(args) diff --git a/src/__pycache__/hybridmetrics.cpython-34.pyc b/src/__pycache__/hybridmetrics.cpython-34.pyc new file mode 100644 index 0000000000000000000000000000000000000000..f76b6cfc79bc957778ae9e53f58a837b70f5df2f Binary files /dev/null and b/src/__pycache__/hybridmetrics.cpython-34.pyc differ diff --git a/src/__pycache__/plots.cpython-34.pyc b/src/__pycache__/plots.cpython-34.pyc new file mode 100644 index 0000000000000000000000000000000000000000..abe82a5e12d4bd2a5e0240d1eef1f09ea9bb2d0d Binary files /dev/null and b/src/__pycache__/plots.cpython-34.pyc differ diff --git a/src/__pycache__/rawdata.cpython-34.pyc b/src/__pycache__/rawdata.cpython-34.pyc new file mode 100644 index 0000000000000000000000000000000000000000..a162884e93f99c5949fd3930dcc77b99f3481a10 Binary files /dev/null and b/src/__pycache__/rawdata.cpython-34.pyc differ diff --git a/src/__pycache__/simplemetrics.cpython-34.pyc b/src/__pycache__/simplemetrics.cpython-34.pyc new file mode 100644 index 0000000000000000000000000000000000000000..c3167444d4830e6ef0aa4b4d65305549e2f8301a Binary files /dev/null and b/src/__pycache__/simplemetrics.cpython-34.pyc differ diff --git a/src/__pycache__/tracemetadata.cpython-34.pyc b/src/__pycache__/tracemetadata.cpython-34.pyc new file mode 100644 index 0000000000000000000000000000000000000000..7fb3916000088f2ebeae698e667ba613cb113840 Binary files /dev/null and b/src/__pycache__/tracemetadata.cpython-34.pyc differ diff --git a/src/__pycache__/utils.cpython-34.pyc b/src/__pycache__/utils.cpython-34.pyc new file mode 100644 index 0000000000000000000000000000000000000000..fb7791e446442b12d07adc3d6c5081f6688e4a99 Binary files /dev/null and b/src/__pycache__/utils.cpython-34.pyc differ diff --git a/cfgs/.directory b/src/cfgs/.directory similarity index 100% rename from cfgs/.directory rename to src/cfgs/.directory diff --git a/cfgs/2dh_BurstEfficiency.cfg b/src/cfgs/2dh_BurstEfficiency.cfg similarity index 100% rename from cfgs/2dh_BurstEfficiency.cfg rename to src/cfgs/2dh_BurstEfficiency.cfg diff --git a/cfgs/barrier-syncr-time.cfg b/src/cfgs/barrier-syncr-time.cfg similarity index 100% rename from cfgs/barrier-syncr-time.cfg rename to src/cfgs/barrier-syncr-time.cfg diff --git a/cfgs/burst_duration.cfg b/src/cfgs/burst_duration.cfg similarity index 100% rename from cfgs/burst_duration.cfg rename to src/cfgs/burst_duration.cfg diff --git a/cfgs/burst_useful.cfg b/src/cfgs/burst_useful.cfg similarity index 100% rename from cfgs/burst_useful.cfg rename to src/cfgs/burst_useful.cfg diff --git a/cfgs/cycles.cfg b/src/cfgs/cycles.cfg similarity index 100% rename from cfgs/cycles.cfg rename to src/cfgs/cycles.cfg diff --git a/cfgs/dimemas.collectives b/src/cfgs/dimemas.collectives similarity index 100% rename from cfgs/dimemas.collectives rename to src/cfgs/dimemas.collectives diff --git a/cfgs/dimemas_ideal.cfg b/src/cfgs/dimemas_ideal.cfg similarity index 100% rename from cfgs/dimemas_ideal.cfg rename to src/cfgs/dimemas_ideal.cfg diff --git a/cfgs/efficiency_table-global.gp b/src/cfgs/efficiency_table-global.gp similarity index 100% rename from cfgs/efficiency_table-global.gp rename to src/cfgs/efficiency_table-global.gp diff --git a/cfgs/efficiency_table-hybrid.gp b/src/cfgs/efficiency_table-hybrid.gp similarity index 100% rename from cfgs/efficiency_table-hybrid.gp rename to src/cfgs/efficiency_table-hybrid.gp diff --git a/cfgs/efficiency_table.gp b/src/cfgs/efficiency_table.gp similarity index 100% rename from cfgs/efficiency_table.gp rename to src/cfgs/efficiency_table.gp diff --git a/cfgs/flushing-cycles.cfg b/src/cfgs/flushing-cycles.cfg similarity index 100% rename from cfgs/flushing-cycles.cfg rename to src/cfgs/flushing-cycles.cfg diff --git a/cfgs/flushing-inst.cfg b/src/cfgs/flushing-inst.cfg similarity index 100% rename from cfgs/flushing-inst.cfg rename to src/cfgs/flushing-inst.cfg diff --git a/cfgs/flushing.cfg b/src/cfgs/flushing.cfg similarity index 100% rename from cfgs/flushing.cfg rename to src/cfgs/flushing.cfg diff --git a/cfgs/instructions.cfg b/src/cfgs/instructions.cfg similarity index 100% rename from cfgs/instructions.cfg rename to src/cfgs/instructions.cfg diff --git a/cfgs/io-call-cycles.cfg b/src/cfgs/io-call-cycles.cfg similarity index 100% rename from cfgs/io-call-cycles.cfg rename to src/cfgs/io-call-cycles.cfg diff --git a/cfgs/io-call-instructions.cfg b/src/cfgs/io-call-instructions.cfg similarity index 100% rename from cfgs/io-call-instructions.cfg rename to src/cfgs/io-call-instructions.cfg diff --git a/cfgs/io-call-reverse.cfg b/src/cfgs/io-call-reverse.cfg similarity index 100% rename from cfgs/io-call-reverse.cfg rename to src/cfgs/io-call-reverse.cfg diff --git a/cfgs/modelfactors-all.gp b/src/cfgs/modelfactors-all.gp similarity index 100% rename from cfgs/modelfactors-all.gp rename to src/cfgs/modelfactors-all.gp diff --git a/cfgs/modelfactors-comm.gp b/src/cfgs/modelfactors-comm.gp similarity index 100% rename from cfgs/modelfactors-comm.gp rename to src/cfgs/modelfactors-comm.gp diff --git a/cfgs/modelfactors-hybrid.gp b/src/cfgs/modelfactors-hybrid.gp similarity index 100% rename from cfgs/modelfactors-hybrid.gp rename to src/cfgs/modelfactors-hybrid.gp diff --git a/cfgs/modelfactors-mpi-hybrid.gp b/src/cfgs/modelfactors-mpi-hybrid.gp similarity index 100% rename from cfgs/modelfactors-mpi-hybrid.gp rename to src/cfgs/modelfactors-mpi-hybrid.gp diff --git a/cfgs/modelfactors-onlydata.gp b/src/cfgs/modelfactors-onlydata.gp similarity index 100% rename from cfgs/modelfactors-onlydata.gp rename to src/cfgs/modelfactors-onlydata.gp diff --git a/cfgs/modelfactors-scale.gp b/src/cfgs/modelfactors-scale.gp similarity index 100% rename from cfgs/modelfactors-scale.gp rename to src/cfgs/modelfactors-scale.gp diff --git a/cfgs/mpi-call-outside.cfg b/src/cfgs/mpi-call-outside.cfg similarity index 100% rename from cfgs/mpi-call-outside.cfg rename to src/cfgs/mpi-call-outside.cfg diff --git a/cfgs/mpi-io-cycles.cfg b/src/cfgs/mpi-io-cycles.cfg similarity index 100% rename from cfgs/mpi-io-cycles.cfg rename to src/cfgs/mpi-io-cycles.cfg diff --git a/cfgs/mpi-io-instructions.cfg b/src/cfgs/mpi-io-instructions.cfg similarity index 100% rename from cfgs/mpi-io-instructions.cfg rename to src/cfgs/mpi-io-instructions.cfg diff --git a/cfgs/mpi-io-reverse.cfg b/src/cfgs/mpi-io-reverse.cfg similarity index 100% rename from cfgs/mpi-io-reverse.cfg rename to src/cfgs/mpi-io-reverse.cfg diff --git a/cfgs/mpi-io.cfg b/src/cfgs/mpi-io.cfg similarity index 100% rename from cfgs/mpi-io.cfg rename to src/cfgs/mpi-io.cfg diff --git a/cfgs/mpi-master-thread.cfg b/src/cfgs/mpi-master-thread.cfg similarity index 100% rename from cfgs/mpi-master-thread.cfg rename to src/cfgs/mpi-master-thread.cfg diff --git a/cfgs/runtime.cfg b/src/cfgs/runtime.cfg similarity index 100% rename from cfgs/runtime.cfg rename to src/cfgs/runtime.cfg diff --git a/cfgs/runtime_app.cfg b/src/cfgs/runtime_app.cfg similarity index 100% rename from cfgs/runtime_app.cfg rename to src/cfgs/runtime_app.cfg diff --git a/cfgs/time_computing.cfg b/src/cfgs/time_computing.cfg similarity index 100% rename from cfgs/time_computing.cfg rename to src/cfgs/time_computing.cfg diff --git a/cfgs/timings.cfg b/src/cfgs/timings.cfg similarity index 100% rename from cfgs/timings.cfg rename to src/cfgs/timings.cfg diff --git a/extrae.xml b/src/extrae.xml similarity index 100% rename from extrae.xml rename to src/extrae.xml diff --git a/extraf.sh b/src/extraf.sh similarity index 100% rename from extraf.sh rename to src/extraf.sh diff --git a/src/functions.bash b/src/functions.bash new file mode 100644 index 0000000000000000000000000000000000000000..fd3de97879ae390995b6746eafce93f2e44de893 --- /dev/null +++ b/src/functions.bash @@ -0,0 +1,597 @@ + + +# Functions + +#Checks if the job submission ended correctly. +Job_completed() +{ + if [ "$Jobs_scheduler" == "slurm" ]; then + local id1 + id1=${1##* } + until ! scontrol show job "$id1" | grep -q 'JobState=COMPLETING' + do + sleep 1 + done + if ! scontrol show job "$id1" | grep -q 'JobState=COMPLETED'; then + Completed=false + else + Completed=true + fi + + elif [ "$Jobs_scheduler" == "lsf" ]; then + local id2 + id2=$(head -n1 "$1" | cut -d'<' -f2 | cut -d'>' -f1) + if ! bjobs -l "$id2" | grep -q 'Status '; then + Completed=false + else + Completed=true + fi + elif [ "$Jobs_scheduler" == "torque" ]; then + local id3 + id3=$(head -n1 "$1" | awk '{ print $3 }') + if ! qstat f "$id3" | grep -q 'exit_status = 0'; then + Completed=false + else + Completed=true + fi + fi + + +} + +# Check if nemo is compiled for using extrae + +Compile_extrae() +{ + trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT + + #Get flag lines + + line=$(sed -n '/^%FCFLAGS /p' "$Nemo_path"/arch/arch-"${arch}".fcm) + line2=$(sed -n '/^%FPPFLAGS /p' "$Nemo_path"/arch/arch-"${arch}".fcm) + line3=$(sed -n '/^%LDFLAGS /p' "$Nemo_path"/arch/arch-"${arch}".fcm) + + + # If -rdynamic is not there recompilation is requiered and -rdynamic added + + if ! echo "${line}"|grep -q "\-rdynamic\b"; then + echo "-rdynamic flag not found in FCFLAGS arch-${arch}.fcm: editing arch-${arch}.fcm " + sed -i '/^%FCFLAGS/ s/$/ -rdynamic/' "${Nemo_path}"/arch/arch-"${arch}".fcm + compile_ext=true + fi + + if ! echo "${line3}"|grep -q "\-export-dynamic\b"; then + echo "-export-dynamic flag not found in LDFLAGS arch-${arch}.fcm: editing arch-${arch}.fcm " + sed -i '/^%LDFLAGS/ s/$/ -export-dynamic/' "${Nemo_path}"/arch/arch-"${arch}".fcm + compile_ext=true + fi + + # If finstrument-functions is not there recompilation is requiered and -finstrument-functions added + if ! echo "${line}"|grep -q "\-finstrument-functions\b"; then + echo "-finstrument-functions flag not found in arch-${arch}.fcm: editing arch-${arch}.fcm " + sed -i '/^%FCFLAGS/ s/$/ -finstrument-functions/' "${Nemo_path}"/arch/arch-"${arch}".fcm + compile_ext=true + fi + + # If -g is not there, recompilation is requiered and -g added + if ! echo "${line}"|grep -q "\-g\b"; then + echo "-g flag not found in arch-${arch}.fcm: editing arch-${arch}.fcm " + sed -i '/^%FCFLAGS/ s/$/ -g /' "${Nemo_path}"/arch/arch-"${arch}".fcm + compile_ext=true + fi + + + if [ "$compile" == true ]; then + echo 'compile parameter is inicialized true' + fi + + # If -pg is there recompilation is requiered and -pg removed + + sed -i 's/-pg//g' "${Nemo_path}"/arch/arch-"${arch}".fcm + + if echo "${line}"|grep -q "\-pg\b"; then + echo "-pg flag found in FCFLAGS arch-${arch}.fcm: editing arch-${arch}.fcm " + compile_ext=true + fi + + if echo "${line2}"|grep -q "\-pg\b"; then + echo "-pg flag found in FPPFLAGS arch-${arch}.fcm : editing arch-${arch}.fcm " + compile_ext=true + fi + + if echo "${line3}"|grep -q "\-pg\b"; then + echo "-pg flag found in LDFLAGS arch-${arch}.fcm: editing arch-${arch}.fcm " + compile_ext=true + fi + + # If nemo executable is not on the run file compile + + if ! test -f "${Nemo_path}/cfgs/${name_cfg}/EXP00/nemo"; then + echo "nemo executable not found in ${name_cfg}" + compile_ext=true + fi + + + if [ "$compile" == true ] || [ "$compile_ext" == true ]; then + + echo "Compiling Nemo for EXTRAE" + + + printf -v workload1 "cd ${Nemo_path}; ./makenemo -r ${cfg} -n ${name_cfg} -m ${arch} -j$Jobs_n_cores $comp_cfg; cd ${Run_path}" + python3 "$dir"/src/./Job_Creator.py -f "compile_extrae" -j "compile_extrae" --set-core "${Jobs_n_cores}" --set-core-per-node "$Jobs_cores_per_node" -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "${workload1}" + + + if ! state1=$("$job" "$wait" compile_extrae."$Jobs_scheduler"); then + exit 1: + fi + + echo + Job_completed "$state1" + mv compile_extrae.* "${logs_path}" + if [ $Completed == false ]; then + echo "Nemo compilation failed, remember to load all the needed modules. Check the details in ${logs_path}/compile_extrae.err" + echo + exit 1 + else + echo "Nemo Extrae compilation successful" + echo + + fi + + + else + echo "Compilation not needed" + echo + fi + + + +} + +#Check if Nemo is compiled for using Gprof + +Compile_gprof() +{ + trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT + + if [ "$compile" == true ]; then + echo 'compile parameter is inicialized true' + fi + + # Checking if Gprof_arch file is present + if ! test -f "${Nemo_path}/arch/arch-${arch}_GPROF.fcm"; then + cp "${Nemo_path}"/arch/arch-"${arch}".fcm "${Nemo_path}"/arch/arch-"${arch}"_GPROF.fcm + fi + + # Get text lines corresponding to the flags + line=$(sed -n '/^%FCFLAGS /p' "$Nemo_path"/arch/arch-"${arch}"_GPROF.fcm) + line2=$(sed -n '/^%FPPFLAGS /p' "$Nemo_path"/arch/arch-"${arch}"_GPROF.fcm) + line3=$(sed -n '/^%LDFLAGS /p' "$Nemo_path"/arch/arch-"${arch}"_GPROF.fcm) + + # If -g is not there, recompilation is requiered and -g added + if ! echo "${line}"|grep -q "\-g\b"; then + echo "-g flag not found in arch-${arch}_GPROF.fcm: editing arch-${arch}_GPROF.fcm " + sed -i '/^%FCFLAGS/ s/$/ -g /' "${Nemo_path}"/arch/arch-"${arch}"_GPROF.fcm + comp_gprof=true + fi + + # If -pg is not there recompilation is requiered and -pg added + + if ! echo "${line}"|grep -q "\-pg\b"; then + echo "-pg flag not found in FCFLAGS arch-${arch}_GPROF.fcm: editing arch-${arch}_GPROF.fcm " + sed -i '/^%FCFLAGS/ s/$/ -pg/' "${Nemo_path}"/arch/arch-"${arch}"_GPROF.fcm + comp_gprof=true + fi + + if ! echo "${line2}"|grep -q "\-pg\b"; then + echo "-pg flag not found in FPPFLAGS arch-${arch}_GPROF.fcm : editing arch-${arch}_GPROF.fcm " + sed -i '/^%FPPFLAGS/ s/$/ -pg/' "${Nemo_path}"/arch/arch-"${arch}"_GPROF.fcm + comp_gprof=true + fi + + if ! echo "${line3}"|grep -q "\-pg\b"; then + echo "-pg flag not found in LDFLAGS arch-${arch}_GPROF.fcm: editing arch-${arch}_GPROF.fcm " + sed -i '/^%LDFLAGS/ s/$/ -pg/' "${Nemo_path}"/arch/arch-"${arch}"_GPROF.fcm + comp_gprof=true + fi + + # If nemo executable is not on the run file compile + + if ! test -f "${Nemo_path}/cfgs/${name_cfg}_GPROF/EXP00/nemo"; then + echo "nemo executable not found in ${name_cfg}_GPROF" + comp_gprof=true + fi + + if [ "$compile" == true ] || [ "$comp_gprof" == true ]; then + echo "Compiling Nemo for GPROF" + printf -v workload1 "cd ${Nemo_path}; ./makenemo -r ${cfg} -n ${name_cfg}_GPROF -m ${arch}_GPROF -j$Jobs_n_cores $comp_cfg; cd ${Gprof_path};" + python3 "$dir"/src/./Job_Creator.py -f "compile_gprof" -j "compile_gprof" --set-core "${Jobs_n_cores}" --set-core-per-node "$Jobs_cores_per_node" -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "${workload1}" + + if ! state1=$("$job" "$wait" compile_gprof."$Jobs_scheduler"); then + exit 1 + fi + + echo + Job_completed "$state1" + mv compile_gprof.* "${logs_path}" + if [ "$Completed" == false ]; then + echo "Nemo compilation failed, remember to load all the needed modules. Check the details in ${logs_path}/compile_gprof.err" + echo + exit 1 + else + echo "Nemo Gprof compilation successful" + echo + + fi + + + else + echo "Compilation not needed" + echo + fi + + #Copy all the EXP00 data in Gprof folder but don't overwrite the namelist, just the executable. + + cp -n "${Nemo_path}"/cfgs/"${name_cfg}"_GPROF/EXP00/* "${Gprof_path}" + cp "${Nemo_path}"/cfgs/"${name_cfg}"_GPROF/EXP00/nemo "${Gprof_path}" + + + if [[ $comp_cfg == "-d OCE del_key 'key_si3 key_top'" ]]; then + sed -i '/_def_nemo-ice.xml\|def_nemo-pisces.xml/d' "${Gprof_path}"/context_nemo.xml #DELETE ICE AND PISCES CONTEXT (NOT USED) + fi + + sed -i '/ /dev/null + echo "Runing Nemo with 2 cores to obtain function data..." + echo + python3 "$dir"/src/./Job_Creator.py -f "run" -j "run" --set-core 4 --set-core-per-node "$Jobs_cores_per_node" -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "mpirun -np 2 ./nemo" + + if ! state2=$("$job" "$wait" run."$Jobs_scheduler"); then + exit 1 + fi + Job_completed "$state2" + mv run.* "${logs_path}" + if [ $Completed == false ]; then + echo "Nemo execution failed look at ${logs_path}/run.err and ${Gprof_path}/ocean.output for more info" + echo "Remember that the namelist files copied are the default ones, change them in order to fit with the input files in the dir " + echo + exit 1 + else + echo "Gprof files generated " + echo + fi + + + echo "Gthrottling functions ..." + echo + python3 "$dir"/src/./Job_Creator.py -f "gthrottling" -j "gthrottling" --set-core 4 --set-core-per-node "$Jobs_cores_per_node" -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "$dir/src/./gthrottling.sh nemo" + if ! state3=$("$job" "$wait" gthrottling."$Jobs_scheduler"); then + exit 1 + fi + Job_completed "$state3" + mv gthrottling.* "${logs_path}" + if [ $Completed == false ]; then + echo "Error listing functions, look at ${logs_path}/gthrottling.err for more info" + echo + exit 1 + else + mv extrae_functions.txt extrae_functions_"${name_cfg}".txt + echo "Functions listed correctly" + echo + + fi + + else + echo "Functions already listed, file ${Gprof_path}/extrae_functions_${name_cfg}.txt does exist." + echo + + fi +} + + +#Gets a trace from Nemo and cuts it to obtain a single timestep + +Get_trace() +{ + trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT + Compile_extrae + + wait + "$dir"/src/./extraf.sh "${Nemo_path}"/cfgs/"${name_cfg}"/EXP00/nemo "${Gprof_path}"/extrae_functions_"${name_cfg}".txt > /dev/null + sed -i "s|list=.*|list=\"${Run_path}/extrae_functions_for_xml.txt\" exclude-automatic-functions=\"yes\">|g" "$dir"/src/extrae.xml + + # Change iterations + sed -i "s|nn_itend * =.*|nn_itend = $Nemo_iterations ! last time step (std 5475)|g" "${Nemo_path}"/cfgs/"${name_cfg}"/EXP00/namelist_cfg + if [[ $comp_cfg == "-d OCE del_key 'key_si3 key_top'" ]]; then + sed -i '/_def_nemo-ice.xml\|def_nemo-pisces.xml/d' "${Nemo_path}"/cfgs/"${name_cfg}"/EXP00/context_nemo.xml #DELETE ICE AND PISCES CONTEXT (NOT USED) + fi + sed -i '/ "${logs_path}"/cut_"$core".out 2>&1 + + if ! ls nemo_"$core".best_cut.prv; then + echo "Cut failed, look at ${logs_path}/cut_$core.out for more info." + kill 0 + exit 1 + fi + echo + cp nemo_"$core".best_cut.* "${Metrics_path}" + cd "$Run_path" + )& + + done + + wait +} + +Create_metrics() +{ + + # Create performance metrics + echo "Creating metrics and storing them in ${Metrics_path} folder" + echo + python3 "$dir"/src/./Job_Creator.py -f "analysis" -j "analysis" --set-core "${Jobs_n_cores}" --set-core-per-node "$Jobs_cores_per_node" -s "$Jobs_scheduler" --set-time "$time" --set-queue "$queue" -w "$dir/src/./modelfactors.py -ms 100000 *" + + + if ! state5=$("$job" "$wait" analysis."$Jobs_scheduler"); then + exit 1 + fi + + Job_completed "$state5" + cp analysis.out overview.txt + mv analysis.* "${logs_path}" + if [ $Completed == false ]; then + echo "Error, metrics have not generated check ${logs_path}/analysis.err to get more details" + echo + exit 1 + + fi + + # Removing run folders + if [ $Clean == true ]; then + rm -r -f "$Run_path" + mv "${Gprof_path}"/extrae_functions* "$dir" + rm -r -f "$Gprof_path"/* + mv "${dir}"/extrae_functions* "${Gprof_path}" + fi + + echo "------------------------------------------------------------------------------" + echo "------------------------- Script Completed -----------------------------------" + echo "--- Data in ${Metrics_path} folder ---" + echo "------------------------------------------------------------------------------" + echo "------------------------------------------------------------------------------" + echo +} diff --git a/gthrottling.sh b/src/gthrottling.sh similarity index 61% rename from gthrottling.sh rename to src/gthrottling.sh index 27c4f503350f4b036d7b7095bfd46d4f047b5065..e91d785dd1a72a0a75921b131d2af3f2c734957a 100755 --- a/gthrottling.sh +++ b/src/gthrottling.sh @@ -3,27 +3,26 @@ # Usage: ./extract_gprof.sh path/to/executable/executable # Output file: extrae_functions.txt -rm gprof_functions -rm suspected_functions_names_only -rm suspected_functions -rm extrae_functions.txt # nm tool lists the symbols from objects and we select the ones with type T|t which the type is in the text section nm $1 | grep -i " T " | awk '{print $3}' > function_names.txt echo "See the function names from the binary in the file function_names.txt" dir=$(dirname $1) -for i in `ls $dir/gmon*`; -do +analyze_gmon() + + { + local n=$(echo "$2" | sed 's/[^0-9]*//g') + local i=$2 echo -e "Analyzing "$i"\n" - gprof $1 $i >gprof_temp + gprof $1 $i >gprof_temp_"$n" #We extract from each gprof file only the part about the functions, number of calls and durations, the call-paths ar enot needed - cat gprof_temp | grep -v ":" | awk 'BEGIN{k=0}{if($1=="%") {k=k+1};if(k>0 && k<2 && $1==$1+0) print $0}' > temp + cat gprof_temp_"$n" | grep -v ":" | awk 'BEGIN{k=0}{if($1=="%") {k=k+1};if(k>0 && k<2 && $1==$1+0) print $0}' > temp_"$n" #We save the name of the functions - cat temp | awk '{if($7~/^$/) print $4;else print $7}' > gprof_functions + cat temp_"$n" | awk '{if($7~/^$/) print $4;else print $7}' > gprof_functions_"$n" #From the initial list we save only the ones that gprof files include - cat function_names.txt | grep -w -f gprof_functions > extrae_new_list + cat function_names.txt | grep -w -f gprof_functions_"$n" > extrae_new_list_"$n" #We apply the throttling rule: # 1) If there is no information about each call of a function is suspected if the total duration is less than 0.1% of the total execution time @@ -31,24 +30,34 @@ do # 2.1) If its duration is less or equal to 5% ($1<=5, you can change them according to your application) of the total execution time # 2.2) If the duration of each call is less than 0.001s, then exclude it # 3) If the total execution time of this function is 0.0%, then remove it - cat temp | awk '{if($7~/^$/ && $1<0.1) print $4" "$1; else if(NF==7 && $4>10000 && (($1<=5 || $5<=0.001)) || $1==0.0) print $7" "$4}' >> suspected_functions - awk '{print $1}' suspected_functions >> suspected_functions_names_only + cat temp_"$n" | awk '{if($7~/^$/ && $1<0.1) print $4" "$1; else if(NF==7 && $4>10000 && (($1<=5 || $5<=0.001)) || $1==0.0) print $7" "$4}' >> suspected_functions_"$n" + awk '{print $1}' suspected_functions_"$n" >> suspected_functions_names_only_"$n" # Sort and remove any double functions from the list with suspected functions - cat suspected_functions_names_only | sort | uniq > temp_file - mv temp_file suspected_functions_names_only + cat suspected_functions_names_only_"$n" | sort | uniq > temp_file_"$n" + mv temp_file_"$n" suspected_functions_names_only_"$n" # Create a new fucntion list with the non suspected functions - cat extrae_new_list | grep -w -v -f suspected_functions_names_only >> extrae_functions.txt -done + cat extrae_new_list_"$n" | grep -w -v -f suspected_functions_names_only_"$n" >> extrae_functions.txt + + rm extrae_new_list_"$n" + rm suspected_functions_names_only_"$n" + rm suspected_functions_"$n" + rm gprof_temp_"$n" + rm temp_"$n" + rm gprof_functions_"$n" + +} +for i in `ls $dir/gmon*`; do analyze_gmon "$1" "$i" & done + +wait #Sort and uniw the useful functions because are called from many processors and can by include twice cat extrae_functions.txt | sort | uniq > temp2 mv temp2 extrae_functions.txt -rm temp echo -e "Input function list: "function_names.txt" "`wc -l function_names.txt | awk '{print $1}'`" functions" echo -e "New function list: extrae_functions.txt "`wc -l extrae_functions.txt | awk '{print $1}'`" functions" - +rm function_names.txt exit diff --git a/hybridmetrics.py b/src/hybridmetrics.py similarity index 100% rename from hybridmetrics.py rename to src/hybridmetrics.py diff --git a/src/magiccut b/src/magiccut new file mode 160000 index 0000000000000000000000000000000000000000..13b714f08a265150dc3b5cc05c5303fa0a650448 --- /dev/null +++ b/src/magiccut @@ -0,0 +1 @@ +Subproject commit 13b714f08a265150dc3b5cc05c5303fa0a650448 diff --git a/modelfactors.py b/src/modelfactors.py similarity index 100% rename from modelfactors.py rename to src/modelfactors.py diff --git a/plots.py b/src/plots.py similarity index 100% rename from plots.py rename to src/plots.py diff --git a/rawdata.py b/src/rawdata.py similarity index 100% rename from rawdata.py rename to src/rawdata.py diff --git a/simplemetrics.py b/src/simplemetrics.py similarity index 100% rename from simplemetrics.py rename to src/simplemetrics.py diff --git a/trace.sh b/src/trace.sh similarity index 69% rename from trace.sh rename to src/trace.sh index fa295aeeab18107eeddb888666afe68212285569..f81f6d535f2ae30316b1989898c063a45fc5a7b8 100755 --- a/trace.sh +++ b/src/trace.sh @@ -1,9 +1,10 @@ #!/bin/sh # Configure Extrae -export EXTRAE_CONFIG_FILE=./extrae.xml +export EXTRAE_CONFIG_FILE=./../src/extrae.xml # Load the tracing library (choose C/Fortran) #export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitrace.so +export EXTRAE_SKIP_AUTO_LIBRARY_INITIALIZE=1 export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitracecf.so # Run the program $* diff --git a/tracemetadata.py b/src/tracemetadata.py similarity index 100% rename from tracemetadata.py rename to src/tracemetadata.py diff --git a/utils.py b/src/utils.py similarity index 100% rename from utils.py rename to src/utils.py