diff --git a/README.md b/README.md index e4edea346fd3face14854d2753c3b57725496c69..a00c378d806ae83fd6d4884f06511ca45896d2ba 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # DestinE Analytics -A comprehensive analytics toolkit for data analysis, visualization, and log processing, specifically designed for AutoSubmit (AS) workflow analysis and file system exploration. +A comprehensive analytics toolkit for data analysis, visualization, and log processing, specifically designed for [AutoSubmit (AS)](https://autosubmit.readthedocs.io/) workflow analysis and file system exploration. ## Overview @@ -55,17 +55,22 @@ DestinE Analytics provides powerful tools for: source .venv/bin/activate # On Windows: .venv\Scripts\activate ``` -3. **Install dependencies**: +3. **Upgrade pip** + ```bash + pip install --upgrade pip + ``` + +4. **Install dependencies**: ```bash pip install -r requirements.txt ``` -4. **Install the package**: +5. **Install the package**: ```bash pip install -e . ``` -5. **Configure environment** (optional): +6. **Configure environment** (optional): ```bash cp .env.example .env # Edit .env with your configuration @@ -92,19 +97,25 @@ draw-filetree /path/to/directory --figure treemap --interactive draw-filetree tools/ls_filetree_Documents.txt --figure sunburst # Remote directory analysis -draw-filetree /remote/path --ssh-command "ssh user@host" --figure icicle +draw-filetree /path/to/remote/directory --ssh-command "ssh user@host" --figure icicle # Save output draw-filetree /path/to/directory --figure all --output figures/ ``` **Options**: -- `--figure`: Chart type (`treemap`, `sunburst`, `icicle`, `all`) -- `--interactive`: Show interactive plot -- `--only-directories`: Show only directories (for large trees) -- `--root`: Specify root directory for subtree analysis -- `--cache`: Enable caching for performance -- `--output`: Output directory for saved figures + +| Command | Description | +|----------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `-h`, `--help` | Show help | +| `-e`, `--ssh-command` | For remote path scoping: ssh command to execute in order to access to remote shell. | +| `-E EXPID`, `--expid EXPID` | Expid. Only for title & filename purposes | +| `-r ROOT`, `--root ROOT` | | +| `-d`, `--only-directories` | Only show directories in the figure. Needed for excessively large file trees. | +| `-f {treemap,sunburst,icicle,all,None}`, `--figure {treemap,sunburst,icicle,all,None}` | Figure type to generate. Must be one of the listed options. `all` is not allowed for interactive mode. | +| `-i`, `--interactive` | Whether to interactively show the figure after plotting. | +| `-c`, `--cache` | Read and write to `~/.cache`. Results are temporarily stored based on CLI arguments. | +| `-o [OUTPUT]`, `--output [OUTPUT]` | Output directory (or full path) or file for saving figures. If no argument is passed, defaults to `~/Figures` . If flag is not used, figure is not saved. | #### 2. Gantt Chart Generation (`create-ganttchart`) @@ -137,14 +148,31 @@ create-ganttchart o005 --output figures/ --dpi 300 ``` **Options**: -- `-e, --ssh-command`: Trigger remote log transfer. Uses `.env` credentials by default, or a provided SSH command. -- `--chunks`: Specify chunk range to analyze -- `--ignore`: Ignore jobs with specific prefixes -- `--color`: Color scheme (`job`, `chunk`, `status`) -- `--starttype`: Use `started` or `submitted` timestamps -- `--add-information`: Add extra info (`available`, `stats`) -- `--durations`: Show job durations -- `--pivot`: Organize by chunk in y-axis + +| Command | Description | +|-------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------| +| `-h`, `--help` | Show help | +| `-e`, `--ssh-command` | Trigger remote log transfer. If an SSH command is provided as an argument, it will be used. Otherwise, credentials from the `.env` file will be used. | +| `--chunks START END` | First and last chunks to plot. | +| `--ignore [IGNORE ...]` | Ignore jobs starting with these arguments. | +| `--starttype {started,submitted}` | Whether the start of the bars is at AS 'started' or 'submitted' status. | +| `--cmap CMAP` | | +| `--interval INTERVAL` | Interval for time ticks in x axis. -1 (default) sets it to 'automatic'. Default: -1 | +| `--pivot` | All bars are classified by chunk in the y axis, regardless of whether splits overlap. | +| `--no-pivot` | Prevent bars from being classified by chunk in the y axis, even if splits overlap. | +| `--by-split` | Gantt Chart entries are separated by split. | +| `--by-member` | Gantt Chart entries are separated by member. | +| `--durations` | Show the summed durations of all the jobs of each entry | +| `--no-edge` | Removes the black edge from the bars in the Gantt chart. | +| `--dpi DPI` | Dots per inch (DPI) for the saved figure. | +| `--after AFTER` | Date after which jobs are read. Format: _YYYYMMDD_ or _YYYYMMDDhhmmss_ | +| `--y-pos Y_POSITIONS [Y_POSITIONS ...]` | Relative positions of line info | +| `--y-bar BAR_HEIGHT` | Height of the bars. Default: 0.4 | +| `--y-factor HEIGHT_FACTOR` | Width/height factor for the plot. Default: 0.4 | +| `-A {available,stats,None}`, `--add-information {available,stats,None}` | Extra info to add to the Gantt chart. Not available in interactive mode | +| `-v`, `--verbose` | Enable verbose output, especially for remote operations. | +| `-i`, `--interactive` | Show the Gantt Chart in an interactive window. | +| `-o [OUTPUT]`, `--output [OUTPUT]` | Output directory (or full path) or file for saving figures. If no argument is passed, defaults to `~/Figures`. If flag is not used, figure is not saved. | ### Shell Tools @@ -237,11 +265,10 @@ cp .env.example .env The `.env` file should contain the following variables: -```bash -# Log directory path -LOG_DIR=/path/to/logs +- Machine 1 corresponds to _ProxyJump_ machine. +- Machine 2 corresponds to virtual machine -# SSH configuration for remote access +```bash file1=/path/to/ssh/key1 file2=/path/to/ssh/key2 user1=username1 @@ -291,7 +318,7 @@ destine_analytics/ - **pytest**: Testing framework - **black**: Code formatting - **isort**: Import sorting -- **flake8**: Code linting +- **pylint**: Code linting ## Development @@ -394,31 +421,3 @@ draw-filetree /remote/path \ - Filter large datasets with `--chunks` or `--root` - Use `--only-directories` for very large file trees - Consider using remote analysis for large remote directories - -## Contributing - -1. Fork the repository -2. Create a feature branch -3. Make your changes -4. Add tests for new functionality -5. Ensure code passes linting and formatting -6. Submit a pull request - -## License - -[Add your license information here] - -## Support - -For issues and questions: -- Create an issue in the repository -- Check the troubleshooting section -- Review the examples for usage patterns - -## Roadmap - -- [ ] Add support for more visualization types -- [ ] Implement real-time log monitoring -- [ ] Add web-based dashboard -- [ ] Support for more log formats -- [ ] Enhanced caching and performance optimizations diff --git a/pyproject.toml b/pyproject.toml index 2e152c3fc174a9d7e08ac3a8a0f10cf1e45e289f..c4fe199b85163f4d9c901aff4fab67c0add28b0e 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -24,7 +24,7 @@ dev = [ "pytest>=7.0.0", "black>=23.0.0", "isort>=5.0.0", - "flake8>=6.0.0", + "pylint>=3.0.0", ] [tool.black]