Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/new-status #442

Merged
merged 19 commits into from
Oct 9, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,48 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [unreleased]
### Fixed
- Cyclical imports and config imports that could easily cause ci issues

### Added
- New file `merlin/study/status.py` dedicated to work relating to the status command
- Contains the Status and DetailedStatus classes
- New file `merlin/common/dumper.py` containing a Dumper object to help dump output to outfiles
- Study name and parameter info now stored in the DAG and MerlinStep objects
- Added functions to `merlin/display.py` that help display status information:
- `display_status_summary` handles the display for the `merlin status` command
- `display_progress_bar` generates and displays a progress bar
- Added new methods to the MerlinSpec class:
- get_tasks_per_step()
- Added methods to the MerlinStepRecord class to mark status changes for tasks as they run (follows Maestro's StepRecord format mostly)
- Added methods to the Step class:
- name_no_params()
- Added a property paramater_labels to the MerlinStudy class
- Added two new utility functions:
- dict_deep_merge() that deep merges two dicts into one
- ws_time_to_dt() that converts a workspace timestring (YYYYMMDD-HHMMSS) to a datetime object
- A new celery task `condense_status_files` to be called when sets of samples finish
- Added a celery config setting `worker_cancel_long_running_tasks_on_connection_loss` since this functionality is about to change in the next version of celery
- Tests for the Status class
- this required adding a decent amount of test files to help with the tests; these can be found under the tests/unit/study/status_test_files directory
- The *.conf regex for the recursive-include of the merlin server directory so that pip will add it to the wheel
- A note to the docs for how to fix an issue where the `merlin server start` command hangs

### Changed
- Reformatted the entire "merlin status" command
- Now accepts both spec files and workspace directories as arguments
- e.g. "merlin status hello.yaml" and "merlin status hello_20230228-111920/" both work
- Removed the --steps flag
- Replaced the --csv flag with the --dump flag
- This will make it easier in the future to adopt more file types to dump to
- New functionality:
- Shows step_by_step progress bar for tasks
- Displays a summary of task statuses below the progress bar
- Split the `add_chains_to_chord` function in `merlin/common/tasks.py` into two functions:
- `get_1d_chain` which converts a 2D list of chains into a 1D list
- `launch_chain` which launches the 1D chain
- Pulled the needs_merlin_expansion() method out of the Step class and made it a function instead
- Removed `tabulate_info` function; replaced with tabulate from the tabulate library
- Moved `verify_filepath` and `verify_dirpath` from `merlin/main.py` to `merlin/utils.py`
- Bump certifi from 2022.12.7 to 2023.7.22 in /docs
- Bump pygments from 2.13.0 to 2.15.0 in /docs
- Bump requests from 2.28.1 to 2.31.0 in /docs
Expand Down
151 changes: 151 additions & 0 deletions merlin/common/dumper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
###############################################################################
# Copyright (c) 2022, Lawrence Livermore National Security, LLC.
# Produced at the Lawrence Livermore National Laboratory
# Written by the Merlin dev team, listed in the CONTRIBUTORS file.
# <merlin@llnl.gov>
#
# LLNL-CODE-797170
# All rights reserved.
# This file is part of Merlin, Version: 1.10.0
#
# For details, see https://github.com/LLNL/merlin.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
###############################################################################
"""This file is meant to help dump information to files"""

import csv
import json
import logging
import os
from typing import Dict, List


LOG = logging.getLogger(__name__)


# TODO When we add more public methods we can get rid of this pylint disable
class Dumper: # pylint: disable=R0903
"""
The dumper class is intended to help write information to files.
Currently, the supported file types to dump to are csv and json.

Example csv usage:
dumper = Dumper("populations.csv")
# Eugene, OR has a population of 175096
# Livermore, CA has a population of 86803
population_data = {
"City": ["Eugene", "Livermore"],
"State": ["OR", "CA"],
"Population": [175096, 86803]
}
dumper.write(population_data, "w")
|---> Output will be written to populations.csv

Example json usage:
dumper = Dumper("populations.json")
population_data = {
"OR": {"Eugene": 175096, "Portland": 641162},
"CA": {"Livermore": 86803, "San Francisco": 815201}
}
dumper.write(population_data, "w")
|---> Output will be written to populations.json
"""

def __init__(self, file_name):
"""
Initialize the class and ensure the file is of a supported type.
:param `file_name`: The name of the file to dump to eventually
"""
supported_types = ["csv", "json"]

valid_file = False
for stype in supported_types:
if file_name.endswith(stype):
valid_file = True
self.file_type = stype

if not valid_file:
raise ValueError(f"Invalid file type for {file_name}. Supported file types are: {supported_types}.")

self.file_name = file_name

def write(self, info_to_write: Dict, fmode: str):
"""
Write information to an outfile.
:param `info_to_write`: The information you want to write to the output file
:param `fmode`: The file write mode ("w", "a", etc.)
"""
if self.file_type == "csv":
self._csv_write(info_to_write, fmode)
elif self.file_type == "json":
self._json_write(info_to_write, fmode)

def _csv_write(self, csv_to_dump: Dict[str, List], fmode: str):
"""
Write information to a csv file.
:param `csv_to_dump`: The information to write to the csv file.
Dict keys will be the column headers and values will be the column values.
:param `fmode`: The file write mode ("w", "a", etc.)
"""
# If we have statuses to write, create a csv writer object and write to the csv file
with open(self.file_name, fmode) as outfile:
csv_writer = csv.writer(outfile)
if fmode == "w":
csv_writer.writerow(csv_to_dump.keys())
csv_writer.writerows(zip(*csv_to_dump.values()))

def _json_write(self, json_to_dump: Dict[str, Dict], fmode: str):
"""
Write information to a json file.
:param `json_to_dump`: The information to write to the json file.
:param `fmode`: The file write mode ("w", "a", etc.)
"""
# Appending to json requires file mode to be r+ for json.load
if fmode == "a":
fmode = "r+"

with open(self.file_name, fmode) as outfile:
# If we're appending, read in the existing file data
if fmode == "r+":
file_data = json.load(outfile)
json_to_dump.update(file_data)
outfile.seek(0)
# Write to the outfile
json.dump(json_to_dump, outfile)


def dump_handler(dump_file: str, dump_info: Dict):
"""
Help handle the process of creating a Dumper object and writing
to an output file.

:param `dump_file`: A filepath to the file we're dumping to
:param `dump_info`: A dict of information that we'll be dumping to `dump_file`
"""
# Create a dumper object to help us write to dump_file
dumper = Dumper(dump_file)

# Get the correct file write mode and log message
fmode = "a" if os.path.exists(dump_file) else "w"
write_type = "Writing" if fmode == "w" else "Appending"
LOG.info(f"{write_type} to {dump_file}...")

# Write the output
dumper.write(dump_info, fmode)
LOG.info(f"{write_type} complete.")
Loading