Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] MERLIN_SPEC variables are not working properly #423

Closed
dylancliche opened this issue May 18, 2023 · 11 comments · Fixed by #425
Closed

[BUG] MERLIN_SPEC variables are not working properly #423

dylancliche opened this issue May 18, 2023 · 11 comments · Fixed by #425
Labels
bug Something isn't working

Comments

@dylancliche
Copy link

dylancliche commented May 18, 2023

Bug Report

Describe the bug
$(MERLIN_SPEC_ORIGINAL_TEMPLATE), $(MERLIN_SPEC_EXECUTED_RUN), and $(MERLIN_SPEC_ARCHIVED_COPY) do not rename the specification's filename with extension .orig.yaml, .partial.yaml, or .expanded.yaml respectively. Instead, it is using the name: given in the description: block within the specification file.

To Reproduce
Steps to reproduce the behavior:
In any workflow change the name in the description block to not be the filename of the specification file and type echo <any of the Merlin variables given above> in any of the workflow's steps.

Expected behavior
In the output file for the step that contained the echo command, it should contain <spec filename>.<orig or partial or expanded>.yaml.

Please answer these questions to help us pinpoint the problem

  • Does the problem occur in merlin run --local mode, distributed or neither? Both
  • On what machines/architectures are you running merlin? Is this bug on a specific machine or can you reproduce it elsewhere? Haven't tried

Please run merlin info and paste the results here:

Merlin Configuration

config_file | /g/g15/cliche1/.merlin/app.yaml
is_debug | False
merlin_home | /g/g15/cliche1/.merlin
merlin_home_exists | True
broker server | amqps://cliche1:@rzrabbit.llnl.gov:5671/cliche1
broker ssl | True
results server | redis://mlsi:
@rzrabbit.llnl.gov:6379/0
results ssl | False

Checking server connections:

broker server connection: OK
results server connection: OK

@dylancliche dylancliche added the bug Something isn't working label May 18, 2023
@dylancliche
Copy link
Author

@bgunnar5, @lucpeterson, @koning: Any thoughts?

@bgunnar5
Copy link
Member

Sorry for the delayed reply, I've been on vacation since last Wednesday. I just had a chance to look into this and I believe this could be changed in the merlin/study/study.py file in the init method of the MerlinStudy class. Right now we have something that looks like:

def __init__(  # pylint: disable=R0913
        self,
        filepath,
        override_vars=None,
        restart_dir=None,
        samples_file=None,
        dry_run=False,
        no_errors=False,
        pgen_file=None,
        pargs=None,
    ):
        self.original_spec = MerlinSpec.load_specification(filepath)

        self.special_vars = {
            "MERLIN_SPEC_ORIGINAL_TEMPLATE": os.path.join(
                self.info,
                self.original_spec.description["name"].replace(" ", "_") + ".orig.yaml",
            ),
            "MERLIN_SPEC_EXECUTED_RUN": os.path.join(
                self.info,
                self.original_spec.description["name"].replace(" ", "_") + ".partial.yaml",
            ),
            "MERLIN_SPEC_ARCHIVED_COPY": os.path.join(
                self.info,
                self.original_spec.description["name"].replace(" ", "_") + ".expanded.yaml",
            ),
        }

Since we pass in the filepath already, we could add a new special variable here called "ORIGINAL_FILEPATH" or something similar like so:

self.special_vars = {
            "MERLIN_SPEC_ORIGINAL_TEMPLATE": os.path.join(
                self.info,
                self.original_spec.description["name"].replace(" ", "_") + ".orig.yaml",
            ),
            "MERLIN_SPEC_EXECUTED_RUN": os.path.join(
                self.info,
                self.original_spec.description["name"].replace(" ", "_") + ".partial.yaml",
            ),
            "MERLIN_SPEC_ARCHIVED_COPY": os.path.join(
                self.info,
                self.original_spec.description["name"].replace(" ", "_") + ".expanded.yaml",
            ),
            "ORIGINAL_FILEPATH": filepath,
        }

@bgunnar5
Copy link
Member

This filepath wouldn't have any of the variables expanded, it would just strictly point to the absolute path (I think it'd be the absolute path at least) of the file that was provided by the user

@koning
Copy link
Member

koning commented May 23, 2023

I can see how that might present as a bug, but that is not the choice they made when implementing this feature. I think Dylan is asking for this:

base_name = os.path.basename(filepath)
base_name = base_name.replace(".yaml", "")
self.special_vars = {
            "MERLIN_SPEC_ORIGINAL_TEMPLATE": os.path.join(
                self.info,
                base_name + ".orig.yaml",
            ),
            "MERLIN_SPEC_EXECUTED_RUN": os.path.join(
                self.info,
                base_name + ".partial.yaml",
            ),
            "MERLIN_SPEC_ARCHIVED_COPY": os.path.join(
                self.info,
                base_name + ".expanded.yaml",
            ),
        }

But it will need to be changed in the code, and the basename may not be available:

expanded_name = result.description["name"].replace(" ", "_") + ".expanded.yaml"

@dylancliche
Copy link
Author

@bgunnar5: No problem. It doesn't even have to be the full filepath (just as @koning pointed out). Especially since $(SPECROOT) already shows the path. It would be nice to have the filename: something along the lines of $(SPECFILE) to be analogous to $(SPECROOT).

@koning: I presented it as a bug because I originally presented it as a new feature, but then @lucpeterson stated it may be a bug so I decided to report it here as well. I wasn't sure where the scripts for the MERLIN_SPEC variables were located, but having @bgunnar5 post them I can see that the original choice for implementing this feature was not what I was looking for. If that is the case, instead of changing the MERLIN_SPEC variables as you suggested (unless that's what everyone thinks is best) we can definitely go the other route and just add a new variable such as SPECFILE to get what I am looking for.

@lucpeterson
Copy link
Member

I think it’s a bug because you could have multiple yaml files with different base names but the same name in the yaml file. The new file would have that as its name. It’d have the correct contents but you wouldn’t know which file it came from

@dylancliche
Copy link
Author

As a piggy back. That was the main reason to why I originally proposed the new feature (and why Luc thinks it is a bug) is because, as Luc mentioned above, different specification files can have the same name in the description block and therefore in the merlin_info directory you wouldn't know which workflow the .orig, .partial, and .expanded files came from without investigating the contents. I'd rather just get the filename.

@dylancliche
Copy link
Author

@bgunnar5: Here's an idea (no matter if we make a new variable or change the MERLIN_SPEC variables) based on @koning's idea. What if we add spec.specfile = os.path.basename(filepath) in the load_specification class method within the merlin/spec/specification.py file. Then either make a special variable in merlin/study/study.py that sets it equal to self.original_spec.specfile (if we want a different variable), or use it to define the MERLIN_SPEC variables such as @koning's suggestion.

@koning
Copy link
Member

koning commented May 23, 2023

These files are only for one specific study, so they are unique within a study even though they might not be unique across multiple studies. The study directory is unique.

@bgunnar5
Copy link
Member

Even with the files being unique to a study I do think it makes more sense for these files to be named based on the original filename rather than the name of the study itself

@koning
Copy link
Member

koning commented May 23, 2023

Yes I agree, I'm just clarifying the bug versus feature discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants