Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor music box configuration file options #220

Open
K20shores opened this issue Aug 29, 2024 · 13 comments
Open

Refactor music box configuration file options #220

K20shores opened this issue Aug 29, 2024 · 13 comments
Assignees

Comments

@K20shores
Copy link
Collaborator

K20shores commented Aug 29, 2024

          There was a comment/question from Rebecca Buchholz at the Demo yesterday; she was sitting next to me, and you may not have heard it. She asked about using initial_conditions.csv to set the initial species concentrations instead of my_config.json.

Initialization of concentrations involves a large number of floating-point numbers, and I believe that specification would be technically better in tabular form (CSV) instead of JSON. We have a JSON mechanism to include CSV file, rather than inserting many values into the JSON configuration. The CSV file would be better for our users to manage and review.

I know that we are retaining compatibility with the Fortran MusicBox. I recommend that we move forward soon with changes that will increase CSV usage through our product suite.

Originally posted by @carl-drews in #214 (review)

Acceptance criteria

  • All existing configuration files in our examples use the new format
  • The new format is read correctly
  • Species configurations are only read from the initial conditions section
    • the initial conditions section can either be a path to a file
    • or the initial conditions section can be an in-place csv file, specified as a set of lists of lists. This is useful for music box interactive since we wouldn't need to write out csv files
  • Evolving conditions can be read from a file
    • or read directly from the csv, like the initial condition. Again, useful for music box interactive

Ideas

  • Consider supporting configuration files of these formats
{
    "box model options": {
        "grid": "box",
        "chemistry time step [sec]": 1.0,
        "output time step [sec]": 1.0,
        "simulation length [sec]": 3600.0
    },
    "evolving conditions": {
        "filepath": "evolving_conditions.csv"
    },
    "initial conditions": {
        "filepath": "initial_conditions.csv"
    },
   "mechanism confiuration" : {
       "relative file path": "camp_data/config.json"
   }
}

or, inplace conditions

{
    "box model options": {
        "grid": "box",
        "chemistry time step [sec]": 1.0,
        "output time step [sec]": 1.0,
        "simulation length [sec]": 3600.0
    },
    "evolving conditions": {
        data: [
                      ["ENV.pressure [Pa]", "ENV.temperature [K]", "PHOT.O2_1 [s-1]"],
                      [1000, 200, 0.5],
                  ]
    },
    "initial conditions": {
        data: [
                      ["ENV.temperature", "ENV.pressure", "CONC.H2O", "CONC.CH4"],
                      [1000, 200, 0.5, 1e-4],
                  ]
    },
   "mechanism confiuration" : {
       "relative file path": "camp_data/config.json"
   }
}

  • Print a warning when we detect an old configuration. tell them that it's old and provide a new command line option, maybe --update-config or --fix so that when you give music_box -c my_config.json --fix, we overwrite the old one, and maybe copy the origional to my_config.json.old
@K20shores
Copy link
Collaborator Author

K20shores commented Aug 29, 2024

Actually, we might consider storing all of the mechanism information in this file as well. That would remove the need to write any files out. We would then have to update the musica API to take json objects, but that is already a goal we have anyways so that we push all of the parsing down to musica. @mattldawson would this be a good time to do that or do you think it's too soon?

@mattldawson
Copy link
Collaborator

Actually, we might consider storing all of the mechanism information in this file as well. That would remove the need to write any files out. We would then have to update the musica API to take json objects, but that is already a goal we have anyways so that we push all of the parsing down to musica. @mattldawson would this be a good time to do that or do you think it's too soon?

I like that idea. Do we need to make any changes to the open-atmos format for this to work?

@K20shores
Copy link
Collaborator Author

@mattldawson I don't belive so. open-atmos already assume everything is in the same file. All we have to do is strip out whatever information is in the mechanism and pass it along to musica

@carl-drews
Copy link
Collaborator

Doing just CSV vs. JSON in this ticket; leave choice of solver (Rosenbrock and others) to another issue.

@K20shores
Copy link
Collaborator Author

This is also related to #260, but only indirectly

@carl-drews
Copy link
Collaborator

The code base has changed substantially since October. I will re-synch my obsolete dev branch with the latest main.

@carl-drews
Copy link
Collaborator

Here is the in-progress branch for this issue:
https://github.com/NCAR/music-box/tree/220-config-file-csv

@carl-drews
Copy link
Collaborator

carl-drews commented Dec 12, 2024

My initial conditions for dev/test look like this in the JSON:

  "initial conditions": {
    "filepath": "initial_conditions.csv",
    "data": [
      ["ENV.temperature [K]", "ENV.pressure [Pa]", "CONC.A [mol m-3]", "CONC.B [mol m-3]"],
      [200, 70000, 0.67, 2.3e-9]
    ]
  },

@carl-drews
Copy link
Collaborator

Since "Species configurations are only read from the initial conditions section", then it follows that the chemical species section of my_config.json will be removed and no longer supported. Delete:

  "chemical species": {
    "A": {
      "initial value [mol m-3]": 1
    },
    "B": {
      "initial value [mol m-3]": 0
    },
    "C": {
      "initial value [mol m-3]": 0
    }
  },

@carl-drews
Copy link
Collaborator

@mattldawson and @K20shores :
For the carbon_bond_5 example we have these initial conditions:

    "initial conditions": {
        "initial_reaction_rates.csv": {}
    },

Those are reaction rates, not initial species concentrations. That CSV files looks like this:

EMIS.NO,EMIS.NO2_emis,EMIS.CO,EMIS.SO2,EMIS.FORM,EMIS.MEOH,EMIS.ALD2_emis,EMIS.PAR,EMIS.ETH,EMIS.OLE,EMIS.IOLE,EMIS.TOL,EMIS.XYL,EMIS.ISOP,PHOTO.NO2,PHOTO.O3->O1D,PHOTO.O3->O3P,PHOTO.NO3->NO2,PHOTO.NO3->NO,PHOTO.HONO,PHOTO.H2O2,PHOTO.PNA,PHOTO.HNO3,PHOTO.NTR,PHOTO.ROOH,PHOTO.MEPX,PHOTO.FORM->HO2,PHOTO.FORM->CO,PHOTO.ALD2,PHOTO.PACD,PHOTO.ALDX,PHOTO.OPEN,PHOTO.MGLY,PHOTO.ISPD
1.44e-10,7.56e-12,1.96e-09,1.06e-09,1.02e-11,5.92e-13,4.25e-12,4.27e-10,4.62e-11,1.49e-11,1.49e-11,1.53e-11,1.4e-11,6.03e-12,0.00477,2.26e-06,0.000253,0.117,0.0144,0.000918,2.59e-06,1.89e-06,8.61e-08,4.77e-07,1.81e-06,1.81e-06,7.93e-06,2.2e-05,2.2e-06,1.81e-06,2.2e-06,0.000645,7.64e-05,1.98e-09

Do you intend for us to read the same file twice, one time looking for prefixes like ENV and CONC, and then read it a second time for prefixes like EMIS and PHOTO? We could do that, but I'm thinking it would be easier to support two JSON keywords instead, like:
"initial concentrations":
"initial rates":

@carl-drews
Copy link
Collaborator

The flow tube example also uses "initial conditions" for the reaction rates:
"initial conditions": {
"filepath": "initial_reaction_rates.csv"
},

The reaction rates look like this:
modeling2:/home/drews/MusicBox/music-box-220-config-file-csv/src/acom_music_box/examples/configs/flow_tube> cat initial_reaction_rates.csv
LOSS.SOA1 wall loss.s-1,LOSS.SOA2 wall loss.s-1
0.01,0.05

@mattldawson
Copy link
Collaborator

I would lean toward allowing any initial conditions (rates, concentrations, temperature, pressure, etc.) to be in one file, or multiple files if a user prefers. Something like:

"initial conditions": {
  "filepaths": [ "all_conditions.csv" ]
}

or

"initial conditions": {
  "filepaths": [
    "some_concentrations.csv",
    "other_concentrations.csv",
    "rates_and_environmental.csv"
  ]
}

What do you think?

@carl-drews
Copy link
Collaborator

carl-drews commented Dec 30, 2024

Thanks @mattldawson. The python code will not know any significance to the filenames, and we could have filepaths like:
concentrations_CSU.csv
BoulderReservoir-20241230.csv
fromCESM-Pinatubo05.csv
Mars-STP.csv

I do like the idea of not tossing any and all initial conditions into the same CSV file, and letting the researcher pick her own filenaming conventions to reflect her current topic of research. So:

  1. The Python code will read all the filepaths specified in initial conditions.
  2. The Python code may take multiple passes through the set of files, and collect the relevant information from each one.
  3. Python will collect the relevant information by header prefix. Probably ENV and CONC will refer to concentrations, and EMIS and PHOTO will refer to reaction rates.

Multiple passes would make the code more modular but slower.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants