Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi line notes in Chemkin with mixed white space after comment symbols cause issues with write_yaml #1695

Closed
corykinney opened this issue May 15, 2024 · 12 comments · Fixed by #1740

Comments

@corykinney
Copy link
Contributor

Problem description

Sorry for the title gore, but recently I've been working with the NUIGMech 1.2 and have been having issues performing mechanism reductions and writing the reduced mechanisms to YAML files. The issue seems to stem from the Chemkin source files having comments that have a mixture of whitespace after the exclamation point within the same comment. I'd be happy to help contribute to this, but would need advice on the recommended way to fix it that wouldn't break other behaviors.

Steps to reproduce

Run ck2yaml with a Chemkin file with the following reaction, note that the original Chemkin file has a space after the ! in the first line of the comment and no space in the second line of the comment:

! A.W. Jasper, S.J. Klippenstein, L.B. Harding, J. Phys. Chem. A 111 (2007) 8699-8707.

!DONG 0406
CH2+CH2=>C2H2+H+H                     7.0E13   0.0022      8 !  

Produces the following output, which Cantera loads correctly:

- equation: CH2 + CH2 => C2H2 + H + H  # Reaction 90
  rate-constant: {A: 7.0e+13, b: 2.2e-03, Ea: 8.0}
  note: |2-
     A.W. Jasper, S.J. Klippenstein, L.B. Harding, J. Phys. Chem. A 111 (2007) 8699-8707.
    DONG 0406

The |2- keeps the indentation consistent with the first line keeping its extra space when loaded. After loading the mechanism, creating a new ct.Solution object that has that species in it and writing it using ck2yaml gives the following output:

  - equation: 2 CH2 => C2H2 + 2 H
    rate-constant: {A: 7.0e+10, b: 2.2e-03, Ea: 3.3472e+04}
    note: |
       A.W. Jasper, S.J. Klippenstein, L.B. Harding, J. Phys. Chem. A 111 (2007) 8699-8707.
      DONG 0406

When it gets written, the 2- is now gone but the initial space gets written out, so this mechanism can't be loaded after the round trip through ct.Solution and then write_yaml as either a illegal map value or end of map not found error is thrown. If everything else is behaving correctly, maybe write_yaml just needs some handling for leading whitespace on multi line notes that the ck2yaml already has? Otherwise, I don't know what the expected behavior is, so I'd appreciate advice on how to go about fixing it. Manually, I've been fixing it by adding the 2- back to reactions that throw errors, but that is a tedious process to do reaction by reaction for such large mechanisms.

As a side note, I wish all mechanisms were developed in Cantera in the first place 😜

System information

  • Cantera version: 3.0.0 (806842d)
  • OS: Windows 11
  • Python: 3.10
@bryanwweber
Copy link
Member

Thanks @corykinney! If I had to guess, this line in ck2yaml ensures the output has the correct indentation marker. The YAML writer from Solution looks like its done in C++ though, and I haven't dug into that at all to know where a fix might come in.

@bryanwweber
Copy link
Member

It looks like this is maybe something that yaml-cpp doesn't support yet? jbeder/yaml-cpp#1191

@corykinney
Copy link
Contributor Author

It looks like this is maybe something that yaml-cpp doesn't support yet? jbeder/yaml-cpp#1191

If that notation can't be added automatically for the write_yaml function with the current yaml writer, is there any harm in just removing leading whitespace in multi-line notes? Or if it always writes yaml files with indentation of 2 spaces, is there a naive way to force the |2- on every multi-line note, even when not necessary? I don't know the file formats super well, just throwing out ideas.

@bryanwweber
Copy link
Member

is there any harm in just removing leading whitespace in multi-line notes?

The only harm I can think of is that you won't be able to round-trip a chemkin file (meaning ck->yaml->ck) and get identical output. I'm not sure how useful that is, and this seems like the easiest way to fix this particular problem, and having a working YAML file seems more important.

@ischoegl
Copy link
Member

I'm 👍 for removing the leading whitespace. Preserving this leading whitespace appears to create issues while doing nothing in terms of problem solving. In other words, it's a 'sacrifice' that is inconsequential unless any of us can think of a scenario where 'formatting' chemkin comments is warranted?

@speth
Copy link
Member

speth commented May 20, 2024

I'm unclear on whether this is a bug in yaml-cpp, or just how we're using it. Regardless of whether it supports certain options for how to format strings, I wouldn't think that yaml-cpp should output YAML that it can't then parse.

@ischoegl
Copy link
Member

ischoegl commented Jun 26, 2024

Here's a Cantera MWE with the actual error:

In [1]: import cantera as ct
   ...: gas = ct.Solution("gri30.yaml")

In [2]: yaml = """
   ...: equation: 2 CH2 => C2H2 + 2 H
   ...: rate-constant: {A: 7.0e+10, b: 2.2e-03, Ea: 3.3472e+04}
   ...: note: |
   ...:    A.W. Jasper, S.J. Klippenstein, L.B. Harding, J. Phys. Chem. A 111 (2007) 8699-8707.
   ...:   DONG 0406
   ...: foo: bar  # adding a dummy line
   ...: """

In [3]: ct.Reaction.from_yaml(yaml, gas)
[...]
CanteraError:
*******************************************************************************
InputFileError thrown by AnyMap::fromYamlString:
Error on line 6 of input string:
end of map not found
|  Line |
|     1 |
|     2 | equation: 2 CH2 => C2H2 + 2 H
|     3 | rate-constant: {A: 7.0e+10, b: 2.2e-03, Ea: 3.3472e+04}
|     4 | note: |
|     5 |    A.W. Jasper, S.J. Klippenstein, L.B. Harding, J. Phys. Chem. A 111 (2007) 8699-8707.
>     6 >   DONG 0406
            ^
|     7 | foo: bar  # adding a dummy line
*******************************************************************************

with the error being triggered here:

YAML::Node node = YAML::Load(yaml);

It looks like yaml-cpp is indeed writing YAML that can't be read ... see WriteLiteralString method (which does not take additional arguments)

@bryanwweber
Copy link
Member

I just came across https://github.com/biojppm/rapidyaml, throwing it out there as a potential alternative if we want to go that route. I have no idea if it fixes this bug.

@ischoegl
Copy link
Member

Performance of that library looks really impressive. At the same time, it’s a rather drastic approach to address this issue 🤣

@baperry2
Copy link

Any updates on resolving this issue?

It appears to be the underlying cause of an issue we are seeing in PelePhysics (AMReX-Combustion/PelePhysics#514), where our workflow for pre-processing some mechanisms for use in the Pele reacting flow solvers involves a round trip through ct.Solution, write_yaml, and then ct.Solution on the new file.

@ischoegl
Copy link
Member

@baperry2 ... no updates at the moment. It appears that the actual bug is upstream from Cantera in yaml-cpp. To us, this issue is a fairly inconsequential paper cut that can be easily addressed by 'fixing' the input (it has been our opinion that we don't necessarily support everything that CHEMKIN allows). The best we could do is to apply a band-aid, which, however, would break true round-trip conversions. It would be simple - PR's are certainly welcome.

@baperry2
Copy link

Thanks @ischoegl, that makes sense. I agree, once you know what's going on it's not too bad to manually resolve the issue when it comes up, which appears to be fairly rare.

speth added a commit to speth/cantera that referenced this issue Jul 26, 2024
Preserve indentation of lines within multiline comment blocks,
except for the first line which cannot be indented due to limited
support for multiline strings from yaml-cpp.

Fixes Cantera#1695
pjsingal pushed a commit to pjsingal/cantera that referenced this issue Sep 4, 2024
Preserve indentation of lines within multiline comment blocks,
except for the first line which cannot be indented due to limited
support for multiline strings from yaml-cpp.

Fixes Cantera#1695
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants