Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds read support for Ion 1.1 system symbols #944

Merged
merged 5 commits into from
Sep 27, 2024

Conversation

popematt
Copy link
Contributor

Issue #, if available:

None.

Description of changes:

  • Adds support for reading system symbols in FlexSyms and using the EE op code.
  • Changes a lot of tests from using A0 for "unknown symbol" to 60 and 90 for empty symbol to 75.
  • Removes the @Disabled annotation from disabled tests in IonManagedWriter_1_1_Test
  • Adds the byte value to the string representation of IonTypeID. (I.e. was SYMBOL(1), now 0xEE(SYMBOL,1)) That was useful to me while debugging things.
  • Limitations:
    • I disabled the entirety of EncodingDirectiveCompilationTest. For some reason, I couldn't seem to understand how these tests were supposed to work in the first place, so I was unable to update them to get them to work with my changes. Because almost all of the other tests are working, I am fairly confident that the problem lies in the tests rather than in the changes I've made. I will need to figure this out and update, replace, or remove this test class, but I didn't want to block this PR on it.
    • There's a bit of a hacky way of handling system symbols that are encoded as tagless flexsyms. There's a rather long FIXME: note about it in the code. This works for now... but should probably be fixed.
    • It can't match read encoding directives/symbol tables that use user-space equivalents of the system symbols. This is something that MUST be fixed eventually.
    • I disabled the Ion 1.1 round-trip tests that transfer values using the system reader because it doesn't translate the symbol tables correctly at this point and/or it can't read them because of the aforementioned issue.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link

codecov bot commented Sep 24, 2024

Codecov Report

Attention: Patch coverage is 75.18797% with 33 lines in your changes missing coverage. Please review.

Please upload report for BASE (ion-11-encoding@4b373e5). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...on/impl/IonReaderContinuableApplicationBinary.java 67.39% 9 Missing and 6 partials ⚠️
...mazon/ion/impl/IonReaderContinuableCoreBinary.java 75.00% 4 Missing and 5 partials ⚠️
...main/java/com/amazon/ion/impl/IonCursorBinary.java 81.81% 1 Missing and 5 partials ⚠️
src/main/java/com/amazon/ion/impl/IonTypeID.java 80.00% 1 Missing ⚠️
...a/com/amazon/ion/impl/LocalSymbolTableImports.java 80.00% 0 Missing and 1 partial ⚠️
...va/com/amazon/ion/impl/bin/IonManagedWriter_1_1.kt 0.00% 1 Missing ⚠️
Additional details and impacted files
@@                Coverage Diff                 @@
##             ion-11-encoding     #944   +/-   ##
==================================================
  Coverage                   ?   70.05%           
  Complexity                 ?     6870           
==================================================
  Files                      ?      196           
  Lines                      ?    27070           
  Branches                   ?     4899           
==================================================
  Hits                       ?    18964           
  Misses                     ?     6593           
  Partials                   ?     1513           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look into:
TODO: Also check if it is an interned user-symbol with the same text as the expected System Symbol

Comment on lines 1502 to 1503
// TODO: We could pretend $0 is a system symbol and consolidate some of the branches here. Is it worth it?
if (nextByte == FLEX_SYM_SYSTEM_SYMBOL_OFFSET) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine with that, if you can get it to work. However, we return 0 from this branch and -1 from the system symbol branch, so there may need to be the same amount of branching either way.

src/main/java/com/amazon/ion/impl/IonCursorBinary.java Outdated Show resolved Hide resolved
src/main/java/com/amazon/ion/impl/IonCursorBinary.java Outdated Show resolved Hide resolved
Comment on lines +536 to +537
// FIXME: This should take into account the version at the point in the stream.
resetImports(1, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's stopping us from passing in the real major and minor version here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current implementation, the version of the symbol table is determined by the system symbol table that it imports. In Ion 1.1, the local symbol table does not necessarily import any system table. I think there's a lot more work that will need to be done in order to get Ion 1.1 and the current SymbolTable API to work together.

} else {
id = readFixedUInt_1_1(valueMarker.startIndex, valueMarker.endIndex);

// FIXME: This is a hack that works as long as our system symbol table doesn't grow to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine tracking this in an issue. This should work for the foreseeable future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will create an issue with relevant perma-links to the code once this PR is merged.

@popematt popematt merged commit c5562bf into amazon-ion:ion-11-encoding Sep 27, 2024
21 of 35 checks passed
@popematt popematt deleted the read-system-symbols branch September 27, 2024 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants