Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add limits when deserializing BigDecimal and BigInteger #2510

Merged
merged 5 commits into from
Oct 23, 2023

Conversation

Marcono1234
Copy link
Collaborator

Purpose

Adds limits when deserializing BigDecimal and BigInteger

Checklist

  • New code follows the Google Java Style Guide
  • If necessary, new public API validates arguments, for example rejects null
  • New public API has Javadoc
    • Javadoc uses @since $next-version$
      ($next-version$ is a special placeholder which is automatically replaced during release)
  • If necessary, new unit tests have been added
    • Assertions in unit tests use Truth, see existing tests
    • No JUnit 3 features are used (such as extending class TestCase)
    • If this pull request fixes a bug, a new test was added for a situation which failed previously and is now fixed
  • mvn clean verify javadoc:jar passes without errors

Copy link
Member

@eamonnmcmanus eamonnmcmanus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on!

@@ -721,6 +722,7 @@ private int peekNumber() throws IOException {
long value = 0; // Negative to accommodate Long.MIN_VALUE more easily.
boolean negative = false;
boolean fitsInLong = true;
int exponentDigitsCount = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of instead recording the index of the first exponent digit? Then you could still easily reject a number if it has too many exponent digits. Or you could parse the digits and compare against a limit, which would fix the leading-zero problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of instead recording the index of the first exponent digit?

I guess that would be possible, but it might then rely on the buffer not being refilled (and indices not changing) during parsing, which works because this is currently the case. Though from that perspective tracking the count seems a bit more reliable to me.

Is your concern performance (I assume not)? Or whether tracking the index might lead to cleaner or easier to understand code?

Or you could parse the digits and compare against a limit, which would fix the leading-zero problem.

I mainly omitted the leading 0s check for simplicity (but nonetheless added a comment and test to document the behavior). Fixing it might also be possible by slightly adjusting this method to explicitly check for 0. For example:

 ...
 } else if (last == NUMBER_CHAR_EXP_E || last == NUMBER_CHAR_EXP_SIGN) {
   last = NUMBER_CHAR_EXP_DIGIT;
+  if (c != '0') {
     exponentDigitsCount++;
+  }
 } else if (last == NUMBER_CHAR_EXP_DIGIT) {
+  if (exponentDigitsCount > 0 || c != '0') {
     exponentDigitsCount++;
 
     // Similar to the scale limit enforced by NumberLimits.parseBigDecimal(String)
     // This also considers leading 0s (e.g. '1e000001'), but probably not worth the effort ignoring them
     if (exponentDigitsCount > 4) {
       throw new MalformedJsonException("Too many number exponent digits" + locationString());
     }
+  }
}

I just wasn't sure if that many leading 0s is really a common use case and worth supporting.
Should I solve this though with the diff shown above (and some comments)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parsing logic is fairly complicated, but it seems to me that the way the i variable works is quite simple, starting from 0 and increasing. Even if the buffer is refilled and pos changes, i is still a valid offset. So I think saving its value and using it at the end is reasonably robust. And I feel the resulting code would be slightly simpler. It also seems very slightly better to check for a specific maximum exponent rather than a number of exponent digits.

I still have the question of whether we need to change JsonReader at all, if we're also checking for large exponents in TypeAdapters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also seems very slightly better to check for a specific maximum exponent rather than a number of exponent digits.

Ok, I will give it a try.

I still have the question of whether we need to change JsonReader at all, if we're also checking for large exponents in TypeAdapters.

Changing JsonReader is mainly for the cases where users directly obtain the number from the JsonReader using nextString() and then parse the number themselves. Do you think this should not be done? In that case is it ok if I change the documentation of nextString() to tell the users to validate the number string, if necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obsolete, I have reverted the changes to JsonReader, see #2510 (comment)

@@ -1031,20 +1036,6 @@ public void testValueVeryCloseToZeroIsZero() {
assertThat(gson.fromJson("122.08e-2132", double.class)).isEqualTo(0.0);
}

@Test
public void testDeserializingBigDecimalAsFloat() {
String json = "-122.08e-2132332";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate. Do we need the logic in JsonReader or would it be enough to just have the new logic in TypeAdapters at the point where the number is converted to a BigDecimal or BigInteger?

Copy link
Collaborator Author

@Marcono1234 Marcono1234 Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly removed these tests because a few lines above something similar is covered already:

assertThat(gson.fromJson("-122.08e-2132", float.class)).isEqualTo(-0.0f);
assertThat(gson.fromJson("-122.08e-2132", double.class)).isEqualTo(-0.0);
assertThat(gson.fromJson("122.08e-2132", float.class)).isEqualTo(0.0f);
assertThat(gson.fromJson("122.08e-2132", double.class)).isEqualTo(0.0);

To me it also does not look like this ...e-2132332 is any specific number but just an arbitrary number with a large amount of exponent digits.
For comparison, Double.MIN_VALUE is 4.9e-324.

I assume it would be possible to adjust the logic in JsonReader and defer the check until nextString() is called, but that might make it a bit more complicated, possibly having to check the string there, or adding a new field indicating whether the parsed number exceeded the limits.
And because at least these numbers here are so contrived I am not sure if that is worth it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I misunderstood. I thought these tests were removed because these values were now rejected. If they still produce the same results but are redundant then it's fine to remove them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you did understand it correctly; they are rejected now.

But I removed them instead of adjusting them (e.g. to use ...e-9999) because in my opinion they did not add much value compared to the other tests: 122.08e-2132 is already parsed as 0.0, so testing -122.08e-2132332 seemed a bit pointless to me.

Do you think this should be supported though? I am not sure if these are really realistic values, or just dummy values for testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the limit checks in JsonReader now because they would also be ineffective when nextString() is called but the JSON data contains a JSON string (for which no restrictions exist) instead of a JSON number, and the user did not explicitly use peek() to verify that the value is a JSON number.

However, the parsing logic for Double.parseDouble (which is called by JsonReader.nextDouble) and Float.parseFloat is quite complex, see https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/math/FloatingDecimal.java
But if it had performance problems for certain number strings, then it would affect all other parts of Gson which parse as double or float as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so if I understand correctly, these tests would now pass, but they are redundant given testValueVeryCloseToZeroIsZero just above. The name of the method is testDeserializingBigDecimalAsFloat but actually it has nothing to do with BigDecimal. Is that right?

Copy link
Collaborator Author

@Marcono1234 Marcono1234 Oct 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is right.

The name of the method is testDeserializingBigDecimalAsFloat but actually it has nothing to do with BigDecimal.

No it doesn't look like it. These tests lead to Gson.doubleAdapter or Gson.floatAdapter being used, which call JsonReader.nextDouble. Neither BigDecimal nor its type adapter seems to be involved.

@Marcono1234
Copy link
Collaborator Author

@eamonnmcmanus, are the changes like this ok?

@@ -1031,20 +1036,6 @@ public void testValueVeryCloseToZeroIsZero() {
assertThat(gson.fromJson("122.08e-2132", double.class)).isEqualTo(0.0);
}

@Test
public void testDeserializingBigDecimalAsFloat() {
String json = "-122.08e-2132332";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so if I understand correctly, these tests would now pass, but they are redundant given testValueVeryCloseToZeroIsZero just above. The name of the method is testDeserializingBigDecimalAsFloat but actually it has nothing to do with BigDecimal. Is that right?

@eamonnmcmanus
Copy link
Member

Thanks again!

@eamonnmcmanus eamonnmcmanus merged commit 6a9d240 into google:main Oct 23, 2023
9 checks passed
@Marcono1234 Marcono1234 deleted the marcono1234/number-limits branch October 23, 2023 20:57
tibor-universe pushed a commit to getuniverse/gson that referenced this pull request Sep 14, 2024
)

* Add limits when deserializing `BigDecimal` and `BigInteger`

* Use assertThrows

* Don't check number limits in JsonReader
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants