-
-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incompatibility with Avro >=1.9.0 (upgrade to Avro 1.11.3) #167
Comments
Hmmh. So the problem is that API of Apache Avro library is incompatible between 1.8.x and 1.9 -- one exposes Jackson 1.x Thank you for reporting the issue: I think it is good that apache avro library has upgraded to Jackson 2.x, but it is bit problematic as types are exposed in API on short term. |
As of Apache Avro 1.9.0 they updated to depend on Jackson 2.9.7, but the problem is actually in the com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor class which is depending on "org.apache.avro.util.internal.JacksonUtils", which used to reference the old org.codehaus.jackson.JsonNode. |
@salt-tony yes, I understand the mechanism pretty well and attempted to summarize above. But I am not sure of the point you are trying to make here? My interest is not in figuring out what to blame (I agree that relying on internal parts of another library is not a good idea) but to figure out how to try to contain further version incompatibilities. Since Jackson versions up to 2.9.x will be bound to dependencies of Avro lib 1.8.x, it is not possible to upgrade to later Avro library there; and anything using Jackson 2.9.x Avro module can not use apache Avro 1.9.x or above. Note, too, that Avro module is only relying on Jackson 1.x types because Avro 1.8.x exposes default values using 1.x |
After investigating this for a bit now I am much less hopeful in figuring out a working way to upgrade Avro module to work with Apache Avro lib 1.9.x. Part of the problem is that the complicated way some of unit tests work, it is difficult to see what actually breaks: I can get code to compile fine, but many things do not seem to work the way they used to in avro lib. So there is a distinct possibility that Avro module will require pre-1.9 version of Apache Avro lib, at least for 2.10. |
Ohhhh man Avro has gone and created a right nasty mess. I've done a bit of digging, and there's some good news and some bad news:
The unit tests catch this pretty handily (all of the "Apache(1.9.X) -> Jackson w/ Apache(1.9.X) schema" round-trip tests start failing, while their counterparts do not), and once fixed reveal a handful of other edge cases that need to be looked at. It appears that most of the other failing unit tests are similarly related to this issue, in the forms of Enums not resolving correctly or Unions not resolving correctly because we had relied upon Avro helpers to resolve these situations in a compatible manner; It looks like they will have to be rewritten and kept in sync instead. In general, I recommend staying away from 1.9.X unless you're prepared to upgrade your whole ecosystem (or at least every consumer) all at once because the schema format is not forwards compatible. I think these can all be worked around/fixed/patched so that Jackson can read from both versions and write from both versions without any major overhauls/work, but we do need to clone&own some pieces we were leveraging from Avro. |
@baharclerode Thank you for digging bit deeper! Your comments make more sense to me, and I think that at least for 2.10 we'll better stay on 1.8.x. Even if it means we can't get rid of Jackson 1.x dependency. I would like to see the minor cleanup wrt replacing use of |
I'll see if I can get two PRs together:
The irony is I'm pretty sure this behavior was changed because the package example;
public class Foo {
public static class Bar {
public static class Baz {}
}
} will still have a name of |
Heh. Oh boy. But yes, +1 for those PRs. I'll also need to find time to work on other submitted PRs, bugs against Avro module, to get things shipshape for 2.10. |
Hi cowtowncoder Avro 1.8.x has vulnerabilities and we need to update avro or remove jackson-dataformats-binary from our projects. |
No, I do not plan to work on this. Avro 1.9.x seems fundamentally problematic wrt its dependencies. |
@DanielCharczynski We fixed this vulnerability by using |
Could you check if this is still the case with Apache Avro 1.9.2? Would love to help here. |
Hello! A couple of quick points:
+1 to helping get this working for jackson! |
@Fokko I can confirm this is still an issue with Avro 1.9.2 and Jackson versions 2.9.8, 2.9.10, and 2.10.2 |
@Sage-Pierce thank you for verifying this. |
It would be great! ... Except the reference implementation doesn't do this for record, enum, or fixed types. There's a lot of tooling and infra that depends heavily upon the reference implementation of Avro, so breaking compatibility with it is a dangerous game. IMHO the better solution would have been for the reference implementation to switch to preferring java-class if it exists, and always including java-class (even for record/enum/fixed types) whenever the java type isn't $namespace.$className. Honestly that'd probably be a good direction to take Jackson-Avro as it gives us more options on how to handle data correctly in a mixed-version ecosystem.
The binary side of things are indeed fully compatible, but the binary can't be read without a schema, so I consider the schema part of compatibility. If a 1.8 client can't read 1.9 data written with a 1.9 schema and map it correctly, backwards compatibility was broken. Even though forward compatibility was not broken, there are some relatively simple/common usage patterns that make upgrading a herculean effort. (Essentially, all consumers have to be upgraded first, and only then you can upgrade the producers. But what do you do for an application that is both a consumer and producer? load different versions of the library on different classloaders?) |
Hmmh. I must say that I find possible incompatibility between 1.8 and 1.9 tooling deeply troubling, especially as binary format should itself be compatible. And even from basic version standpoint it is... not great. While I do not necessarily understand full complexity here, it seems to me that it may be necessary to sort of work Jackson Avro format module into 2 implementations; existing older one for 1.8 and prior, and then something else for 1.9 and beyond? Naming of this thing is probably ... interesting problem, but both could co-exist and aside from question of what nominal "format name" (property that Jackson format implementations declare they support), might work. I assume new module would be something like |
Ultimately, I don’t think we need to maintain two different backends. Jackson is flexible enough that we can upgrade to avro 1.9, fix the deserialization/serialization code to handle 1.8 or 1.9 schemas automatically, and add a mapper feature to select if you want 1.8 or 1.9 schemas to be generated (defaulting to 1.8 for compatibility). Switching an existing ecosystem from 1.8 to 1.9 will still be painful, but Jackson won’t be the bottleneck/blocker. I’m mostly just frustrated at the reference implementation because unlike Jackson, I can’t have full 1.8 and 1.9 support side-by-side in the same app without shading or classloader magic. |
To give some background, Avro 1.8.x was still on the old codehaus Jackson. Avro 1.8.2 was released back in June 2017, quite a while ago. Since then we've updated a lot of outdated dependencies, including moving from Jackson 1.x to 2.x and deprecating Joda-time and such. Since back then there we're some Jackson objects part of the public API, we've decided to remove these since we had to break the API anyway (sad, I know). We've analyzed the API's that we've broke, this is mostly Jackson and Joda stuff: http://people.apache.org/~busbey/avro/1.9.0-RC4/1.8.2_to_1.9.0RC4_compat_report.html |
@Fokko removal of Jackson 1.x (and especially removal of the exposure via Avro API) is great, and no complaints there. Replacements are (as far as I understand) fine and not problematic in themselves. But it seems that some changes outside this particular dependency are bigger issues @baharclerode if that can be done, great. I wonder if this would fit in timeframe for 2.11 (quite imminent), or 2.12? Or, put another way: are there things to be done in 2.11, with limited compatibility problems, but that would lead towards breaking changes in 2.12? |
🍿 😮 🍿Are there Avro JIRAs to file/watch for complaining about the "minor" semantic number introducing backwards incompatible changes? |
For info: @Cricket007 https://issues.apache.org/jira/browse/AVRO-2687 and a link to the last discussion on the mailing list. Avro has not followed literal semantic versioning, which is often unexpected. Your (tactful and gentle) perspective would be very welcome 😄 @baharclerode : I do agree with you about the schema being an important part for backwards/forwards compatibility. The current behaviour was definitely a fix for cross-platform compatibility in 1.9.x... it's unfortunate nobody foresaw code relying on the invalid names from 1.8.x! I think I had a basic misunderstanding on what Jackson-Avro needed to work -- I'm taking a deeper look. Avro is planning a 1.10.x release in May (which should not be considered a minor semantic version bump). It might be worthwhile working together and targetting that to have the least-painful / most-correct fixes in both projects -- maybe we can restore the "better solution" ☝️ behaviour for ReflectData at that point? |
Which (other) changes do you refer to concretely @cowtowncoder ? |
@iemejia I wish I remembered the exact details, from my attempts to upgrade, but I think there were actual behavior changes between apache avro 1.8 and 1.9, so that although I was able to make Jackson Avro module compile with changes to use new methods in 1.9, existing test suite failures, possibly related to changes in schema name generation changes. As to Avro 1.10.x, it sounds like this might be good opportunity to resolve the upgrade challenges. @baharclerode I'll have a look at that PR now. |
Looks like version 1.11.x released: https://mvnrepository.com/artifact/org.apache.avro/avro if anyone had time and interest to see whether we could possibly upgrade past 1.8.x for Jackson 2.16. |
There is a CVE in avro up to 1.11.3 - see https://lists.apache.org/thread/wcj1747hvyl7qjhrfr6d6j1l62hvpr5l #401 shows the test issues. |
it gives the reason of schema incompatibility when the schema is expected to be compatible. It can help investigation on FasterXML#167 Signed-off-by: Aurélien Pupier <apupier@redhat.com>
it gives the reason of schema incompatibility when the schema is expected to be compatible. It can help investigation on FasterXML#167 Signed-off-by: Aurélien Pupier <apupier@redhat.com>
Same number of test error with Avro 1.9.2 and 1.11.3: |
RecordEvolutionTest during investigation upgrading Avro to 1.9+, the test was failing. the code has enforced the rule from the specification that `when a default value is specified for a record field whose type is a union, the type of the default value must match the first element of the union. Thus, for unions containing "null", the "null" is usually listed first, since the default value of such unions is typically null.` it was part of specification in 1.8 already https://avro.apache.org/docs/1.8.0/spec.html#Unions , just not enforced in the codebase. relates to FasterXML#167 Signed-off-by: Aurélien Pupier <apupier@redhat.com>
The test com.fasterxml.jackson.dataformat.avro.SerializeGeneratedTest.testWriteGeneratedEvent() is failing with:
I noticed one related change in Avro API which is related. The Consequently, the |
@apupier avroMapper.registerModule(new SimpleModule() {
@Override
public void setupModule(SetupContext context) {
context.addBeanSerializerModifier(new BeanSerializerModifier() {
@Override
public List<BeanPropertyWriter> changeProperties(
SerializationConfig config,
BeanDescription beanDesc,
List<BeanPropertyWriter> beanProperties
) {
for (int i = 0; i < beanProperties.size(); i++) {
BeanPropertyWriter writer = beanProperties.get(i);
if (writer.getName().equals("specificData")) {
beanProperties.remove(writer);
}
}
return super.changeProperties(config, beanDesc, beanProperties);
}
});
super.setupModule(context);
}
}) |
@pjfanning sure, do you know any ETA for the overall fix? |
No ETA - in fact, it seems unlikely anything will be done here. It has been open for a long time. 21 broken tests with many root causes. |
Yeah, I think it's ok. |
I am not sure I'd recommend "use something else" as the blanket answer. But it is true that right now there is no one working on resolving this issue. We would be in better position to get upgrade to Jackson 3.0 however, for what that is worth. But the compatibility issue still needs to be resolved. Thanks to your patch (via @pjfanning 's PR) there is at least one less unit test failure with 2.18. Small step but still... :) Put another way: solution here would be highly valuable but Help Needed. I can help with small details in getting things reviewed, merged and so on. |
Namespace for nested classes no longer ends with '$'. This is how avro library generates schema since version 1.9. See: AVRO-2143 Please note that resolution of nested classes without '$' was implemented long ago in c570549. Fixes FasterXML#167
I've made a PR (#511) that successfully passes all tests with Avro 1.11.3, but I'm unsure about backward compatibility. |
As per my comment on PR, what I think matters most is the over-the-wire compatibility. I also think we would probably want PR against 2.18 branch instead of |
Namespace for nested classes no longer ends with '$'. This is how avro library generates schema since version 1.9. See: AVRO-2143 Please note that resolution of nested classes without '$' was implemented long ago in c570549. Fixes FasterXML#167
Avro version 1.9.0 was released last month (https://mvnrepository.com/artifact/org.apache.avro/avro) and my team has run in to compatibility issues between
jackson-dataformat-avro
and this latest release.In particular, it looks to be the case that Avro has switched from Codehaus Jackson to FasterXML Jackson, which results in
NoSuchMethodError
thrown when accessing utility methods.The following test exposes the issue:
This test passes on v1.8.2 of Avro, but fails with v1.9.0 on the following:
Note that we are using
com.fasterxml.jackson.dataformat:jackson-dataformat-avro:2.9.9
The text was updated successfully, but these errors were encountered: