Skip to content

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Jun 25, 2025

#13191 added an API breakage by introducing an abstract method to BaseHTTPClient. We can actually avoid breaking the API in a similar manner as to how we did it historically in the BaseHTTPClient / RESTClient by throwing an UOE when the new parameter is actually set. Since subclasses already override/implement this new method, everything should work as expected.

import org.apache.hadoop.util.Preconditions;

class ParserContext {
public class ParserContext {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the class is being used in public API methods, therefore it should be public

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah technically anyone can use the packaged HttpClient independently make a request on their own and pass through their own ParserContext. So from an API design perspective I agree, we should just make this public to enable people to be able to just build their own if they want. Iceberg implementation can take an internal opinionated approach for whatever client it initializes internally, but it's always available for a user to specify their own context if they so choose.

@nastra
Copy link
Contributor Author

nastra commented Jun 25, 2025

@RussellSpitzer @singhpk234 @amogh-jahagirdar can you guys please review this one? CI should be fixed once #13385 is in

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching and fixing this @nastra , in hindsight I agree there's really no reason that we need to break the execute API and we have a simple way to avoid it in this case.

import org.apache.hadoop.util.Preconditions;

class ParserContext {
public class ParserContext {
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah technically anyone can use the packaged HttpClient independently make a request on their own and pass through their own ParserContext. So from an API design perspective I agree, we should just make this public to enable people to be able to just build their own if they want. Iceberg implementation can take an internal opinionated approach for whatever client it initializes internally, but it's always available for a user to specify their own context if they so choose.

Comment on lines +158 to +162
if (null != parserContext) {
throw new UnsupportedOperationException("Parser context is not supported");
}

return execute(request, responseType, errorHandler, responseHeaders);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is a more elegant way to completely avoid the API breakage.

Copy link
Contributor

@singhpk234 singhpk234 Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the execute without parserContext still abstract ? except the parser context being package protected ?

  protected abstract <T extends RESTResponse> T execute(
      HTTPRequest request,
      Class<T> responseType,
      Consumer<ErrorResponse> errorHandler,
      Consumer<Map<String, String>> responseHeaders);

Copy link
Contributor

@singhpk234 singhpk234 Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the intention to just have one abstract method for execute ? I agree with the ParserContext being made public

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the intention to just have one abstract method for execute

@singhpk234 I think it's more in general that when there's an opportunity to avoid public classes being forced to implement something on upgrade we should try do it that way since we can establish an opinionated default implementation; it's not always possible but here I think it is.

So specifically prior to this change client implementations were already forced to implement execute without the parser context, and after #13191 they would have to implement an additional execute. Since there was an easy way to avoid that by just having a non-abstract implementation which still just fails at runtime, it seems worth it to avoid any existing client implementations from having to just override it themselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That make sense ! thank you for the explanation @amogh-jahagirdar !

Consumer<Map<String, String>> responseHeaders);

protected abstract <T extends RESTResponse> T execute(
protected <T extends RESTResponse> T execute(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this actually stop a user from being broken on upgrade? If you extend this class and execute is called without the parserContext it still breaks but now you don't know that until runtime?

I may misunderstand the full range of possibilities here, but isn't the internal Iceberg SDK passing through the ParserContext here?

@nastra nastra force-pushed the fix-api-breakage branch from ac51a7a to bf311ce Compare June 25, 2025 14:59

static class Builder {
private Map<String, Object> data;
private final Map<String, Object> data;
Copy link
Contributor

@singhpk234 singhpk234 Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't we doing a put on a final object then ? as part of add method, i intentionally removed the final here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the final applies to the actual map instance and doesn't imply that the map is immutable. Looking at the code again I don't think it actually works currently when the builder's add() method is called because data is an empty immutable map. @singhpk234 shouldn't the data map rather be just an empty mutable map?

-    private final Map<String, Object> data;
-
-    private Builder() {
-      this.data = Collections.emptyMap();
-    }
+    private final Map<String, Object> data = Maps.newHashMap();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good eye, I think we would've caught this when implementing the actual client side changes for scan planning w tests, we wouldn't have any of the expected context passed through but it's good to catch this now.

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well, thanks for catching it and fix @nastra !
just have one minor comment on making the map final

@nastra nastra force-pushed the fix-api-breakage branch from bf311ce to 9a083f2 Compare June 25, 2025 16:06
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nastra really appreciate it !

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve of this, I just was a little curious about the change.

@stevenzwu stevenzwu added this to the Iceberg 1.10.0 milestone Jun 25, 2025
@amogh-jahagirdar
Copy link
Contributor

Thanks @nastra , thanks @singhpk234 @RussellSpitzer @stevenzwu for reviewing. I'll go ahead and merge

@amogh-jahagirdar amogh-jahagirdar merged commit 4278e91 into apache:main Jun 25, 2025
43 checks passed
@nastra nastra deleted the fix-api-breakage branch June 26, 2025 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants