Skip to content

Conversation

@SinghAsDev
Copy link
Contributor

As of now, Parquet does not provide builders for Maps and Lists. This leaves margin for user errors. Having Map and List builders will make it easier for users to build these types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add docs for

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, should add docs for all type parameters.

@rdblue
Copy link
Contributor

rdblue commented Apr 28, 2015

Looks like the test failures are caused by incompatible changes flagged by semver. I don't think they are actually incompatible because you moved the methods to a superclass, but we'll need to add a temporary exception here: https://github.com/apache/parquet-mr/blob/master/pom.xml#L240

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think we still need a couple of methods that were overlooked: key(Type) so you can make more complex types than are supported by the builders. Then we will also need value(Type), requiredValue(Type), and optionalValue(Type) on the key builders that set the value type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Adding them.

@rdblue
Copy link
Contributor

rdblue commented Apr 30, 2015

Super close. I think we just need to add a couple of methods and remove some stuff. Still need to add the methods that take Type objects as I noted in comments. I also think we should remove the nested maps within lists.

I didn't quite understand the key builders within List either, and what made them distinct from the mapElement builder. I don't think they are needed either.

One last thing: can you rebase on master? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: style is off for these methods: shouldn't have a space before the open paren.

@rdblue
Copy link
Contributor

rdblue commented May 15, 2015

Okay, I've done one more look through. There are two things still holding it up:

  1. element(Type) methods are missing
  2. I think we should remove the keyElement code and not provide convenience methods for lists inside map keys. This ended up being too messy and exposed extra methods on the list builder.

I also noted a few other things that I haven't seen before. Not blockers, but we may as well fix them:

  • All of the inner builder classes have names like MapGroupKeyBuilder. That's okay, but there's no need to prefix the name with Map because there is no other Key. So I think it should be just GroupKeyBuilder. That also fixes the problem of doubled names, like MapMapValueBuilder or ListListElementBuilder.
  • The key, value, and element method names are singular. I noticed this when trying out the element builder: Types.requiredList().requiredElement(INT32).named("myList"). I think the plural, requiredElements would be easier. It's up to you whether we change it or not. Let's just make sure it is consistent between key, value, and element.
  • Class and method ordering could be more consistent for readability (very minor)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just occurred to me that repetition is ignored because the Type includes a repetition. Maybe we don't need requiredElement and optionalElement after all. Same with the key and value methods that take a Type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks again for catching this.

On Sun, May 17, 2015 at 2:07 PM, Ryan Blue notifications@github.com wrote:

In parquet-column/src/main/java/org/apache/parquet/schema/Types.java
#148 (comment):

  •  if (parent != null) {
    
  •    return new ListElementBuilder<P, Q>(this).repetition(repetition);
    
  •  } else {
    
  •    return new ListElementBuilder<P, Q>(this, returnType).repetition(repetition);
    
  •  }
    
  • }
  • public ListElementBuilder<P, Q> requiredListElement() {
  •  return listElement(Type.Repetition.REQUIRED);
    
  • }
  • public ListElementBuilder<P, Q> optionalListElement() {
  •  return listElement(Type.Repetition.OPTIONAL);
    
  • }
  • public BaseListBuilder<P, Q> element(Type type, Type.Repetition repetition) {

It just occurred to me that repetition is ignored because the Type
includes a repetition. Maybe we don't need requiredElement and
optionalElement after all. Same with the key and value methods that take
a Type.


Reply to this email directly or view it on GitHub
https://github.com/apache/parquet-mr/pull/148/files#r30474900.

Regards,
Ashish

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of non-functional changes to the white space that look non-standard.

@asfgit asfgit closed this in dd92a9d May 26, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants