Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logical type int{8,16,32} don't work #43

Closed
mia-0032 opened this issue May 7, 2020 · 3 comments · Fixed by #45 or #46
Closed

Logical type int{8,16,32} don't work #43

mia-0032 opened this issue May 7, 2020 · 3 comments · Fixed by #45 or #46
Labels
bug Something isn't working

Comments

@mia-0032
Copy link

mia-0032 commented May 7, 2020

Hi,

I found that using logical type int{8,16,32} cause java.lang.IllegalStateException: INTEGER(32,true) can only annotate INT32 error.
I think that this plugin have set primitive type to INT64 where it had to be set the type to INT32.

https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/schema/Types.java#L520-L527

My sample config is below:

out:
  type: s3_parquet
  bucket: example-bucket
  path_prefix: embulk/1/hoge.
  file_ext: snappy.parquet
  compression_codec: snappy
  default_timestamp_format: "%Y-%m-%d %H:%M:%S.%3N%z"
  default_timezone: JST
  column_options:
    id:
      logical_type: int32
  type_options:
    timestamp:
      logical_type: timestamp-millis
  canned_acl: private
  auth_method: profile
  region: ap-northeast-1

Error detail:

org.embulk.exec.PartialExecutionException: java.lang.IllegalStateException: INTEGER(32,true) can only annotate INT32
        at org.embulk.exec.BulkLoader$LoaderState.buildPartialExecuteException(BulkLoader.java:340)
        at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:566)
        at org.embulk.exec.BulkLoader.access$000(BulkLoader.java:35)
        at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:353)
        at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:350)
        at org.embulk.spi.Exec.doWith(Exec.java:22)
        at org.embulk.exec.BulkLoader.run(BulkLoader.java:350)
        at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:242)
        at org.embulk.EmbulkRunner.runInternal(EmbulkRunner.java:291)
        at org.embulk.EmbulkRunner.run(EmbulkRunner.java:155)
        at org.embulk.cli.EmbulkRun.runSubcommand(EmbulkRun.java:431)
        at org.embulk.cli.EmbulkRun.run(EmbulkRun.java:90)
        at org.embulk.cli.Main.main(Main.java:64)
Caused by: java.lang.IllegalStateException: INTEGER(32,true) can only annotate INT32
        at org.apache.parquet.Preconditions.checkState(Preconditions.java:89)
        at org.apache.parquet.schema.Types$BasePrimitiveBuilder$1.checkInt32PrimitiveType(Types.java:567)
        at org.apache.parquet.schema.Types$BasePrimitiveBuilder$1.visit(Types.java:526)
        at org.apache.parquet.schema.LogicalTypeAnnotation$IntLogicalTypeAnnotation.accept(LogicalTypeAnnotation.java:739)
        at org.apache.parquet.schema.Types$BasePrimitiveBuilder.build(Types.java:445)
        at org.apache.parquet.schema.Types$BasePrimitiveBuilder.build(Types.java:336)
        at org.apache.parquet.schema.Types$Builder.named(Types.java:314)
        at org.embulk.output.s3_parquet.parquet.IntLogicalTypeHandler.newSchemaFieldType(LogicalTypeHandler.scala:35)
        at org.embulk.output.s3_parquet.parquet.EmbulkMessageType$EmbulkMessageTypeColumnVisitor.addTypeByLogicalTypeHandlerOrDefault(EmbulkMessageType.scala:58)
        at org.embulk.output.s3_parquet.parquet.EmbulkMessageType$EmbulkMessageTypeColumnVisitor.longColumn(EmbulkMessageType.scala:76)
        at org.embulk.spi.Column.visit(Column.java:48)
        at org.embulk.spi.Schema.visitColumns(Schema.java:68)
        at org.embulk.output.s3_parquet.parquet.EmbulkMessageType$Builder.build(EmbulkMessageType.scala:38)
        at org.embulk.output.s3_parquet.parquet.ParquetFileWriteSupport.init(ParquetFileWriteSupport.scala:25)
        at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:277)
        at org.apache.parquet.hadoop.ParquetWriter$Builder.build(ParquetWriter.java:569)
        at org.embulk.output.s3_parquet.S3ParquetOutputPlugin.$anonfun$open$1(S3ParquetOutputPlugin.scala:331)
        at org.embulk.output.s3_parquet.ContextClassLoaderSwapper$.using(ContextClassLoaderSwapper.scala:11)
        at org.embulk.output.s3_parquet.ContextClassLoaderSwapper$.usingPluginClass(ContextClassLoaderSwapper.scala:16)
        at org.embulk.output.s3_parquet.S3ParquetOutputPlugin.open(S3ParquetOutputPlugin.scala:332)
        at org.embulk.spi.util.Executors.process(Executors.java:51)
        at org.embulk.spi.util.Executors.process(Executors.java:38)
        at org.embulk.exec.LocalExecutorPlugin$DirectExecutor$1.call(LocalExecutorPlugin.java:170)
        at org.embulk.exec.LocalExecutorPlugin$DirectExecutor$1.call(LocalExecutorPlugin.java:167)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
        Suppressed: java.lang.IllegalStateException: INTEGER(32,true) can only annotate INT32
                ... 28 more

Error: java.lang.IllegalStateException: INTEGER(32,true) can only annotate INT32
@civitaspo
Copy link
Owner

Thanks for the report. I'll get it fixed soon. Or your pull request would be appreciated :)

@civitaspo civitaspo added the bug Something isn't working label May 9, 2020
@civitaspo civitaspo mentioned this issue May 25, 2020
@civitaspo
Copy link
Owner

@mia-0032 I fixed the issue by v0.5.0. Would you mind checking it out?

@mia-0032
Copy link
Author

@civitaspo I tried it and it worked well. Thank you so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants