[QUESTION] Unable to create Hudi table #5167
-
Code of Conduct
Search before asking
Describe the bugWhen I create a table on hudi
then I got these error:
I didn't set hive.metastore.uris in my spark conf, I wonder if hive env is necessary. Affects Version(s)1.7.1/1.7.0 Kyuubi Server Log Output2023-08-15 16:53:28.098 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing hudi's query[6df1e6ac-e4d3-4383-82d2-1c6691c91a8d]: PENDING_STATE -> RUNNING_STATE, statement:
CREATE TABLE hudi_cow_nonpcf_tbl (
uuid INT,
name STRING,
price DOUBLE
) USING HUDI
2023-08-15 16:53:33.106 INFO org.apache.kyuubi.operation.ExecuteStatement: Query[6df1e6ac-e4d3-4383-82d2-1c6691c91a8d] in RUNNING_STATE
2023-08-15 16:53:38.113 INFO org.apache.kyuubi.operation.ExecuteStatement: Query[6df1e6ac-e4d3-4383-82d2-1c6691c91a8d] in RUNNING_STATE
2023-08-15 16:53:43.119 INFO org.apache.kyuubi.operation.ExecuteStatement: Query[6df1e6ac-e4d3-4383-82d2-1c6691c91a8d] in RUNNING_STATE
2023-08-15 16:53:43.122 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing hudi's query[6df1e6ac-e4d3-4383-82d2-1c6691c91a8d]: RUNNING_STATE -> CANCELED_STATE, time taken: 15.024 seconds
2023-08-15 16:53:48.124 INFO org.apache.kyuubi.operation.ExecuteStatement: Query[6df1e6ac-e4d3-4383-82d2-1c6691c91a8d] in RUNNING_STATE
2023-08-15 16:53:48.135 INFO org.apache.kyuubi.client.KyuubiSyncThriftClient: TCancelOperationReq(operationHandle:TOperationHandle(operationId:THandleIdentifier(guid:60 7D A3 0F DD C1 40 BD B9 CD A8 8E C9 76 99 8C, secret:C2 EE 5B 97 3E A0 41 FC AC 16 9B D7 08 ED 8F 38), operationType:EXECUTE_STATEMENT, hasResultSet:true)) succeed on engine side
2023-08-15 16:53:48.143 INFO org.apache.kyuubi.operation.ExecuteStatement: Query[6df1e6ac-e4d3-4383-82d2-1c6691c91a8d] in CANCELED_STATE
2023-08-15 16:53:48.153 WARN org.apache.kyuubi.operation.ExecuteStatement: Ignore exception in terminal state with 6df1e6ac-e4d3-4383-82d2-1c6691c91a8d Kyuubi Engine Log OutputCaused by: ERROR XJ040: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@2222ea4, see the next exception for details.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source)
... 133 more
Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /opt/apache-kyuubi-1.7.1-bin/work/hudi/metastore_db.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.privGetJBMSLockOnDB(Unknown Source)
at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.getJBMSLockOnDB(Unknown Source)
at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
at org.apache.derby.impl.store.raw.RawStore$6.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.store.raw.RawStore.bootServiceModule(Unknown Source)
at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
at org.apache.derby.impl.store.access.RAMAccessManager$5.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.store.access.RAMAccessManager.bootServiceModule(Unknown Source)
at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
at org.apache.derby.impl.db.BasicDatabase$5.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.db.BasicDatabase.bootServiceModule(Unknown Source)
at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection$4.run(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection$4.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.jdbc.EmbedConnection.startPersistentService(Unknown Source)
... 130 more Kyuubi Server Configurationskyuubi.engine.single.spark.session=true
spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog
spark.hoodie.schema.on.read.enable=true
spark.hoodie.datasource.write.reconcile.schema=true
spark.hoodie.datasource.write.schema.allow.auto.evolution.column.drop=true Kyuubi Engine ConfigurationsNo response Additional contextspark version 3.2.4-bin-hadoop3.2 Are you willing to submit PR?
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
Hello @RrazzmatazZ, |
Beta Was this translation helpful? Give feedback.
-
The error message indicates that you are using embedded HiveMetaStore with the Derby database, and it's OK when you use vanilla Spark to access Hive tables, since there is only one HiveClient instance in the Spark Driver process, but when Hudi/Iceberg comes in, multiple HiveClient instances may be created, which is not supported by Derby. So, please set up a dedicated HiveMetaStore service instead of using embedded HiveMetaStore. |
Beta Was this translation helpful? Give feedback.
-
Try spark.sql.catalogImplementation=in-memory |
Beta Was this translation helpful? Give feedback.
Try spark.sql.catalogImplementation=in-memory