Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trino access Aliyun OSS got error: 'failed: No factory for location: oss://bucket-name/path' #23740

Open
XiangyuFan17 opened this issue Oct 10, 2024 · 6 comments

Comments

@XiangyuFan17
Copy link

Hello there, trying to using Trino to access data warehouse setup based on Aliyun OSS
Trino was deployed on kuburnetes and the hive.properties part in catalog.yaml was edited like below

connector.name=hive
hive.metastore.uri=thrift://${thrift-server-host}:30083
fs.native-s3.enabled=true
fs.hadoop.enabled=false
s3.endpoint=http://oss-cn-shanghai.aliyuncs.com
s3.region=oss-cn-shanghai
s3.aws-access-key=${my key}
s3.aws-secret-key=${my key}

while executing show tables is working great and when i tried to get data from specific table, something went wrong

Query 20241010_103922_00003_dt8h9 failed: No factory for location:

WeChatWorkScreenshot_06446aad-4b80-45d6-b261-6327abcb6b30

the way Trino interact with OSS should be the same as AWS S3, anyone please help, thanks a lot

@rvishureddy
Copy link

rvishureddy commented Oct 21, 2024

I have the same issue on Trino 462 version but with S3

Caused by: java.lang.IllegalArgumentException: No factory for location: s3a://Bucket NAME/metadata/13665-29244414-57ea-433d-a4c2-76d6fe0c48a2.metadata.json

catalogs:
  hive: |
    connector.name=hive
    fs.native-s3.enabled=true
    hive.metastore.uri=thrift://hive.spark:9083
    hive.non-managed-table-writes-enabled=true
    hive.max-partitions-per-writers=500
    hive.orc.time-zone=UTC
    hive.parquet.time-zone=UTC
    hive.rcfile.time-zone=UTC
    hive.orc.bloom-filters.enabled=true
    hive.metastore.thrift.client.connect-timeout=2000s
    hive.metastore.thrift.client.read-timeout=2000s

Error Type : INTERNAL_ERROR
Error Code : GENERIC_INTERNAL_ERROR (65536)
Stack Trace :

 io.trino.spi.TrinoException: Error processing metadata for table <tablename>
	at io.trino.plugin.iceberg.IcebergExceptions.translateMetadataException(IcebergExceptions.java:54)
        at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.refreshFromMetadataLocation(AbstractIcebergTableOperations.java:272)
	at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.refreshFromMetadataLocation(AbstractIcebergTableOperations.java:239)
	at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.refresh(AbstractIcebergTableOperations.java:140)
	at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.current(AbstractIcebergTableOperations.java:123)
	at io.trino.plugin.iceberg.catalog.hms.TrinoHiveCatalog.lambda$loadTable$11(TrinoHiveCatalog.java:448)
	at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4903)
	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3574)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2316)
	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2189)
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2079)
	at com.google.common.cache.LocalCache.get(LocalCache.java:4017)
	at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4898)
	at io.trino.cache.EvictableCache.get(EvictableCache.java:118)
	at io.trino.cache.CacheUtils.uncheckedCacheGet(CacheUtils.java:39)
	at io.trino.plugin.iceberg.catalog.hms.TrinoHiveCatalog.loadTable(TrinoHiveCatalog.java:445)
	at io.trino.plugin.iceberg.IcebergMetadata.getTableHandle(IcebergMetadata.java:472)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.getTableHandle(ClassLoaderSafeConnectorMetadata.java:1237)
	at io.trino.tracing.TracingConnectorMetadata.getTableHandle(TracingConnectorMetadata.java:142)
	at io.trino.metadata.MetadataManager.lambda$getTableHandle$5(MetadataManager.java:293)
	at java.base/java.util.Optional.flatMap(Optional.java:289)
	at io.trino.metadata.MetadataManager.getTableHandle(MetadataManager.java:284)
	at io.trino.metadata.MetadataManager.getRedirectionAwareTableHandle(MetadataManager.java:1947)
	at io.trino.metadata.MetadataManager.getRedirectionAwareTableHandle(MetadataManager.java:1939)
	at io.trino.tracing.TracingMetadata.getRedirectionAwareTableHandle(TracingMetadata.java:1494)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.getTableHandle(StatementAnalyzer.java:5842)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:2291)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:520)
	at io.trino.sql.tree.Table.accept(Table.java:60)
	at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:539)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.analyzeFrom(StatementAnalyzer.java:4891)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:3091)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:520)
	at io.trino.sql.tree.QuerySpecification.accept(QuerySpecification.java:155)
	at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:539)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:547)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1562)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:520)
	at io.trino.sql.tree.Query.accept(Query.java:119)
	at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:539)
	at io.trino.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:499)
	at io.trino.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:488)
	at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:98)
	at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:87)
	at io.trino.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:289)
	at io.trino.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:222)
	at io.trino.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:892)
	at io.trino.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:153)
	at io.trino.$gen.Trino_462____20241021_184613_2.call(Unknown Source)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1575)
Caused by: java.lang.IllegalArgumentException: No factory for location: s3a://My Bucket NAME/metadata/13665-29244414-57ea-433d-a4c2-76d6fe0c48a2.metadata.json
	at io.trino.filesystem.manager.FileSystemModule.lambda$createFileSystemFactory$2(FileSystemModule.java:149)
	at java.base/java.util.Optional.orElseThrow(Optional.java:403)
	at io.trino.filesystem.manager.FileSystemModule.lambda$createFileSystemFactory$3(FileSystemModule.java:149)
	at io.trino.filesystem.switching.SwitchingFileSystem.fileSystem(SwitchingFileSystem.java:194)
	at io.trino.filesystem.switching.SwitchingFileSystem.newInputFile(SwitchingFileSystem.java:60)
	at io.trino.filesystem.tracing.TracingFileSystem.newInputFile(TracingFileSystem.java:51)
	at io.trino.filesystem.cache.CacheFileSystem.newInputFile(CacheFileSystem.java:49)
	at io.trino.plugin.iceberg.fileio.ForwardingFileIo.newInputFile(ForwardingFileIo.java:60)
	at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.lambda$refreshFromMetadataLocation$1(AbstractIcebergTableOperations.java:241)
	at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.lambda$refreshFromMetadataLocation$3(AbstractIcebergTableOperations.java:266)
	at dev.failsafe.Functions.lambda$toCtxSupplier$11(Functions.java:243)
	at dev.failsafe.Functions.lambda$get$0(Functions.java:46)
	at dev.failsafe.internal.RetryPolicyExecutor.lambda$apply$0(RetryPolicyExecutor.java:74)
	at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:187)
	at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
	at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:112)
	at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.refreshFromMetadataLocation(AbstractIcebergTableOperations.java:266)


@rvishureddy
Copy link

Got this working
@XiangyuFan17

At a minimum, each Delta Lake, Hive or Hudi object storage catalog file must set the hive.metastore configuration property to define the type of metastore to use. Iceberg catalogs instead use the iceberg.catalog.type configuration property to define the type of metastore to use.

Go through this carefully : https://trino.io/docs/current/object-storage/metastores.html#hive-thrift-metastore

For hive

  hive: |
    connector.name=hive
    hive.metastore=thrift
    fs.native-s3.enabled=true
    hive.metastore.uri=thrift://hive.spark:9083

For me The below configuration worked

  iceberg: |
    connector.name=iceberg
    fs.native-s3.enabled=true
    iceberg.catalog.type=hive_metastore
    hive.metastore.uri=thrift://hive.spark:9083
    iceberg.file-format=orc
    iceberg.compression-codec=zstd
    hive.orc.bloom-filters.enabled=true

@rvishureddy
Copy link

rvishureddy commented Oct 21, 2024

If you are using thrift protocol do as above

But if you are using http or https read the respective Sections in the above link provided.

@sar009
Copy link

sar009 commented Oct 25, 2024

I see a similar error with the Iceberg connector with the rest and glue catalog. I used the following config for glue

connector.name=iceberg
iceberg.catalog.type=glue
iceberg.file-format=parquet
hive.metastore.glue.region=us-east-1
hive.metastore.glue.default-warehouse-dir=s3://mybucket/test/
hive.metastore.glue.aws-access-key=abcd
hive.metastore.glue.aws-secret-key=abcd

and following for rest

connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=http://rest:8181/

the error is

No factory for location: s3://mybucket/test/taxis-bec5eb7e34844c76ad34d2c87558813f

I tried various versions and see the problem started at version 458

@hendoxc
Copy link

hendoxc commented Dec 3, 2024

using local minio + iceberg rest catalog

connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=http://rest:8181
iceberg.rest-catalog.warehouse=s3://demo-iceberg/
iceberg.file-format=PARQUET
fs.native-s3.enabled=true
s3.endpoint=http://minio:9000
s3.path-style-access=true

Works for me ~ using s3.endpoint=http://minio:9000 alone gave me errors in trino, until I added fs.native-s3.enabled=true

@hashhar
Copy link
Member

hashhar commented Dec 3, 2024

Only relevant to comments from @hendoxc and @sar009

Please read docs and release notes. Since 458 this was changed. See the breaking changes notes in https://trino.io/docs/current/release/release-458.html#iceberg-connector

And also see docs https://trino.io/docs/current/object-storage.html#configuration

For the original reporter it seems your metastore has the locations stored as oss://<path> and if that's indeed the case then obviously it won't work since Trino doens't ship with a filesystem implementation for Aliyun OSS. If OSS provides S3-compatible API you can try a similar configuration as shared by @hendoxc in the comment above by pointing s3.endpoint to Aliyun OSS endpoint. However note that it may or may not work based on how complete the Aliyun OSS implementation of S3 is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants