Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] srcUri is decoded and not encoded back for rename API (ADLS Gen2) #8761

Closed
3 tasks
snleee opened this issue Mar 5, 2020 · 4 comments · Fixed by #8887
Closed
3 tasks

[BUG] srcUri is decoded and not encoded back for rename API (ADLS Gen2) #8761

snleee opened this issue Mar 5, 2020 · 4 comments · Fixed by #8887
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Data Lake Storage Gen2 Storage Storage Service (Queues, Blobs, Files)

Comments

@snleee
Copy link

snleee commented Mar 5, 2020

Describe the bug
Let's go over the example with filename = "segment1 %" and I want to rename this file to segment1 on ADLS Gen2 using Java SDK.

Because Java SDK usually takes the url encoded path, I was calling the following:

...
// provide the url encoded file name
DataLakeFileClient fileClient = fileSystemClient.getFileClient("a/segment_1%20%25");

// I was able to fetch the corresponding path properties
fileClient.getProperties();  

// try to rename
fileClient.rename(null, "a/segment1");

Exception or Stack Trace

com.azure.storage.file.datalake.models.DataLakeStorageException: Status code 400, "{"error":{"code":"InvalidSourceUri","message":"The source URI is invalid.\nRequestId:013d66ea-501f-00b8-2dc3-f203d8000000\nTime:2020-03-05T07:55:15.1649477Z"}}"

	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at com.azure.core.http.rest.RestProxy.instantiateUnexpectedException(RestProxy.java:357)
	at com.azure.core.http.rest.RestProxy.lambda$ensureExpectedStatus$3(RestProxy.java:400)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:118)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1592)
	at reactor.core.publisher.MonoProcessor.onNext(MonoProcessor.java:317)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1592)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:144)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:121)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:121)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1592)
	at reactor.core.publisher.MonoCollect$CollectSubscriber.onComplete(MonoCollect.java:145)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onComplete(FluxMapFuseable.java:144)
	at reactor.core.publisher.FluxReplay$UnboundedReplayBuffer.replayNormal(FluxReplay.java:551)
	at reactor.core.publisher.FluxReplay$UnboundedReplayBuffer.replay(FluxReplay.java:654)
	at reactor.core.publisher.FluxReplay.subscribeOrReturn(FluxReplay.java:1096)
	at reactor.core.publisher.FluxReplay.subscribe(FluxReplay.java:1064)
	at reactor.core.publisher.FluxAutoConnectFuseable.subscribe(FluxAutoConnectFuseable.java:60)
	at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55)
	at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4087)
	at reactor.core.publisher.MonoProcessor.add(MonoProcessor.java:457)
	at reactor.core.publisher.MonoProcessor.subscribe(MonoProcessor.java:370)
	at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:150)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:121)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:121)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1592)
	at reactor.core.publisher.MonoCollect$CollectSubscriber.onComplete(MonoCollect.java:145)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onComplete(FluxMapFuseable.java:144)
	at reactor.core.publisher.FluxReplay$UnboundedReplayBuffer.replayNormal(FluxReplay.java:551)
	at reactor.core.publisher.FluxReplay$UnboundedReplayBuffer.replay(FluxReplay.java:654)
	at reactor.core.publisher.FluxReplay$ReplaySubscriber.onComplete(FluxReplay.java:1218)
	at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:136)
	at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onComplete(FluxDoFinally.java:138)
	at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:136)
	at reactor.netty.channel.FluxReceive.terminateReceiver(FluxReceive.java:397)
	at reactor.netty.channel.FluxReceive.drainReceiver(FluxReceive.java:197)
	at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:345)
	at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:363)
	at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:412)
	at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:556)
	at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:91)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:328)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:302)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475)
	at io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1236)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1273)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)
	at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:745)
	Suppressed: java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:93)
		at reactor.core.publisher.Mono.block(Mono.java:1663)
		at com.azure.storage.common.implementation.StorageImplUtils.blockWithOptionalTimeout(StorageImplUtils.java:99)
		at com.azure.storage.file.datalake.DataLakeFileClient.renameWithResponse(DataLakeFileClient.java:367)
		at com.azure.storage.file.datalake.DataLakeFileClient.rename(DataLakeFileClient.java:335)
		at org.apache.pinot.plugin.filesystem.test.AzureGen2PinotFSTest.testF2S(AzureGen2PinotFSTest.java:197)
		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
		at java.lang.reflect.Method.invoke(Method.java:498)
		at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:108)
		at org.testng.internal.Invoker.invokeMethod(Invoker.java:661)
		at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:869)
		at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1193)
		at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:126)
		at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
		at org.testng.TestRunner.privateRun(TestRunner.java:744)
		at org.testng.TestRunner.run(TestRunner.java:602)
		at org.testng.SuiteRunner.runTest(SuiteRunner.java:380)
		at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:375)
		at org.testng.SuiteRunner.privateRun(SuiteRunner.java:340)
		at org.testng.SuiteRunner.run(SuiteRunner.java:289)
		at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
		at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
		at org.testng.TestNG.runSuitesSequentially(TestNG.java:1301)
		at org.testng.TestNG.runSuitesLocally(TestNG.java:1226)
		at org.testng.TestNG.runSuites(TestNG.java:1144)
		at org.testng.TestNG.run(TestNG.java:1115)
		at org.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:73)
		at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:123)

To Reproduce
Steps to reproduce the behavior:

  1. create the file segment1 % under directory /a using storage explorer. (/a/segment1 %)
  2. run the following code
...
DataLakeServiceClient serviceClient = builder.buildClient();
DataLakeFileSystemClient fileSystemClient = serviceClient.getFileSystemClient("<file_system_name>");
DataLakeFileClient fileClient = fileSystemClient.getFileClient("a/segment_1%20%25");
fileClient.getProperties();
fileClient.rename(null, "a/segment1");

Code Snippet
Add the code snippet that causes the issue.

Please look at the highlighted line where it rebuilds the src path.

dataLakePathAsyncClient.getObjectPath() decodes the path as shown below.

So eventually, a/segment_1 % is passed to the API, which ends up throwing invalid srcUri exception.

Expected behavior
A clear and concise description of what you expected to happen.


The above line needs to be changed like the following:

dataLakePathAsyncClient.getObjectPath()

-> Utility.urlEncode(dataLakePathAsyncClient.getObjectPath())

Screenshots
If applicable, add screenshots to help explain your problem.

Setup (please complete the following information):

  • OS: [e.g. iOS]
  • IDE : [e.g. IntelliJ]
  • Version of the Library used

Additional context
Add any other context about the problem here.

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added
@gapra-msft
Copy link
Member

Hi @snleee

Thank you for reporting this issue. I'll take a look and see what could be causing it.

@gapra-msft gapra-msft self-assigned this Mar 5, 2020
@gapra-msft gapra-msft added Client This issue points to a problem in the data-plane of the library. Data Lake Storage Gen2 Storage Storage Service (Queues, Blobs, Files) labels Mar 5, 2020
@triage-new-issues triage-new-issues bot removed the triage label Mar 5, 2020
@gapra-msft gapra-msft added the customer-reported Issues that are reported by GitHub users external to the Azure organization. label Mar 5, 2020
@rickle-msft
Copy link
Contributor

For the record, there's a fairly specific system of this in blobs, and I think we even had to fix a bug around customers wanting % in the blob name, so we should check out what patterns/history we have there.

@snleee
Copy link
Author

snleee commented Mar 6, 2020

@gapra-msft @rickle-msft Thank you for the update. Can we have some idea on the timeline on fixing the issue and making the new library version available to public?

@rickle-msft
Copy link
Contributor

My best estimate is that we can have this ready for the April release, which would be about this time next month, though I can't guarantee we will hit that target.

snleee pushed a commit to snleee/pinot that referenced this issue Mar 12, 2020
1. Testing have been done by attaching ADLS Gen2 to the local deployment.
2. move() is implemented by copy & delete because of azure sdk issue with rename() API.
   Azure/azure-sdk-for-java#8761
snleee pushed a commit to apache/pinot that referenced this issue Mar 17, 2020
* Add Azure Data Lake Gen2 connector for PinotFS

1. Testing have been done by attaching ADLS Gen2 to the local deployment.
2. move() is implemented by copy & delete because of azure sdk issue with rename() API.
   Azure/azure-sdk-for-java#8761

* Addressing comments
@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Data Lake Storage Gen2 Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants