Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We get following exception when we try to invoke via data brick C# client to list scopes. #27

Closed
vinothganesh opened this issue Jan 28, 2020 · 6 comments
Assignees

Comments

@vinothganesh
Copy link

The operation was canceled.
System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
System.Net.Sockets.SocketException (125): Operation canceled
--- End of inner exception stack trace ---
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
at System.Net.Security.SslStream. FillBufferAsync g__InternalFillBufferAsync|215_0[TReadAdapter](TReadAdapter adap, ValueTask1 task, Int32 min, Int32 initial) at System.Net.Security.SslStream.ReadAsyncInternal[TReadAdapter](TReadAdapter adapter, Memory1 buffer)
at System.Net.Http.HttpConnection.FillAsync()
at System.Net.Http.HttpConnection.ReadNextResponseHeaderLineAsync(Boolean foldedHeadersAllowed)
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithNtConnectionAuthAsync(HttpConnection connection, HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.DecompressionHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at Microsoft.Azure.Databricks.Client.TimeoutHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
at Microsoft.Azure.Databricks.Client.ApiClient.HttpGet[T](HttpClient httpClient, String requestUri)
at Microsoft.Azure.Databricks.Client.SecretsApiClient.ListScopes()

@michael-damatov
Copy link

We have a similar exception. According to our logs the exception is thrown after waiting for 30 sec for response:

System.Threading.Tasks.TaskCanceledException: The operation was canceled.
---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
---> System.Net.Sockets.SocketException (125): Operation canceled
--- End of inner exception stack trace ---
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
at System.Net.Security.SslStream.g__InternalFillBufferAsync|215_0[TReadAdapter](TReadAdapter adap, ValueTask1 task, Int32 min, Int32 initial) at System.Net.Security.SslStream.ReadAsyncInternal[TReadAdapter](TReadAdapter adapter, Memory1 buffer)
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithNtConnectionAuthAsync(HttpConnection connection, HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.DecompressionHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at Microsoft.Azure.Databricks.Client.TimeoutHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
at Microsoft.Azure.Databricks.Client.ApiClient.HttpGet[T](HttpClient httpClient, String requestUri)
at Microsoft.Azure.Databricks.Client.JobsApiClient.RunsGet(Int64 runId)
...

@vinothganesh
Copy link
Author

@michael-damatov , I had spoken with the data bricks team. Basically, they are suggesting to increase the timeout. As of now, it is 30 sec by default

public static DatabricksClient CreateClient(string baseUrl, string token, long timeoutSeconds = 30)

Kindly increase timeout from 30 sec to 1 minute and see how it performs.

@memoryz
Copy link
Contributor

memoryz commented Feb 5, 2020

@michael-damatov, I debugged with @vinothganesh last week for a while, and I strongly believe this error is caused by HTTP timeout. When HTTP timeout happens, the HttpClient class throws TaskCanceledException. The databricks client tries to handle the HTTP timeout by changing the exception thrown from HttpClient to a TimeoutException, but the implementation wasn't complete. After some thinking, I later decided that the databricks client should NOT try to change the exception type to TimeoutException, and should instead let the application layer handle the exception on its own. However, the current implementation was left there to avoid breaking changes in behavior. Maybe I'll remove the timeout handling logic in a later release.
The bottom line is, I believe the exception you see is caused by timeout, and increasing the default timeout and/or implementing some retry should solve your problem.

@michael-damatov
Copy link

Thanks, this approach (increase the timeout + retry) seems to be reasonable.

However, this means that the Databricks just "eats up" the request without returning any response, so, I'm curious, why this happens.

@memoryz
Copy link
Contributor

memoryz commented Feb 5, 2020

Thanks, this approach (increase the timeout + retry) seems to be reasonable.

However, this means that the Databricks just "eats up" the request without returning any response, so, I'm curious, why this happens.

It may not be on the Databricks side (I'm not saying it's not on them, but I don't work for Databricks or the Azure Databricks team, so I cannot say for sure), but may be on any intermediate node in the network.

In our own client application, we also ran into all sorts of transient errors. After we used polly.net to handle and retry on these errors, the application resiliency was greatly improved.

@vinothganesh
Copy link
Author

As of now, I have a stable working code with an increased timeout. Hence, I am closing the issue. Feel free to re-open it once it occurs again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants