-
Notifications
You must be signed in to change notification settings - Fork 850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Having a STREAMID 1 CANCEL when trying to read a file from ADLS #3948
Comments
Hello @invasionofsmallcubes I'm routing this to some folks on the ADLS team to help investigate. |
Thanks Joel.
inside that method I'm doing some data transformation and than sending the record to a grpc server, which of course, takes some network time. The timeout seems to be quite linear to the file size, a bigger size will have the connection open longer. So I tried to transform my method to just read the whole file in memory and then process it and I don't have any problem. I think there is like a watchdog that just cuts the connection after a while to not consume the bandwith or to not keep the connection open too much. Is this assumption correct? Is there some official documentation about this behavior? I'm just guessing here. |
Still trying to get somebody from the service team to comment on this, the extra data point will help. How much extra time is added when performing the processing? |
for a 50MB file it's about 5 minutes, 200MB it's about 1h and 30m etc... |
So just to make sure I understand, you read a chunk of the file from ADLS, process, then read the next chunk and repeat the process etc, and the processing make the total read of the 50MB file take 5 minutes? So as a result the connection gets closed presumably because the read is taking too long? |
OK I discussed this with the service team, their recommendation is to repeatedly call |
Honestly this should be done at go sdk level. I mean, yes, one of my options was to keep track of the offset and just recover but doesn't really seems something that should escape from sdk API. At the moment I'm fine with my solution. Can you confirm is some kind of safe guard to avoid keep the connection open? |
Agreed it belongs in the SDK, it would be a hand-written "convenience" layer on top of our swagger-generated code. We've had a similar request, see also #3231. |
@jhendrixMSFT can you be more specific on what I should do given the code I pasted in the first post? |
If you want to keep the code as-is (i.e. don't read in 4MB chunks) can you try setting the idle connection timeout to zero (no timeout) or some reasonable upper bound based on your use case. You'd create a custom HTTP client and assign it to the j, err := cookiejar.New(nil)
if err != nil {
// deal with error...
}
client := filesystem.NewClient()
client.Sender = &http.Client{
Transport: &http.Transport{
Proxy: http.ProxyFromEnvironment,
DialContext: (&net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
DualStack: true,
}).DialContext,
MaxIdleConns: 100,
IdleConnTimeout: 0, // no limit
TLSHandshakeTimeout: 10 * time.Second,
ExpectContinueTimeout: 1 * time.Second,
},
Jar: j,
} |
Ok, I'll try. Thank you. |
Hello,
I'm having the following issue: I wrote a go program that is scheduled on k8s to read files from ADLS, one at the time. The file structure follows a typical offset pattern so I just consume them following this offset.
I also can reproduce the problem on my laptop so it's not a k8s issue.
If I run
file -I filename
on these files I getapplication/octet-stream; charset=binary
I'm having the issue that, for a specific file, I'm always getting back the following error:
stream error: stream ID 1; CANCEL
.Here is the way I'm consuming the file:
and then I use the
io.ReadCloser
to read one line after the other and do something with itWhen I try to consume this specific file I always get a different number of row consumed each time (starting always from the beginning).
first run:
second run:
third run:
As you can see the line count is always different and the third time consumed less lines than the second time.
I have no issue at all reading this file with pyspark on to of Azure Databricks.
Looking at the error through debug it appears to be a
http2ErrorStream
with code codenet/http2ErrCodeCancel
and StreamID1
.According to documentation ( https://http2.github.io/http2-spec/ ) it seems it could be the client doing this but I'm not really sure on how to verify it.
From the doc:
it says also that
Forgot to add that I'm using
from
Again, reading this file with pyspark on Azure Databricks doesn't give any issue at all. I don't know what specifically I'm doing wrong.
Can you help me?
The text was updated successfully, but these errors were encountered: