Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker socket timeout retrieving TransportNetwork from S3 #832

Open
abyrd opened this issue Oct 28, 2022 · 0 comments
Open

Worker socket timeout retrieving TransportNetwork from S3 #832

abyrd opened this issue Oct 28, 2022 · 0 comments
Assignees

Comments

@abyrd
Copy link
Member

abyrd commented Oct 28, 2022

This failure on a worker causes the whole regional analysis to fail. Maybe in some circumstances we should allow regional analyses to continue after a small number of failures - if we just ignored this worker the regional analysis likely would have finished properly.

The trick is distinguishing between errors that invalidate the results and those that are one-off glitches. Maybe just looking for repeated identical errors from multiple workers is the answer.

SocketTimeoutException: Read timed out, caused RuntimeException: java.net.SocketTimeoutException: Read timed out, caused TransportNetworkException: Exception occurred retrieving or building network., caused TransportNetworkException: Could not load TransportNetwork into cache.
[detail follows]
com.conveyal.r5.transit.TransportNetworkException: Could not load TransportNetwork into cache.
         at com.conveyal.r5.transit.TransportNetworkCache.getNetwork(TransportNetworkCache.java:88)
         at com.conveyal.r5.transit.TransportNetworkCache.getNetworkForScenario(TransportNetworkCache.java:125)
         at com.conveyal.r5.analyst.NetworkPreloader.buildValue(NetworkPreloader.java:109)
         at com.conveyal.r5.analyst.NetworkPreloader.synchronousPreload(NetworkPreloader.java:100)
         at com.conveyal.r5.analyst.cluster.AnalysisWorker.handleOneRegionalTask(AnalysisWorker.java:428)
         at com.conveyal.r5.analyst.cluster.AnalysisWorker.lambda$startPolling$0(AnalysisWorker.java:245)
         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
         at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.conveyal.r5.transit.TransportNetworkException: Exception occurred retrieving or building network.
         at com.conveyal.r5.transit.TransportNetworkCache.loadNetwork(TransportNetworkCache.java:373)
         at com.github.benmanes.caffeine.cache.LocalLoadingCache.lambda$newMappingFunction$2(LocalLoadingCache.java:141)
         at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2380)
         at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908)
         at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2378)
         at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2361)
         at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
         at com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:54)
         at com.conveyal.r5.transit.TransportNetworkCache.getNetwork(TransportNetworkCache.java:86)
         ... 8 more
Caused by: java.lang.RuntimeException: java.net.SocketTimeoutException: Read timed out
         at com.conveyal.cluster.S3FileStorage.cacheLoader(S3FileStorage.java:100)
         at com.github.benmanes.caffeine.cache.LocalLoadingCache.lambda$newMappingFunction$2(LocalLoadingCache.java:141)
         at com.github.benmanes.caffeine.cache.UnboundedLocalCache.lambda$computeIfAbsent$2(UnboundedLocalCache.java:238)
         at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1705)
         at com.github.benmanes.caffeine.cache.UnboundedLocalCache.computeIfAbsent(UnboundedLocalCache.java:234)
         at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
         at com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:54)
         at com.conveyal.cluster.S3FileStorage.getFile(S3FileStorage.java:158)
         at com.conveyal.r5.transit.TransportNetworkCache.loadNetwork(TransportNetworkCache.java:362)
         ... 16 more
Caused by: java.net.SocketTimeoutException: Read timed out
         at java.base/java.net.SocketInputStream.socketRead0(Native Method)
         at java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:115)
         at java.base/java.net.SocketInputStream.read(SocketInputStream.java:168)
         at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140)
         at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478)
         at java.base/sun.security.ssl.SSLSocketInputRecord.readFully(SSLSocketInputRecord.java:461)
         at java.base/sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:243)
         at java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:181)
         at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)
         at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1429)
         at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1396)
         at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:985)
         at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
         at org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:197)
         at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176)
         at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135)
         at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
         at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
         at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
         at com.amazonaws.services.s3.internal.S3AbortableInputStream.read(S3AbortableInputStream.java:125)
         at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
         at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
         at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
         at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
         at java.base/java.security.DigestInputStream.read(DigestInputStream.java:162)
         at com.amazonaws.services.s3.internal.DigestValidationInputStream.read(DigestValidationInputStream.java:59)
         at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
         at java.base/java.io.InputStream.transferTo(InputStream.java:704)
         at com.conveyal.cluster.S3FileStorage.cacheLoader(S3FileStorage.java:94)
         ... 24 more
@abyrd abyrd self-assigned this Oct 28, 2022
@abyrd abyrd changed the title Worker socket timeout retrieving TransportNetowork from S3 Worker socket timeout retrieving TransportNetwork from S3 Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant