Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load log servers with log-list.json #48

Open
HappyDr0id opened this issue Oct 5, 2022 · 9 comments
Open

Unable to load log servers with log-list.json #48

HappyDr0id opened this issue Oct 5, 2022 · 9 comments

Comments

@HappyDr0id
Copy link

HappyDr0id commented Oct 5, 2022

Hey 👋

We did the migration from 0.3.0 to 1.1.1 recently and we observed some non-fatal reports on Crashlytics since then. By digging into the issue, it looks like there is an internal library error happening quite randomly, only from time to time (like once every few minutes), without preventing the usage of our app overall (at least visibly for the user). Here is the stack trace:

2022-10-04 17:33:08.402 30118-30253/<package_name> I/CertificateTransparency: <domain_url> Failure: Unable to load log servers with log-list.json failed to load with java.lang.InterruptedException
        at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:84)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
        at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source:1)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
        at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source:1)
        at com.appmattus.certificatetransparency.internal.verifier.CertificateTransparencyBase.hasValidSignedCertificateTimestamp(CertificateTransparencyBase.kt:112)
        at com.appmattus.certificatetransparency.internal.verifier.CertificateTransparencyBase.verifyCertificateTransparency(CertificateTransparencyBase.kt:96)
        at com.appmattus.certificatetransparency.internal.verifier.CertificateTransparencyInterceptor.intercept(CertificateTransparencyInterceptor.kt:69)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        <...>
2022-10-04 17:33:08.402 30118-30253/<package_name> I/CertificateTransparency:     at java.lang.Thread.run(Thread.java:1012)
2022-10-04 17:33:08.404 30118-30253/<package_name> I/okhttp.OkHttpClient: <-- HTTP FAILED: javax.net.ssl.SSLPeerUnverifiedException: Certificate transparency failed

We didn't have this report at all with the previous 0.3.0 version, and this failure is happening really randomly, among successful "SCT trusted logs". Our domains have needed certificates and Certificate Transparency is working as expected, so, especially seeing the json file reading failure, it looks to be more internal to the library than due to a certificate failure itself.

Does it ring a bell to you? Thanks in advance for your help, and overall for your work on this CT library 🙏

@mattmook
Copy link
Member

Nothing in particular comes to mind - InterruptedException kind of implies the the coroutine block has been cancelled in some way - could be timeout? I might need to add some more tests around timeouts etc to see if I can replicate more reliably.
The fix may be as simple as the code detecting this exception and suppressing (if appropriate)

@mattmook
Copy link
Member

One thing to look at is what you do in terms of caching the log list data (to disk for example) that could reduce the number of network hits

@mattmook
Copy link
Member

On reflection I think this is likely solved by #39 and given no other reports of a similar issue I'm going to close this one.

@HappyDr0id
Copy link
Author

Thanks for your answers and your work on this library 🙏 I'll see with the latest release if this is something still being reported in production environment or not 🙂

@ikeed
Copy link

ikeed commented Apr 7, 2023

The fix may be as simple as the code detecting this exception and suppressing (if appropriate)

Unfortunately in the case of the InterruptedException, the lib is throwing an SSLPeerUnverifiedException which is indistinguishable from a validation failure. In our app, I implemented a kind of kludgy workaround to look at the VerificationResult from the last logger callback to determine whether we should suppress the error. It would be great if there were an easier way to distinguish these cases.

I can still reproduce this issue on 2.1.2 by spamming a bunch of requests at once.

@mattmook
Copy link
Member

mattmook commented Apr 9, 2023

Have reopened for further investigation. @ikeed, this definitely sounds testable/reproducible by what you are saying then.

@mattmook mattmook reopened this Apr 9, 2023
@ikeed
Copy link

ikeed commented Apr 9, 2023

Thanks very much Matt! If you need any more info from me, just ask.

@ikeed
Copy link

ikeed commented Apr 9, 2023

Incidentally, if the InterruptedException is too complex to fix, just embedding some kind of a "reason" field into the SSLPeerUnverifiedException would suit my needs. I'm not sure if that's the direction you want to go. We get a VerificationResult in the log callback and it would be useful to have something similar in the SSLPeerUnverifiedException itself so clients can make smarter decisions when an error is thrown. Just to help distinguish between concurrency issues and an actual dodgy cert.

@dsatija
Copy link

dsatija commented Jul 10, 2024

Hi @mattmook , we are facing the same issue. Getting this error :
Unable to load log servers with log-list.zip failed to load with kotlinx.coroutines.JobCancellationException
Here's the error :
E/CertificateTransparencyManager: Certificate transparency failed for host : y.x.com with result: Failure: Unable to load log servers with log-list.zip failed to load with kotlinx.coroutines.JobCancellationException: Parent job is Cancelling; job=BlockingCoroutine{Cancelled}@5bb23e7 Caused by: java.lang.InterruptedException at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source:159) at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source:3) at com.appmattus.certificatetransparency.internal.verifier.CertificateTransparencyBase.hasValidSignedCertificateTimestamp(CertificateTransparencyBase.kt:7) at com.appmattus.certificatetransparency.internal.verifier.CertificateTransparencyBase.verifyCertificateTransparency(CertificateTransparencyBase.kt:77) at

Could you suggest what can be done here ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants