Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wingtips-zipkin: NumberFormatException in WingtipsToZipkinSpanConverterDefaultImpl #14

Closed
jpiechowka opened this issue Aug 17, 2016 · 14 comments

Comments

@jpiechowka
Copy link
Contributor

I have been trying out Wingtips with Zipkin today. I love the docs and overall simplicity of use. I created simple client-server app. Client is written using Cujojs-rest Zipkin instrumentation and Server is created in Spark web framework. When I make request from client to server, and Wingtips tries to convert strings extracted from headers to long it always ends in NumberFormatException in WingtipsToZipkinSpanConverterDefaultImpl in nullSafeLong() method. The solution I found working was to add this:

protected Long nullSafeLong(String str) {
        if (str == null)
            return null;

        long longString = new BigInteger((str), 16).longValue();
        return longString;
    }
@nicmunroe
Copy link
Member

Thanks for the report @jpiechowka ! Can you copy/paste an example of one of the values being sent by the Cujojs-rest Zipkin instrumentation that is failing with the NumberFormatException in Wingtips? I'm pretty sure I know what it will look like, but I'd rather be 100% sure than assume.

We definitely want to support receiving propagation headers from Zipkin, so we'll have to come up with a solution that supports both Wingtips and Zipkin headers.

@nicmunroe
Copy link
Member

And just to make sure I'm clear on the desired outcome here: (1) Cujojs-rest (Zipkin) -> (2) Spark (Wingtips) -> (3) Zipkin server for collecting, and the failure is happening when (2) is trying to send spans to (3)? Is that correct?

@jpiechowka
Copy link
Contributor Author

This is my application with my fix applied in nullSafeLong(): wingtips-cujojs-spark-example.

@nicmunroe
Copy link
Member

Can you just copy/paste the full NumberFormatException message? i.e. java.lang.NumberFormatException: For input string: "[unparseable value]". I need to see an example of one of the [unparseable value] that you're seeing.

@jpiechowka
Copy link
Contributor Author

@nicmunroe Sure! Here it is:

java.lang.NumberFormatException: For input string: "ac65beda7e31487a"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Long.parseLong(Long.java:589)
    at java.lang.Long.parseLong(Long.java:631)
    at com.nike.wingtips.zipkin.util.WingtipsToZipkinSpanConverterDefaultImpl.nullSafeLong(WingtipsToZipkinSpanConverterDefaultImpl.java:72)
    at com.nike.wingtips.zipkin.util.WingtipsToZipkinSpanConverterDefaultImpl.convertWingtipsSpanToZipkinSpan(WingtipsToZipkinSpanConverterDefaultImpl.java:30)
    at com.nike.wingtips.zipkin.WingtipsToZipkinLifecycleListener.spanCompleted(WingtipsToZipkinLifecycleListener.java:97)
    at com.nike.wingtips.Tracer.notifySpanCompleted(Tracer.java:664)
    at com.nike.wingtips.Tracer.completeAndLogSpan(Tracer.java:547)
    at com.nike.wingtips.Tracer.completeRequestSpan(Tracer.java:470)
    at SparkServer.lambda$main$1(SparkServer.java:44)
    at spark.RouteImpl$1.handle(RouteImpl.java:58)
    at spark.webserver.MatcherFilter.doFilter(MatcherFilter.java:162)
    at spark.webserver.JettyHandler.doHandle(JettyHandler.java:61)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:189)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119)
    at org.eclipse.jetty.server.Server.handle(Server.java:517)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:302)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:245)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
    at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:213)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
    at java.lang.Thread.run(Thread.java:745)

If you need anything more please feel free to ask. But I need to go to sleep. I will be available in couple of hours.

@codefromthecrypt
Copy link

@nicmunroe so I've been thinking about making a b3-java or b3-propagation-java project as this is a fairly common type of bug/misunderstanding. Ex. sometimes people are trying to read the hex ids as integers etc.

@nicmunroe
Copy link
Member

@jpiechowka Thanks, that's exactly what I was looking for. We'll get on this ASAP - should hopefully be a quick fix.

@adriancole Yeah that's what's happening here. Is @jpiechowka 's suggestion of new BigInteger(zipkinHexString, 16).longValue() the correct way to solve this in Java 7? I tried simply doing a Long.parseLong("ac65beda7e31487a", 16) with the example given and it failed. Java 8's Long.parseUnsignedLong("ac65beda7e31487a", 16) works, and gives the same answer as the BigInteger solution. So without looking at how zipkin is generating the hex strings it seems safe to assume they are unsigned longs and the BigInteger solution should work. But since you're here I figure it's better to double check - will the BigInteger solution create the long so it gets reported to the zipkin server correctly, or is there other massaging that needs to happen?

Context for why this bug exists - Wingtips just passes around decimal longs as strings rather than hex-encoded even when serializing to JSON, so I never considered this case. And Zipkin -> Wingtips -> * HTTP header propagation would work fine since Wingtips stores the IDs as strings internally rather than longs so it would pass through without issue. This particular case of Zipkin -> Wingtips -> Zipkin reporter fails though because the wingtips reporter assumes decimal encoded longs.

@codefromthecrypt
Copy link

Zipkin has a hipster hex to unsigned long parser which profiles well. Main
thing different than some is it tolerates less than 16 characters.

https://github.com/openzipkin/zipkin/blob/master/zipkin/src/main/java/zipkin/internal/Util.java

On 18 Aug 2016 12:22, "Nic Munroe" notifications@github.com wrote:

@jpiechowka https://github.com/jpiechowka Thanks, that's exactly what I
was looking for. We'll get on this ASAP - should hopefully be a quick fix.

@adriancole https://github.com/adriancole Yeah that's what's happening
here. Is @jpiechowka https://github.com/jpiechowka 's suggestion of new
BigInteger(zipkinHexString, 16).longValue() the correct way to solve this
in Java 7? I tried simply doing a Long.parseLong("ac65beda7e31487a", 16)
with the example given and it failed. Java 8's Long.parseUnsignedLong("ac65beda7e31487a",
16) works, and gives the same answer as the BigInteger solution. So
without looking at how zipkin is generating the hex strings it seems safe
to assume they are unsigned longs and the BigInteger solution should
work. But since you're here I figure it's better to double check - will the
BigInteger solution create the long so it gets reported to the zipkin
server correctly, or is there other massaging that needs to happen?

Context for why this bug exists - Wingtips just passes around decimal
longs as strings rather than hex-encoded even when serializing to JSON, so
I never considered this case. And Zipkin -> Wingtips -> * HTTP header
propagation would work fine since Wingtips stores the IDs as strings
internally rather than longs so it would pass through without issue. This
particular case of Zipkin -> Wingtips -> Zipkin reporter fails though
because the wingtips reporter assumes decimal encoded longs.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#14 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAD612GtnlW2Ex5Ev8XkmFA4qgMRp2Cnks5qg94VgaJpZM4Jm98J
.

@nicmunroe
Copy link
Member

nicmunroe commented Aug 18, 2016

@adriancole perfect! Thanks for the pointer. I only see one problem at this point - my naive initial approach to solving this problem was going to be to attempt decimal long parsing first, and if that failed with a NumberFormatException fall back to hex long parsing. But a service could serve both zipkin and wingtips clients at the same time, and there are hex encoded longs that have only digits in them, so a zipkin ID could be incorrectly parsed as a decimal long without error.

I ran a quick test with 100 million randomly generated longs, and about 1 in every 2000 falls into this category where the hex string could be successfully parsed as a decimal encoded string.

So at this point I'm thinking maybe you just have to tell the Wingtips-to-Zipkin-span-converter whether you expect hex or decimal encoded IDs, and supporting both at the same time won't be a supported use case? Or is a 0.05% incorrect ID rate an ok tradeoff? WDYT?

@codefromthecrypt
Copy link

Just to clarify, are you saying in wingtips that only the reporting does ids as numbers, or also header propagation? If the latter, it certainly won't be compatible with B3, which means the problem is bigger than span reporting.

The impact of not using hex ids is pretty big.. this means that B3 traces that start before wingtips won't propagate compatibly into wingtips. Similarly, outbound, if a wingtips user calls a ruby app or c# B3 instrumented app, they won't join properly.

My suggestion would be to phase in a change to wingtips to encode in B3 (lowerhex fixed width) by default. For example, start traces that way, and assume fixed-width 16 char ids are hex.

I ack that this cleanup is dirty in the brown field. One way is to propagate a special header that knows about this change and eventually remove it once internal wingtips deployments are converged. While a punch, a punch like this is certainly better earlier than later because when B3 works the same was here as outside, users can get access to the whole ecosystem compatibly.

There's probably other approaches, too. these are just my thoughts.

@nicmunroe
Copy link
Member

Unfortunately it is the latter, but good news is cleanup shouldn't be too bad. Since wingtips uses strings for the internal model, zk -> wt -> * propagation should work fine - wingtips just passes IDs through as strings without any change. The only problem comes in when wingtips does the ID creation since wingtips-created decimal IDs won't be joined properly by a hex ID B3 tracing system.

The way we collect and aggregate spans internally shouldn't be affected by changing to hex IDs so I don't think we need a phased migration strategy. I think it's just band-aid ripping time.

@nicmunroe
Copy link
Member

Ok, I've created a new issue for switching to hex IDs for proper B3 compatibility: #15 .

@jpiechowka the issue you raised here should be solved as part of the work for #15 .

nicmunroe added a commit to nicmunroe/wingtips that referenced this issue Aug 19, 2016
…s with lowercase hex encoding for trace and span IDs

Fixes issues Nike-Inc#14 and Nike-Inc#15
nicmunroe added a commit to nicmunroe/wingtips that referenced this issue Aug 19, 2016
…s with lowercase hex encoding for trace and span IDs

Fixes issues Nike-Inc#14 and Nike-Inc#15
nicmunroe added a commit to nicmunroe/wingtips that referenced this issue Aug 19, 2016
…s with lowercase hex encoding for trace and span IDs

Fixes issues Nike-Inc#14 and Nike-Inc#15
nicmunroe added a commit to nicmunroe/wingtips that referenced this issue Aug 19, 2016
…s with lowercase hex encoding for trace and span IDs

Fixes issues Nike-Inc#14 and Nike-Inc#15
@codefromthecrypt
Copy link

👏

On Fri, Aug 19, 2016 at 11:41 PM, Nic Munroe notifications@github.com
wrote:

Ok, I've created a new issue for switching to hex IDs for proper B3
compatibility: #15 #15 .

@jpiechowka https://github.com/jpiechowka the issue you raised here
should be solved as part of the work for #15
#15 .


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#14 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAD61zrCIxjIlZRBCQJXsKYmnM8weTRfks5qhc7DgaJpZM4Jm98J
.

nicmunroe added a commit to nicmunroe/wingtips that referenced this issue Aug 20, 2016
…s with lowercase hex encoding for trace and span IDs

Fixes issues Nike-Inc#14 and Nike-Inc#15
@jpiechowka
Copy link
Contributor Author

@nicmunroe Great work! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants