Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input.readAscii multithreaded bug #264

Closed
devinrsmith opened this issue Nov 14, 2014 · 12 comments
Closed

Input.readAscii multithreaded bug #264

devinrsmith opened this issue Nov 14, 2014 · 12 comments

Comments

@devinrsmith
Copy link

After lots of debugging and testing out different theories, I found a multithreaded bug that causes non-deterministic results in Input.readAscii.

Specifically, Input.readAscii modifies the user's passed in buffer. If the user passes in the same buffer pointer to multiple Inputs, then readAscii instances may read past the proper end.

While it might seem rare that somebody is deserializing the same exact buffer multiple times, it definitely comes into play when running DBs in memory.

It looks like this is the only place in Input that modifies the buffer.

@romix
Copy link
Collaborator

romix commented Nov 14, 2014

Have you read this on the main page?
https://github.com/EsotericSoftware/kryo#threading

I think it is pretty clear about this issue.

@brianfromoregon
Copy link

Btw this issue was recently discussed in #128.

@devinrsmith
Copy link
Author

Ah! I did read through the documentation, but the byte[] part escaped me. I didn't even realize that I was calling the method with the same byte[] until I did lots of debugging. It might be useful to augment the readName KryoException to mention that the user might be passing in the same buffer.

@devinrsmith
Copy link
Author

It would be pretty easy to create a version that is just as fast but can accept the same byte[] buffer in multiple inputs. I think this is a much better solution than users copying bytes before passing into Input.

https://gist.github.com/devinrsmith/627573047e4e2884836f

My version isn't as fast b/c it does an extra copy, but that could be gotten rid of if necessary.

@NathanSweet
Copy link
Member

We would need to write an extra byte per string to avoid the copy.

On Fri, Nov 14, 2014 at 6:23 PM, Devin Smith notifications@github.com
wrote:

It would be pretty easy to create a version that is just as fast but can
accept the same byte[] buffer in multiple inputs. I think this is a much
better solution than users copying bytes before passing into Input.

https://gist.github.com/devinrsmith/627573047e4e2884836f

My version isn't as fast b/c it does an extra copy, but that could be
gotten rid of if necessary.


Reply to this email directly or view it on GitHub
#264 (comment)
.

@devinrsmith
Copy link
Author

You can avoid the extra copy w/o binary incompatibility by building up a char[] instead and calling Strings private constructor.

@NathanSweet
Copy link
Member

"building up a char[]" is a copy.

On Fri, Nov 14, 2014 at 6:55 PM, Devin Smith notifications@github.com
wrote:

You can avoid the extra copy w/o binary incompatibility by building up a
char[] instead and calling Strings private constructor.


Reply to this email directly or view it on GitHub
#264 (comment)
.

@devinrsmith
Copy link
Author

You are already doing that via the public String constructor:

@Deprecated
public String(byte ascii[], int hibyte, int offset, int count) {
    checkBounds(ascii, offset, count);
    char value[] = new char[count];

    if (hibyte == 0) {
        for (int i = count; i-- > 0;) {
            value[i] = (char)(ascii[i + offset] & 0xff);
        }
    } else {
        hibyte <<= 8;
        for (int i = count; i-- > 0;) {
            value[i] = (char)(hibyte | (ascii[i + offset] & 0xff));
        }
    }
    this.value = value;
}

@devinrsmith
Copy link
Author

I'm suggesting building the char[] yourself and calling

String(char[] value, boolean share) {
    // assert share : "unshared not supported";
    this.value = value;
}

@NathanSweet
Copy link
Member

Using the public API, String is going to do a copy no matter what as it's
the only way String can stay immutable. Two copies (doing a copy and then
giving it to String to do another copy) is what we are trying to avoid by
modifying the byte[].

String(char[], boolean) is JRE implementation specific. It's a good idea
to use it if available though. We'll still need a fallback, which I guess
can do two copies.

On Fri, Nov 14, 2014 at 7:03 PM, Devin Smith notifications@github.com
wrote:

I'm suggesting building the char[] yourself and calling

String(char[] value, boolean share) {
// assert share : "unshared not supported";
this.value = value;
}


Reply to this email directly or view it on GitHub
#264 (comment)
.

@magro
Copy link
Collaborator

magro commented Jun 9, 2018

@NathanSweet fixed in 5.0?

@NathanSweet
Copy link
Member

Yep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants