-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent queries return randomly corrupted UASTs #34
Comments
Actually, we seem to reproduce it even with a single process. If I add |
Another observation: we did not test it with a single instance of the driver. We will test it now. |
This is the log we get. I see errors with empty |
@juanjux BTW those errors on empty files do not look like gracefully handled :) |
Does it only happen with all these files or still happens with a smaller number of files? If the latter, could you provide an example with the minimum number of files and those files to try to reproduce the problem? |
This may or may not be related with bblfsh/java-driver#19 |
We will try to reduce it now. |
We tortured it to death!
|
This is the minimal example which reproduces the problem in 50% of the cases:
|
I'll try to reproduce and fix this tomorrow as I return from CurryOn. Sorry for the delay. |
Not the same problem (but could be related), I'm seeing that parsing those init.py files directly from the driver (without the server) produce a fatal error "missing content" (since the files are empty) which should be a normal "error" (with an empty AST returned) and not a fatal (since fatal errors stop any further processing from that driver and this isn't really anything unrecoverable). I'll start by fixing that and checking if that fix the problem. This could also be related with the blocking problem because we suspect the server is not closing failing drivers correctly and thus the limit of live images is reached, but that will come later. |
Update: this bug seems to be a several-headed beast, I'll write and keep updated a list of found problems so you can see the status from this message. Maybe later I'll split this into several different issues.
|
Since I can't reproduce this problem anymore after the first fixes I'll close this one. I'll publish new docker images of the server and python-driver soon but meanwhile you can test with a server and python-driver build from ~master. Build the python driver first: python-driver$ make build then tag the created image ( server$ cd cmd/bblfsh
bblfsh$ go build
bblfsh$ sudo rm -rf /tmp/bblfsh-runtime
bblfsh$ sudo ./bblfsh server --transport=docker-daemon Then you can run the python-client (remember to use the BTW, the last version of the python-client seem to work a little different than the way it was explained in the report, now it has a #!/bin/sh
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/__init__.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/protocol/__init__.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/protocol/generated_pb2.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/protocol/generated_pb2_grpc.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/uast/__init__.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/uast/generated_pb2.py and then called it from this one: #!/bin/sh
sh bugtest.sh > 1.txt&
sh bugtest.sh > 2.txt&
sh bugtest.sh > 3.txt&
sh bugtest.sh > 4.txt&
sh bugtest.sh > 5.txt& |
@juanjux from the first message
|
@juanjux The bug is not reproduced the way your script works. We knew that, that's why we created the patch. Please reopen this and let us test your fixes instead of opening the second issue about the same problems. |
Reopened. I saw your comment and downloaded from that repo but failed to notice it was a specific branch, sorry about that. Anyway I'm testing now with the right branch and the same script (plus the --disable-bblfsh-autorun parameter to use my local updated server) and I'm always getting the same result. Right now I'm writing a script to do it some hundreds times before I go to lunch to be 100% safe. Sorry for closing the bug report before your test, it's pretty common on github to do it as soon as the developer considers the bug fixed and reopening if the user says its not, but I certainly didn't saw anything on that respect on our git workflow document so I could have acted too fast dumbed down by the happy feeling of closing this f*cking bug that I've been battling for a day and a half 😀 |
This will be running in the background for a while (run with full #!/bin/bash
for i in `seq 1 500`;
do
python3 -m bblfsh --disable-bblfsh-autorun bblfsh/github/com/bblfsh/sdk/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py bblfsh/github/com/bblfsh/sdk/uast/__init__.py bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 1.txt &
python3 -m bblfsh --disable-bblfsh-autorun bblfsh/github/com/bblfsh/sdk/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py bblfsh/github/com/bblfsh/sdk/uast/__init__.py bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 2.txt &
python3 -m bblfsh --disable-bblfsh-autorun bblfsh/github/com/bblfsh/sdk/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py bblfsh/github/com/bblfsh/sdk/uast/__init__.py bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 3.txt &
sleep 5
diff 1.txt 2.txt > /dev/null
D1=$?
diff 2.txt 3.txt > /dev/null
D2=$?
if [ "$D1" == 1 ] || [ "$D2" == 1 ]; then
echo "Different files"
exit 1
fi
rm 1.txt 2.txt 3.txt
echo "Finished iteration $i"
done |
Worked 500 times. I'll make a Docker image so Egor can test this without much hassle. |
Thanks! Tomorrow ML has OSF, so Egor will be able to test on Monday. |
Ok Vadim, as I spoke with Egor, I'll prepare a Dockerfile that downloads and compiles all components in ~master so this and other bugfixes are easier to test by you both before releasing new versions of the docker images on dockerhub with the bugfixes. |
I created a project with a Dockerfile and a test script here: https://github.com/juanjux/bblfsh-dev-image For testing: git clone https://github.com/juanjux/bblfsh-dev-image.git
cd bblfsh-dev-image
docker build -t bblfshdev .
# wait for the image to build...
docker run -v $(pwd):/script --privileged -it --rm bblfshdev bash -l /script/test_server_issue34.sh
# wait for the 500 iterations (* 5 seconds) of the test finish...
# ...make whatever other tests you need... Several Bothan agents died to create this Dockerfile + script and made it compatible with Ubuntu 16.04. |
Thank you @juanjux! Update: Thanks for an awesome effort on a containerizing build for bblfsh! I must be doing something wrong here, but for some reason
|
Found the same problem tonight actually, should be fixed on the current version of the docker image project. That command is run inside the docker of the Python driver (the docker inside the docker). The problem was an UID collision that was fixed adding the BUILD_UID env variable hardcoding the UID to the one the previously created prebblfsh user has (1000) and then giving permission to that UID for the drivers directories (that are mounted in |
🚀 Builds like a charm now - both, python and java driver were built. Update if anybody is interested - I put together the instructions to build the same on Debian Jessie, so you can do build latest bblfsh i.e on free Google Cloud Console
|
@bzz yep, most of the complications in the Dockerfile is because of dependencies (and dependencies of dependencies) that are missing in Ubuntu 16.04, in 16.10+ is much simpler. |
Feel free to reopen if it's not fixed. |
@EgorBu I don't see your test report here. Is everything's fixed for us? |
Not everything is well. During the first iteration, it returns constantly the wrong result: I think it may be related to #37
|
@EgorBu as I was telling you in Slack, it's a false positive because of the 5 seconds sleep is not enough for the driver instantiation, so the 3 files are still empty (bash file redirections create the files even before content is generated on stdout). Changing it to 20 seconds that iteration eliminates the problem 100% of the time. |
We use the following script to reproduce the problem:
Please note that you need the patch to client-python: https://github.com/vmarkovtsev/bblfsh.client-python/tree/main-change
Result: 1.txt, 2.txt and 3.txt are randomly different (checked with
diff
). Besides, some files are in wrong places or empty.The text was updated successfully, but these errors were encountered: