-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tsm1]: panic: runtime error: index out of range #4365
Comments
@pauldix Looks like We could prevent this panic in |
I am also facing the same issue.
|
FYI, I have applied the patch d3a483e and I am still unable to start InflxuDB. Now I am getting and "unable to determine block type" panic. Is the tsm1 data file corrupted?
|
@syepes how big is |
@pauldix , It not that big, can I just remove this file?
|
@syepes you can remove the file, but then you'll lose everything that was in it. Can you grep your log for that file name? Also, would it be possible to zip that up along with the |
Ok, no problem. Where should i upload it? |
If DecodeSameTypeBlock is called on on an empty Values slice, it would panic with an index out of bounds error. This func can actually be removed because DecodeBlock can determine what type of values are encoded already. This will still panic if the block cannot be decoded due to other reasons. Fixes #4365
@syepes you can upload your files using this link: http://cwok.me/cu/ghi4365 |
@beckettsean I have just uploaded the requested files |
@beckettsean I think this issue is not totally resolved, after starting a fresh InfluxDB (d8d87d9) this morning I have gotten another "index out of range" panic:
|
This is probably fixed by #4402. Although you'll have to start with fresh data. |
@pauldix Thanks, no problem I will try that out when it gets merged. |
#4402 was merged. |
We were seeing this panic fairly frequently; we will test with a new nightly next week and report back. |
Hi, I still have problem, I tested with
I'm using a single node with almost 10k points/s. Thanks. |
@Issif did you start with a fresh database? |
I did. I erased everything in /wal and /data as you adviced in your previous answer. Thanks. |
@Issif What does |
When rewriting a tsm file, a panice on the Values slice could happen if there were no values in the slice and the conditions of the rewrite causes DecodeAndCombine to be called with the empty slice. This could happen is the sizes of the points new values was equal to the MaxPointsInBlock config options and there were no future blocks after the current one being written. When this happens, DecodeAndCombine returns a zero length remaining values slice which is passed back into DecodeAndCombine one last time. In this case, we now just return the original block since there is nothing new to combine. Fixes #4444 #4365
Fixed via #4504 |
@daviesalex Let us know if you hit it after #4504. I was able to reproduce it with #4402 so it was not completely fixed by that patch. |
Hi, I still have this issue, with
My configuration : https://gist.github.com/Issif/0893c15e2cd75156d694 Error logs : https://gist.github.com/Issif/f9b77a2f2aab812696b2 And a
It crashes after few seconds each time (started with no data again). |
@lssif looks like a slightly different panic this time. I'll take a look. Thanks. |
Overnight we saw the panic that you all expected (in tsm1.Values.MinTime), will upgrade and re-test. Will test with #4504 included.
|
#4513 opened to track new issue. |
Installed the nightly 9.5 build (version 0.9.5-nightly-6f80d69), configured the tsm1 data store and started running my test program. InfluxDB is running on a Ubuntu 13.04 VM with 4 cores, 3G ram, 20G hd.
The test program batch inserts 15K - 30K points every 5 seconds. Each point has 7 tag values, each tag has a cardinality of 10. Two continuous queries are configured to run every 5m and 1h, down sampling the data into 5m and 1h buckets.
Influx ingested the batch inserts without problems, but memory consumption continued to grow until the process memory reached about 94% of system memory. Causal observation (using top) show memory appeared to increase as the continuous queries executed.
When process memory reached 94% of system memory, the system became unresponsive and a 500 error was received on the batch write. I gracefully restarted the system and influx failed to start. Subsequent attempts to start influx all fail with the same panic.
Below is a log snippet showing the panic.
The text was updated successfully, but these errors were encountered: