-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take too long to load 1 datapoint into 3000 distinct series #375
Comments
@samuraraujo The problem with only writing a single point on each HTTP request is that you'll end up spending a majority of each request in HTTP overhead, so even if it's relatively quick (20ms total), 3000 points will still take 60 seconds, and I would expect that's a best case scenario. For optimum throughput, you should batch up the points into a single request, like this: [
{
"name": "series_one",
"columns": ["value", "time"],
"points": [
[99, 123412341234]
]
},
{
"name": "series_two",
"columns": ["value", "time"],
"points": [
[33, 432143214321]
]
}
] Are you using a library to write the data in, or are you hitting the API directly? Regarding your question about the maximum number of series, InfluxDB is designed to handle millions of time series relatively easily. |
Todd, I am sending the datapoints in one single batch. More precisely, 3000 datapoints are sent at once via the API (HTTP). Each datapoint represents one series. Is it really normal that InfluxDB takes minutes to handle this request? You wrote: " InfluxDB is designed to handle millions of time series relatively easily." Notice that I use one single machine with 8GB memory. |
@samuraraujo Writing a single batch of 3000 points should take less than a second. When running tests locally with points in 10k batches, we see point throughput of > 100k/sec sustained. Adding the overhead of a remote server would reduce that, but not drastically so. Can you send over any sample data or sample code that you're using to write the points? How much data are you including with each point? We don't have any benchmarks that specifically address number of time series, but the series names function as an index, so the performance will remain high even as the number of series grows. |
Todd, here is a sample of the data that I am trying to add to influx. Do you see anything that could cause the delay that I reported? [{"name"=>"metric.news.item.b167cafa4d700a938800e9d72507ba206c49f480.view", "columns"=>["value", "time"], "points"=>[[1, 1396038465]]},
{"name"=>"metric.news.item.a6c2333267bea34350b0dec33d38989dd11d05cf.view", "columns"=>["value", "time"], "points"=>[[1, 1396038465]]},
{"name"=>"metric.news.item.289cadefe3d93953aba1dbae71099dc1093b910c.view", "columns"=>["value", "time"], "points"=>[[3, 1396038465]]},
{"name"=>"metric.news.item.7b3874298fa2d95c0651828d1d13fb6f20b0ce5f.view", "columns"=>["value", "time"], "points"=>[[2, 1396038465]]},
{"name"=>"metric.news.item.02d171b106caf3584dbfb56fb5aaca2027cabe2c.view", "columns"=>["value", "time"], "points"=>[[1, 1396038465]]},
{"name"=>"metric.news.item.f4edf7ec88215ec1c03b1c3f19ff788a18ceecbc.view", "columns"=>["value", "time"], "points"=>[[6, 1396038465]]},
{"name"=>"metric.news.item.4c35dc637aff4eec9c8e3f77a2d67a2bab998d4d.view", "columns"=>["value", "time"], "points"=>[[1, 1396038465]]},
{"name"=>"metric.news.item.dc667f337ba0a0b8e44af171419f15a5312d0762.view", "columns"=>["value", "time"], "points"=>[[1, 1396038465]]},
{"name"=>"metric.news.item.062005b04be79af3fa28d55416498085a9961e98.ctr", "columns"=>["value", "time"], "points"=>[[0.029411764705882353, 1396038465]]},
{"name"=>"metric.news.item.5f87b2e08af7e58eecc5b5080f20c275f08f5386.ctr", "columns"=>["value", "time"], "points"=>[[0.5, 1396038465]]},
{"name"=>"metric.news.item.ad048a2726f48ad29b5461c3115801bab741d606.ctr", "columns"=>["value", "time"], "points"=>[[0.25, 1396038465]]},
{"name"=>"metric.news.item.7203f60774ce2df6e27983c6b0d666ea2bce7421.ctr", "columns"=>["value", "time"], "points"=>[[0.3333333333333333, 1396038465]]}
{"name"=>"metric.news.item.208b4d79a681e2e6f63985a8d76a65e1f3b0fded.ctr", "columns"=>["value", "time"], "points"=>[[0.5, 1396038465]]},
{"name"=>"metric.news.item.4c32036d5d38e4280079df17b6853ac88da0e32f.ctr", "columns"=>["value", "time"], "points"=>[[1.0, 1396038465]]},
{"name"=>"metric.news.item.2805823d54a43dd1c45ec75f74441adccda38a84.ctr", "columns"=>["value", "time"], "points"=>[[0.010752688172043012, 1396038465]]}
]``` |
Todd, you can download the file with the entire data here: |
@samuraraujo This is great, thanks! With some local testing, I wasn't able to reproduce the large delay you were seeing, but I will try this dataset and let you know what I see. Could you also send along the code or command you're issuing to actually write the data? |
Todd, I observe the same delay using the command: curl -X POST 'http://localhost:8086/db/metrics/series?u=root&p=root&time_precision=s' -d @log.txt The code is not the issue. |
@samuraraujo Alright, I just tried your dataset and the same curl command you provided above and the request returns in 2-3 seconds. Can you give me any more information about the machine you're running this on? Also, what is the behavior of the process while the query is running (cpu %, memory utilization, etc.)? Is there anything out of the ordinary in the log file? |
@samuraraujo Could you also send over a copy of your config file? |
Here is my config file: |
Here is the machine details: Below is the machine status while running the command. Notice that I did not change the default config file. |
@samuraraujo ok, it sounds like something else is going on. the cpu utilization should be much higher during writes, so it sounds like something else is causing the bottleneck. can you run the following command while the writes are happening and send over some of the output?
Thanks! |
Here is the output:
|
hey @samuraraujo, can you run |
Hi Shahid, here is the output: https://drive.google.com/file/d/0Bwci5MBqp2fyeDFYX3dxbWc1VG8/edit?usp=sharing Soon I run the command below, the io grows from 0.0% to 67%. curl -X POST 'http://localhost:8086/db/metrics/series?u=root&p=root&time_precision=s' -d @log_json.txt |
It looks like the filesystem is doing a lot of io. This is probably due to the fact that we're breaking up the request per time series (we have a story to fix this #310) and the fact that the default config flushes the wal to disk on every request. So currently with your config, you're flushing to disk 3000 times per reqeust. To verify my theory, can you change your config to match the following: flush-after = 10000 # the number of writes after which wal will be flushed, 0 for flushing on every write
bookmark-after = 10000 # the number of writes after which a bookmark will be created
# the number of writes after which an index entry is created pointing
# to the offset of the first request, default to 1k
index-after = 10000
# the number of requests per one log file, if new requests came in a
# new log file will be created
requests-per-logfile = 100000 |
Now it works!!! Very fast! Congrats! Can you point me the documentation that explain better these parameters? |
Improving the documentation is on the top of our todo list, we'll update them in the next few weeks. Currently we have very poor documentation when it comes to configuration options. Actually, no documentation other than what you get in the config file itself. |
I'll go ahead and close this issue, since it's been resolved. |
I have 3000 series with two columns each (value, timestamp) where I try to load via REST api 1 datapoint every 20s. The task takes many minutes to complete. I would expect it to be loaded in a few seconds.
So, I have a question: How many distinct series InfluxDB support? Thousands? Hundred thousands? Millions?
Are there any benchmark testing the time to load data into InfluxDB?
Do you have a sample or tutorial on modelling large web data analytics on InfluxDB?
The text was updated successfully, but these errors were encountered: