-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NaN in StatisticsSenderService JSON output #29412
Comments
assign core |
New categories assigned: core @Dr15Jones,@smuzaffar,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks |
A new Issue was created by @makortel Matti Kortelainen. @Dr15Jones, @silviodonato, @dpiparo, @smuzaffar, @makortel can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
@bbockelm Would you be able to take a look? |
@vkuznet Is it more important to get a conforming JSON, or a sensible number instead of e.g. |
Matti, what is important is valid JSON, the nan is not a valid JSON value, therefore service fails to parse the JSON. From my perspective, I don't really care if you'll provide string or number, but for data to be consistent with possible analysis it should have consistent data-type. |
Thanks, so a |
yes, 0 or -1 will work. And, the rate of errors I see is ~1500 in 24h where the total number of docs received by the collector was about 2.5M. |
#35142 (and https://hypernews.cern.ch/HyperNews/CMS/get/swDevelopment/3593.html) reports a similar issue with cmssw/Utilities/StorageFactory/src/StatisticsSenderService.cc Lines 76 to 86 in 27edb97
Since the entry is there, and By quick look this snippet looks like it intends to calculate the mean and stddev since the last update, especially because
I think this leads to the second term in the Should |
@makortel , I did not inspect all errors (as I reported today there are roughly 7000 of invalid JSON failures reported yesterday). How many do have those per day I don't know, but I would assume this rate. So, the code should check that all math operations are valid and produce valid numerical values. I don't see why you can't just check that you pass positive value to sqrt, and you should not divide by zero. Those seems obvious checks to add to the code to all values which this code stream to |
So looking at the code, the numbers can be changing on one thread by being read out on another. They are all |
@makortel For open cycles, you can refer to And as discussed at ORP in Aug, We will update the ORP speadsheet later. |
Reported by @vkuznet in https://hypernews.cern.ch/HyperNews/CMS/get/swDevelopment/3555.html
The text was updated successfully, but these errors were encountered: