-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x-pack/metricbeat/module/sql: Problem with handling types properly #40090
Comments
Hi, I think detecting the type by value is not a good concept. What about other strings like "0.002"? + you never know what ids are out there and how they look like. You can maybe also handle all data as string and let the user decide what data needs to be converted using processors? |
I've again started to taking a look at this: Current behaviour
"sql": {
"query": "select '0054321'::varchar(20) as strCol, 12345 as intCol",
"metrics": {
"numeric": {
"strcol": 54321,
"intcol": 12345
}
},
"driver": "postgres"
} The strcol value is incorrectly converted to Root CauseThe issue stems from type handling in the module's code. While github.com/lib/pq correctly interprets VARCHAR as a Go string, the current implementation in the SQL module converts all non-special types to float64 if possible, leading to the loss of leading zeros. See how the github.com/lib/pq (used by SQL module for Postgres) handles the data types: https://pkg.go.dev/github.com/lib/pq#hdr-Data_Types
So beats/metricbeat/helper/sql/sql.go Line 169 in ce4a17b
Except Proposed solutionA fix has been proposed in PR #41607 that properly handles string types. Output after the fix: "sql": {
"driver": "postgres",
"query": "select '0054321'::varchar(20) as strCol, 12345 as intCol",
"metrics": {
"string": {
"strcol": "0054321"
},
"numeric": {
"intcol": 12345
}
}
} The fix ensures proper type categorization and preservation of string values. A helpful temporary workaround is available for those affected by this bug while waiting for the official fix. You can resolve this by modifying the query in your configuration as shown below: Before:
After:
So this appends Output will look like this: "sql": {
"driver": "postgres",
"query": "select 'x' || '0054321'::varchar(20) as strCol, 12345 as intCol",
"metrics": {
"numeric": {
"intcol": 12345
},
"string": {
"strcol": "x0054321"
}
}
} Notice the value for
So it drops the But yes, we recommend to wait for the official fix if it is possible. |
Can this be closed as the PR is merged? |
Yes, I am just waiting for backport 8.x PR to be merged. |
Fix should be available in 8.16.2. FF for 8.16.1 was yesterday. In case the branch is still not cut out for 8.16.1 then the fix could land in next 8.16.1 release but this is unlikely. The fix will for sure be available in 8.16.2 release. |
Creating issue with content provided by @shmsr :
Metricbeat SQL module is automatically changing the data types of VARCHAR columns to numeric types when the values are numeric, which causes the losing of the original values in certain cases.
This is easy to reproduce with the following Metricbeat configuration:
The previous example will return 2 columns, the first column with value
0054321
(ofVARCHAR
type) and the second column with12345
as a number.But, after running metricbeat, we see the following:
Query:
Response:
which is incorrect i.e., leading zeroes of
VARCHAR
are lost. To retain the leading zeroes instrcol
should've been ideallystring
.The same will happen for example with a VARCHAR value like
5501174335
. It ends up being represented as the float5.501174335E9
, which is wrong.Root cause:
The root problem exists here. If you see []byte and default cases, it handles them as a string and then tries to parse it as float. If it can, it becomes a float else it remains a string.
For better understanding, you can also take a look at the unit tests (test inputs) here.
If you notice in the unit tests, case int, unit, etc. is handled and float64 is expected and not int, unit which is not correct. Similarly more problems exist.
To fix this behavior, change is necessitated to handle these types properly. Also, add cases like:
i.e., any string leading with 0 (not immediately followed by a dot) should be a string. This should handle types like
VARCHAR
,TEXT
, etc.Also, types like
int
,uint
, etc. should remain as they are not converted tofloat
.And then, we have to update the code. Here all those types should be handled properly. For example, numeric types in ES should have those types and not just float64 as it has support for more.
The text was updated successfully, but these errors were encountered: