Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix escaping filter_key in prometheus output #260

Merged
merged 3 commits into from
Mar 26, 2023
Merged

Conversation

u5surf
Copy link
Collaborator

@u5surf u5surf commented Feb 19, 2023

#142 (comment)

* Fixes vozlt#142
* it can be escaped the 2 - 4 bytes character
@u5surf u5surf requested review from SuperQ and vozlt February 20, 2023 06:45
Copy link
Owner

@vozlt vozlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll trust you and leave it to you. The ngx_http_vhost_traffic_status_escape_prometheus() function has been merged from this PR.

@u5surf
Copy link
Collaborator Author

u5surf commented Feb 21, 2023

@jongiddy Hi, I heard you implement this function, please you could review on this patch🙏

Copy link
Contributor

@jongiddy jongiddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Prometheus exposition format is UTF-8 encoded so valid UTF-8 characters do not need to be encoded further.

Encoding valid UTF-8 characters to percent-encoding would be surprising. Why did you choose this encoding? The format does not mention this form of escape.

There are 2 problems here:

  1. \xff is an invalid UTF-8 character but ngx_utf8_decode does not detect this.
  2. Even if did detect it was invalid, and it tried to use the \xff encoding that it uses for dec > 0x10ffff that encoding is not supported.

I would suggest a fix to ngx_utf8_decode to detect more invalid UTF-8 characters, including \xff. This allows valid UTF-8 to continue to be sent literally.

Change the handling of invalid UTF-8 to do something other than \x... Maybe percent-encoding is the best we can do.

@u5surf
Copy link
Collaborator Author

u5surf commented Feb 25, 2023

@jongiddy

I would suggest a fix to ngx_utf8_decode to detect more invalid UTF-8 characters,

I agree with your suggestion. Right away I send the patch to nginx-devel team.
https://mailman.nginx.org/pipermail/nginx-devel/2023-February/34K6WOSI5JSIRESP32TY73EIIUHATOM5.html

After patch merged. to follow backwards compatibility, we suppose to prepare the branches between versions sometime soon.

@jongiddy
Copy link
Contributor

Great. Thanks for creating the nginx patch.

Also, I retract problem 2 in my comment above. My initial reading was that it would replace an invalid character with \xff in the exposition format, which Prometheus does not recognize.

Taking a second look, it sends two backslashes (\\xff) which is valid in the exposition format and the final text shown to users will be \xfftest-api.

Maybe %FFtest-api would have been better. It would allow the code to allocate 3 bytes instead of 5 for the remaining characters. But I'm not sure it is worth changing behavior that someone might be expecting.

@u5surf
Copy link
Collaborator Author

u5surf commented Feb 26, 2023

Maybe %FFtest-api would have been better. It would allow the code to allocate 3 bytes instead of 5 for the remaining characters. But I'm not sure it is worth changing behavior that someone might be expecting.

@jongiddy
I consider that your opinion is true in this aspect, but @axingblog @topicgit had said that the below behavior is not expecting because it can be occurred the invalid format error in promethes(#142 (comment)). I don't see wether it is true or not, once I heard to them what is happened but none of response from them(#142 (comment)). Thus they seem to hope that their character get to be encoding something expression format. So I choice the percent encoding which is adopted generally in RFC 2396 https://www.ietf.org/rfc/rfc2396.txt
Also I was hearing in these points, but they had not any response yet(#142 (comment)).

* it is necessary while the below patch is released officially.
* nginx/nginx@2c5fccd
@u5surf
Copy link
Collaborator Author

u5surf commented Mar 18, 2023

@jongiddy
The nginx patch has merged and I consider that it may release next versions!
nginx/nginx@2c5fccd
For compatibility, We consider that it should return as an invalid character which starts with above 0xf8 following modified nginx decode rules. I changed this patch such that. Can you check it?

@@ -187,7 +187,7 @@ ngx_http_vhost_traffic_status_escape_prometheus(ngx_pool_t *pool, ngx_str_t *buf
}
} else {
char_end = pa;
if (ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) {
if (*char_end > 0xf8 || ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (*char_end > 0xf8 || ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) {
if (*pa >= 0xf8 || ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) {

This is a good approach, but 0xf8 is invalid too, so you need to use >=.

I also suggest using *pa as this is the pointer to the current character. char_end happens to have the same value, but it has a different meaning associated with its use in the ngx_utf8_decode function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jongiddy Thanks! I fixed it.

@@ -237,7 +237,7 @@ ngx_http_vhost_traffic_status_escape_prometheus(ngx_pool_t *pool, ngx_str_t *buf
}
} else {
char_end = pa;
if (ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) {
if (*char_end > 0xf8 || ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jongiddy Also fixed

@u5surf u5surf merged commit a98a4b8 into vozlt:master Mar 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid UTF-8 label -- Prometheus Error with VTS Exporter.
3 participants