Skip to content

Conversation

@mouhc1ne
Copy link
Contributor

@mouhc1ne mouhc1ne commented Sep 10, 2025

  • Change URL_ENCODE to encode spaces as +.
  • Add a new URL_ENCODE_COMPONENT scalar function, which encodes spaces as %20.
  • Both encoding functions encode all characters in the input except for alphanumerics, ., -, _, and ~.
  • Allow URL_DECODE to fail gracefully if the input can't be decoded, by returning null and adding a warning in the header, similar to other scalar functions.
  • Update csv-tests:
      - Test decoding for both + and %20 back to space.
      - Reduce the length of doc lines to less than 76 chars, as mentioned in the contribution guide.
  • Minor changes to the documentation.

@elasticsearchmachine elasticsearchmachine added v9.2.0 needs:triage Requires assignment of a team area label labels Sep 10, 2025
@mouhc1ne mouhc1ne requested a review from Copilot September 10, 2025 20:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances URL encoding functionality in ESQL by modifying existing functions and adding a new URL encoding variant. The main purpose is to provide more compliant URL encoding with different space encoding behaviors.

Key changes:

  • Modified URL_ENCODE to encode spaces as + instead of %20
  • Added new URL_ENCODE_COMPONENT function that encodes spaces as %20
  • Enhanced URL_DECODE to gracefully handle invalid input by returning null and adding warnings

Reviewed Changes

Copilot reviewed 31 out of 34 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
UrlEncodeComponent.java New scalar function implementation for component-style URL encoding
UrlEncode.java Updated to use custom encoding that produces + for spaces
UrlDecode.java Enhanced with exception handling for graceful failure
UrlCodecUtils.java New utility class providing custom URL encoding codecs
EsqlFunctionRegistry.java Registered the new URL_ENCODE_COMPONENT function
string.csv-spec Updated test cases and examples for new encoding behaviors
Various test files Added comprehensive test coverage for the new function
Documentation files Updated function descriptions and examples

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 10, 2025

@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@mouhc1ne mouhc1ne force-pushed the url_encode_component branch 2 times, most recently from 5945402 to 58925f6 Compare September 10, 2025 20:39
@mouhc1ne mouhc1ne requested a review from Copilot September 10, 2025 20:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 31 out of 34 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@mouhc1ne mouhc1ne self-assigned this Sep 10, 2025
@mouhc1ne mouhc1ne force-pushed the url_encode_component branch from 58925f6 to ae03646 Compare September 10, 2025 20:50
@mouhc1ne mouhc1ne added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL ES|QL-ui Impacts ES|QL UI labels Sep 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Sep 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/kibana-esql (ES|QL-ui)

@mouhc1ne mouhc1ne marked this pull request as draft September 11, 2025 12:52
@mouhc1ne mouhc1ne force-pushed the url_encode_component branch 2 times, most recently from 97fb60a to 9048842 Compare September 15, 2025 13:02
@mouhc1ne mouhc1ne marked this pull request as ready for review September 15, 2025 13:17
@mouhc1ne mouhc1ne force-pushed the url_encode_component branch from 9048842 to 6d5b121 Compare September 15, 2025 14:47
@mouhc1ne mouhc1ne force-pushed the url_encode_component branch 2 times, most recently from 21fe748 to 96fd1a2 Compare September 17, 2025 14:57
@mouhc1ne mouhc1ne requested review from ivancea and nik9000 September 17, 2025 14:58
@mouhc1ne mouhc1ne force-pushed the url_encode_component branch from 3cd4855 to 9a44483 Compare September 17, 2025 15:47
@mouhc1ne mouhc1ne force-pushed the url_encode_component branch from 9a44483 to 88802bc Compare September 19, 2025 14:22
@mouhc1ne mouhc1ne force-pushed the url_encode_component branch 3 times, most recently from 0214f71 to 0516afa Compare September 21, 2025 18:39
- Change URL_ENCODE to encode spaces as +.
- Add a new URL_ENCODE_COMPONENT scalar function, which encodes spaces as %20.
- Both encoding functions encode all characters in the input except for alphanumerics, ., -, _, and ~.
- Allow URL_DECODE to fail gracefully if the input can't be decoded, by returning null and adding a warning in the header, similar to other scalar functions.
- Update csv-tests:
  - Test decoding for both + and %20 back to space.
  - Reduce the length of doc lines to less than 76 chars, as mentioned in the contribution guide.
- Minor changes to the documentation.
- Manually perform percent-encoding directly on the BreakingBytesRefBuilder scratch area, and only if the input was gonna change after encoding.
- Use Apache's PercentEncode as the ground truth during unit testing.
- Add test cases with fixed strings.
- Add fixed test cases that fail decoding
- Randomize the encoder used in decoding tests
- Add url_encode_component capability to some encode/decode tests whose behavior changed to ensure they only run if the url_encode_component capability is available
@mouhc1ne mouhc1ne force-pushed the url_encode_component branch from 0516afa to 6b2e950 Compare September 21, 2025 23:43
@mouhc1ne mouhc1ne merged commit 1af8b3e into elastic:main Sep 22, 2025
34 checks passed
gmjehovich pushed a commit to gmjehovich/elasticsearch that referenced this pull request Sep 22, 2025
- Change URL_ENCODE to encode spaces as +.
- Add a new URL_ENCODE_COMPONENT scalar function, which encodes spaces as %20.
- Both encoding functions encode all characters in the input except for alphanumerics, ., -, _, and ~.
- Allow URL_DECODE to fail gracefully if the input can't be decoded, by returning null and adding a warning in the header, similar to other scalar functions.
- Manually perform percent-encoding directly on the BreakingBytesRefBuilder scratch area, and only if the input was gonna change after encoding.
- Update csv-tests:
  - Test decoding for both + and %20 back to space.
  - Reduce the length of doc lines to less than 76 chars, as mentioned in the contribution guide.
  - Add test cases with fixed strings
- Minor changes to the documentation.
- Update Unit tests:
  - Add fixed test cases that fail decoding
  - Randomize the encoder used in decoding tests
  - Use Apache's PercentEncode as the ground truth during unit testing.
DonalEvans pushed a commit to DonalEvans/elasticsearch that referenced this pull request Sep 22, 2025
- Change URL_ENCODE to encode spaces as +.
- Add a new URL_ENCODE_COMPONENT scalar function, which encodes spaces as %20.
- Both encoding functions encode all characters in the input except for alphanumerics, ., -, _, and ~.
- Allow URL_DECODE to fail gracefully if the input can't be decoded, by returning null and adding a warning in the header, similar to other scalar functions.
- Manually perform percent-encoding directly on the BreakingBytesRefBuilder scratch area, and only if the input was gonna change after encoding.
- Update csv-tests:
  - Test decoding for both + and %20 back to space.
  - Reduce the length of doc lines to less than 76 chars, as mentioned in the contribution guide.
  - Add test cases with fixed strings
- Minor changes to the documentation.
- Update Unit tests:
  - Add fixed test cases that fail decoding
  - Randomize the encoder used in decoding tests
  - Use Apache's PercentEncode as the ground truth during unit testing.
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Oct 2, 2025
BASE=62e53cd13c8846f0729390450a5cfbd0304e7f40
HEAD=6b2e950488ed317736d19f100fbcf58bb587fea4
Branch=main
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Oct 8, 2025
BASE=62e53cd13c8846f0729390450a5cfbd0304e7f40
HEAD=6b2e950488ed317736d19f100fbcf58bb587fea4
Branch=main
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Oct 17, 2025
BASE=62e53cd13c8846f0729390450a5cfbd0304e7f40
HEAD=6b2e950488ed317736d19f100fbcf58bb587fea4
Branch=main
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Oct 23, 2025
BASE=62e53cd13c8846f0729390450a5cfbd0304e7f40
HEAD=6b2e950488ed317736d19f100fbcf58bb587fea4
Branch=main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement ES|QL-ui Impacts ES|QL UI Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants