Skip to content

Conversation

@phananh1010
Copy link
Owner

@phananh1010 phananh1010 commented Oct 8, 2025

Single commit with tree=6b2e950488ed317736d19f100fbcf58bb587fea4^{tree}, parent=62e53cd13c8846f0729390450a5cfbd0304e7f40. Exact snapshot of upstream PR head. No conflict resolution attempted.

Summary by CodeRabbit

  • New Features
    • Added URL_ENCODE_COMPONENT to encode URL components (preview/snapshot).
    • URL_DECODE now returns null and adds a warning header when input cannot be decoded.
  • Documentation
    • Expanded descriptions for URL_ENCODE and URL_DECODE, clarifying percent-encoding rules and error behavior.
    • Added full documentation, syntax, parameters, supported types, and examples for URL_ENCODE_COMPONENT.
    • Updated examples for URL_ENCODE and URL_DECODE to use clearer sample URLs and show expected outputs.

BASE=62e53cd13c8846f0729390450a5cfbd0304e7f40
HEAD=6b2e950488ed317736d19f100fbcf58bb587fea4
Branch=main
@coderabbitai
Copy link

coderabbitai bot commented Oct 8, 2025

Walkthrough

Adds a new URL_ENCODE_COMPONENT function (docs, registry, capability, evaluator, tests), introduces UrlCodecUtils for RFC3986 encoding, refactors URL_ENCODE to use the utility and scratch buffers, clarifies URL_DECODE behavior (null + warning on failure), and updates/expands documentation, examples, JSON definitions, and test fixtures.

Changes

Cohort / File(s) Summary
Docs — ESQL function descriptions & parameters
docs/reference/query-languages/esql/_snippets/functions/description/url_decode.md, .../description/url_encode.md, .../description/url_encode_component.md, .../parameters/url_decode.md, .../parameters/url_encode.md, .../parameters/url_encode_component.md
Updated descriptions for URL_DECODE (null + warning on failure) and URL_ENCODE (explicit encoding rules). Added generated description/parameters for URL_ENCODE_COMPONENT. Minor parameter wording tweaks.
Docs — ESQL examples
docs/reference/query-languages/esql/_snippets/functions/examples/url_decode.md, .../examples/url_encode.md, .../examples/url_encode_component.md
Replaced example inputs/outputs; added example snippet for URL_ENCODE_COMPONENT.
Docs — ESQL layout/types for component
docs/reference/query-languages/esql/_snippets/functions/layout/url_encode_component.md, .../types/url_encode_component.md
Added layout and supported types snippet for URL_ENCODE_COMPONENT.
Kibana — function JSON definitions
docs/reference/query-languages/esql/kibana/definition/functions/url_decode.json, .../url_encode.json, .../url_encode_component.json
Updated descriptions, parameter texts, and examples for url_decode/url_encode. Added new definition for url_encode_component (signatures, example, preview, snapshot_only).
Kibana — function docs
docs/reference/query-languages/esql/kibana/docs/functions/url_decode.md, .../url_encode.md, .../url_encode_component.md
Synced docs with new descriptions/examples; added page for URL_ENCODE_COMPONENT.
Engine — capability & registry
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java, .../expression/ExpressionWritables.java, .../EsqlFunctionRegistry.java
Added URL_ENCODE_COMPONENT capability; registered UrlEncodeComponent as writeable and function (snapshot).
Engine — URL codec implementation
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/util/UrlCodecUtils.java, .../scalar/convert/UrlEncode.java, .../scalar/convert/UrlEncodeComponent.java, .../scalar/convert/UrlDecode.java
Introduced UrlCodecUtils (RFC3986 encoding, plus-for-space option). Refactored UrlEncode to use utility with BreakingBytesRefBuilder scratch. Added UrlEncodeComponent function using utility (%20 for spaces). Updated UrlDecode description and evaluator to warn on IllegalArgumentException.
Tests — unit and serialization
x-pack/plugin/esql/src/test/java/.../AbstractUrlEncodeDecodeTestCase.java, .../UrlEncodeTests.java, .../UrlDecodeTests.java, .../UrlEncodeComponentTests.java, .../UrlEncodeComponentErrorTests.java, .../UrlEncodeComponentSerializationTests.java
Reworked test infra to enum-driven codec types; updated ENCODE/DECODE tests; added tests for URL_ENCODE_COMPONENT (happy path, errors, serialization).
Tests — spec fixtures
x-pack/plugin/esql/qa/testFixtures/src/main/resources/string.csv-spec
Added/expanded CSV specs for URL_ENCODE_COMPONENT and component-level decode; added capabilities, new inputs (underscores, tildes, emoji), negative cases, and end-to-end scenarios.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant ESQL as ESQL Engine
  participant Fn as UrlEncode / UrlEncodeComponent
  participant Util as UrlCodecUtils
  participant Scratch as BreakingBytesRefBuilder

  Client->>ESQL: EVAL URL_ENCODE(_)/URL_ENCODE_COMPONENT(_)
  ESQL->>Fn: toEvaluator(...)
  Fn->>Scratch: acquire (THREAD_LOCAL / Fixed)
  Fn->>Util: urlEncode(val, scratch, plusForSpace)
  Util-->>Fn: BytesRef (encoded or original)
  Fn-->>ESQL: result BytesRef
  ESQL-->>Client: encoded value
  note over Fn,Util: plusForSpace=true for URL_ENCODE,<br/>false (%20) for URL_ENCODE_COMPONENT
Loading
sequenceDiagram
  autonumber
  participant Client
  participant ESQL as ESQL Engine
  participant Decode as UrlDecode

  Client->>ESQL: EVAL URL_DECODE(_)
  ESQL->>Decode: process(val)
  alt Decoding succeeds
    Decode-->>ESQL: decoded BytesRef
    ESQL-->>Client: decoded value
  else IllegalArgumentException
    Decode-->>ESQL: null
    ESQL-->>Client: null + Warning header
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I nibbled bytes and chewed the tilde,
Packed spaces neat—plus or %20 gilded.
New paths to encode each component part,
A warning if decode falls apart.
With scratchy buffers I hop along,
RFC tunes hum my URL song. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately reflects that this pull request mirrors the upstream elastic#134503 for AI review by taking a snapshot of the HEAD tree, succinctly conveying the primary objective without extraneous details.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch eval/pr-134503-diff

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62e53cd and 247f876.

⛔ Files ignored due to path filters (4)
  • docs/reference/query-languages/esql/images/functions/url_encode_component.svg is excluded by !**/*.svg
  • x-pack/plugin/esql/src/main/generated/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlDecodeEvaluator.java is excluded by !**/generated/**
  • x-pack/plugin/esql/src/main/generated/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentEvaluator.java is excluded by !**/generated/**
  • x-pack/plugin/esql/src/main/generated/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeEvaluator.java is excluded by !**/generated/**
📒 Files selected for processing (31)
  • docs/reference/query-languages/esql/_snippets/functions/description/url_decode.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/description/url_encode.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/description/url_encode_component.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/examples/url_decode.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/examples/url_encode.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/examples/url_encode_component.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/layout/url_encode_component.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/parameters/url_decode.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/parameters/url_encode.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/parameters/url_encode_component.md (1 hunks)
  • docs/reference/query-languages/esql/_snippets/functions/types/url_encode_component.md (1 hunks)
  • docs/reference/query-languages/esql/kibana/definition/functions/url_decode.json (2 hunks)
  • docs/reference/query-languages/esql/kibana/definition/functions/url_encode.json (2 hunks)
  • docs/reference/query-languages/esql/kibana/definition/functions/url_encode_component.json (1 hunks)
  • docs/reference/query-languages/esql/kibana/docs/functions/url_decode.md (1 hunks)
  • docs/reference/query-languages/esql/kibana/docs/functions/url_encode.md (1 hunks)
  • docs/reference/query-languages/esql/kibana/docs/functions/url_encode_component.md (1 hunks)
  • x-pack/plugin/esql/qa/testFixtures/src/main/resources/string.csv-spec (5 hunks)
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java (1 hunks)
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/ExpressionWritables.java (2 hunks)
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java (2 hunks)
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlDecode.java (2 hunks)
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncode.java (4 hunks)
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponent.java (1 hunks)
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/util/UrlCodecUtils.java (1 hunks)
  • x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/AbstractUrlEncodeDecodeTestCase.java (1 hunks)
  • x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlDecodeTests.java (1 hunks)
  • x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentErrorTests.java (1 hunks)
  • x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentSerializationTests.java (1 hunks)
  • x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentTests.java (1 hunks)
  • x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeTests.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (9)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java (1)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponent.java (1)
  • UrlEncodeComponent (35-99)
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentErrorTests.java (2)
x-pack/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/Expression.java (1)
  • Expression (30-214)
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentTests.java (1)
  • UrlEncodeComponentTests (21-37)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/util/UrlCodecUtils.java (1)
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/BreakingBytesRefBuilder.java (1)
  • BreakingBytesRefBuilder (21-209)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/ExpressionWritables.java (1)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponent.java (1)
  • UrlEncodeComponent (35-99)
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentTests.java (1)
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/AbstractUrlEncodeDecodeTestCase.java (1)
  • AbstractUrlEncodeDecodeTestCase (30-269)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncode.java (2)
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/BreakingBytesRefBuilder.java (1)
  • BreakingBytesRefBuilder (21-209)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/util/UrlCodecUtils.java (1)
  • UrlCodecUtils (13-116)
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentSerializationTests.java (1)
x-pack/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/Expression.java (1)
  • Expression (30-214)
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/AbstractUrlEncodeDecodeTestCase.java (1)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/util/UrlCodecUtils.java (1)
  • UrlCodecUtils (13-116)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponent.java (3)
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/BreakingBytesRefBuilder.java (1)
  • BreakingBytesRefBuilder (21-209)
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/EvalOperator.java (1)
  • EvalOperator (23-270)
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/util/UrlCodecUtils.java (1)
  • UrlCodecUtils (13-116)
🪛 markdownlint-cli2 (0.18.1)
docs/reference/query-languages/esql/_snippets/functions/types/url_encode_component.md

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)


3-3: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

docs/reference/query-languages/esql/_snippets/functions/parameters/url_encode_component.md

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)


3-3: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

docs/reference/query-languages/esql/_snippets/functions/layout/url_encode_component.md

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)


3-3: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


4-4: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


9-9: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

docs/reference/query-languages/esql/_snippets/functions/description/url_encode_component.md

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)


3-3: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

docs/reference/query-languages/esql/_snippets/functions/examples/url_encode_component.md

1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)


3-3: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

🔇 Additional comments (28)
docs/reference/query-languages/esql/_snippets/functions/parameters/url_decode.md (1)

6-6: LGTM!

The parameter description update improves consistency across the documentation by adding the article "The" and properly hyphenating "URL-encoded".

docs/reference/query-languages/esql/_snippets/functions/examples/url_encode_component.md (1)

1-13: LGTM!

The example correctly demonstrates URL_ENCODE_COMPONENT behavior, where all characters are percent-encoded except alphanumerics, ., -, _, and ~. The output correctly shows even protocol separators (://) and slashes are encoded, which is appropriate for component-level encoding.

Note: The static analysis hints about markdown formatting are false positives for auto-generated documentation files.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/ExpressionWritables.java (2)

44-44: LGTM!

The import is correctly added for the new UrlEncodeComponent class.


235-235: LGTM!

The UrlEncodeComponent.ENTRY registration follows the correct pattern and is appropriately positioned alongside the related UrlEncode entry.

docs/reference/query-languages/esql/_snippets/functions/parameters/url_encode.md (1)

6-6: LGTM!

The parameter description update improves consistency by adding the article "The", matching the style used in related documentation.

docs/reference/query-languages/esql/_snippets/functions/types/url_encode_component.md (1)

1-9: LGTM!

The supported types table correctly shows that both keyword and text inputs return keyword output, which is consistent with other string conversion functions.

Note: The static analysis hints about markdown formatting are false positives for auto-generated documentation files.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java (2)

99-99: LGTM!

The import is correctly added for the new UrlEncodeComponent class.


545-545: LGTM!

The UrlEncodeComponent registration in snapshot functions follows the correct pattern and is appropriately positioned alongside the related url_encode and url_decode functions.

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeTests.java (1)

29-29: LGTM!

The refactoring from a boolean parameter to PercentCodecTestType.ENCODE improves code clarity and type safety. Using an enum makes the test parameterization more maintainable and extensible for different encoding/decoding types.

docs/reference/query-languages/esql/_snippets/functions/parameters/url_encode_component.md (1)

1-7: LGTM!

The parameter documentation correctly describes the string parameter for URL_ENCODE_COMPONENT and is consistent with the documentation style used for related functions.

Note: The static analysis hints about markdown formatting are false positives for auto-generated documentation files.

docs/reference/query-languages/esql/_snippets/functions/description/url_encode.md (1)

5-5: LGTM! Documentation clarifies encoding behavior.

The updated description explicitly specifies which characters are not percent-encoded and clarifies that spaces are encoded as +. This aligns with application/x-www-form-urlencoded encoding standards.

docs/reference/query-languages/esql/kibana/docs/functions/url_encode_component.md (1)

1-9: LGTM! Clear documentation for RFC3986 component encoding.

The documentation correctly distinguishes URL_ENCODE_COMPONENT from URL_ENCODE by using %20 for spaces instead of +. This follows RFC3986 percent-encoding for URI components, which is appropriate for encoding individual query parameters or path segments.

docs/reference/query-languages/esql/_snippets/functions/layout/url_encode_component.md (1)

1-27: LGTM! Standard documentation layout for new function.

This is auto-generated documentation scaffolding for the URL_ENCODE_COMPONENT function. The structure follows the established pattern for ESQL function documentation.

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlDecodeTests.java (1)

29-29: LGTM! Improved test parameterization.

The change from boolean to enum-based parameterization (PercentCodecTestType.DECODE) improves code clarity and makes the test infrastructure more extensible for different codec types.

docs/reference/query-languages/esql/kibana/docs/functions/url_decode.md (1)

4-8: LGTM! Documentation accurately reflects error handling.

The updated description correctly documents that URL_DECODE returns null and adds a warning header when decoding fails. This aligns with the implementation change in UrlDecode.java that adds warnExceptions = { IllegalArgumentException.class } to handle invalid input gracefully.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlDecode.java (3)

47-47: LGTM! Description updated to match documented behavior.

The function description now accurately reflects that URL_DECODE returns null and adds a warning header when decoding fails, which aligns with the warnExceptions annotation added below.


53-53: LGTM! Refined parameter description.

The parameter description is now more concise and grammatically correct: "The URL-encoded string to decode."


86-86: LGTM! Correct error handling implementation.

The warnExceptions = { IllegalArgumentException.class } annotation correctly implements the documented behavior. URLDecoder.decode() throws IllegalArgumentException for invalid escape sequences (e.g., %ZZ, incomplete sequences), and this annotation will convert those exceptions into a null return value with a warning header in the response.

docs/reference/query-languages/esql/kibana/docs/functions/url_encode.md (1)

4-7: LGTM! Consistent documentation updates.

The description clarifies the URL_ENCODE encoding rules, explicitly stating which characters are not percent-encoded and that spaces are encoded as +. The updated example is clear and demonstrates the function usage effectively.

docs/reference/query-languages/esql/_snippets/functions/description/url_decode.md (1)

5-5: LGTM! Documentation consistently updated.

The description update matches the changes in other documentation files and accurately describes the error handling behavior (null return with warning header on decode failure).

docs/reference/query-languages/esql/_snippets/functions/description/url_encode_component.md (1)

1-6: LGTM!

The description accurately documents URL_ENCODE_COMPONENT behavior. The markdown linting issues flagged by static analysis are expected for auto-generated documentation files and can be safely ignored.

docs/reference/query-languages/esql/_snippets/functions/examples/url_decode.md (1)

6-12: LGTM!

The simplified example improves clarity while maintaining accurate demonstration of URL_DECODE functionality.

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentErrorTests.java (1)

22-37: LGTM!

The test class follows the established pattern for error testing in ESQL functions, properly integrating with the test infrastructure and correctly handling type error messages.

docs/reference/query-languages/esql/_snippets/functions/examples/url_encode.md (1)

6-11: LGTM!

The updated example effectively demonstrates URL_ENCODE behavior, showing both space encoding (as +) and percent-encoding of special characters.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java (1)

1463-1466: LGTM!

The capability addition follows the established pattern and is properly snapshot-gated, consistent with the existing URL_ENCODE capability.

docs/reference/query-languages/esql/kibana/definition/functions/url_encode_component.json (1)

1-37: LGTM!

The function definition is well-structured with appropriate signatures, examples, and metadata. The encoding behavior is correctly documented.

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentSerializationTests.java (1)

14-20: LGTM!

The serialization test follows the standard pattern for ESQL scalar function serialization testing, properly extending the base class and implementing the required factory method.

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/UrlEncodeComponentTests.java (1)

21-37: LGTM!

The test class properly integrates with the URL encoding/decoding test infrastructure, using the ENCODE_COMPONENT test type to ensure component-specific encoding rules are tested correctly.

"name" : "string",
"type" : "keyword",
"optional" : false,
"description" : "The URL to encode."
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider clarifying the parameter description.

The parameter description says "The URL to encode" but should be "The URL component to encode" or "The string to encode" to better reflect that this function encodes URL components (parts of a URL), not complete URLs. This distinction matters because URL_ENCODE_COMPONENT uses different encoding rules than URL_ENCODE (e.g., spaces become %20 instead of +).

Apply this diff to improve clarity:

-          "description" : "The URL to encode."
+          "description" : "The URL component to encode."

Note: The same change should be applied to line 25.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"description" : "The URL to encode."
"description" : "The URL component to encode."
🤖 Prompt for AI Agents
In
docs/reference/query-languages/esql/kibana/definition/functions/url_encode_component.json
around lines 13 and 25, the parameter description "The URL to encode." is
misleading; change it to "The URL component to encode." or "The string to
encode." to reflect that this function encodes URL components (not full URLs)
and uses component-specific encoding rules; update both occurrences to the
clearer wording.

Comment on lines 2639 to 2642
| EVAL a = TRIM(URL_DECODE(first_name))
| EVAL b = URL_DECODE(TO_LOWER(first_name))
| EVAL c = URL_DECODE(TO_LOWER(first_name))
| KEEP a,b;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Drop unused duplicate URL_DECODE evaluation.

The new EVAL c = URL_DECODE(TO_LOWER(first_name)) duplicates the previous line and c is never kept. Please drop it to avoid needless work and future confusion.

-| EVAL c = URL_DECODE(TO_LOWER(first_name))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| EVAL a = TRIM(URL_DECODE(first_name))
| EVAL b = URL_DECODE(TO_LOWER(first_name))
| EVAL c = URL_DECODE(TO_LOWER(first_name))
| KEEP a,b;
| EVAL a = TRIM(URL_DECODE(first_name))
| EVAL b = URL_DECODE(TO_LOWER(first_name))
| KEEP a,b;
🤖 Prompt for AI Agents
In x-pack/plugin/esql/qa/testFixtures/src/main/resources/string.csv-spec around
lines 2639 to 2642, there is a duplicate unused evaluation: remove the line
"EVAL c = URL_DECODE(TO_LOWER(first_name))" so you only perform the needed
evaluations for a and b and avoid computing an unused variable; keep the
remaining lines as-is (EVAL a = TRIM(URL_DECODE(first_name)), EVAL b =
URL_DECODE(TO_LOWER(first_name)), KEEP a,b;).

Comment on lines +27 to +55
scratch.grow(size);
scratch.clear();

int lo = val.offset;
int hi = val.offset + val.length;

for (int i = lo; i < hi; ++i) {
byte b = val.bytes[i];
char c = (char) (b & 0xFF);

if (plusForSpace && c == ' ') {
scratch.append((byte) '+');
continue;
}

if (isRfc3986Safe(c)) {
scratch.append(b);
continue;
}

// every encoded byte is represented by 3 chars: %XY
scratch.append((byte) '%');

// the X in %XY is the hex value for the high nibble
scratch.append((byte) HEX_DIGITS[(c >> 4) & 0x0F]);

// the Y in %XY is the hex value for the low nibble
scratch.append((byte) HEX_DIGITS[c & 0x0F]);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix aliasing bug when input already uses the scratch buffer.

If a caller feeds the result of a previous urlEncode call back into this method with the same scratch (e.g. re‑encoding a component), scratch.clear() zeros the shared BytesRef’s length before we capture the bounds. The loop then sees hi == lo and returns an empty string. Repro:

BreakingBytesRefBuilder scratch = new BreakingBytesRefBuilder(breaker, "test");
BytesRef once = UrlCodecUtils.urlEncode(new BytesRef("foo bar"), scratch, true);
BytesRef twice = UrlCodecUtils.urlEncode(once, scratch, true); // returns length 0

Capture lo/hi before clearing (or clone the input) so aliasing stays safe.

-        scratch.grow(size);
-        scratch.clear();
-
-        int lo = val.offset;
-        int hi = val.offset + val.length;
+        scratch.grow(size);
+
+        int lo = val.offset;
+        int hi = val.offset + val.length;
+
+        scratch.clear();
🤖 Prompt for AI Agents
In
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/util/UrlCodecUtils.java
around lines 27 to 55, the method clears the shared scratch buffer before
capturing the input bounds which causes an aliasing bug when the caller passes
the same BytesRef instance that backs scratch; capture the input bounds and/or
clone the input bytes before calling scratch.clear() so the loop iterates over
the original data (e.g., compute lo and hi or copy val =
BytesRef.deepCopyOf(val) prior to scratch.clear()), then proceed with encoding
into scratch.

@phananh1010 phananh1010 closed this Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant