Skip to content

[BUG] Nested alias fields fail when referencing fields outside parent context #4559

@alexey-temnikov

Description

@alexey-temnikov

Query Information

PPL Command/Query:

describe test_alias_nested

or

source=test_alias_nested | fields aws.cloudtrail.event_name

Expected Result:
The query should successfully parse the index mapping and return the schema information or query results, properly resolving the alias field aws.cloudtrail.event_name to its target path api.operation.

Actual Result:

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "java.sql.SQLException: exception while executing query: Failed to read mapping for index pattern [[Ljava.lang.String;@70a2450b]",
    "type": "RuntimeException"
  },
  "status": 500
}

Full stack trace:

Caused by: java.lang.IllegalStateException: Cannot find the path [api.operation] for alias type field [event_name]
	at org.opensearch.sql.opensearch.data.type.OpenSearchDataType.lambda$parseMapping$4(OpenSearchDataType.java:146)
	at java.base/java.util.LinkedHashMap.forEach(LinkedHashMap.java:986)
	at org.opensearch.sql.opensearch.data.type.OpenSearchDataType.parseMapping(OpenSearchDataType.java:140)
	at org.opensearch.sql.opensearch.data.type.OpenSearchDataType.of(OpenSearchDataType.java:172)
	at org.opensearch.sql.opensearch.data.type.OpenSearchDataType.lambda$parseMapping$3(OpenSearchDataType.java:131)

Dataset Information

Dataset/Schema Type

  • Custom (details below)

Index Mapping

{
  "mappings": {
    "properties": {
      "aws": {
        "properties": {
          "cloudtrail": {
            "properties": {
              "event_name": {
                "type": "alias",
                "path": "api.operation"
              },
              "user_identity": {
                "type": "text"
              }
            }
          }
        }
      },
      "api": {
        "properties": {
          "operation": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

Sample Data

{
  "aws": {
    "cloudtrail": {
      "user_identity": "test-user"
    }
  },
  "api": {
    "operation": "CreateBucket"
  }
}

Bug Description

Issue Summary:
SQL/PPL plugin fails to parse index mappings when an alias field is nested within an object and points to a field outside its parent context. This is a different scenario from the previously fixed issue #3646, which only handled aliases at the root level pointing to nested fields.

Steps to Reproduce:

  1. Create an index with a nested alias field that points to a field in a different branch of the mapping tree:
PUT /test_alias_nested
{
  "mappings": {
    "properties": {
      "aws": {
        "properties": {
          "cloudtrail": {
            "properties": {
              "event_name": {
                "type": "alias",
                "path": "api.operation"
              }
            }
          }
        }
      },
      "api": {
        "properties": {
          "operation": {
            "type": "keyword"
          }
        }
      }
    }
  }
}
  1. Attempt to query the index using PPL:
POST /_plugins/_ppl
{
  "query": "describe test_alias_nested"
}
  1. Observe the error: Cannot find the path [api.operation] for alias type field [event_name]

Comparison with Working Scenario:
The fix in PR #3674 (issue #3646) works for aliases at the ROOT level pointing to nested fields:

{
  "mappings": {
    "properties": {
      "log": {
        "properties": {
          "url": {
            "properties": {
              "message": { "type": "text" }
            }
          }
        }
      },
      "message_alias": {
        "type": "alias",
        "path": "log.url.message"
      }
    }
  }
}

✅ This works because message_alias is at the root level.

However, it does NOT work when the alias is nested:

{
  "mappings": {
    "properties": {
      "aws": {
        "properties": {
          "cloudtrail": {
            "properties": {
              "event_name": {
                "type": "alias",
                "path": "api.operation"
              }
            }
          }
        }
      },
      "api": {
        "properties": {
          "operation": { "type": "keyword" }
        }
      }
    }
  }
}

❌ This fails because event_name is nested inside aws.cloudtrail and points to api.operation which is outside its parent context.

Impact:
This bug prevents users from querying indices with nested alias fields that reference fields outside their parent object. This is a common pattern in observability schemas like OpenTelemetry, Simple Schema for Observability (SS4O), and custom schemas, where aliases are used to provide backward compatibility or alternative field names within nested structures.

Environment Information

OpenSearch Version:
OpenSearch 3.4.0-SNAPSHOT (build date: 2025-10-14)

Additional Details:

  • The issue occurs with both the PPL describe command and regular queries
  • The bug is in the OpenSearchDataType.parseMapping() method, which is called recursively for nested objects
  • When processing nested objects, the method only has access to fields within that nested context, not the entire index mapping

Root Cause Analysis

This is a preliminary analysis and requires further investigation.

The issue is in the OpenSearchDataType.parseMapping() method at lines 137-147 in opensearch/src/main/java/org/opensearch/sql/opensearch/data/type/OpenSearchDataType.java.

When the method encounters an Object or Nested type (lines 171-172), it recursively calls parseMapping() on the nested properties:

Map<String, OpenSearchDataType> properties =
    parseMapping((Map<String, Object>) innerMap.getOrDefault("properties", Map.of()));

This recursive call processes only the properties within that nested context. When an alias field is encountered within the nested context (e.g., event_name inside aws.cloudtrail), the code attempts to resolve its path (e.g., api.operation) by flattening the result:

Map<String, OpenSearchDataType> flattenResult = traverseAndFlatten(result);

However, result only contains fields from the current parsing context (fields within aws.cloudtrail), not the entire index mapping. Therefore, when the alias points to a field outside its parent context (like api.operation at the root level), the resolution fails.

The fix in PR #3674 addressed the case where aliases at the root level point to nested fields, but it didn't address the case where nested aliases point to fields outside their parent context.

Tentative Proposed Fix

This is a preliminary analysis and requires further investigation.

To fix this issue, the alias resolution logic needs access to the complete index mapping, not just the current parsing context. Possible approaches:

  1. Defer alias resolution: Instead of resolving aliases during recursive parsing, collect all alias mappings and resolve them after the entire mapping tree is built. This would require passing the complete flattened mapping to the alias resolution logic.

  2. Pass parent context: Modify the parseMapping() method to accept an additional parameter containing the complete mapping tree or a reference to the root context, allowing nested alias resolution to access fields outside the current context.

  3. Two-pass parsing: First pass builds the complete mapping tree without resolving aliases, second pass resolves all aliases with access to the complete tree.

Related Issues

Workaround

No easy workaround available. The only option is to avoid using nested alias fields that point to fields outside their parent object context. Users would need to restructure their mappings to place aliases at the root level or ensure that alias targets are within the same parent object.

Metadata

Metadata

Assignees

Labels

PPLPiped processing languagebugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions