Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Move serialization for EsField #109222

Merged
merged 7 commits into from
May 31, 2024

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented May 30, 2024

This moves the serialization logic for EsField into the EsField subclasses to better align with the way rest of Elasticsearch works. It also switches them from ESQL's home grown writeNamed thing to NamedWriteable. These are wire compatible with one another.

nik9000 added 2 commits May 30, 2024 08:31
This moves the serialization of `DataType` from `PlanNamedTypes` and
into `DataType` itself to better align with the rest of Elasticsearch.
This moves the serialization for `EsField`
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 30, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nik, I like this!

I have one minor, opinionated reservation about the organization of the named registry entries; happy to discuss, but this shouldn't be addressed in this PR, anyway.

import java.util.Map;

/**
* SQL-related information about an index field with date type
*/
public class DateEsField extends EsField {
static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(EsField.class, "DateEsField", DateEsField::new);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bikeshed: I know this is consistent with how the de-/serialization of the compute engine classes is organized, but this feels like it belongs directly in EsqlPlugin, resp. EsqlPlugin.getNamedWriteables, not here. The same applies to EsField.getNamedWriteables.

Reason:

  • This isn't information that is fully local to the DateEsField. This already assumes knowledge of the fact that whenever we deserialize a DateEsField, wo do so in a place that expects a more general EsField.
  • This applies similarly to EsField.getNamedWriteables, which leaks info about subclasses into the superclass, which actually could also just sit in the place responsible for registering the named writeables.
  • Consequently, information about how to register these named writeables is in 3 classes, while it could be in just one (EsqlPlugin).

Since we're aligning the ser/de approach of the compute engine/rest of ES with how QL used to do it, maybe it's worth discussing this before we fully commit to this approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, fair. There isn't a big standard in the ES code base on how to organize this stuff, but we do tend to do things kind of like I did. But it varies.

As to your first point, it's not entirely true. If you want to deserialize a DateEsField directly you can do that - just call it's deserialization ctor. That's private now because no one wants to, but nothing stops us from making it public if you have a need. Some other bits of ESQL will work like this, but I don't believe any EsField subclasses will.

The reason for putting the ENTRY in the class itself is:

  1. It allows you to make the deserialization ctor private if no one else calls it. That's a nice signal.
  2. We'll end up with dozens of these entries and it feels good to group them.
  3. It's a useful signal to say "this is the category to which this NamedWriteable belongs".

It's also important that the ENTRY ends up near-ish to the class itself so we can test deserialization using it. In this one we register only the NamedWriteables for subclasses of EsField in the test. That's generally a good thing.

Generally the rest of ES seems split. Lots of things do this, like Block and SortValue. Other things push the entry building up, like to the equivalent of EsqlPlugin. Aggs do that and it fine. It makes testing serialization a little more complex.

It's a choice and for things shaped like EsField I come down on the "put the ENTRY in the class an make a list somewhere convenient" side.

As to leaking the subclasses, I'm not a fan of it, but I also don't hate it too much. Not badly enough to make a new class just to build the entries. The real funny thing about this hierarchy is that the "top" class isn't abstract or an interface or something. Like Block is. That bothers me more than the leaking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, making serialization tests simple and not involve building a huge registry is nice, indeed.

@@ -20,4 +26,20 @@ public static DateEsField dateEsField(String name, Map<String, EsField> properti
private DateEsField(String name, DataType dataType, Map<String, EsField> properties, boolean hasDocValues) {
super(name, dataType, properties, hasDocValues);
}

private DateEsField(StreamInput in) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

big ++ to placing the information on how to construct a DateEsField read from an input stream here; this shouldn't be leaked outside this class.

import java.util.Map;

/**
* SQL-related information about an index field with date type
*/
public class DateEsField extends EsField {
static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(EsField.class, "DateEsField", DateEsField::new);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Since the string DateEsField is used both here and in getWriteableName, maybe we could use a String constant?

Applies similarly to the other EsField subclasses.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you know what, there's a better way actually. I'll push something.

import java.util.Map;
import java.util.TreeMap;

public abstract class AbstractEsFieldTypeTests<T extends EsField> extends AbstractNamedWriteableTestCase<EsField> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big ++ to the added tests; thanks @nik9000 !

import java.util.Map;

public class DateEsFieldTests extends AbstractEsFieldTypeTests<DateEsField> {
static DateEsField randomDateEsField(int depth) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: depth suggests that's the number of levels we're gonna get. Actually, the number of levels we're gonna get is 5-depth. Maybe we should flip that around; at least, it got me confused, so it may confuse others in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

import org.elasticsearch.core.Nullable;

import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.Objects;

/**
* SQL-related information about an index field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL-related can be dropped here (and in most other classes in this package).

@@ -53,6 +63,33 @@ protected KeywordEsField(
this.normalized = normalized;
}

private KeywordEsField(StreamInput in) throws IOException {
this(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily belonging to this PR, but the c'tor used here can be dropped and the one that doesn't require a DataType parameter used instead (was wondering why do we need to provide a type here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll save that one I think.

@nik9000 nik9000 added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label May 31, 2024
@elasticsearchmachine elasticsearchmachine merged commit 5bd1bfc into elastic:main May 31, 2024
15 checks passed
@nik9000 nik9000 deleted the esql_write_invalid branch May 31, 2024 13:50
craigtaverner pushed a commit to craigtaverner/elasticsearch that referenced this pull request Jun 11, 2024
This moves the serialization logic for `EsField` into the `EsField`
subclasses to better align with the way rest of Elasticsearch works. It
also switches them from ESQL's home grown `writeNamed` thing to
`NamedWriteable`. These are wire compatible with one another.
craigtaverner added a commit to craigtaverner/elasticsearch that referenced this pull request Jun 11, 2024
The second prototype replaced MultiTypeField.Unresolved with MultiTypeField, but this clashed with existing behaviour around mapping unused MultiTypeFields to `unsupported` and `null`, so this new attempt simply adds new fields, resulting in more than one field with the same name.
We still need to store this new field in EsRelation, so that physical planner can insert it into FieldExtractExec, so this is quite similar to the second protototype.

The following query works in this third prototype:

```
multiIndexIpString
FROM sample_data* METADATA _index
| EVAL client_ip = TO_IP(client_ip)
| KEEP _index, @timestamp, client_ip, event_duration, message
| SORT _index ASC, @timestamp DESC
```

As with the previous prototyep, we no longer need an aggregation to force the conversion function onto the data node, as the 'real' conversion is now done at field extraction time using the converter function previously saved in the EsRelation and replanned into the EsQueryExec.

Support row-stride-reader for LoadFromMany

Add missing ESQL version after rebase on main

Fixed missing block release

Simplify UnresolvedUnionTypes

Support other commands, notably WHERE

Update docs/changelog/107545.yaml

Fix changelog

Removed unused code

Slight code reduction in analyser of union types

Removed unused interface method

Fix bug in copying blocks (array overrun)

Convert MultiTypeEsField.UnresolvedField back to InvalidMappedField

This is to ensure older behaviour still works.

Simplify InvalidMappedField support

Rather than complex code to recreate InvalidMappedField from MultiTypeEsField.UnresolvedField, we rely on the fact that this is the parent class anyway, so we can resolve this during plan serialization/deserialization anyway. Much simpler

Simplify InvalidMappedField support further

Combining InvalidMappedField and MultiTypeEsField.UnresolvedField into one class simplifies plan serialization even further.

InvalidMappedField is used slightly differently in QL

We need to separate the aggregatable used in the original really-invalid mapped field from the aggregatable used if the field can indeed be used as a union-type in ES|QL.

Updated version limitation after 8.14 branch

Try debug CI failures in multi-node clusters

Support type conversion in rowstride reader on single leaf

Disable union_types from CsvTests

Keep track of per-shard converters for LoadFromMany

Simplify block loader convert function

Code cleanup

Added unit test for ValuesSourceReaderOperator including field type conversions at block loading

Added test for @timestamp and fixed related bug

It turns out that most, but not all, DataType values have the same esType as typeName, and @timestamp is one that does not, using `date` for esType and `datetime` for typename. Our EsqlIndexResolver was recording multi-type fields with `esType`, while later the actual type conversion was using an evaluator that relied on DataTypes.typeFromName(typeName).
So we fixed the EsqlIndexResolver to rather use typeName.

Added more tests, with three indices combined and two type conversions

Disable lucene-pushdown on union-type fields

Since the union-type rewriter replaced conversion functions with new FieldAttributes, these were passing the check for being possible to push-down, which was incorrect. Now we prevent that.

Set union-type aggregatable flag to false always

This simplifies the push-down check.

Fixed tests after rebase on main

Add unit tests for union-types (same field, different type)

Remove generic warnings

Test code cleanup and clarifying comments

Remove -IT_tests_only in favor of CsvTests assumeFalse

Improved comment

Code review updates

Code review updates

Remove changes to ql/EsRelation

And it turned out the latest version of union type no longer needed these changes anyway, and was using the new EsRelation in the ESQL module without these changes.

Port InvalidMappedField to ESQL

Note, this extends the QL version of InvalidMappedField, so is not a complete port. This is necessary because of the intertwining of QL IndexResolver and EsqlIndexResolver. Once those classes are disentangled, we can completely break InvalidMappedField from QL and make it a forbidden type.

Fix capabilities line after rebase on main

Revert QL FieldAttribute and extend with ESQL FieldAttribute

So as to remove any edits to QL code, we extend FieldAttribute in the ESQL code with the changes required, since is simply to include the `field` in the hascode and equals methods.

Revert "Revert QL FieldAttribute and extend with ESQL FieldAttribute"

This reverts commit 168c6c75436e26b83e083cd3de8e18062e116bc9.

Switch UNION_TYPES from EsqlFeatures to EsqlCapabilities

Make hashcode and equals aligned

And removed unused method from earlier union-types work where we kept the NodeId during re-writing (which we no longer do).

Replace required_feature with required_capability after rebase

Switch union_types capability back to feature, because capabilities do not work in mixed clusters

Revert "Switch union_types capability back to feature, because capabilities do not work in mixed clusters"

This reverts commit 56d58bedf756dbad703c07bf4cdb991d4341c1ae.

Added test for multiple columns from same fields

Both IP and Date are tested

Fix bug with incorrectly resolving invalid types

And added more tests

Fixed bug with multiple fields of same name

This fix simply removes the original field already at the EsRelation level, which covers all test cases but has the side effect of having the final field no-longer be unsupported/null when the alias does not overwrite the field with the same name.
This is not exactly the correct semantic intent.
The original field name should be unsupported/null unless the user explicitly overwrote the name with `field=TO_TYPE(field)`, which effectively deletes the old field anyway.

Fixed bug with multiple conversions of the same field

This also fixes the issue with the previous fix that incorrectly reported the converted type for the original field.

More tests with multiple fields and KEEP/DROP combinations

Replace skip with capabilities in YML tests

Fixed missing ql->esql import change afer merging main

Merged two InvalidMappedField classes

After the QL code was ported to esql.core, we can now make the edits directly in InvalidMappedField instead of having one extend the other.

Move FieldAttribute edits from QL to ESQL

ESQL: Prepare analyzer for LOOKUP (elastic#109045)

This extracts two fairly uncontroversial changes that were in the main
LOOKUP PR into a smaller change that's easier to review.

ESQL: Move serialization for EsField (elastic#109222)

This moves the serialization logic for `EsField` into the `EsField`
subclasses to better align with the way rest of Elasticsearch works. It
also switches them from ESQL's home grown `writeNamed` thing to
`NamedWriteable`. These are wire compatible with one another.

ESQL: Move serialization of `Attribute` (elastic#109267)

This moves the serialization of `Attribute` classes used in ESQL into
the classes themselves to better line up with the rest of Elasticsearch.

ES|QL: add MV_APPEND function (elastic#107001)

Adding `MV_APPEND(value1, value2)` function, that appends two values
creating a single multi-value. If one or both the inputs are
multi-values, the result is the concatenation of all the values, eg.

```
MV_APPEND([a, b], [c, d]) -> [a, b, c, d]
```

~I think for this specific case it makes sense to consider `null` values
as empty arrays, so that~ ~MV_APPEND(value, null) -> value~ ~It is
pretty uncommon for ESQL (all the other functions, apart from
`COALESCE`, short-circuit to `null` when one of the values is null), so
let's discuss this behavior.~

[EDIT] considering the feedback from Andrei, I changed this logic and
made it consistent with the other functions: now if one of the
parameters is null, the function returns null

[ES|QL] Convert string to datetime when the other size of an arithmetic operator is date_period or time_duration (elastic#108455)

* convert string to datetime when the other side of binary operator is temporal amount

ESQL: Move `NamedExpression` serialization (elastic#109380)

This moves the serialization for the remaining `NamedExpression`
subclass into the class itself, and switches all direct serialization of
`NamedExpression`s to `readNamedWriteable` and friends. All other
`NamedExpression` subclasses extend from `Attribute` who's serialization
was moved ealier. They are already registered under the "category class"
for `Attribute`. This also registers them as `NamedExpression`s.

ESQL: Implement LOOKUP, an "inline" enrich (elastic#107987)

This adds support for `LOOKUP`, a command that implements a sort of
inline `ENRICH`, using data that is passed in the request:

```
$ curl -uelastic:password -HContent-Type:application/json -XPOST \
    'localhost:9200/_query?error_trace&pretty&format=txt' \
-d'{
    "query": "ROW a=1::LONG | LOOKUP t ON a",
    "tables": {
        "t": {
            "a:long":     [    1,     4,     2],
            "v1:integer": [   10,    11,    12],
            "v2:keyword": ["cat", "dog", "wow"]
        }
    },
    "version": "2024.04.01"
}'
      v1       |      v2       |       a
---------------+---------------+---------------
10             |cat            |1
```

This required these PRs: * elastic#107624 * elastic#107634 * elastic#107701 * elastic#107762 *

Closes elastic#107306

parent 32ac5ba755dd5c24364a210f1097ae093fdcbd75
author Craig Taverner <craig@amanzi.com> 1717779549 +0200
committer Craig Taverner <craig@amanzi.com> 1718115775 +0200

Fixed compile error after merging in main

Fixed strange merge issues from main

Remove version from ES|QL test queries after merging main

Fixed union-types on nested fields

Switch to Luigi's solution, and expand nested tests

Cleanup after rebase
craigtaverner added a commit that referenced this pull request Jun 19, 2024
* Union Types Support

The second prototype replaced MultiTypeField.Unresolved with MultiTypeField, but this clashed with existing behaviour around mapping unused MultiTypeFields to `unsupported` and `null`, so this new attempt simply adds new fields, resulting in more than one field with the same name.
We still need to store this new field in EsRelation, so that physical planner can insert it into FieldExtractExec, so this is quite similar to the second protototype.

The following query works in this third prototype:

```
multiIndexIpString
FROM sample_data* METADATA _index
| EVAL client_ip = TO_IP(client_ip)
| KEEP _index, @timestamp, client_ip, event_duration, message
| SORT _index ASC, @timestamp DESC
```

As with the previous prototyep, we no longer need an aggregation to force the conversion function onto the data node, as the 'real' conversion is now done at field extraction time using the converter function previously saved in the EsRelation and replanned into the EsQueryExec.

Support row-stride-reader for LoadFromMany

Add missing ESQL version after rebase on main

Fixed missing block release

Simplify UnresolvedUnionTypes

Support other commands, notably WHERE

Update docs/changelog/107545.yaml

Fix changelog

Removed unused code

Slight code reduction in analyser of union types

Removed unused interface method

Fix bug in copying blocks (array overrun)

Convert MultiTypeEsField.UnresolvedField back to InvalidMappedField

This is to ensure older behaviour still works.

Simplify InvalidMappedField support

Rather than complex code to recreate InvalidMappedField from MultiTypeEsField.UnresolvedField, we rely on the fact that this is the parent class anyway, so we can resolve this during plan serialization/deserialization anyway. Much simpler

Simplify InvalidMappedField support further

Combining InvalidMappedField and MultiTypeEsField.UnresolvedField into one class simplifies plan serialization even further.

InvalidMappedField is used slightly differently in QL

We need to separate the aggregatable used in the original really-invalid mapped field from the aggregatable used if the field can indeed be used as a union-type in ES|QL.

Updated version limitation after 8.14 branch

Try debug CI failures in multi-node clusters

Support type conversion in rowstride reader on single leaf

Disable union_types from CsvTests

Keep track of per-shard converters for LoadFromMany

Simplify block loader convert function

Code cleanup

Added unit test for ValuesSourceReaderOperator including field type conversions at block loading

Added test for @timestamp and fixed related bug

It turns out that most, but not all, DataType values have the same esType as typeName, and @timestamp is one that does not, using `date` for esType and `datetime` for typename. Our EsqlIndexResolver was recording multi-type fields with `esType`, while later the actual type conversion was using an evaluator that relied on DataTypes.typeFromName(typeName).
So we fixed the EsqlIndexResolver to rather use typeName.

Added more tests, with three indices combined and two type conversions

Disable lucene-pushdown on union-type fields

Since the union-type rewriter replaced conversion functions with new FieldAttributes, these were passing the check for being possible to push-down, which was incorrect. Now we prevent that.

Set union-type aggregatable flag to false always

This simplifies the push-down check.

Fixed tests after rebase on main

Add unit tests for union-types (same field, different type)

Remove generic warnings

Test code cleanup and clarifying comments

Remove -IT_tests_only in favor of CsvTests assumeFalse

Improved comment

Code review updates

Code review updates

Remove changes to ql/EsRelation

And it turned out the latest version of union type no longer needed these changes anyway, and was using the new EsRelation in the ESQL module without these changes.

Port InvalidMappedField to ESQL

Note, this extends the QL version of InvalidMappedField, so is not a complete port. This is necessary because of the intertwining of QL IndexResolver and EsqlIndexResolver. Once those classes are disentangled, we can completely break InvalidMappedField from QL and make it a forbidden type.

Fix capabilities line after rebase on main

Revert QL FieldAttribute and extend with ESQL FieldAttribute

So as to remove any edits to QL code, we extend FieldAttribute in the ESQL code with the changes required, since is simply to include the `field` in the hascode and equals methods.

Revert "Revert QL FieldAttribute and extend with ESQL FieldAttribute"

This reverts commit 168c6c75436e26b83e083cd3de8e18062e116bc9.

Switch UNION_TYPES from EsqlFeatures to EsqlCapabilities

Make hashcode and equals aligned

And removed unused method from earlier union-types work where we kept the NodeId during re-writing (which we no longer do).

Replace required_feature with required_capability after rebase

Switch union_types capability back to feature, because capabilities do not work in mixed clusters

Revert "Switch union_types capability back to feature, because capabilities do not work in mixed clusters"

This reverts commit 56d58bedf756dbad703c07bf4cdb991d4341c1ae.

Added test for multiple columns from same fields

Both IP and Date are tested

Fix bug with incorrectly resolving invalid types

And added more tests

Fixed bug with multiple fields of same name

This fix simply removes the original field already at the EsRelation level, which covers all test cases but has the side effect of having the final field no-longer be unsupported/null when the alias does not overwrite the field with the same name.
This is not exactly the correct semantic intent.
The original field name should be unsupported/null unless the user explicitly overwrote the name with `field=TO_TYPE(field)`, which effectively deletes the old field anyway.

Fixed bug with multiple conversions of the same field

This also fixes the issue with the previous fix that incorrectly reported the converted type for the original field.

More tests with multiple fields and KEEP/DROP combinations

Replace skip with capabilities in YML tests

Fixed missing ql->esql import change afer merging main

Merged two InvalidMappedField classes

After the QL code was ported to esql.core, we can now make the edits directly in InvalidMappedField instead of having one extend the other.

Move FieldAttribute edits from QL to ESQL

ESQL: Prepare analyzer for LOOKUP (#109045)

This extracts two fairly uncontroversial changes that were in the main
LOOKUP PR into a smaller change that's easier to review.

ESQL: Move serialization for EsField (#109222)

This moves the serialization logic for `EsField` into the `EsField`
subclasses to better align with the way rest of Elasticsearch works. It
also switches them from ESQL's home grown `writeNamed` thing to
`NamedWriteable`. These are wire compatible with one another.

ESQL: Move serialization of `Attribute` (#109267)

This moves the serialization of `Attribute` classes used in ESQL into
the classes themselves to better line up with the rest of Elasticsearch.

ES|QL: add MV_APPEND function (#107001)

Adding `MV_APPEND(value1, value2)` function, that appends two values
creating a single multi-value. If one or both the inputs are
multi-values, the result is the concatenation of all the values, eg.

```
MV_APPEND([a, b], [c, d]) -> [a, b, c, d]
```

~I think for this specific case it makes sense to consider `null` values
as empty arrays, so that~ ~MV_APPEND(value, null) -> value~ ~It is
pretty uncommon for ESQL (all the other functions, apart from
`COALESCE`, short-circuit to `null` when one of the values is null), so
let's discuss this behavior.~

[EDIT] considering the feedback from Andrei, I changed this logic and
made it consistent with the other functions: now if one of the
parameters is null, the function returns null

[ES|QL] Convert string to datetime when the other size of an arithmetic operator is date_period or time_duration (#108455)

* convert string to datetime when the other side of binary operator is temporal amount

ESQL: Move `NamedExpression` serialization (#109380)

This moves the serialization for the remaining `NamedExpression`
subclass into the class itself, and switches all direct serialization of
`NamedExpression`s to `readNamedWriteable` and friends. All other
`NamedExpression` subclasses extend from `Attribute` who's serialization
was moved ealier. They are already registered under the "category class"
for `Attribute`. This also registers them as `NamedExpression`s.

ESQL: Implement LOOKUP, an "inline" enrich (#107987)

This adds support for `LOOKUP`, a command that implements a sort of
inline `ENRICH`, using data that is passed in the request:

```
$ curl -uelastic:password -HContent-Type:application/json -XPOST \
    'localhost:9200/_query?error_trace&pretty&format=txt' \
-d'{
    "query": "ROW a=1::LONG | LOOKUP t ON a",
    "tables": {
        "t": {
            "a:long":     [    1,     4,     2],
            "v1:integer": [   10,    11,    12],
            "v2:keyword": ["cat", "dog", "wow"]
        }
    },
    "version": "2024.04.01"
}'
      v1       |      v2       |       a
---------------+---------------+---------------
10             |cat            |1
```

This required these PRs: * #107624 * #107634 * #107701 * #107762 *

Closes #107306

parent 32ac5ba755dd5c24364a210f1097ae093fdcbd75
author Craig Taverner <craig@amanzi.com> 1717779549 +0200
committer Craig Taverner <craig@amanzi.com> 1718115775 +0200

Fixed compile error after merging in main

Fixed strange merge issues from main

Remove version from ES|QL test queries after merging main

Fixed union-types on nested fields

Switch to Luigi's solution, and expand nested tests

Cleanup after rebase

* Added more tests from code review

Note that one test, `multiIndexIpStringStatsInline` is muted due to failing with the error:

    UnresolvedException: Invalid call to dataType on an unresolved object ?client_ip

* Make CsvTests consistent with integration tests for capabilities

The integration tests do not fail the tests if the capability does not even exist on cluster nodes, instead the tests are ignored. The same behaviour should happen with CsvTests for consistency.

* Return assumeThat to assertThat, but change order

This way we don't have to add more features to the test framework in this PR, but we would probably want a mute feature (like a `skip` line).

* Move serialization of MultiTypeEsField to NamedWritable approach

Since the sub-fields are AbstractConvertFunction expressions, and Expression is not yet fully supported as a category class for NamedWritable, we need a few slight tweaks to this, notably registering this explicitly in the EsqlPlugin, as well as calling PlanStreamInput.readExpression() instead of StreamInput.readNamedWritable(Expression.class). These can be removed later once Expression is fully supported as a category class.

* Remove attempt to mute two failed tests

We used required_capability to mute the tests, but this caused issues with CsvTests which also uses this as a spelling mistake checker for typing the capability name wrong, so we tried to use muted-tests.yml, but that only mutes tests in specific run configurations (ie. we need to mute each and every IT class separately).

So now we just remove the tests entirely. We left a comment in the muted-tests.yml file for future reference about how to mute csv-spec tests.

* Fix rather massive issue with performance of testConcurrentSerialization

Recreating the config on every test was very expensive.

* Code review by Nik

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants