Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong serialization of TermsQuery, MinimumShouldMatch and TrackTotalHits #8387

Closed
andersosthus opened this issue Oct 15, 2024 · 4 comments
Closed
Labels

Comments

@andersosthus
Copy link
Contributor

andersosthus commented Oct 15, 2024

Elastic.Clients.Elasticsearch version: 8.15.9

Elasticsearch version: N/A

.NET runtime version: 8.0

Operating system version: Win 11

Description of the problem including expected versus actual behavior:
Term in TermsQuery does not serialize properly.
MinimumShouldMatch in BoolQuery does not serialize properly.
TrackTotalHits in SearchRequest does not serialize properly.

Steps to reproduce:

using System.Text;
using Elastic.Clients.Elasticsearch;
using Elastic.Clients.Elasticsearch.Core.Search;
using Elastic.Clients.Elasticsearch.QueryDsl;
using Elastic.Transport;

var client = new ElasticsearchClient();

var request = new SearchRequest("indexName")
{
    Query = new BoolQuery
    {
        Should =
        [
            new TermsQuery
            {
                Field = "abcd",
                Term = new TermsQueryField(new FieldValue[]{"a", "b"}),
            },
        ],
        MinimumShouldMatch = MinimumShouldMatch.Fixed(1)
    },
    TrackTotalHits = new TrackHits(false)
};

using var ms = new MemoryStream();
client.Transport.Configuration.SourceSerializer.Serialize(request, ms, SerializationFormatting.Indented);
var jsonPayload = Encoding.UTF8.GetString(ms.ToArray());

Console.WriteLine(jsonPayload);

Expected behavior
Expected the following output:

{
  "query": {
    "bool": {
      "should": {
        "terms": {
          "abcd": ["a", "b"]
        }
      },
      "minimum_should_match": 1
    }
  },
  "track_total_hits": false
}

but this is the actual produces JSON:

{
  "query": {
    "bool": {
      "should": {
        "terms": {
          "abcd": {}
        }
      },
      "minimum_should_match": {}
    }
  },
  "track_total_hits": {}
}

Other notes:
Using new TrackHits(1) for TrackTotalHits gives the same output.

We've worked around it for now with custom serializers, just reporting it here to make sure you're aware of it.

Sidenote, for TermsQuery, shouldn't Term be renamed to Terms to align semantically since the use case is for one or many terms? Also, thinking about DX, TermsQuery.Term(s) should allow us to just pass in a collection in stead of a TermsQueryField which feels a lot more clunky.

@andersosthus andersosthus added 8.x Relates to 8.x client version Category: Bug labels Oct 15, 2024
@flobernd
Copy link
Member

Hi @andersosthus,

you are trying to serialize "internal" request/response types using the SourceSerializer. The SourceSerializer is exclusively meant to (de-)serialize user defined data (usually specified as a generic type TDocument).

It should work, if you use the client.RequestResponseSerializer instead.

Hint: If you update to the latest version, you can use one of the new serialization overloads:

var jsonPayload = client.RequestResponseSerializer.SerializeToString(request);

This way you don't have to manually create a MemoryStream or handle encoding. Besides that, an optimized shortcut path will be used, if the serializer derives from SystemTextJsonSerializer - which is the case for both, the DefaultRequestResponseSerializer and the DefaultSourceSerializer 🙂

Please let me know, if that answers your question.

@flobernd
Copy link
Member

Sidenote, for TermsQuery, shouldn't Term be renamed to Terms to align semantically since the use case is for one or many terms?

Yes, I agree. This is currently incorrectly named in our specification. The specification is used to generate multiple clients and other tools, which means we have to be careful with breaking changes like this one. We will discuss this point internally.

Also, thinking about DX, TermsQuery.Term(s) should allow us to just pass in a collection in stead of a TermsQueryField which feels a lot more clunky.

This is already on the list of upcoming usability improvements. For this to work, we have to generate transient implicit conversion operators.

@andersosthus
Copy link
Contributor Author

Hi @andersosthus,

you are trying to serialize "internal" request/response types using the SourceSerializer. The SourceSerializer is exclusively meant to (de-)serialize user defined data (usually specified as a generic type TDocument).

It should work, if you use the client.RequestResponseSerializer instead.

Hint: If you update to the latest version, you can use one of the new serialization overloads:

var jsonPayload = client.RequestResponseSerializer.SerializeToString(request);

This way you don't have to manually create a MemoryStream or handle encoding. Besides that, an optimized shortcut path will be used, if the serializer derives from SystemTextJsonSerializer - which is the case for both, the DefaultRequestResponseSerializer and the DefaultSourceSerializer 🙂

Please let me know, if that answers your question.

Thanks @flobernd.

Too many serializers :)

I guess I need to dig a bit deeper to find what's breaking our request. A bunch of assumptions led me to believe it was in the serializer (based on my example above), but that was clearly wrong :)

@flobernd
Copy link
Member

@andersosthus Let me know if I can do something else to help you debugging your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants