Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't query on content type from Elasticsearch and sample is wrong #17185

Closed
Piedone opened this issue Dec 9, 2024 · 8 comments · Fixed by #17194
Closed

Can't query on content type from Elasticsearch and sample is wrong #17185

Piedone opened this issue Dec 9, 2024 · 8 comments · Fixed by #17194

Comments

@Piedone
Copy link
Member

Piedone commented Dec 9, 2024

Describe the bug

I can't seem to create an Elasticsearch query that filters on content type. The RecentBlogPosts query in TheBlog theme's recipe, as well as all the samples in the documentation are broken too.

Orchard Core version

Latest main (238ed0e). This used to work in 1.8.x.

To Reproduce

  1. Run Elasticsearch locally or else, and configure it in appsettings.
  2. Set up with the Blog recipe.
  3. Run the "Blog - Elasticsearch Query" recipe.
  4. Open the RecentBlogPosts Query. Notice that its "Query" textbox is empty. This is bug 1.
  5. Try to run the same Elasticsearch query that was in the recipe and used to work under 1.8.x, see below, and observe that it doesn't return any items, though it should the Blog Post created during setup. This is bug 2.
{
  "query": {
    "term": { "Content.ContentItem.ContentType": "BlogPost" }
  },
  "sort": [
    {
      "Content.ContentItem.CreatedUtc": "desc"
    }
  ],
  "size": 3
}

Using these variations doesn't return any items either:

{
  "query": {
    "term": { "Content.ContentItem.ContentType.keyword": "BlogPost" }
  }
}
{
  "query": {
    "match": { "Content.ContentItem.ContentType.keyword": "BlogPost" }
  }
}
{
  "query": {
    "match": { "Content.ContentItem.ContentType": "BlogPost" }
  }
}

They field is correctly mapped (as a keyword):

{
  "aliases": {},
  "mappings": {
    "dynamic_templates": [
      {
        "*.Inherited": {
          "match_mapping_type": "string",
          "path_match": "*.Inherited",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "*.Ids": {
          "match_mapping_type": "string",
          "path_match": "*.Ids",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "*.Location": {
          "match_mapping_type": "object",
          "path_match": "*.Location",
          "mapping": {
            "type": "geo_point"
          }
        }
      }
    ],
    "_meta": {
      "last_task_id": 7
    },
    "properties": {
      "Content": {
        "properties": {
          "BodyAspect": {
            "properties": {
              "Body": {
                "fields": {
                  "keyword": {
                    "ignore_above": 256,
                    "type": "keyword"
                  }
                },
                "type": "text"
              }
            },
            "type": "object"
          },
          "ContentItem": {
            "properties": {
              "Author": {
                "fields": {
                  "keyword": {
                    "ignore_above": 256,
                    "type": "keyword"
                  }
                },
                "type": "text"
              },
              "ContainedPart": {
                "properties": {
                  "Ids": {
                    "type": "keyword"
                  },
                  "Order": {
                    "type": "float"
                  }
                },
                "type": "object"
              },
              "ContentType": {
                "type": "keyword"
              },
              "CreatedUtc": {
                "type": "date"
              },
              "DisplayText": {
                "properties": {
                  "Analyzed": {
                    "type": "text"
                  },
                  "Keyword": {
                    "type": "keyword"
                  },
                  "Normalized": {
                    "type": "keyword"
                  },
                  "keyword": {
                    "fields": {
                      "keyword": {
                        "ignore_above": 256,
                        "type": "keyword"
                      }
                    },
                    "type": "text"
                  }
                },
                "type": "object"
              },
              "FullText": {
                "type": "text"
              },
              "Latest": {
                "type": "boolean"
              },
              "ModifiedUtc": {
                "type": "date"
              },
              "Owner": {
                "type": "keyword"
              },
              "Published": {
                "type": "boolean"
              },
              "PublishedUtc": {
                "type": "date"
              }
            },
            "type": "object"
          }
        },
        "type": "object"
      },
      "ContentItemId": {
        "type": "keyword"
      },
      "ContentItemVersionId": {
        "type": "keyword"
      }
    },
    "_source": {
      "enabled": true,
      "excludes": [
        "Content.ContentItem.DisplayText.Analyzed"
      ]
    }
  },
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "standard": {
            "type": "standard"
          }
        }
      },
      "creation_date": 1733704276685,
      "number_of_replicas": "1",
      "number_of_shards": "1",
      "provided_name": "default_search",
      "routing": {
        "allocation": {
          "include": {
            "_tier_preference": "data_content"
          }
        }
      },
      "uuid": "eyiyt9TNTfmnkPV3dA4Egw",
      "version": {
        "created": "8518000"
      }
    }
  }
}

Keyword search from under /Search still works.

Expected behavior

The Query from the recipe should work. Also, samples like the ones under https://docs.orchardcore.net/en/latest/guides/query-content-items-based-on-taxonomy/ should also work (they're for Lucene, but this is supposed to work the same).

Logs and screenshots

@Piedone
Copy link
Member Author

Piedone commented Dec 9, 2024

@Skrypt do you perhaps have some tip?

@Piedone Piedone changed the title Can't query on content type from Elasticsearch and sample is wrong. Can't query on content type from Elasticsearch and sample is wrong Dec 9, 2024
@Skrypt
Copy link
Contributor

Skrypt commented Dec 9, 2024

Maybe related with merging Elastic.Clients.Elasticsearch library recently?
Try revert your branch to before that commit on main branch to see if it fixes the issue.
Then, if it does, we may want to check what's wrong with the recently commited changes.

Also, try removing the .keyword in the end of your field names. Maybe it got removed from that PR.

Best way to know is to inspect the index with the Elasticsearch web UI.

@MikeAlhayek
Copy link
Member

@Piedone I think it may be a bug in the new Elasticsearch.Net library.

Try this instead

{
  "query": {
    "term": { "Content.ContentItem.ContentType": {
           "value": "BlogPost"
       }
    }
  },
  "sort": [
    {
      "Content.ContentItem.CreatedUtc": "desc"
    }
  ],
  "size": 3
}

I reporting this bug elastic/elasticsearch-net#8432 to see what they come back with.

@MikeAlhayek MikeAlhayek added this to the 3.0 milestone Dec 9, 2024
Copy link
Contributor

github-actions bot commented Dec 9, 2024

We triaged this issue and set the milestone according to the priority we think is appropriate (see the docs on how we triage and prioritize issues).

This indicates when the core team may start working on it. However, if you'd like to contribute, we'd warmly welcome you to do that anytime. See our guide on contributions here.

@Skrypt
Copy link
Contributor

Skrypt commented Dec 9, 2024

The ContentType field must be set to be indexed as a keyword from memory so that we have a .keyword name on the field to search. We need to see if this is a change in ES 8. Else, I will need to analyze this and see what is happening.

@MikeAlhayek
Copy link
Member

@Skrypt I think you may have missed my last comment

@Skrypt
Copy link
Contributor

Skrypt commented Dec 9, 2024

Yeah, the issue here is that our Lucene implementation supports the short form in our Lucene TermQueryProvider.
These Queries need to work with both Lucene and Elasticsearch.
So, we will need to convert them to use { "value": "BlogPost" }.

Also .keyword should work unless they changed that too.

MikeAlhayek added a commit that referenced this issue Dec 9, 2024
@Piedone
Copy link
Member Author

Piedone commented Dec 9, 2024

Maybe related with merging Elastic.Clients.Elasticsearch library recently? Try revert your branch to before that commit on main branch to see if it fixes the issue. Then, if it does, we may want to check what's wrong with the recently commited changes.

Also, try removing the .keyword in the end of your field names. Maybe it got removed from that PR.

Best way to know is to inspect the index with the Elasticsearch web UI.

Commit bb7359e, slightly before #17027, works with the original query.

@Piedone I think it may be a bug in the new Elasticsearch.Net library.

Try this instead

{
  "query": {
    "term": { "Content.ContentItem.ContentType": {
           "value": "BlogPost"
       }
    }
  },
  "sort": [
    {
      "Content.ContentItem.CreatedUtc": "desc"
    }
  ],
  "size": 3
}

I reporting this bug elastic/elasticsearch-net#8432 to see what they come back with.

This query indeed works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants