Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running test #128

Open
GuillaumeCisco opened this issue Dec 7, 2015 · 11 comments
Open

Running test #128

GuillaumeCisco opened this issue Dec 7, 2015 · 11 comments

Comments

@GuillaumeCisco
Copy link

Hello, thank you for providing this tool.
Sorry if my question is totally irrelevant, but I cannot manage to run the tests.

Error is:

{u'status': 400, u'error': {u'caused_by': {u'reason': u'Mapper for [_id] conflicts with existing mapping in other types:\n[mapper [_id] cannot be changed from type [string] to [int]]', u'type': u'illegal_argument_exception'}, u'root_cause': [{u'reason': u'Failed to parse mapping [ManagedButEmpty]: Mapper for [_id] conflicts with existing mapping in other types:\n[mapper [_id] cannot be changed from type [string] to [int]]', u'type': u'mapper_parsing_exception'}], u'type': u'mapper_parsing_exception', u'reason': u'Failed to parse mapping [ManagedButEmpty]: Mapper for [_id] conflicts with existing mapping in other types:\n[mapper [_id] cannot be changed from type [string] to [int]]'}}

I really don't understand why this error occurs. Maybe I have conflicts in my elasticsearch instance.
Please some help is needed.

By the way, I was wondering how to configure suggest mapping with bungiesearch:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html

Thank you,

@ChristopherRabotin
Copy link
Owner

Are you running the tests on a new index? The error you got usually means that you are trying to write to a doc type whose _id field (in this case) is already set to be a string. I know of two ways this can happen:

  1. Some data was already written to that index and doc type before the mapping was set, and elasticsearch interpreted the data you gave as a string instead of an integer;
  2. The mapping for that index and doc type has already been defined as a string, and elasticsearch prevents you from coercing it back to an integer.

As for the suggesters, I had planned on using them and configuring them through bungiesearch a few months, but never go around to it. So I sadly can't be of much help right now. However, I might end up using them in a few weeks on a project I'm currently working on. If you have any insight on it, please share it as I'll also use it.

@GuillaumeCisco
Copy link
Author

Thank you @ChristopherRabotin for these informations.
What I truly don't understand is that I run the provided tests which should create bungiesearch_demo and bungiesearch_demo_bis as described in the settings.

Log is :

INFO:root:Overwriting implicitly defined model field description (StringField) its explicit definition: StringField.
INFO:root:Creating index bungiesearch_demo_bis with 3 doctypes.
INFO:urllib3.connectionpool:Starting new HTTP connection (1): localhost
WARNING:elasticsearch:PUT /bungiesearch_demo_bis [status:400 request:0.016s]

Do you think I have some conflicts with my settings? It is very weird. I think I'm going to test it on another machine.

Thank you,

@ChristopherRabotin
Copy link
Owner

That is strange indeed. I don't recall seeing any such error in the build logs on TravisCI. What version of ES are you running, just in case.

@NullSoldier , any thoughts on this?

@GuillaumeCisco
Copy link
Author

I'm using elasticsearch 2.1.

$> curl -XGET 'localhost:9200' 
{
  "name" : "Hephaestus",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.1.0",
    "build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
    "build_timestamp" : "2015-11-18T22:40:03Z",
    "build_snapshot" : false,
    "lucene_version" : "5.3.1"
  },
  "tagline" : "You Know, for Search"
}

If it can help.
I have also created a specific virtualenv for testing bungiesearch, it automatically installed django 1.9, so I moved back to django 1.7.11 manually.

Last aside question, Why bungiesearch do not use fields described in elasticsearch_dsl.field?

@ChristopherRabotin
Copy link
Owner

Django 1.7 should be supported. Bungiesearch used to run on version 1.5.5 ages ago, and I don't recall making any breaking changes with Django.

As for using Elasticsearch's fields, the main reason is that the ES-dsl fields did not exist when bungiesearch was first written. They were added three or four months after bungiesearch was being used in production at Sparrho. I don't see any main blocker right now to not use them, so that should probably be done to be honest.

@GuillaumeCisco
Copy link
Author

Ok thank you for all these informations.

I think I've found why it is not working thanks to:
http://stackoverflow.com/questions/33516499/elasticsearch-mapping-conflict-error-upgrading-from-1-5-to-2-0

Firstly, I've tried to create my mappings the elasticsearch way, using data generated by bungiesearch:

#!/bin/bash                                                                                                                                                                                                                                                                    


curl -X DELETE localhost:9200/bungie_demo
curl -X PUT localhost:9200/bungie_demo

curl -X PUT localhost:9200/bungie_demo/Article/_mapping -d '{                                                                                                                                                                                                                  
    "Article": {"properties": {"updated": {"type": "date", "null_value": "2013-07-01"},                                                                                                                                                                                        
                            "description": {"boost": 1.35, "type": "string", "analyzer": "snowball"},                                                                                                                                                                          
                            "created": {"type": "date"},                                                                                                                                                                                                                       
                            "title": {"boost": 1.75, "type": "string", "analyzer": "snowball"},                                                                                                                                                                                
                            "meta_data": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                           
                            "link": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                                
                            "text_field": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                          
                            "authors": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                             
                            "effective_date": {"type": "date"},                                                                                                                                                                                                                
                            "more_fields": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                         
                            "_id": {"type": "integer"},                                                                                                                                                                                                                        
                            "tweet_count": {"type": "integer"},                                                                                                                                                                                                                
                            "id": {"type": "integer"},                                                                                                                                                                                                                         
                            "published": {"type": "date"}}}                                                                                                                                                                                                                    
}'

curl -X PUT localhost:9200/bungie_demo/User/_mapping -d '{                                                                                                                                                                                                                     
 "User": {                                                                                                                                                                                                                                                                     
    "properties": {"updated": {"type": "date", "null_value": "2013-07-01"},                                                                                                                                                                                                    
                   "user_id": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                                      
                   "description": {"boost": 1.35, "type": "string", "analyzer": "snowball"},                                                                                                                                                                                   
                   "created": {"type": "date"},                                                                                                                                                                                                                                
                   "meta_data": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                                    
                   "more_fields": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                                  
                   "effective_date": {"type": "date"},                                                                                                                                                                                                                         
                   "_id": {"type": "string", "analyzer": "snowball"},                                                                                                                                                                                                          
                   "name": {"type": "string", "analyzer": "snowball"}}}                                                                                                                                                                                                        
}'

It doesn't work with elasticsearch 2.x :

{"acknowledged":true}{"acknowledged":true}{"acknowledged":true}{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Mapper for [_id] conflicts with existing mapping in other types:\n[mapper [_id] cannot be changed from type [int] to [string]]"}],"type":"illegal_argument_exception","reason":"Mapper for [_id] conflicts with existing mapping in other types:\n[mapper [_id] cannot be changed from type [int] to [string]]"},"status":400} 

We cannot have _id typed as an integer in a DocType and _id typed as a string in another DocType.
More information: https://www.elastic.co/blog/great-mapping-refactoring

I suggest to update the test suite for dealing with elasticsearch 2.x

What do you think?
Thank you again for your reactivity.

@ChristopherRabotin
Copy link
Owner

Yes, I agree that elasticsearch version 2 should be supported as well. Maybe could the travis file be updated to include different versions of elasticsearch in the matrix, cf. https://github.com/ChristopherRabotin/bungiesearch/pull/135/files .

@GuillaumeCisco
Copy link
Author

One more thing, I have tried to modify the code for elasticsearch 2.0 compliance, but I still cannot run the tests.
When I execute the tests suite, tests are processed alphabetically and not in a declarative order.
It looks like defined tests are interdependants, can you confirm me this?
Is there a way for executing tests in a declarative order?

Thank you,

@NullSoldier
Copy link
Contributor

I'm on vacation until the end of the week but I'll be sure to read over this and respond by the end of the week.

@notfol
Copy link

notfol commented Mar 12, 2016

Areas I'm seeing the tests break using elasticsearch 2.x:

  1. _id is not mappable (already mentioned). I workaround this by popping _id out of the mapping dict returned by ModelIndex.get_mapping.
  2. Fields of the same name, in the same index, in different mapping types map to the same field internally and must have the same mapping. This needs to be fixed in the indices in tests/core/search_indices.py and tests/core/search_indices_bis.py
    https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html#merging-conflicts
  3. The cluster health endpoint breaks with the timeout argument. Looks like you now need to provide a unit of time (i.e timeout=50s)
    es.cluster.health(index=','.join(indices), wait_for_status='green', timeout='30s')
    $ curl -s http://localhost:9200/_cluster/health?timeout=50|python -m json.tool
    {
    "error": {
    "reason": "Failed to parse setting [timeout] with value [50] as a time value: unit is missing or unrecognized",
    "root_cause": [
    {
    "reason": "Failed to parse setting [timeout] with value [50] as a time value: unit is missing or unrecognized",
    "type": "parse_exception"
    }
    ],
    "type": "parse_exception"
    },
    "status": 400
    }
  4. Should take wait_for_status from command line argument. The docs state, yellow status means "Elasticsearch has allocated all of the primary shards, but some/all of the replicas have not been allocated." AFAIK this will always be the case in a single node cluster. I saw this call work with 'green' as the wait_for_status argument work in ES 1.7, but it seems to not in 2.2.
    es.cluster.health(index=','.join(indices), wait_for_status=options.get('wait_for_status'), timeout='30s')
    http://chrissimpson.co.uk/elasticsearch-yellow-cluster-status-explained.html

@ChristopherRabotin
Copy link
Owner

Item number 3 sounds like an issue with the ES library, not bungiesearch.
As for all other issues, I'm open to a PR if you're up to it. Otherwise,
I'll have to figure out when I can work on this. I currently no longer use
Bungiesearch on a regular basis, hence the slowed down development.

On Fri, Mar 11, 2016, 18:43 notfol notifications@github.com wrote:

Areas I'm seeing the tests break using elasticsearch 2.x:

_id is not mappable (already mentioned). I workaround this by popping
_id out of the mapping dict returned by ModelIndex.get_mapping.
2.

Fields of the same name, in the same index, in different mapping types
map to the same field internally and must have the same mapping. This needs
to be fixed in the indices in tests/core/search_indices.py and
tests/core/search_indices_bis.py

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html#merging-conflicts
3.

The cluster health endpoint breaks with the timeout argument. Looks
like you now need to provide a unit of time (i.e timeout=50s)
es.cluster.health(index=','.join(indices), wait_for_status='green',
timeout='30s')
$ curl -s http://localhost:9200/_cluster/health?timeout=50|python -m
json.tool
{
"error": {
"reason": "Failed to parse setting [timeout] with value [50] as a time
value: unit is missing or unrecognized",
"root_cause": [
{
"reason": "Failed to parse setting [timeout] with value [50] as a time
value: unit is missing or unrecognized",
"type": "parse_exception"
}
],
"type": "parse_exception"
},
"status": 400
}
4.

Should take wait_for_status from command line argument. The docs
state, yellow status means "Elasticsearch has allocated all of the primary
shards, but some/all of the replicas have not been allocated." AFAIK this
will always be the case in a single node cluster. I saw this call work with
'green' as the wait_for_status argument work in ES 1.7, but it seems to not
in 2.2.
es.cluster.health(index=','.join(indices),
wait_for_status=options.get('wait_for_status'), timeout='30s')

http://chrissimpson.co.uk/elasticsearch-yellow-cluster-status-explained.html


Reply to this email directly or view it on GitHub
#128 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants