Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug in lucene select #5719

Closed
publicocean0 opened this issue Feb 13, 2016 · 5 comments
Closed

bug in lucene select #5719

publicocean0 opened this issue Feb 13, 2016 · 5 comments
Assignees
Labels
Milestone

Comments

@publicocean0
Copy link

version 2.10

index: c.name_list  FULLTEXT    ["name","list"]     LUCENE 

insert into c set list = ['a','b'] 

select from index:c.name_list where key  LUCENE "list:b" 

----+------+----------------------------+-----
#   |@CLASS|key                         |rid  
----+------+----------------------------+-----
0   |null  |OCompositeKey{keys=[list:b]}|#56:1
----+------+----------------------------+-----

select from index:c.name_list where key  LUCENE "list:a" 
0 item(s) found. Query executed in 0.002 sec(s).


It seams it add just last element. In lucene arrays are handled natevely adding many fields with same name as many elements

@smolinari
Copy link
Contributor

Is it correct that a lucene index should be able to index lists?

I also just tried it with Studio and it doesn't work. I get.

com.orientechnologies.orient.core.exception.OCommandExecutionException: Cannot evaluate lucene condition without index configuration.

With the index set on a property of an embedded list of strings.

Scott

@publicocean0
Copy link
Author

Lucene is a index engine designed for indexing unstructured data. You can index what you want.
It is used above all for documental services. A general document contains a unstructured data. if you read a word file and you want index internal info present in it , you dont know the structure until you read the specific instance.For example you want index authors , in a word you can find them , in another not. So all is not foreseable. For this reason lucene got success, because answers to a specific need requested by customers. So the initiall answer is naturally yes.
I saw that error mentioned by you. But it is another bug. It appear if you create a index using console. I skipped it using java or gui studio (not executing a command)
In addition i tried today to give a look to lucene module for patching it, because in lucene to index a list requires 3 instructions, but i discovered that lucene manager class splits a list or OCompositeKey in many updates (i think many documents).I m not sure about these duplication, I lose few time in the investigation, i will give a look in spare time, it seams it creates duplications in index.
The main problem for my opinion is here :
OCompositeIndexDefinition: it makes a initial ipothesis valid for orientdb index , not for lucene index:

 if (keyValue instanceof Collection) {
      final Collection collectionKey = (Collection) keyValue;
      if (!containsCollection)
        for (int i = 1; i < collectionKey.size(); i++) {
          final OCompositeKey compositeKey = new OCompositeKey(firstKey.getKeys());
          compositeKeys.add(compositeKey);
        }
      else
  ----->      throw new OIndexException("Composite key cannot contain more than one collection item");

Simply in Lucene every key pratically is valid because it corrisponds to a lucene Document class (it is a generic problem valid not only for lists) .
I think is the cause also of duplications in lucene indexes, so it use many Lucene Document for making what it is possible make using just a Lucene Document (it creates index files very huge if there isa index schema with many key elements)
For example OIndexDefinition could contains a boolean internal property useStructuredKey (true for lucene engine,false for the remaining engines)

@publicocean0
Copy link
Author

Other bug found: CREATE INDEX Manual2 FULLTEXT ENGINE LUCENE gives NullPointerException

@publicocean0
Copy link
Author

Other bug found : select expand(rid) from index:A.lucene where key LUCENE "producer:'#33:295'"
impossible search for a attribute that is a link to another document.

Uhmmm maybe is correct LUCENE "producer:'#33\:295'"

robfrank added a commit that referenced this issue Oct 31, 2016
robfrank added a commit that referenced this issue Oct 31, 2016
@robfrank robfrank added this to the 2.2.x (next hotfix) milestone Oct 31, 2016
@robfrank
Copy link
Contributor

verified with tests. see coomits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants