How groupingSearch specifies SortedNumericDocValuesField

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How groupingSearch specifies SortedNumericDocValuesField

顿顿
When I use groupingSearch specified as SortedNumericDocValuesField,
I got an "unexpected docvalues type NUMERIC for field 'id'
(expected=SORTED)" Exception.

My code is as follows:
 String indexPath = "tmp/grouping";
        Analyzer standardAnalyzer = new StandardAnalyzer();
        Directory indexDir = FSDirectory.open(Paths.get(indexPath));
        IndexWriterConfig indexWriterConfig = new
IndexWriterConfig(standardAnalyzer);
        indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
        IndexWriter masterIndex = new IndexWriter(indexDir,
indexWriterConfig);

        String name = "Tom";
        for (int i = 1; i < 5; i++) {
            Document doc = new Document();
            doc.add(new StringField("name", name + "_" + i,
Field.Store.YES));
            doc.add(new SortedNumericDocValuesField("id", i));
            doc.add(new StoredField("id", i));
            masterIndex.addDocument(doc);

        }
        masterIndex.commit();
        masterIndex.commit();

        IndexReader reader =
DirectoryReader.open(FSDirectory.open(Paths.get(indexPath)));
        IndexSearcher searcher = new IndexSearcher(reader);

        GroupingSearch groupingSearch = new GroupingSearch("id");
        TopGroups topGroups = groupingSearch.search(searcher, new
MatchAllDocsQuery(), 0, 100);

        System.out.println(topGroups.totalHitCount);
        reader.close();


The exception is as follows:
Exception in thread "main" java.lang.IllegalStateException: unexpected
docvalues type SORTED_NUMERIC for field 'id' (expected=SORTED). Re-index
with correct docvalues type.
at org.apache.lucene.index.DocValues.checkField(DocValues.java:317)
at org.apache.lucene.index.DocValues.getSorted(DocValues.java:369)
at
org.apache.lucene.search.grouping.TermGroupSelector.setNextReader(TermGroupSelector.java:56)
at
org.apache.lucene.search.grouping.FirstPassGroupingCollector.doSetNextReader(FirstPassGroupingCollector.java:348)
at
org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:643)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
at
org.apache.lucene.search.grouping.GroupingSearch.groupByFieldOrFunction(GroupingSearch.java:141)
at
org.apache.lucene.search.grouping.GroupingSearch.search(GroupingSearch.java:113)


The version of Lucene I am using is 8.0.0.


Finally, I want to know how groupingSearch specifies three fields:
NumericDocValuesField, SortedNumericDocValuesField, SortedSetDocValuesField?




Thank you for your attention  to this  matter!
Reply | Threaded
Open this post in threaded view
|

Re: How groupingSearch specifies SortedNumericDocValuesField

Martin Grigorov
Hi,

On Tue, May 14, 2019 at 8:28 PM 顿顿 <[hidden email]> wrote:

> When I use groupingSearch specified as SortedNumericDocValuesField,
> I got an "unexpected docvalues type NUMERIC for field 'id'
> (expected=SORTED)" Exception.
>
> My code is as follows:
>  String indexPath = "tmp/grouping";
>         Analyzer standardAnalyzer = new StandardAnalyzer();
>         Directory indexDir = FSDirectory.open(Paths.get(indexPath));
>         IndexWriterConfig indexWriterConfig = new
> IndexWriterConfig(standardAnalyzer);
>         indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
>         IndexWriter masterIndex = new IndexWriter(indexDir,
> indexWriterConfig);
>
>         String name = "Tom";
>         for (int i = 1; i < 5; i++) {
>             Document doc = new Document();
>             doc.add(new StringField("name", name + "_" + i,
> Field.Store.YES));
>             doc.add(new SortedNumericDocValuesField("id", i));
>             doc.add(new StoredField("id", i));
>

are you sure both fields should have the same name ("id") ?


>             masterIndex.addDocument(doc);
>
>         }
>         masterIndex.commit();
>         masterIndex.commit();
>
>         IndexReader reader =
> DirectoryReader.open(FSDirectory.open(Paths.get(indexPath)));
>         IndexSearcher searcher = new IndexSearcher(reader);
>
>         GroupingSearch groupingSearch = new GroupingSearch("id");
>         TopGroups topGroups = groupingSearch.search(searcher, new
> MatchAllDocsQuery(), 0, 100);
>
>         System.out.println(topGroups.totalHitCount);
>         reader.close();
>
>
> The exception is as follows:
> Exception in thread "main" java.lang.IllegalStateException: unexpected
> docvalues type SORTED_NUMERIC for field 'id' (expected=SORTED). Re-index
> with correct docvalues type.
> at org.apache.lucene.index.DocValues.checkField(DocValues.java:317)
> at org.apache.lucene.index.DocValues.getSorted(DocValues.java:369)
> at
>
> org.apache.lucene.search.grouping.TermGroupSelector.setNextReader(TermGroupSelector.java:56)
> at
>
> org.apache.lucene.search.grouping.FirstPassGroupingCollector.doSetNextReader(FirstPassGroupingCollector.java:348)
> at
>
> org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:643)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> at
>
> org.apache.lucene.search.grouping.GroupingSearch.groupByFieldOrFunction(GroupingSearch.java:141)
> at
>
> org.apache.lucene.search.grouping.GroupingSearch.search(GroupingSearch.java:113)
>
>
> The version of Lucene I am using is 8.0.0.
>
>
> Finally, I want to know how groupingSearch specifies three fields:
> NumericDocValuesField, SortedNumericDocValuesField,
> SortedSetDocValuesField?
>
>
>
>
> Thank you for your attention  to this  matter!
>
Reply | Threaded
Open this post in threaded view
|

Re: How groupingSearch specifies SortedNumericDocValuesField

顿顿
Hi:

This is a unit test, and I changed to NumericDocValuesField with a similar
error.

I tried testing the NumericDocValuesField, SortedNumericDocValuesField and
SortedSetDocValuesField, these three fields can not be specified in
groupingSearch. Does groupingSearch only support SortedDocValuesField?



Martin Grigorov <[hidden email]> 于2019年5月15日周三 下午1:51写道:

> Hi,
>
> On Tue, May 14, 2019 at 8:28 PM 顿顿 <[hidden email]> wrote:
>
> > When I use groupingSearch specified as SortedNumericDocValuesField,
> > I got an "unexpected docvalues type NUMERIC for field 'id'
> > (expected=SORTED)" Exception.
> >
> > My code is as follows:
> >  String indexPath = "tmp/grouping";
> >         Analyzer standardAnalyzer = new StandardAnalyzer();
> >         Directory indexDir = FSDirectory.open(Paths.get(indexPath));
> >         IndexWriterConfig indexWriterConfig = new
> > IndexWriterConfig(standardAnalyzer);
> >         indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
> >         IndexWriter masterIndex = new IndexWriter(indexDir,
> > indexWriterConfig);
> >
> >         String name = "Tom";
> >         for (int i = 1; i < 5; i++) {
> >             Document doc = new Document();
> >             doc.add(new StringField("name", name + "_" + i,
> > Field.Store.YES));
> >             doc.add(new SortedNumericDocValuesField("id", i));
> >             doc.add(new StoredField("id", i));
> >
>
> are you sure both fields should have the same name ("id") ?
>
>
> >             masterIndex.addDocument(doc);
> >
> >         }
> >         masterIndex.commit();
> >         masterIndex.commit();
> >
> >         IndexReader reader =
> > DirectoryReader.open(FSDirectory.open(Paths.get(indexPath)));
> >         IndexSearcher searcher = new IndexSearcher(reader);
> >
> >         GroupingSearch groupingSearch = new GroupingSearch("id");
> >         TopGroups topGroups = groupingSearch.search(searcher, new
> > MatchAllDocsQuery(), 0, 100);
> >
> >         System.out.println(topGroups.totalHitCount);
> >         reader.close();
> >
> >
> > The exception is as follows:
> > Exception in thread "main" java.lang.IllegalStateException: unexpected
> > docvalues type SORTED_NUMERIC for field 'id' (expected=SORTED). Re-index
> > with correct docvalues type.
> > at org.apache.lucene.index.DocValues.checkField(DocValues.java:317)
> > at org.apache.lucene.index.DocValues.getSorted(DocValues.java:369)
> > at
> >
> >
> org.apache.lucene.search.grouping.TermGroupSelector.setNextReader(TermGroupSelector.java:56)
> > at
> >
> >
> org.apache.lucene.search.grouping.FirstPassGroupingCollector.doSetNextReader(FirstPassGroupingCollector.java:348)
> > at
> >
> >
> org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:643)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > at
> >
> >
> org.apache.lucene.search.grouping.GroupingSearch.groupByFieldOrFunction(GroupingSearch.java:141)
> > at
> >
> >
> org.apache.lucene.search.grouping.GroupingSearch.search(GroupingSearch.java:113)
> >
> >
> > The version of Lucene I am using is 8.0.0.
> >
> >
> > Finally, I want to know how groupingSearch specifies three fields:
> > NumericDocValuesField, SortedNumericDocValuesField,
> > SortedSetDocValuesField?
> >
> >
> >
> >
> > Thank you for your attention  to this  matter!
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: How groupingSearch specifies SortedNumericDocValuesField

duriku
Hi, I managed to retrive the groups using the *SortedSetDocValuesField* in
*GroupingSearch* by initialising the groupsearch with *SortedSetFieldSource*

The problem is when a document has multiple values in the field
"SortedSetDocValuesField" than not the grouping query does not return all
the groups.

Let me demonstrate it in my example

// indexing, the first object has the category "one" and the second object
has category "two" and "three"

Document doc = new Document();
doc.add(new FacetField("Author", "Bob"));
doc.add(new SortedSetDocValuesField("category", new BytesRef("one")));
indexWriter.addDocument(config.build(taxoWriter, doc));

doc = new Document();
doc.add(new FacetField("Author", "Lisa"));
doc.add(new SortedSetDocValuesField("category", new BytesRef("two")));
doc.add(new SortedSetDocValuesField("category", new BytesRef("three")));
indexWriter.addDocument(config.build(taxoWriter, doc));

// initializing the grouping search
ValueSource vs = new SortedSetFieldSource(groupField);
groupingSearch = new GroupingSearch(vs, new HashMap<>());

// performing the group search
TopGroups groups = groupingSearch.search(searcher, new MatchAllDocsQuery(),
0, 100);


It returns 2 groups only and I would expect 3 groups ("one", "two" and
"three")

Is it a bug? Or am I using the API in a wrong way?



--
Sent from: https://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]