Search Performance Problem 16 sec for 250K docs

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Search Performance Problem 16 sec for 250K docs

M A-2
Hi there,

I have an index with about 250K document, to be indexed full text.

there are 2 types of searches carried out, 1. using 1 field, the other using
4 .. for a query string ...

given the nature of the queries required, all stop words are maintained in
the index, thereby allowing for phrasal queries, (this is a requirement) ..

So search I am using the following ..

if(fldArray.length > 1)
    {
      // use four fields
      BooleanClause.Occur[] flags = {BooleanClause.Occur.SHOULD,
BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD,
BooleanClause.Occur.SHOULD};
      query = MultiFieldQueryParser.parse(queryString, fldArray, flags,
analyzer); //parse the
    }
    else
    {
      //use only 1 field
      query = new QueryParser("tags", analyzer).parse(queryString);
    }


When i search on the 4 fields the average search time is 16 sec ..
When i search on the 1 field the average search time is 9 secs ...

The Analyzer used for both searching and indexing is
Analyzer analyzer = new StandardAnalyzer(new String[]{});

The index size is about a 1GB ..

The documents vary in size some are less than 1K max size is about 5k



Is there anything I can do to make this faster.... 16 secs is just not
acceptable ..

Machine : 512MB, celeron 2600 ...  Lucene 2.0

I could go for a bigger machine but wanted to make sure that the problem was
not something I was doing, given 250K is not that large a figure ..

Please Help

Thanx

Mo
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Erick Erickson
This is a loooonnnnnggg time, I think you're right, it's excessive.

What are you timing? The time to complete the search (i.e. get a Hits object
back) or the total time to assemble the response? Why I ask is that the Hits
object is designed to return the fir st100 or so docs efficiently. Every 100
docs or so, it re-executes the query. So, if you're returning a large result
set, the using the Hits object to iterate over them, this could account for
your time. Use a HitCollector instead... But do note this from the javadoc
for hitcollector

----
. For good search performance, implementations of this method should not
call Searcher.doc(int)<file:///C:/lucene_1.9.1/docs/api/org/apache/lucene/search/Searcher.html#doc%28int%29>or
IndexReader.document(int)<file:///C:/lucene_1.9.1/docs/api/org/apache/lucene/index/IndexReader.html#document%28int%29>on
every document number encountered. Doing so can slow searches by an
order
of magnitude or more.
-----

FWIW, I have indexes larger that 1G that return in far less time than you
are indicating, through three layers and constructing web pages in the
meantime. It contains over 800K documents and the response time is around a
second (haven't timed it lately). This includes 5-way sorts.

You might also either get a copy of Luke and have it explain exactly what
the parse does or use one of the query exlain calls (sorry, don't remember
them off the top of my head) to see what query is *actually* submitted and
whether it's what you expect.

Are you using wildcards? They also have an effect on query speed.

If none of this applies, perhaps you could post the query and the how the
index is constructed. If you haven't already gotten a copy of Luke, I
heartily recommend it....

Hope this helps
Erick

On 8/19/06, M A <[hidden email]> wrote:

>
> Hi there,
>
> I have an index with about 250K document, to be indexed full text.
>
> there are 2 types of searches carried out, 1. using 1 field, the other
> using
> 4 .. for a query string ...
>
> given the nature of the queries required, all stop words are maintained in
> the index, thereby allowing for phrasal queries, (this is a requirement)
> ..
>
> So search I am using the following ..
>
> if(fldArray.length > 1)
>     {
>       // use four fields
>       BooleanClause.Occur[] flags = {BooleanClause.Occur.SHOULD,
> BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD,
> BooleanClause.Occur.SHOULD};
>       query = MultiFieldQueryParser.parse(queryString, fldArray, flags,
> analyzer); //parse the
>     }
>     else
>     {
>       //use only 1 field
>       query = new QueryParser("tags", analyzer).parse(queryString);
>     }
>
>
> When i search on the 4 fields the average search time is 16 sec ..
> When i search on the 1 field the average search time is 9 secs ...
>
> The Analyzer used for both searching and indexing is
> Analyzer analyzer = new StandardAnalyzer(new String[]{});
>
> The index size is about a 1GB ..
>
> The documents vary in size some are less than 1K max size is about 5k
>
>
>
> Is there anything I can do to make this faster.... 16 secs is just not
> acceptable ..
>
> Machine : 512MB, celeron 2600 ...  Lucene 2.0
>
> I could go for a bigger machine but wanted to make sure that the problem
> was
> not something I was doing, given 250K is not that large a figure ..
>
> Please Help
>
> Thanx
>
> Mo
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
what i am measuring is this

Analyzer analyzer = new StandardAnalyzer(new String[]{});

    if(fldArray.length > 1)
    {
      BooleanClause.Occur[] flags = {BooleanClause.Occur.SHOULD,
BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD,
BooleanClause.Occur.SHOULD};
      query = MultiFieldQueryParser.parse(queryString, fldArray, flags,
analyzer); //parse the
    }
    else
    {
      query = new QueryParser("tags", analyzer).parse(queryString);
      System.err.println("QUERY IS " + query.toString());

    }

long ts = System.currentTimeMillis();

    hits = searcher.search(query, new Sort("sid", true));
    java.util.Set terms = new java.util.HashSet();
    query.extractTerms(terms);
    Object[] strTerms = terms.toArray();

long ta = System.currentTimeMillis();

 System.err.println("Retrived Hits in " + (ta - ts));

 recent figures for this ..

Retrived Hits in 26974 (msec)
Retrived Hits in 61415 (msec)

The query itself is just a word, eg "apple"

The index is contructed as follows ..

// this process runs as a daemon process ...

                iwriter = new IndexWriter(INDEX_PATH, new
StandardAnalyzer(new String[]{}), false);
                Document doc = new Document();
                doc.add(new Field("id", String.valueOf(story.getId()),
Field.Store.YES, Field.Index.NO));
                doc.add(new
Field("sid",String.valueOf(story.getDate().getTime()),
Field.Store.YES, Field.Index.UN_TOKENIZED));
                String tags = getTags(story.getId());
                doc.add(new Field("tags", tags, Field.Store.YES,
Field.Index.TOKENIZED));
                doc.add(new Field("headline", story.getHeadline(),
Field.Store.YES, Field.Index.TOKENIZED));
                doc.add(new Field("blurb", story.getBlurb(), Field.Store.YES,
Field.Index.TOKENIZED));
                doc.add(new Field("content", story.getContent(),
Field.Store.YES, Field.Index.TOKENIZED));
                doc.add(new Field("catid", String.valueOf(story.getCategory()),
Field.Store.YES, Field.Index.TOKENIZED));
                iwriter.addDocument(doc);

// then iwriter.close()

optimize just runs once a day, after some deletions .

The tags are select words  .. in total about 20K different ones in
combination ..

so story.getTags() -> Returns a string of the type "XXX YYY ZZZ YYY CCC DDD"
story.getId() -> returns a long
story.sid -> thats a long too
story.getContent() -> returns text in most cases sometimes its blank
story.getHeadline() -> returns text usually about 512 chars
story.getBlurb() -> returns text about 255 chars
story.getCatid() -> returns a long


that covers both sections i.e. the read and the write ..

I did look at luke, but unfortunately the docs dont seem to refer to a
commandline interface to it (unless i missed something).. This is running on
a headless box ..

Cheers

Mohammed.



On 8/19/06, Erick Erickson <[hidden email]> wrote:

>
> This is a loooonnnnnggg time, I think you're right, it's excessive.
>
> What are you timing? The time to complete the search (i.e. get a Hits
> object
> back) or the total time to assemble the response? Why I ask is that the
> Hits
> object is designed to return the fir st100 or so docs efficiently. Every
> 100
> docs or so, it re-executes the query. So, if you're returning a large
> result
> set, the using the Hits object to iterate over them, this could account
> for
> your time. Use a HitCollector instead... But do note this from the javadoc
> for hitcollector
>
> ----
> . For good search performance, implementations of this method should not
> call Searcher.doc(int)<
> file:///C:/lucene_1.9.1/docs/api/org/apache/lucene/search/Searcher.html#doc%28int%29
> >or
> IndexReader.document(int)<
> file:///C:/lucene_1.9.1/docs/api/org/apache/lucene/index/IndexReader.html#document%28int%29
> >on
> every document number encountered. Doing so can slow searches by an
> order
> of magnitude or more.
> -----
>
> FWIW, I have indexes larger that 1G that return in far less time than you
> are indicating, through three layers and constructing web pages in the
> meantime. It contains over 800K documents and the response time is around
> a
> second (haven't timed it lately). This includes 5-way sorts.
>
> You might also either get a copy of Luke and have it explain exactly what
> the parse does or use one of the query exlain calls (sorry, don't remember
> them off the top of my head) to see what query is *actually* submitted and
> whether it's what you expect.
>
> Are you using wildcards? They also have an effect on query speed.
>
> If none of this applies, perhaps you could post the query and the how the
> index is constructed. If you haven't already gotten a copy of Luke, I
> heartily recommend it....
>
> Hope this helps
> Erick
>
> On 8/19/06, M A <[hidden email]> wrote:
> >
> > Hi there,
> >
> > I have an index with about 250K document, to be indexed full text.
> >
> > there are 2 types of searches carried out, 1. using 1 field, the other
> > using
> > 4 .. for a query string ...
> >
> > given the nature of the queries required, all stop words are maintained
> in
> > the index, thereby allowing for phrasal queries, (this is a requirement)
> > ..
> >
> > So search I am using the following ..
> >
> > if(fldArray.length > 1)
> >     {
> >       // use four fields
> >       BooleanClause.Occur[] flags = {BooleanClause.Occur.SHOULD,
> > BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD,
> > BooleanClause.Occur.SHOULD};
> >       query = MultiFieldQueryParser.parse(queryString, fldArray, flags,
> > analyzer); //parse the
> >     }
> >     else
> >     {
> >       //use only 1 field
> >       query = new QueryParser("tags", analyzer).parse(queryString);
> >     }
> >
> >
> > When i search on the 4 fields the average search time is 16 sec ..
> > When i search on the 1 field the average search time is 9 secs ...
> >
> > The Analyzer used for both searching and indexing is
> > Analyzer analyzer = new StandardAnalyzer(new String[]{});
> >
> > The index size is about a 1GB ..
> >
> > The documents vary in size some are less than 1K max size is about 5k
> >
> >
> >
> > Is there anything I can do to make this faster.... 16 secs is just not
> > acceptable ..
> >
> > Machine : 512MB, celeron 2600 ...  Lucene 2.0
> >
> > I could go for a bigger machine but wanted to make sure that the problem
> > was
> > not something I was doing, given 250K is not that large a figure ..
> >
> > Please Help
> >
> > Thanx
> >
> > Mo
> >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Chris Hostetter-3

:     hits = searcher.search(query, new Sort("sid", true));

you don't show where searcher is initialized, and you don't clarify how
you are timing your multiple iterations -- i'm going to guess that you are
opening a new searcher every iteration right?

sorting on a field requires pre-computing an array of information for that
field -- this is both time and space expensive, and is cached per
IndexReader/IndexSearcher -- so if you reuse the same searcher and time
multiple iterations you'll find that hte first iteration might be somewhat
slow, but the rest should be very fast.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
yes there is a new searcher opened each time a search is conducted,

This is because the index is updated every 5 mins or so, due to the incoming
feed of stories ..

When you say iteration, i take it you mean, search request, well for each
search that is conducted I create a new one .. search reader that is ..



On 8/19/06, Chris Hostetter <[hidden email]> wrote:

>
>
> :     hits = searcher.search(query, new Sort("sid", true));
>
> you don't show where searcher is initialized, and you don't clarify how
> you are timing your multiple iterations -- i'm going to guess that you are
> opening a new searcher every iteration right?
>
> sorting on a field requires pre-computing an array of information for that
> field -- this is both time and space expensive, and is cached per
> IndexReader/IndexSearcher -- so if you reuse the same searcher and time
> multiple iterations you'll find that hte first iteration might be somewhat
> slow, but the rest should be very fast.
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Chris Hostetter-3

: This is because the index is updated every 5 mins or so, due to the incoming
: feed of stories ..
:
: When you say iteration, i take it you mean, search request, well for each
: search that is conducted I create a new one .. search reader that is ..

yeah ... i ment iteration of your test.  don't do that.

if the index is updated every 5 minutes, then open a new searcher every 5
minutes -- and reuse it for theentire 5 minutes.  if it's updated
"sparadically throughout the day" then open a search, and keep using it
untill the index is udated, then open a new one.

reusing an indexsearcher as long as possible is one of biggest factors of
Lucene applications.

:
:
:
: On 8/19/06, Chris Hostetter <[hidden email]> wrote:
: >
: >
: > :     hits = searcher.search(query, new Sort("sid", true));
: >
: > you don't show where searcher is initialized, and you don't clarify how
: > you are timing your multiple iterations -- i'm going to guess that you are
: > opening a new searcher every iteration right?
: >
: > sorting on a field requires pre-computing an array of information for that
: > field -- this is both time and space expensive, and is cached per
: > IndexReader/IndexSearcher -- so if you reuse the same searcher and time
: > multiple iterations you'll find that hte first iteration might be somewhat
: > slow, but the rest should be very fast.
: >
: >
: >
: > -Hoss
: >
: >
: > ---------------------------------------------------------------------
: > To unsubscribe, e-mail: [hidden email]
: > For additional commands, e-mail: [hidden email]
: >
: >
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
Ok I get your point, this still however means the first search on the new
searcher will take a huge amount of time .. given that this is happening now
..

i.e. new search -> new query -> get hits ->20+ secs ..  this happens every 5
mins or so ..

although subsequent searches may be quicker ..

Am i to assume for a first search the amount of  time is ok -> .. seems like
a long time to me ..?

The other thing is the sorting is fixed .. it never changes .. it is always
sorted by the same field ..

i just built the entire index and it still takes ages .,..








On 8/20/06, Chris Hostetter <[hidden email]> wrote:

>
>
> : This is because the index is updated every 5 mins or so, due to the
> incoming
> : feed of stories ..
> :
> : When you say iteration, i take it you mean, search request, well for
> each
> : search that is conducted I create a new one .. search reader that is ..
>
> yeah ... i ment iteration of your test.  don't do that.
>
> if the index is updated every 5 minutes, then open a new searcher every 5
> minutes -- and reuse it for theentire 5 minutes.  if it's updated
> "sparadically throughout the day" then open a search, and keep using it
> untill the index is udated, then open a new one.
>
> reusing an indexsearcher as long as possible is one of biggest factors of
> Lucene applications.
>
> :
> :
> :
> : On 8/19/06, Chris Hostetter <[hidden email]> wrote:
> : >
> : >
> : > :     hits = searcher.search(query, new Sort("sid", true));
> : >
> : > you don't show where searcher is initialized, and you don't clarify
> how
> : > you are timing your multiple iterations -- i'm going to guess that you
> are
> : > opening a new searcher every iteration right?
> : >
> : > sorting on a field requires pre-computing an array of information for
> that
> : > field -- this is both time and space expensive, and is cached per
> : > IndexReader/IndexSearcher -- so if you reuse the same searcher and
> time
> : > multiple iterations you'll find that hte first iteration might be
> somewhat
> : > slow, but the rest should be very fast.
> : >
> : >
> : >
> : > -Hoss
> : >
> : >
> : > ---------------------------------------------------------------------
> : > To unsubscribe, e-mail: [hidden email]
> : > For additional commands, e-mail: [hidden email]
> : >
> : >
> :
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Erik Hatcher
This is why a warming strategy like Solr takes is very valuable.  The  
searchable index is always serving up requests as fast as Lucene  
works, which is achieved by warming a new IndexSearcher with searches/
sorts/filter creating/etc before it is swapped into use.

        Erik


On Aug 20, 2006, at 5:35 AM, M A wrote:

> Ok I get your point, this still however means the first search on  
> the new
> searcher will take a huge amount of time .. given that this is  
> happening now
> ..
>
> i.e. new search -> new query -> get hits ->20+ secs ..  this  
> happens every 5
> mins or so ..
>
> although subsequent searches may be quicker ..
>
> Am i to assume for a first search the amount of  time is ok -> ..  
> seems like
> a long time to me ..?
>
> The other thing is the sorting is fixed .. it never changes .. it  
> is always
> sorted by the same field ..
>
> i just built the entire index and it still takes ages .,..
>
>
>
>
>
>
>
>
> On 8/20/06, Chris Hostetter <[hidden email]> wrote:
>>
>>
>> : This is because the index is updated every 5 mins or so, due to the
>> incoming
>> : feed of stories ..
>> :
>> : When you say iteration, i take it you mean, search request, well  
>> for
>> each
>> : search that is conducted I create a new one .. search reader  
>> that is ..
>>
>> yeah ... i ment iteration of your test.  don't do that.
>>
>> if the index is updated every 5 minutes, then open a new searcher  
>> every 5
>> minutes -- and reuse it for theentire 5 minutes.  if it's updated
>> "sparadically throughout the day" then open a search, and keep  
>> using it
>> untill the index is udated, then open a new one.
>>
>> reusing an indexsearcher as long as possible is one of biggest  
>> factors of
>> Lucene applications.
>>
>> :
>> :
>> :
>> : On 8/19/06, Chris Hostetter <[hidden email]> wrote:
>> : >
>> : >
>> : > :     hits = searcher.search(query, new Sort("sid", true));
>> : >
>> : > you don't show where searcher is initialized, and you don't  
>> clarify
>> how
>> : > you are timing your multiple iterations -- i'm going to guess  
>> that you
>> are
>> : > opening a new searcher every iteration right?
>> : >
>> : > sorting on a field requires pre-computing an array of  
>> information for
>> that
>> : > field -- this is both time and space expensive, and is cached per
>> : > IndexReader/IndexSearcher -- so if you reuse the same searcher  
>> and
>> time
>> : > multiple iterations you'll find that hte first iteration might be
>> somewhat
>> : > slow, but the rest should be very fast.
>> : >
>> : >
>> : >
>> : > -Hoss
>> : >
>> : >
>> : >  
>> ---------------------------------------------------------------------
>> : > To unsubscribe, e-mail: [hidden email]
>> : > For additional commands, e-mail: [hidden email]
>> : >
>> : >
>> :
>>
>>
>>
>> -Hoss
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
Just ran some tests .. it appears that the problem is in the sorting ..

i.e.

//hits = searcher.search(query, new Sort("sid", true));    -> 17 secs
//hits = searcher.search(query, new Sort("sid", false)); -> 17 secs
hits = searcher.search(query);    -> less than 1 sec ..

am trying something out .. .. will keep you posted





On 8/20/06, Erik Hatcher <[hidden email]> wrote:

>
> This is why a warming strategy like Solr takes is very valuable.  The
> searchable index is always serving up requests as fast as Lucene
> works, which is achieved by warming a new IndexSearcher with searches/
> sorts/filter creating/etc before it is swapped into use.
>
>        Erik
>
>
> On Aug 20, 2006, at 5:35 AM, M A wrote:
>
> > Ok I get your point, this still however means the first search on
> > the new
> > searcher will take a huge amount of time .. given that this is
> > happening now
> > ..
> >
> > i.e. new search -> new query -> get hits ->20+ secs ..  this
> > happens every 5
> > mins or so ..
> >
> > although subsequent searches may be quicker ..
> >
> > Am i to assume for a first search the amount of  time is ok -> ..
> > seems like
> > a long time to me ..?
> >
> > The other thing is the sorting is fixed .. it never changes .. it
> > is always
> > sorted by the same field ..
> >
> > i just built the entire index and it still takes ages .,..
> >
> >
> >
> >
> >
> >
> >
> >
> > On 8/20/06, Chris Hostetter <[hidden email]> wrote:
> >>
> >>
> >> : This is because the index is updated every 5 mins or so, due to the
> >> incoming
> >> : feed of stories ..
> >> :
> >> : When you say iteration, i take it you mean, search request, well
> >> for
> >> each
> >> : search that is conducted I create a new one .. search reader
> >> that is ..
> >>
> >> yeah ... i ment iteration of your test.  don't do that.
> >>
> >> if the index is updated every 5 minutes, then open a new searcher
> >> every 5
> >> minutes -- and reuse it for theentire 5 minutes.  if it's updated
> >> "sparadically throughout the day" then open a search, and keep
> >> using it
> >> untill the index is udated, then open a new one.
> >>
> >> reusing an indexsearcher as long as possible is one of biggest
> >> factors of
> >> Lucene applications.
> >>
> >> :
> >> :
> >> :
> >> : On 8/19/06, Chris Hostetter <[hidden email]> wrote:
> >> : >
> >> : >
> >> : > :     hits = searcher.search(query, new Sort("sid", true));
> >> : >
> >> : > you don't show where searcher is initialized, and you don't
> >> clarify
> >> how
> >> : > you are timing your multiple iterations -- i'm going to guess
> >> that you
> >> are
> >> : > opening a new searcher every iteration right?
> >> : >
> >> : > sorting on a field requires pre-computing an array of
> >> information for
> >> that
> >> : > field -- this is both time and space expensive, and is cached per
> >> : > IndexReader/IndexSearcher -- so if you reuse the same searcher
> >> and
> >> time
> >> : > multiple iterations you'll find that hte first iteration might be
> >> somewhat
> >> : > slow, but the rest should be very fast.
> >> : >
> >> : >
> >> : >
> >> : > -Hoss
> >> : >
> >> : >
> >> : >
> >> ---------------------------------------------------------------------
> >> : > To unsubscribe, e-mail: [hidden email]
> >> : > For additional commands, e-mail: [hidden email]
> >> : >
> >> : >
> >> :
> >>
> >>
> >>
> >> -Hoss
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Erick Erickson
In reply to this post by M A-2
About luke... I don't know about command-line interfaces, but if you copy
your index to a different machine and use Luke there. I do this between
Linux and Windows boxes all the time. Or, if you can mount the remote drive
so you can see it, you can just use Luke to browse to it and open it up. You
may have some latency though.....

See below...

On 8/20/06, M A <[hidden email]> wrote:
>
> Ok I get your point, this still however means the first search on the new
> searcher will take a huge amount of time .. given that this is happening
> now
> ..


You can fire one or several canned queries at the searcher whenever you open
a new one. That way the first time a *user* hits the box, the warm-up will
already have happened. Note that the same searcher can be used by multiple
threads...


i.e. new search -> new query -> get hits ->20+ secs ..  this happens every 5

> mins or so ..
>
> although subsequent searches may be quicker ..
>
> Am i to assume for a first search the amount of  time is ok -> .. seems
> like
> a long time to me ..?
>
> The other thing is the sorting is fixed .. it never changes .. it is
> always
> sorted by the same field ..


Assuming that you still have performance issues, you could think about
building your index in pre-sorted order an just avoiding the sorting all
together. The internal Lucene document IDs are then your sort order (a newly
added doc hast an ID that is always greater than any existing doc ID). I
don't know details of your problem space, but this might be relatively
easy.... You won't want to return things in relevance order in that case. In
fact, you probably don't want relevance in place at all since your sorting
doesn't change.... I think a ConstantScoreQuery  might work for you here.

But I wouldn't go there unless you have evidence that your sort is slowing
you down, which is easy enough to verify by just taking it out. Don't bother
with any of this until you re-use your reader though....

i just built the entire index and it still takes ages .,..


The search took ages? Or building the index? If the former, then rebuilding
the index is irrelevant, it's the first time you use a searcher that counts.

On 8/20/06, Chris Hostetter <[hidden email]> wrote:

> >
> >
> > : This is because the index is updated every 5 mins or so, due to the
> > incoming
> > : feed of stories ..
> > :
> > : When you say iteration, i take it you mean, search request, well for
> > each
> > : search that is conducted I create a new one .. search reader that is
> ..
> >
> > yeah ... i ment iteration of your test.  don't do that.
> >
> > if the index is updated every 5 minutes, then open a new searcher every
> 5
> > minutes -- and reuse it for theentire 5 minutes.  if it's updated
> > "sparadically throughout the day" then open a search, and keep using it
> > untill the index is udated, then open a new one.
> >
> > reusing an indexsearcher as long as possible is one of biggest factors
> of
> > Lucene applications.
> >
> > :
> > :
> > :
> > : On 8/19/06, Chris Hostetter <[hidden email]> wrote:
> > : >
> > : >
> > : > :     hits = searcher.search(query, new Sort("sid", true));
> > : >
> > : > you don't show where searcher is initialized, and you don't clarify
> > how
> > : > you are timing your multiple iterations -- i'm going to guess that
> you
> > are
> > : > opening a new searcher every iteration right?
> > : >
> > : > sorting on a field requires pre-computing an array of information
> for
> > that
> > : > field -- this is both time and space expensive, and is cached per
> > : > IndexReader/IndexSearcher -- so if you reuse the same searcher and
> > time
> > : > multiple iterations you'll find that hte first iteration might be
> > somewhat
> > : > slow, but the rest should be very fast.
> > : >
> > : >
> > : >
> > : > -Hoss
> > : >
> > : >
> > : >
> ---------------------------------------------------------------------
> > : > To unsubscribe, e-mail: [hidden email]
> > : > For additional commands, e-mail: [hidden email]
> > : >
> > : >
> > :
> >
> >
> >
> > -Hoss
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Erick Erickson
Talk about mails crossing in the aether...... wrote my resonse before seeing
the last two...

Sounds like you're on track.

Erick

On 8/20/06, Erick Erickson <[hidden email]> wrote:

>
> About luke... I don't know about command-line interfaces, but if you copy
> your index to a different machine and use Luke there. I do this between
> Linux and Windows boxes all the time. Or, if you can mount the remote drive
> so you can see it, you can just use Luke to browse to it and open it up. You
> may have some latency though.....
>
> See below...
>
> On 8/20/06, M A <[hidden email]> wrote:
> >
> > Ok I get your point, this still however means the first search on the
> > new
> > searcher will take a huge amount of time .. given that this is happening
> > now
> > ..
>
>
> You can fire one or several canned queries at the searcher whenever you
> open a new one. That way the first time a *user* hits the box, the warm-up
> will already have happened. Note that the same searcher can be used by
> multiple threads...
>
>
> i.e. new search -> new query -> get hits ->20+ secs ..  this happens every
> > 5
> > mins or so ..
> >
> > although subsequent searches may be quicker ..
> >
> > Am i to assume for a first search the amount of  time is ok -> .. seems
> > like
> > a long time to me ..?
> >
> > The other thing is the sorting is fixed .. it never changes .. it is
> > always
> > sorted by the same field ..
>
>
> Assuming that you still have performance issues, you could think about
> building your index in pre-sorted order an just avoiding the sorting all
> together. The internal Lucene document IDs are then your sort order (a newly
> added doc hast an ID that is always greater than any existing doc ID). I
> don't know details of your problem space, but this might be relatively
> easy.... You won't want to return things in relevance order in that case. In
> fact, you probably don't want relevance in place at all since your sorting
> doesn't change.... I think a ConstantScoreQuery  might work for you here.
>
> But I wouldn't go there unless you have evidence that your sort is slowing
> you down, which is easy enough to verify by just taking it out. Don't bother
> with any of this until you re-use your reader though....
>
>
> i just built the entire index and it still takes ages .,..
>
>
> The search took ages? Or building the index? If the former, then
> rebuilding the index is irrelevant, it's the first time you use a searcher
> that counts.
>
> On 8/20/06, Chris Hostetter <[hidden email] > wrote:
> > >
> > >
> > > : This is because the index is updated every 5 mins or so, due to the
> > > incoming
> > > : feed of stories ..
> > > :
> > > : When you say iteration, i take it you mean, search request, well for
> >
> > > each
> > > : search that is conducted I create a new one .. search reader that is
> > ..
> > >
> > > yeah ... i ment iteration of your test.  don't do that.
> > >
> > > if the index is updated every 5 minutes, then open a new searcher
> > every 5
> > > minutes -- and reuse it for theentire 5 minutes.  if it's updated
> > > "sparadically throughout the day" then open a search, and keep using
> > it
> > > untill the index is udated, then open a new one.
> > >
> > > reusing an indexsearcher as long as possible is one of biggest factors
> > of
> > > Lucene applications.
> > >
> > > :
> > > :
> > > :
> > > : On 8/19/06, Chris Hostetter < [hidden email]> wrote:
> > > : >
> > > : >
> > > : > :     hits = searcher.search(query, new Sort("sid", true));
> > > : >
> > > : > you don't show where searcher is initialized, and you don't
> > clarify
> > > how
> > > : > you are timing your multiple iterations -- i'm going to guess that
> > you
> > > are
> > > : > opening a new searcher every iteration right?
> > > : >
> > > : > sorting on a field requires pre-computing an array of information
> > for
> > > that
> > > : > field -- this is both time and space expensive, and is cached per
> > > : > IndexReader/IndexSearcher -- so if you reuse the same searcher and
> > > time
> > > : > multiple iterations you'll find that hte first iteration might be
> > > somewhat
> > > : > slow, but the rest should be very fast.
> > > : >
> > > : >
> > > : >
> > > : > -Hoss
> > > : >
> > > : >
> > > : >
> > ---------------------------------------------------------------------
> > > : > To unsubscribe, e-mail: [hidden email]
> > > : > For additional commands, e-mail: [hidden email]
> > > : >
> > > : >
> > > :
> > >
> > >
> > >
> > > -Hoss
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [hidden email]
> > > For additional commands, e-mail: [hidden email]
> > >
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
In reply to this post by Erick Erickson
The index is already built in date order i.e. the older documents appear
first in the index, what i am trying to achieve is however the latest
documents appearing first in the search results ..  without the sort .. i
think they appear by relevance .. well thats what it looked like ..

I am looking at the scoring as we speak,



On 8/20/06, Erick Erickson <[hidden email]> wrote:

>
> About luke... I don't know about command-line interfaces, but if you copy
> your index to a different machine and use Luke there. I do this between
> Linux and Windows boxes all the time. Or, if you can mount the remote
> drive
> so you can see it, you can just use Luke to browse to it and open it up.
> You
> may have some latency though.....
>
> See below...
>
> On 8/20/06, M A <[hidden email]> wrote:
> >
> > Ok I get your point, this still however means the first search on the
> new
> > searcher will take a huge amount of time .. given that this is happening
> > now
> > ..
>
>
> You can fire one or several canned queries at the searcher whenever you
> open
> a new one. That way the first time a *user* hits the box, the warm-up will
> already have happened. Note that the same searcher can be used by multiple
> threads...
>
>
> i.e. new search -> new query -> get hits ->20+ secs ..  this happens every
> 5
> > mins or so ..
> >
> > although subsequent searches may be quicker ..
> >
> > Am i to assume for a first search the amount of  time is ok -> .. seems
> > like
> > a long time to me ..?
> >
> > The other thing is the sorting is fixed .. it never changes .. it is
> > always
> > sorted by the same field ..
>
>
> Assuming that you still have performance issues, you could think about
> building your index in pre-sorted order an just avoiding the sorting all
> together. The internal Lucene document IDs are then your sort order (a
> newly
> added doc hast an ID that is always greater than any existing doc ID). I
> don't know details of your problem space, but this might be relatively
> easy.... You won't want to return things in relevance order in that case.
> In
> fact, you probably don't want relevance in place at all since your sorting
> doesn't change.... I think a ConstantScoreQuery  might work for you here.
>
> But I wouldn't go there unless you have evidence that your sort is slowing
> you down, which is easy enough to verify by just taking it out. Don't
> bother
> with any of this until you re-use your reader though....
>
> i just built the entire index and it still takes ages .,..
>
>
> The search took ages? Or building the index? If the former, then
> rebuilding
> the index is irrelevant, it's the first time you use a searcher that
> counts.
>
> On 8/20/06, Chris Hostetter <[hidden email]> wrote:
> > >
> > >
> > > : This is because the index is updated every 5 mins or so, due to the
> > > incoming
> > > : feed of stories ..
> > > :
> > > : When you say iteration, i take it you mean, search request, well for
> > > each
> > > : search that is conducted I create a new one .. search reader that is
> > ..
> > >
> > > yeah ... i ment iteration of your test.  don't do that.
> > >
> > > if the index is updated every 5 minutes, then open a new searcher
> every
> > 5
> > > minutes -- and reuse it for theentire 5 minutes.  if it's updated
> > > "sparadically throughout the day" then open a search, and keep using
> it
> > > untill the index is udated, then open a new one.
> > >
> > > reusing an indexsearcher as long as possible is one of biggest factors
> > of
> > > Lucene applications.
> > >
> > > :
> > > :
> > > :
> > > : On 8/19/06, Chris Hostetter <[hidden email]> wrote:
> > > : >
> > > : >
> > > : > :     hits = searcher.search(query, new Sort("sid", true));
> > > : >
> > > : > you don't show where searcher is initialized, and you don't
> clarify
> > > how
> > > : > you are timing your multiple iterations -- i'm going to guess that
> > you
> > > are
> > > : > opening a new searcher every iteration right?
> > > : >
> > > : > sorting on a field requires pre-computing an array of information
> > for
> > > that
> > > : > field -- this is both time and space expensive, and is cached per
> > > : > IndexReader/IndexSearcher -- so if you reuse the same searcher and
> > > time
> > > : > multiple iterations you'll find that hte first iteration might be
> > > somewhat
> > > : > slow, but the rest should be very fast.
> > > : >
> > > : >
> > > : >
> > > : > -Hoss
> > > : >
> > > : >
> > > : >
> > ---------------------------------------------------------------------
> > > : > To unsubscribe, e-mail: [hidden email]
> > > : > For additional commands, e-mail: [hidden email]
> > > : >
> > > : >
> > > :
> > >
> > >
> > >
> > > -Hoss
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [hidden email]
> > > For additional commands, e-mail: [hidden email]
> > >
> > >
> >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
Ok this is what i have done so far ->

static class MyIndexSearcher extends IndexSearcher {
         IndexReader reader = null;
         public MyIndexSearcher(IndexReader r) {
             super(r);
             reader = r;
         }
         public void search(Weight weight,
org.apache.lucene.search.Filterfilter, final HitCollector results)
throws IOException {
             HitCollector collector = new HitCollector() {
                 public final void collect(int doc, float score) {
                     try {
                         // System.err.println(" doc " + doc + " score " +
score );
                         String str = reader.document(doc).get("sid");
                         results.collect(doc, Float.parseFloat(str));
                     } catch(Exception e) {

                     }
                 }
             };

             Scorer scorer = weight.scorer(reader);
             if (scorer == null)
                 return;
             scorer.score(collector);
         }

    };


Which is essentially an overriden method, although not fully optimized im
sure there is a way to make it quicker .. my timing has gone down to sub, 5
secs a query, not ideal but definately better than what i was getting before
..

In fact some searches now complete in under an sec .. which is a definate
result ..

The reason for doing it this way is simple .. the field sid stores a long
value that is the epoch, therefore the larger this value the more recent the
story and hence .. the higher it should be in the ranking ..

I guess the only bottleneck now is reading the value from the field .. since
for the multifield queries that value gets called (collect(int doc, float
score) ) a hell of a lot of times ..

now just have to find a way to eliminate low scoring ones .. and i am set ..


Thanx




On 8/20/06, M A <[hidden email]> wrote:

>
>  The index is already built in date order i.e. the older documents appear
> first in the index, what i am trying to achieve is however the latest
> documents appearing first in the search results ..  without the sort .. i
> think they appear by relevance .. well thats what it looked like ..
>
> I am looking at the scoring as we speak,
>
>
>
> On 8/20/06, Erick Erickson <[hidden email]> wrote:
> >
> > About luke... I don't know about command-line interfaces, but if you
> > copy
> > your index to a different machine and use Luke there. I do this between
> > Linux and Windows boxes all the time. Or, if you can mount the remote
> > drive
> > so you can see it, you can just use Luke to browse to it and open it up.
> > You
> > may have some latency though.....
> >
> > See below...
> >
> > On 8/20/06, M A <[hidden email]> wrote:
> > >
> > > Ok I get your point, this still however means the first search on the
> > new
> > > searcher will take a huge amount of time .. given that this is
> > happening
> > > now
> > > ..
> >
> >
> > You can fire one or several canned queries at the searcher whenever you
> > open
> > a new one. That way the first time a *user* hits the box, the warm-up
> > will
> > already have happened. Note that the same searcher can be used by
> > multiple
> > threads...
> >
> >
> > i.e. new search -> new query -> get hits ->20+ secs ..  this happens
> > every 5
> > > mins or so ..
> > >
> > > although subsequent searches may be quicker ..
> > >
> > > Am i to assume for a first search the amount of  time is ok -> ..
> > seems
> > > like
> > > a long time to me ..?
> > >
> > > The other thing is the sorting is fixed .. it never changes .. it is
> > > always
> > > sorted by the same field ..
> >
> >
> > Assuming that you still have performance issues, you could think about
> > building your index in pre-sorted order an just avoiding the sorting all
> > together. The internal Lucene document IDs are then your sort order (a
> > newly
> > added doc hast an ID that is always greater than any existing doc ID). I
> >
> > don't know details of your problem space, but this might be relatively
> > easy.... You won't want to return things in relevance order in that
> > case. In
> > fact, you probably don't want relevance in place at all since your
> > sorting
> > doesn't change.... I think a ConstantScoreQuery  might work for you
> > here.
> >
> > But I wouldn't go there unless you have evidence that your sort is
> > slowing
> > you down, which is easy enough to verify by just taking it out. Don't
> > bother
> > with any of this until you re-use your reader though....
> >
> > i just built the entire index and it still takes ages .,..
> >
> >
> > The search took ages? Or building the index? If the former, then
> > rebuilding
> > the index is irrelevant, it's the first time you use a searcher that
> > counts.
> >
> > On 8/20/06, Chris Hostetter <[hidden email]> wrote:
> > > >
> > > >
> > > > : This is because the index is updated every 5 mins or so, due to
> > the
> > > > incoming
> > > > : feed of stories ..
> > > > :
> > > > : When you say iteration, i take it you mean, search request, well
> > for
> > > > each
> > > > : search that is conducted I create a new one .. search reader that
> > is
> > > ..
> > > >
> > > > yeah ... i ment iteration of your test.  don't do that.
> > > >
> > > > if the index is updated every 5 minutes, then open a new searcher
> > every
> > > 5
> > > > minutes -- and reuse it for theentire 5 minutes.  if it's updated
> > > > "sparadically throughout the day" then open a search, and keep using
> > it
> > > > untill the index is udated, then open a new one.
> > > >
> > > > reusing an indexsearcher as long as possible is one of biggest
> > factors
> > > of
> > > > Lucene applications.
> > > >
> > > > :
> > > > :
> > > > :
> > > > : On 8/19/06, Chris Hostetter <[hidden email] > wrote:
> > > > : >
> > > > : >
> > > > : > :     hits = searcher.search(query, new Sort("sid", true));
> > > > : >
> > > > : > you don't show where searcher is initialized, and you don't
> > clarify
> > > > how
> > > > : > you are timing your multiple iterations -- i'm going to guess
> > that
> > > you
> > > > are
> > > > : > opening a new searcher every iteration right?
> > > > : >
> > > > : > sorting on a field requires pre-computing an array of
> > information
> > > for
> > > > that
> > > > : > field -- this is both time and space expensive, and is cached
> > per
> > > > : > IndexReader/IndexSearcher -- so if you reuse the same searcher
> > and
> > > > time
> > > > : > multiple iterations you'll find that hte first iteration might
> > be
> > > > somewhat
> > > > : > slow, but the rest should be very fast.
> > > > : >
> > > > : >
> > > > : >
> > > > : > -Hoss
> > > > : >
> > > > : >
> > > > : >
> > > ---------------------------------------------------------------------
> > > > : > To unsubscribe, e-mail: [hidden email]
> > > > : > For additional commands, e-mail: [hidden email]
> >
> > > > : >
> > > > : >
> > > > :
> > > >
> > > >
> > > >
> > > > -Hoss
> > > >
> > > >
> > > >
> > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: [hidden email]
> > > > For additional commands, e-mail: [hidden email]
> > > >
> > > >
> > >
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Yonik Seeley-2
>          public void search(Weight weight,
> org.apache.lucene.search.Filterfilter, final HitCollector results)
> throws IOException {
>              HitCollector collector = new HitCollector() {
>                  public final void collect(int doc, float score) {
>                      try {
>                          String str = reader.document(doc).get("sid");
>                          results.collect(doc, Float.parseFloat(str));
>                      } catch(Exception e) {

Ahhh... that explains things.
Retrieving documents is much slower than using Lucene's indicies.
If you want to do something like this, use FunctionQuery or use the FieldCache.

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Yonik Seeley-2
In reply to this post by M A-2
On 8/20/06, M A <[hidden email]> wrote:
> The index is already built in date order i.e. the older documents appear
> first in the index, what i am trying to achieve is however the latest
> documents appearing first in the search results ..  without the sort .. i
> think they appear by relevance .. well thats what it looked like ..

You can specify a Sort by internal lucene docid (forward or reverse).
That's your fastest and least memory intensive option if the docs are
indexed in date order.


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
Yeah I tried looking this up,

If i wanted to do it by document id (highest docs first) , does this mean
doing something like

hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or
something like that,

is this way of sorting any different performance wise to what i was doing
before ..






On 8/21/06, Yonik Seeley <[hidden email]> wrote:

>
> On 8/20/06, M A <[hidden email]> wrote:
> > The index is already built in date order i.e. the older documents appear
> > first in the index, what i am trying to achieve is however the latest
> > documents appearing first in the search results ..  without the sort ..
> i
> > think they appear by relevance .. well thats what it looked like ..
>
> You can specify a Sort by internal lucene docid (forward or reverse).
> That's your fastest and least memory intensive option if the docs are
> indexed in date order.
>
>
> -Yonik
> http://incubator.apache.org/solr Solr, the open-source Lucene search
> server
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Yonik Seeley-2
On 8/21/06, M A <[hidden email]> wrote:

> Yeah I tried looking this up,
>
> If i wanted to do it by document id (highest docs first) , does this mean
> doing something like
>
> hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or
> something like that,
>
> is this way of sorting any different performance wise to what i was doing
> before ..

Definitely a lot faster if you don't warm up and re-use your searchers.
Sorting by docid doesn't require the FieldCache, so you don't get the
first-search penalty.

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
I still dont get this,  How would i do this, so i can try it out ..

is

searcher.search(query, new Sort(SortField.DOC))

..correct this would return stuff in the order of the documents, so how
would i reverse this, i mean the later documents appearing fisrt ..

searcher.search(query, new Sort(????)

How do you get document number descending .. ?? for the sort that is



On 8/21/06, Yonik Seeley <[hidden email]> wrote:

>
> On 8/21/06, M A <[hidden email]> wrote:
> > Yeah I tried looking this up,
> >
> > If i wanted to do it by document id (highest docs first) , does this
> mean
> > doing something like
> >
> > hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or
> > something like that,
> >
> > is this way of sorting any different performance wise to what i was
> doing
> > before ..
>
> Definitely a lot faster if you don't warm up and re-use your searchers.
> Sorting by docid doesn't require the FieldCache, so you don't get the
> first-search penalty.
>
> -Yonik
> http://incubator.apache.org/solr Solr, the open-source Lucene search
> server
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

Yonik Seeley-2
On 8/21/06, M A <[hidden email]> wrote:
> I still dont get this,  How would i do this, so i can try it out ..

http://lucene.apache.org/java/docs/api/org/apache/lucene/search/SortField.html#SortField(java.lang.String,%20int,%20boolean)

new Sort(new SortField(null,SortField.DOC,true)


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Search Performance Problem 16 sec for 250K docs

M A-2
It is indeed alot faster ...

Will use that one now ..

hits = searcher.search(query, new Sort(new
SortField(null,SortField.DOC,true)));


That is completing in under a sec for pretty much all the queries ..




On 8/22/06, Yonik Seeley <[hidden email]> wrote:

>
> On 8/21/06, M A <[hidden email]> wrote:
> > I still dont get this,  How would i do this, so i can try it out ..
>
>
> http://lucene.apache.org/java/docs/api/org/apache/lucene/search/SortField.html#SortField(java.lang.String,%20int,%20boolean)
>
> new Sort(new SortField(null,SortField.DOC,true)
>
>
> -Yonik
> http://incubator.apache.org/solr Solr, the open-source Lucene search
> server
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>