Calculating idf across multiple indexes

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Calculating idf across multiple indexes

yahootintin.11533894
Hi,



Due to the size of my index, I need to break it into several different
segments.  I have a service that gets a query from the user and contacts each
index searcher service asynchronously and waits for the results.  The results
are then collated and returned to the user.



The problem is that the idf
isn't being calculated correctly because each index searcher service doesn't
know the total document frequency for each term.  How are others working around
this issue?



Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Calculating idf across multiple indexes

Daniel Naber
On Tuesday 07 June 2005 00:02, [hidden email] wrote:

>  How are others working around
> this issue?

This has been fixed in the development version  of Lucene. It's already
quite stable, so I suggest to try it (needs to be checked out from SVN).

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Calculating idf across multiple indexes

yahootintin.11533894
In reply to this post by yahootintin.11533894
Hi Daniel,



The problem is that if I tell Lucene about only one of the indexes
it has no way of knowing what the total document frequency is across the other
index servers.



Does that make sense?  I think my collator will need to
calculate the idf somehow.



Thanks.



--- [hidden email] wrote:

On Tuesday 07 June 2005 00:02, [hidden email] wrote:

>

> > ??How are others working around

> > this issue?

>

> This has
been fixed in the development version  of Lucene. It's already

> quite stable,
so I suggest to try it (needs to be checked out from SVN).

>

> Regards

>  Daniel

>

> --

> http://www.danielnaber.de

>

> ---------------------------------------------------------------------

> To unsubscribe, e-mail: [hidden email]

> For
additional commands, e-mail: [hidden email]

>

>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Calculating idf across multiple indexes

Daniel Naber
On Tuesday 07 June 2005 00:49, [hidden email] wrote:

> The problem is that if I tell Lucene about only one of the indexes
> it has no way of knowing what the total document frequency is across the
> other index servers.

Can't you use ParallelMultiSearcher and/or RemoteSearchable?

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Calculating idf across multiple indexes

yahootintin.11533894
In reply to this post by yahootintin.11533894
Hmmm... I'll look into that.  I thought the MultiSearcher would still need
access to each index.  Does the RemoteSearchable avoid that?  Will it allow
me to delegate searching to multiple boxes and then collate the results correctly?



Thanks for the tip about the RemoteSearchable.



--- [hidden email]
wrote:

On Tuesday 07 June 2005 00:49, [hidden email]
wrote:

>

> > The problem is that if I tell Lucene about only one of the
indexes

> > it has no way of knowing what the total document frequency is
across the

> > other index servers.

>

> Can't you use ParallelMultiSearcher
and/or RemoteSearchable?

>

> Regards

>  Daniel

>

> --

> http://www.danielnaber.de

>

> ---------------------------------------------------------------------

> To unsubscribe, e-mail: [hidden email]

> For
additional commands, e-mail: [hidden email]

>

>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]