Getting a document by primary key

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Getting a document by primary key

Jonathan Leibiusky
I'm developing my own request handler and given a document primary key I
would like to get it from the index.
Which is the best and fastest way to do this? I will execute this request
handler several times and this should work really fast.
Sorry if it's a basic question.

Thanks!

Jonathan
Reply | Threaded
Open this post in threaded view
|

Re: Getting a document by primary key

Marc Sturlese
Hey there,
I am doing the same and I am experimenting some trouble. I get the document data searching by term. The problem is that when I do it several times (inside a huge for) the app starts increasing the memory use until I use almost the whole memory...
Did u find any other way to do that?

Jonathan Ariel wrote
I'm developing my own request handler and given a document primary key I
would like to get it from the index.
Which is the best and fastest way to do this? I will execute this request
handler several times and this should work really fast.
Sorry if it's a basic question.

Thanks!

Jonathan
Reply | Threaded
Open this post in threaded view
|

Re: Getting a document by primary key

Yonik Seeley-2
On Sun, Nov 2, 2008 at 8:09 PM, Marc Sturlese <[hidden email]> wrote:
> I am doing the same and I am experimenting some trouble. I get the document
> data searching by term. The problem is that when I do it several times
> (inside a huge for) the app starts increasing the memory use until I use
> almost the whole memory...

That just sounds like the way Java's garbage collection tends to
work... do you ever run out of memory (and get an exception)?

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Getting a document by primary key

Marc Sturlese
Hey there,
I never run out of memory but I think the app always run to the limit... The problem seems to be in here (searching by term):
try {
            indexSearcher = new IndexSearcher(path_index) ;
           
            QueryParser queryParser = new QueryParser("id_field", getAnalyzer(stopWordsFile)) ;
            Query query = queryParser.parse(query_string) ;
           
            Hits hits = indexSearcher.search(query) ;
           
            if(hits.length() > 0) {
                doc = hits.doc(0) ;
            }
           
        } catch (Exception ex) {
           
        } finally {
            if(indexSearcher != null) {
                try {
                    indexSearcher.close() ;
                } catch(Exception e){} ;
                indexSearcher = null ;
            }
        }

As hits is deprecated I tried to use termdocs and top docs... but the memory problem never disapeared...
If I call the garbage collector every time I use the upper code the memory doesn't increase undefinitely but... the app works soo slow.
Any suggestion?
Thanks for replaying!

Yonik Seeley wrote
On Sun, Nov 2, 2008 at 8:09 PM, Marc Sturlese <marc.sturlese@gmail.com> wrote:
> I am doing the same and I am experimenting some trouble. I get the document
> data searching by term. The problem is that when I do it several times
> (inside a huge for) the app starts increasing the memory use until I use
> almost the whole memory...

That just sounds like the way Java's garbage collection tends to
work... do you ever run out of memory (and get an exception)?

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Getting a document by primary key

Yonik Seeley-2
On Mon, Nov 3, 2008 at 2:40 PM, Marc Sturlese <[hidden email]> wrote:
> As hits is deprecated I tried to use termdocs and top docs...

Try using searcher.getFirstMatch(t) as Jonathan is.  It should be
faster than Hits.

> but the memory
> problem never disapeared...
> If I call the garbage collector every time I use the upper code the memory
> doesn't increase undefinitely but... the app works soo slow.

That's really just the way Java GC works.  Don't call GC explicitly,
just pick the max amount of memory you need to give to the JVM and let
it handle the rest.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Getting a document by primary key

Otis Gospodnetic-2
In reply to this post by Marc Sturlese
Is this your code or something from Solr?
That indexSearcher = new IndexSearcher(path_index) ; is very suspicious looking.
Are you creating a new IndexSearcher for every search request?  If so, that's the cause of your memory problem.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----

> From: Marc Sturlese <[hidden email]>
> To: [hidden email]
> Sent: Monday, November 3, 2008 2:40:00 PM
> Subject: Re: Getting a document by primary key
>
>
> Hey there,
> I never run out of memory but I think the app always run to the limit... The
> problem seems to be in here (searching by term):
> try {
>             indexSearcher = new IndexSearcher(path_index) ;
>            
>             QueryParser queryParser = new QueryParser("id_field",
> getAnalyzer(stopWordsFile)) ;
>             Query query = queryParser.parse(query_string) ;
>            
>             Hits hits = indexSearcher.search(query) ;
>            
>             if(hits.length() > 0) {
>                 doc = hits.doc(0) ;
>             }
>            
>         } catch (Exception ex) {
>            
>         } finally {
>             if(indexSearcher != null) {
>                 try {
>                     indexSearcher.close() ;
>                 } catch(Exception e){} ;
>                 indexSearcher = null ;
>             }
>         }
>
> As hits is deprecated I tried to use termdocs and top docs... but the memory
> problem never disapeared...
> If I call the garbage collector every time I use the upper code the memory
> doesn't increase undefinitely but... the app works soo slow.
> Any suggestion?
> Thanks for replaying!
>
>
> Yonik Seeley wrote:
> >
> > On Sun, Nov 2, 2008 at 8:09 PM, Marc Sturlese
> > wrote:
> >> I am doing the same and I am experimenting some trouble. I get the
> >> document
> >> data searching by term. The problem is that when I do it several times
> >> (inside a huge for) the app starts increasing the memory use until I use
> >> almost the whole memory...
> >
> > That just sounds like the way Java's garbage collection tends to
> > work... do you ever run out of memory (and get an exception)?
> >
> > -Yonik
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Getting-a-document-by-primary-key-tp20072108p20309245.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Getting a document by primary key

Yonik Seeley-2
On Mon, Nov 3, 2008 at 2:49 PM, Otis Gospodnetic
<[hidden email]> wrote:
> Is this your code or something from Solr?
> That indexSearcher = new IndexSearcher(path_index) ; is very suspicious looking.

Good point... if this is a Solr plugin, then get the SolrIndexSearcher
from the request object.
If it's not Solr, then use termenum/termdocs (and post to the right list ;-)

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Getting a document by primary key

Marc Sturlese
Hey your are right,
I'm trying to migrate my app to solr. For the moment I am using solr for the searching part of the app but i am using my own lucene app for indexing, Shoud have posted in lucene forum for this trouble. Sorry about that.
Iam trying to use termdocs properly now.
Thanks for your advice.

Marc
Yonik Seeley wrote
On Mon, Nov 3, 2008 at 2:49 PM, Otis Gospodnetic
<otis_gospodnetic@yahoo.com> wrote:
> Is this your code or something from Solr?
> That indexSearcher = new IndexSearcher(path_index) ; is very suspicious looking.

Good point... if this is a Solr plugin, then get the SolrIndexSearcher
from the request object.
If it's not Solr, then use termenum/termdocs (and post to the right list ;-)

-Yonik