[lucy-user] document IDs

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[lucy-user] document IDs

Anil Pachuri
Hi,

Please suggest how to get doc IDs of hits. I am assuming Lucy assigns internal IDs (starting with 0?) to each document in the index.

my $searcher = Lucy::Search::IndexSearcher->new( 
index => $path_to_index,
);

my $qparser = Lucy::Search::QueryParser->new(
schema => $searcher->get_schema,
default_boolop => 'OR',
);

my $query = $qparser->parse('query');
my $hits = $searcher->hits(
query => $query,
);

my $hit_count = $hits->total_hits;

# something like:
my $docIDs = $hits->doc_ids;

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] document IDs

Peter Karman
On 9/24/13 12:44 PM, Anil Pachuri wrote:

 > # something like:
 > my $docIDs = $hits->doc_ids;
 >

my @doc_ids;

while ( my $hit_doc = $hits->next ) {
     push @doc_ids, $hit_doc->get_doc_id();
}

https://metacpan.org/module/Lucy::Document::Doc

--
Peter Karman  .  http://peknet.com/  .  [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] document IDs

Anil Pachuri
Great, thank you much Peter.



________________________________
 From: Peter Karman <[hidden email]>
To: [hidden email]
Sent: Tuesday, September 24, 2013 1:02 PM
Subject: Re: [lucy-user] document IDs
 

On 9/24/13 12:44 PM, Anil Pachuri wrote:

> # something like:
> my $docIDs = $hits->doc_ids;
>

my @doc_ids;

while ( my $hit_doc = $hits->next ) {
    push @doc_ids, $hit_doc->get_doc_id();
}

https://metacpan.org/module/Lucy::Document::Doc

-- Peter Karman  .  http://peknet.com/  .  [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] document IDs

Anil Pachuri
In reply to this post by Peter Karman
Hi Peter,


When I run the program below for different queries, it gives me different values for $hit_count and number of elements in the array @doc_ids .  @doc_ids always shows 10 elements even for different queries, while  $hit_count value changes with query. May be I am missing something.
Thanks, Anil


my $searcher = Lucy::Search::IndexSearcher->new(
index => $path_to_index,
);

my $qparser = Lucy::Search::QueryParser->new(
schema => $searcher->get_schema,
default_boolop => 'OR',
);

my $query = $qparser->parse('query');

my $hits = $searcher->hits(
query => $query,
);

my $hit_count = $hits->total_hits;
print "$hit_count\n";

my @doc_ids;
while ( my $hit_doc = $hits->next ) {
push @doc_ids, $hit_doc->get_doc_id();
}

print "@doc_ids\n";





________________________________
 From: Peter Karman <[hidden email]>
To: [hidden email]
Sent: Tuesday, September 24, 2013 1:02 PM
Subject: Re: [lucy-user] document IDs
 

On 9/24/13 12:44 PM, Anil Pachuri wrote:

> # something like:
> my $docIDs = $hits->doc_ids;
>

my @doc_ids;

while ( my $hit_doc = $hits->next ) {
    push @doc_ids, $hit_doc->get_doc_id();
}

https://metacpan.org/module/Lucy::Document::Doc

-- Peter Karman  .  http://peknet.com/  .  [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] document IDs

Peter Karman
On 9/25/13 2:50 PM, Anil Pachuri wrote:
> Hi Peter,
>
> When I run the program below for different queries, it gives me
> different values for $hit_count and number of elements in the array
> @doc_ids . @doc_ids always shows 10 elements even for different queries,
> while $hit_count value changes with query. May be I am missing something.

You want the num_wanted param for hits().


> Thanks, Anil
>
> my $searcher = Lucy::Search::IndexSearcher->new(
> index => $path_to_index,
> );
>
> my $qparser = Lucy::Search::QueryParser->new(
> schema => $searcher->get_schema,
> default_boolop => 'OR',
> );
>
> my $query = $qparser->parse('query');
>
> my $hits = $searcher->hits(
> query => $query,

   num_wanted => 1_000_000,   # really big number

> );





--
Peter Karman  .  http://peknet.com/  .  [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] document IDs

Anil Pachuri
Super, that worked. Thanks a lot Peter.


________________________________
 From: Peter Karman <[hidden email]>
To: Anil Pachuri <[hidden email]>
Cc: "[hidden email]" <[hidden email]>
Sent: Thursday, September 26, 2013 1:14 PM
Subject: Re: [lucy-user] document IDs
 

On 9/25/13 2:50 PM, Anil Pachuri wrote:
> Hi Peter,
>
> When I run the program below for different queries, it gives me
> different values for $hit_count and number of elements in the array
> @doc_ids . @doc_ids always shows 10 elements even for different queries,
> while $hit_count value changes with query. May be I am missing something.

You want the num_wanted param for hits().


> Thanks, Anil
>
> my $searcher = Lucy::Search::IndexSearcher->new(
> index => $path_to_index,
> );
>
> my $qparser = Lucy::Search::QueryParser->new(
> schema => $searcher->get_schema,
> default_boolop => 'OR',
> );
>
> my $query = $qparser->parse('query');
>
> my $hits = $searcher->hits(
> query => $query,

   num_wanted => 1_000_000,   # really big number

> );





--
Peter Karman  .  http://peknet.com/  .  [hidden email]