List of indexed terms for a field

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

List of indexed terms for a field

Paul Terray
Hello,

 

I am trying Solr for some projects and I am very impressed by its simplicity
and clarity of use.

 

I am trying to make an index: Is there any way to get a list of all indexed
terms for a field (especially a string or text one)?

 

Thanks.

 


>

Paul Terray


 

Consultant Avant-Vente


>

SOLLAN

 


 

27, bis rue du Progrès
93100 Montreuil - France
Tel :  +33 (0)1 48 51 15 44
Fax : +33 (0)1 48 51 15 48
 <mailto:[hidden email]> [hidden email]
 <http://www.sollan.com> www.sollan.com

STRICTLY PERSONAL AND CONFIDENTIAL. This email may contain confidential and
proprietary material for the sole use of the intended recipient. Any review
or distribution by others is strictly prohibited. If you are not the
intended recipient please contact the sender and delete all copies.


 <http://www.sollan.com/signature_mail/lien_signature.php> SOLLAN

 

Reply | Threaded
Open this post in threaded view
|

Re: List of indexed terms for a field

Tim Archambault-2
Great question. Please share your answers. I'd like to use this for a
"GOOGLE SUGGEST" Ajax scenario.

On 6/7/06, Paul Terray <[hidden email]> wrote:

>
> Hello,
>
>
>
> I am trying Solr for some projects and I am very impressed by its
> simplicity
> and clarity of use.
>
>
>
> I am trying to make an index: Is there any way to get a list of all
> indexed
> terms for a field (especially a string or text one)?
>
>
>
> Thanks.
>
>
>
>
> >
>
> Paul Terray
>
>
>
>
> Consultant Avant-Vente
>
>
> >
>
> SOLLAN
>
>
>
>
>
>
> 27, bis rue du Progrès
> 93100 Montreuil - France
> Tel :  +33 (0)1 48 51 15 44
> Fax : +33 (0)1 48 51 15 48
> <mailto:[hidden email]> [hidden email]
> <http://www.sollan.com> www.sollan.com
>
> STRICTLY PERSONAL AND CONFIDENTIAL. This email may contain confidential
> and
> proprietary material for the sole use of the intended recipient. Any
> review
> or distribution by others is strictly prohibited. If you are not the
> intended recipient please contact the sender and delete all copies.
>
>
> <http://www.sollan.com/signature_mail/lien_signature.php> SOLLAN
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: List of indexed terms for a field

Yonik Seeley
In reply to this post by Paul Terray
On 6/7/06, Paul Terray <[hidden email]> wrote:
> I am trying to make an index: Is there any way to get a list of all indexed
> terms for a field (especially a string or text one)?

Hi Paul,
There isn't currently a way to do this, except perhaps writing your
own custom request handler and using the lower level Lucene
TermEnumerator after getting your hands on the underlying IndexReader.

This feature has been on my wish-list though.
There needs to be a syntax to request info like this, and then the
implementation.

perhaps something along the lines of a function syntax

@top10=terms("myfield",10)
  // request top 10 terms of "myfield", and return result under "top10"

So then the XML result from Solr would have something like this at the end:
<arr name="top10"><str>term1</str><str>term2</str><str>term3</str></arr>


@top10=termFreqs("myfield",10)   // request top 10 terms and their frequencies
Returns:
<arr name="top10"><str>term1</str><int>142</int>...
  OR
<lst name="top10"><int name="term1">142</int>...


-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: List of indexed terms for a field

Erik Hatcher
In reply to this post by Paul Terray

On Jun 7, 2006, at 3:45 AM, Paul Terray wrote:
> I am trying Solr for some projects and I am very impressed by its  
> simplicity
> and clarity of use.
>
>
>
> I am trying to make an index: Is there any way to get a list of all  
> indexed
> terms for a field (especially a string or text one)?

Out of the box Solr does not do this by default, but the core  
architecture of Solr makes this easy to add.

I've built a Google-Suggest-like drop down do this very thing.  All  
of my Solr code is currently going here:

        <http://svn.sourceforge.net/viewcvs.cgi/patacriticism/nines/trunk/ 
src/solr/org/nines/>

Particularly the FacetedSearchRequestHandler.java - where (prefix !=  
null).  In this particular case it's doing something a little  
interesting... a RAMDirectory was built into a custom Solr cache that  
indexes peoples names with tokenization and then returns just the  
names (not "documents").

Note: There are domain-centric things in there at the moment, with  
the grand idea to build any of these types of things into Solr when  
they are proven in the field.

        Erik

Reply | Threaded
Open this post in threaded view
|

RE: List of indexed terms for a field

Paul Terray
In reply to this post by Yonik Seeley
Thanks for the answer.

This is not a need for the moment, but it could be in the near future.

If it becomes so, I will see how we can implement such a thing.

As for the syntax, I would see another parameter for the request (and maybe
another URL, as the function is clearly different).

Something like:
http://localhost:8983/solr/terms/?fl=myfield&rows=10

But perhaps am I completely off-course (I am no Java developer, sorry).



-----Message d'origine-----
De : Yonik Seeley [mailto:[hidden email]]
Envoyé : mercredi 7 juin 2006 15:41
À : [hidden email]
Objet : Re: List of indexed terms for a field

On 6/7/06, Paul Terray <[hidden email]> wrote:
> I am trying to make an index: Is there any way to get a list of all
indexed
> terms for a field (especially a string or text one)?

Hi Paul,
There isn't currently a way to do this, except perhaps writing your
own custom request handler and using the lower level Lucene
TermEnumerator after getting your hands on the underlying IndexReader.

This feature has been on my wish-list though.
There needs to be a syntax to request info like this, and then the
implementation.

perhaps something along the lines of a function syntax

@top10=terms("myfield",10)
  // request top 10 terms of "myfield", and return result under "top10"

So then the XML result from Solr would have something like this at the end:
<arr name="top10"><str>term1</str><str>term2</str><str>term3</str></arr>


@top10=termFreqs("myfield",10)   // request top 10 terms and their
frequencies
Returns:
<arr name="top10"><str>term1</str><int>142</int>...
  OR
<lst name="top10"><int name="term1">142</int>...


-Yonik