Solr statistics of top searches and results returned

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr statistics of top searches and results returned

solrpowr
Hi,

Besides my own offline processing via logs, does solr have the functionality to give me statistics such as top searches, how many results were returned on these searches, and/or how long it took to get these results on average.


Thanks,
Bob
Reply | Threaded
Open this post in threaded view
|

Re: Solr statistics of top searches and results returned

Shalin Shekhar Mangar
On Tue, May 19, 2009 at 11:50 PM, solrpowr <[hidden email]> wrote:

>
> Besides my own offline processing via logs, does solr have the
> functionality
> to give me statistics such as top searches, how many results were returned
> on these searches, and/or how long it took to get these results on average.
>
>
You can see the statistics page (see the /select section) which will tell
you average queries per second and average time per query.

There's no option to show top searches as of now.

--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

RE: Solr statistics of top searches and results returned

Plaatje, Patrick
In reply to this post by solrpowr
Hi,

At the moment Solr does not have such functionality. I have written a plugin for Solr though which uses a second Solr core to store/index the searches. If you're interested, send me an email and I'll get you the source for the plugin.

Regards,

Patrick

-----Original Message-----
From: solrpowr [mailto:[hidden email]]
Sent: dinsdag 19 mei 2009 20:21
To: [hidden email]
Subject: Solr statistics of top searches and results returned


Hi,

Besides my own offline processing via logs, does solr have the functionality to give me statistics such as top searches, how many results were returned on these searches, and/or how long it took to get these results on average.


Thanks,
Bob
--
View this message in context: http://www.nabble.com/Solr-statistics-of-top-searches-and-results-returned-tp23621779p23621779.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Solr statistics of top searches and results returned

Shalin Shekhar Mangar
On Wed, May 20, 2009 at 1:31 PM, Plaatje, Patrick <
[hidden email]> wrote:

>
> At the moment Solr does not have such functionality. I have written a
> plugin for Solr though which uses a second Solr core to store/index the
> searches. If you're interested, send me an email and I'll get you the source
> for the plugin.
>
>
Patrick, this will be a useful addition. However instead of doing this with
another core, we can keep running statistics which can be shown on the
statistics page itself. What do you think?

A related approach for showing slow queries was discussed recently. There's
an issue open which has more details:

https://issues.apache.org/jira/browse/SOLR-1101

--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

RE: Solr statistics of top searches and results returned

Plaatje, Patrick
Hi Shalin,

Let me investigate. I think the challenge will be in storingmanaging these statistics. I'll get back to the list when I have thought of something.

Rgrds,

Patrick

-----Original Message-----
From: Shalin Shekhar Mangar [mailto:[hidden email]]
Sent: woensdag 20 mei 2009 10:33
To: [hidden email]
Subject: Re: Solr statistics of top searches and results returned

On Wed, May 20, 2009 at 1:31 PM, Plaatje, Patrick < [hidden email]> wrote:

>
> At the moment Solr does not have such functionality. I have written a
> plugin for Solr though which uses a second Solr core to store/index
> the searches. If you're interested, send me an email and I'll get you
> the source for the plugin.
>
>
Patrick, this will be a useful addition. However instead of doing this with another core, we can keep running statistics which can be shown on the statistics page itself. What do you think?

A related approach for showing slow queries was discussed recently. There's an issue open which has more details:

https://issues.apache.org/jira/browse/SOLR-1101

--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

Re: Solr statistics of top searches and results returned

Grant Ingersoll-2
In reply to this post by Shalin Shekhar Mangar

On May 20, 2009, at 4:33 AM, Shalin Shekhar Mangar wrote:

> On Wed, May 20, 2009 at 1:31 PM, Plaatje, Patrick <
> [hidden email]> wrote:
>
>>
>> At the moment Solr does not have such functionality. I have written a
>> plugin for Solr though which uses a second Solr core to store/index  
>> the
>> searches. If you're interested, send me an email and I'll get you  
>> the source
>> for the plugin.
>>
>>
> Patrick, this will be a useful addition. However instead of doing  
> this with
> another core, we can keep running statistics which can be shown on the
> statistics page itself. What do you think?

I think you will want some type of persistence mechanism otherwise you  
will end up consuming a lot of resources keeping track of all the  
query strings, unless I'm missing something.  Either a Lucene index  
(Solr core) or the option of embedding a DB.  Ideally, it would be  
pluggable such that people could choose their storage mechanism.  Most  
people do this kind of thing offline via log analysis as logs can grow  
quite large quite quickly.

>
>
> A related approach for showing slow queries was discussed recently.  
> There's
> an issue open which has more details:
>
> https://issues.apache.org/jira/browse/SOLR-1101
>
> --
> Regards,
> Shalin Shekhar Mangar.

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search

Reply | Threaded
Open this post in threaded view
|

Re: Solr statistics of top searches and results returned

Shalin Shekhar Mangar
On Fri, May 22, 2009 at 3:22 AM, Grant Ingersoll <[hidden email]>wrote:

>
> I think you will want some type of persistence mechanism otherwise you will
> end up consuming a lot of resources keeping track of all the query strings,
> unless I'm missing something.  Either a Lucene index (Solr core) or the
> option of embedding a DB.  Ideally, it would be pluggable such that people
> could choose their storage mechanism.  Most people do this kind of thing
> offline via log analysis as logs can grow quite large quite quickly.
>

For a general case, yes. But I was thinking more of a top 'n' queries as a
running statistic.

--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

Re: Solr statistics of top searches and results returned

Umar Shah
Hi,

good feature to have,
maintaining top N would also require storing all the search queries
done so far and keep updating (or atleast in some time window).

having pluggable persistent storage for all time search queries would be great.

tell me how can I help?

-umar

On Fri, May 22, 2009 at 12:21 PM, Shalin Shekhar Mangar
<[hidden email]> wrote:

> On Fri, May 22, 2009 at 3:22 AM, Grant Ingersoll <[hidden email]>wrote:
>
>>
>> I think you will want some type of persistence mechanism otherwise you will
>> end up consuming a lot of resources keeping track of all the query strings,
>> unless I'm missing something.  Either a Lucene index (Solr core) or the
>> option of embedding a DB.  Ideally, it would be pluggable such that people
>> could choose their storage mechanism.  Most people do this kind of thing
>> offline via log analysis as logs can grow quite large quite quickly.
>>
>
> For a general case, yes. But I was thinking more of a top 'n' queries as a
> running statistic.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
Reply | Threaded
Open this post in threaded view
|

RE: Solr statistics of top searches and results returned

Plaatje, Patrick
Hi all,

I created a script that uses a Solr Search Component, which hooks into the main solr core and catches the searches being done. After this it tokenizes the search and send both the tokenized as well as the original query to another Solr core. I have not written a factory for this, but if required, it shouldn't be so hard to modify the script and code Database support into it.

You can find the source here:

http://www.ipros.nl/uploads/Stats-component.zip

It includes a README, and a schema.xml that should be used.

Please let me know you're thoughts.

Best,

Patrick



 

-----Original Message-----
From: Umar Shah [mailto:[hidden email]]
Sent: vrijdag 22 mei 2009 10:03
To: [hidden email]
Subject: Re: Solr statistics of top searches and results returned

Hi,

good feature to have,
maintaining top N would also require storing all the search queries done so far and keep updating (or atleast in some time window).

having pluggable persistent storage for all time search queries would be great.

tell me how can I help?

-umar

On Fri, May 22, 2009 at 12:21 PM, Shalin Shekhar Mangar <[hidden email]> wrote:

> On Fri, May 22, 2009 at 3:22 AM, Grant Ingersoll <[hidden email]>wrote:
>
>>
>> I think you will want some type of persistence mechanism otherwise
>> you will end up consuming a lot of resources keeping track of all the
>> query strings, unless I'm missing something.  Either a Lucene index
>> (Solr core) or the option of embedding a DB.  Ideally, it would be
>> pluggable such that people could choose their storage mechanism.  
>> Most people do this kind of thing offline via log analysis as logs can grow quite large quite quickly.
>>
>
> For a general case, yes. But I was thinking more of a top 'n' queries
> as a running statistic.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
Reply | Threaded
Open this post in threaded view
|

RE: Solr statistics of top searches and results returned

rswart
If this is is not done in an async way wouldn't this have a serious performance impact?

 
Plaatje, Patrick wrote
Hi all,

I created a script that uses a Solr Search Component, which hooks into the main solr core and catches the searches being done. After this it tokenizes the search and send both the tokenized as well as the original query to another Solr core. I have not written a factory for this, but if required, it shouldn't be so hard to modify the script and code Database support into it.

You can find the source here:

http://www.ipros.nl/uploads/Stats-component.zip

It includes a README, and a schema.xml that should be used.

Please let me know you're thoughts.

Best,

Patrick



 

-----Original Message-----
From: Umar Shah [mailto:umar@wisdomtap.com]
Sent: vrijdag 22 mei 2009 10:03
To: solr-user@lucene.apache.org
Subject: Re: Solr statistics of top searches and results returned

Hi,

good feature to have,
maintaining top N would also require storing all the search queries done so far and keep updating (or atleast in some time window).

having pluggable persistent storage for all time search queries would be great.

tell me how can I help?

-umar

On Fri, May 22, 2009 at 12:21 PM, Shalin Shekhar Mangar <shalinmangar@gmail.com> wrote:
> On Fri, May 22, 2009 at 3:22 AM, Grant Ingersoll <gsingers@apache.org>wrote:
>
>>
>> I think you will want some type of persistence mechanism otherwise
>> you will end up consuming a lot of resources keeping track of all the
>> query strings, unless I'm missing something.  Either a Lucene index
>> (Solr core) or the option of embedding a DB.  Ideally, it would be
>> pluggable such that people could choose their storage mechanism.  
>> Most people do this kind of thing offline via log analysis as logs can grow quite large quite quickly.
>>
>
> For a general case, yes. But I was thinking more of a top 'n' queries
> as a running statistic.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
Reply | Threaded
Open this post in threaded view
|

RE: Solr statistics of top searches and results returned

Plaatje, Patrick
Hi,

In our specific implementation this is not really an issue, but I can imagine it could impact performance. I guess a new thread could spawned, which takes care of any performance issues, thanks for pointing it out. I'll post a message when I coded the change.

Regards,

Patrick


-----Original Message-----
From: rswart [mailto:[hidden email]]
Sent: dinsdag 26 mei 2009 16:42
To: [hidden email]
Subject: RE: Solr statistics of top searches and results returned


If this is is not done in an async way wouldn't this have a serious performance impact?

 

Plaatje, Patrick wrote:

>
> Hi all,
>
> I created a script that uses a Solr Search Component, which hooks into
> the main solr core and catches the searches being done. After this it
> tokenizes the search and send both the tokenized as well as the
> original query to another Solr core. I have not written a factory for
> this, but if required, it shouldn't be so hard to modify the script
> and code Database support into it.
>
> You can find the source here:
>
> http://www.ipros.nl/uploads/Stats-component.zip
>
> It includes a README, and a schema.xml that should be used.
>
> Please let me know you're thoughts.
>
> Best,
>
> Patrick
>
>
>
>  
>
> -----Original Message-----
> From: Umar Shah [mailto:[hidden email]]
> Sent: vrijdag 22 mei 2009 10:03
> To: [hidden email]
> Subject: Re: Solr statistics of top searches and results returned
>
> Hi,
>
> good feature to have,
> maintaining top N would also require storing all the search queries
> done so far and keep updating (or atleast in some time window).
>
> having pluggable persistent storage for all time search queries would
> be great.
>
> tell me how can I help?
>
> -umar
>
> On Fri, May 22, 2009 at 12:21 PM, Shalin Shekhar Mangar
> <[hidden email]> wrote:
>> On Fri, May 22, 2009 at 3:22 AM, Grant Ingersoll
>> <[hidden email]>wrote:
>>
>>>
>>> I think you will want some type of persistence mechanism otherwise
>>> you will end up consuming a lot of resources keeping track of all
>>> the query strings, unless I'm missing something.  Either a Lucene
>>> index (Solr core) or the option of embedding a DB.  Ideally, it
>>> would be pluggable such that people could choose their storage mechanism.
>>> Most people do this kind of thing offline via log analysis as logs
>>> can grow quite large quite quickly.
>>>
>>
>> For a general case, yes. But I was thinking more of a top 'n' queries
>> as a running statistic.
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>
>

--
View this message in context: http://www.nabble.com/Solr-statistics-of-top-searches-and-results-returned-tp23621779p23724277.html
Sent from the Solr - User mailing list archive at Nabble.com.