identifying source of queries

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

identifying source of queries

suresh pendap-2
Hi,
We have found that application teams often fire ad-hoc queries, some of
these are very expensive queries and can bring the solr cluster down.  Some
times they just build custom scripts which does some offline analytics by
firing expensive queries, the solr cluster was originally not sized for
executing such queries.

When an issue happens we identify from the solr logs the query which is
taking long time. But some times we do not even know who is firing these
queries and hence it takes a while to stop them.

We would like be able to identify the source of the solr queries.

Is there a way to tag the solr queries?
Can we associate some tags or query identifier with the query?
These tags should be made mandatory without which the solr query should
fail?

We would like to build a custom component which logs the query, the query
identifier (the tag which user provides) and the IP address of the client
machine which fired this query.


Thanks
Suresh
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: identifying source of queries

Shalin Shekhar Mangar
There is no in-built way but if you are willing to create a custom
query component then it should be easy to mandate that every query
must have a tag parameter by throwing an exception otherwise. Any
query param you pass to a distributed query request should be
propagated to all query nodes as well.

On Wed, Aug 9, 2017 at 8:58 AM, suresh pendap <[hidden email]> wrote:

> Hi,
> We have found that application teams often fire ad-hoc queries, some of
> these are very expensive queries and can bring the solr cluster down.  Some
> times they just build custom scripts which does some offline analytics by
> firing expensive queries, the solr cluster was originally not sized for
> executing such queries.
>
> When an issue happens we identify from the solr logs the query which is
> taking long time. But some times we do not even know who is firing these
> queries and hence it takes a while to stop them.
>
> We would like be able to identify the source of the solr queries.
>
> Is there a way to tag the solr queries?
> Can we associate some tags or query identifier with the query?
> These tags should be made mandatory without which the solr query should
> fail?
>
> We would like to build a custom component which logs the query, the query
> identifier (the tag which user provides) and the IP address of the client
> machine which fired this query.
>
>
> Thanks
> Suresh



--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: identifying source of queries

suresh pendap-2
Thanks Shalin for the reply.
Do I need to also update the query parsers in order to handle the new query
param?
I can build a custom component but dabbling with query parsers would be way
too much for me to handle.

Thanks
Suresh




On Tue, Aug 8, 2017 at 9:49 PM, Shalin Shekhar Mangar <
[hidden email]> wrote:

> There is no in-built way but if you are willing to create a custom
> query component then it should be easy to mandate that every query
> must have a tag parameter by throwing an exception otherwise. Any
> query param you pass to a distributed query request should be
> propagated to all query nodes as well.
>
> On Wed, Aug 9, 2017 at 8:58 AM, suresh pendap <[hidden email]>
> wrote:
> > Hi,
> > We have found that application teams often fire ad-hoc queries, some of
> > these are very expensive queries and can bring the solr cluster down.
> Some
> > times they just build custom scripts which does some offline analytics by
> > firing expensive queries, the solr cluster was originally not sized for
> > executing such queries.
> >
> > When an issue happens we identify from the solr logs the query which is
> > taking long time. But some times we do not even know who is firing these
> > queries and hence it takes a while to stop them.
> >
> > We would like be able to identify the source of the solr queries.
> >
> > Is there a way to tag the solr queries?
> > Can we associate some tags or query identifier with the query?
> > These tags should be made mandatory without which the solr query should
> > fail?
> >
> > We would like to build a custom component which logs the query, the query
> > identifier (the tag which user provides) and the IP address of the client
> > machine which fired this query.
> >
> >
> > Thanks
> > Suresh
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: identifying source of queries

Shalin Shekhar Mangar
There is no need to change anything related to query parsers. A search
component should be enough.

On Wed, Aug 9, 2017 at 9:56 PM, suresh pendap <[hidden email]> wrote:

> Thanks Shalin for the reply.
> Do I need to also update the query parsers in order to handle the new query
> param?
> I can build a custom component but dabbling with query parsers would be way
> too much for me to handle.
>
> Thanks
> Suresh
>
>
>
>
> On Tue, Aug 8, 2017 at 9:49 PM, Shalin Shekhar Mangar <
> [hidden email]> wrote:
>
>> There is no in-built way but if you are willing to create a custom
>> query component then it should be easy to mandate that every query
>> must have a tag parameter by throwing an exception otherwise. Any
>> query param you pass to a distributed query request should be
>> propagated to all query nodes as well.
>>
>> On Wed, Aug 9, 2017 at 8:58 AM, suresh pendap <[hidden email]>
>> wrote:
>> > Hi,
>> > We have found that application teams often fire ad-hoc queries, some of
>> > these are very expensive queries and can bring the solr cluster down.
>> Some
>> > times they just build custom scripts which does some offline analytics by
>> > firing expensive queries, the solr cluster was originally not sized for
>> > executing such queries.
>> >
>> > When an issue happens we identify from the solr logs the query which is
>> > taking long time. But some times we do not even know who is firing these
>> > queries and hence it takes a while to stop them.
>> >
>> > We would like be able to identify the source of the solr queries.
>> >
>> > Is there a way to tag the solr queries?
>> > Can we associate some tags or query identifier with the query?
>> > These tags should be made mandatory without which the solr query should
>> > fail?
>> >
>> > We would like to build a custom component which logs the query, the query
>> > identifier (the tag which user provides) and the IP address of the client
>> > machine which fired this query.
>> >
>> >
>> > Thanks
>> > Suresh
>>
>>
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>



--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: identifying source of queries

Rick Leir-2
In reply to this post by suresh pendap-2
Suresh
If you have a web app in front of Solr and it or Apache logs all requests then you should be able to match the log entries to the solr.log entries. That would tell you a source IP, but it might not help if the users are behind a nat firewall. But yes, you could look at the nat firewall logs.

Then again, your web app could have a login, and you could pass the username to Solr as a query parameter. But now you will be busy developing this simple web app.

Solr auth might help if each user has her own credentials (I do not know if this is logged).
Cheers -- Rick

On August 8, 2017 11:28:51 PM EDT, suresh pendap <[hidden email]> wrote:

>Hi,
>We have found that application teams often fire ad-hoc queries, some of
>these are very expensive queries and can bring the solr cluster down.
>Some
>times they just build custom scripts which does some offline analytics
>by
>firing expensive queries, the solr cluster was originally not sized for
>executing such queries.
>
>When an issue happens we identify from the solr logs the query which is
>taking long time. But some times we do not even know who is firing
>these
>queries and hence it takes a while to stop them.
>
>We would like be able to identify the source of the solr queries.
>
>Is there a way to tag the solr queries?
>Can we associate some tags or query identifier with the query?
>These tags should be made mandatory without which the solr query should
>fail?
>
>We would like to build a custom component which logs the query, the
>query
>identifier (the tag which user provides) and the IP address of the
>client
>machine which fired this query.
>
>
>Thanks
>Suresh

--
Sorry for being brief. Alternate email is rickleir at yahoo dot com
Loading...