[jira] [Created] (SOLR-2524) Adding grouping to Solr 3x

classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
Adding grouping to Solr 3x
--------------------------

                 Key: SOLR-2524
                 URL: https://issues.apache.org/jira/browse/SOLR-2524
             Project: Solr
          Issue Type: New Feature
    Affects Versions: 3.2
            Reporter: Martijn van Groningen


Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.

The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035287#comment-13035287 ]

Michael McCandless commented on SOLR-2524:
------------------------------------------

+1 this would be awesome Martijn!!

In general we should try hard to backport features we build on trunk, to 3.x, when feasible.

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated SOLR-2524:
----------------------------------------

    Attachment: SOLR-2524.patch

Attached the initial patch.

* Patch is based on what is in the trunk.
** Integrated the grouping contrib collectors
** Same response formats.
** All parameters except group.query and group.func are supported.
** Computed DocSet (for facetComponent and StatsComponent) is based the ungrouped result.
* Also integrated the caching collector. For this I added the group.cache=true|false and group.cache.maxSize=[number] parameters.

Things still todo:
* Integrate AllGroupsCollector for total count based on groups.
* Create a Solr Test for grouping
* Cleanup / Refactor / java doc

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>         Attachments: SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035526#comment-13035526 ]

Michael McCandless commented on SOLR-2524:
------------------------------------------

Awesome, that was fast!

Maybe rename group.cache.maxSize -> .maxSizeMB?  (So it's clear what the units are).

Should we default group.cache to true?  (It's false now?).

When you get the top groups from collector2, should you pass in offset instead of 0?  (Hmm -- maybe groupOffset?  It seems like you're using offset for both the first & second phase collectors?  Maybe I'm confused...).

bq. Computed DocSet (for facetComponent and StatsComponent) is based the ungrouped result.

This matches how Solr does grouping on trunk right?

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>         Attachments: SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Assigned] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned SOLR-2524:
----------------------------------------

    Assignee: Michael McCandless

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035586#comment-13035586 ]

Martijn van Groningen commented on SOLR-2524:
---------------------------------------------

bq. Maybe rename group.cache.maxSize -> .maxSizeMB? (So it's clear what the units are).
Yes that is a more descriptive name.

bq. Should we default group.cache to true? (It's false now?).
That makes sense.

I think that if the cachedCollector.isCached() returns false we should put something in the response indication that the cache wasn't used because it hit the cache.maxSizeMB limit. Otherwise the nobody will no whether the cache was utilized.

When I was playing around with the cache options I noticed that searching without cache (~350 ms) was faster then with cache (~500 ms) on a 10M index with 1711 distinct group values. This is not what I'd expect.

bq. When you get the top groups from collector2, should you pass in offset instead of 0? (Hmm – maybe groupOffset? It seems like you're using offset for both the first & second phase collectors? Maybe I'm confused...).
I know that is confusing, but the DocSlice expects offset + len documents. So that was a quick of doing that. I will clean that up.

bq. This matches how Solr does grouping on trunk right?
Yes it does. I'm already thinking about a new collector that collects all most relevant documents of all groups. This collector should produce something like an OpenBitSet. We can use the OpenBitSet to create a DocSet. I think this should be implemented in a different issue.

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035631#comment-13035631 ]

Michael McCandless commented on SOLR-2524:
------------------------------------------

bq. I think that if the cachedCollector.isCached() returns false we should put something in the response indication that the cache wasn't used because it hit the cache.maxSizeMB limit. Otherwise the nobody will no whether the cache was utilized.

+1, and maybe log a warning?  Or is that going to be too much logging?

bq. When I was playing around with the cache options I noticed that searching without cache (~350 ms) was faster then with cache (~500 ms) on a 10M index with 1711 distinct group values. This is not what I'd expect.

That is worrisome!!  Was this a simple TermQuery?  Is it somehow possible Solr is already caching the queries results itself...?

bq. I'm already thinking about a new collector that collects all most relevant documents of all groups. This collector should produce something like an OpenBitSet. We can use the OpenBitSet to create a DocSet. I think this should be implemented in a different issue.

Cool!

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035673#comment-13035673 ]

Martijn van Groningen commented on SOLR-2524:
---------------------------------------------

bq. and maybe log a warning?
I also think a log warning should be added too. If users don't want that they can reconfigure their logger not to log messages from the Grouping class.

bq. Was this a simple TermQuery
No a MatchDocAllQuery (*:*). I usually use that to maximize the number documents to group on. Trying this on a simple term query that results in 8M documents; the query with cache is a little bit (~5%) faster. For boolean queries the cache does pay off. Grouping with cache is 20% faster than without. Off course these are just numbers from my machine and my test index.

bq. Is it somehow possible Solr is already caching the queries results itself...?
Only the filterCache is used for any filter query in the fq param. The main query is not cached with the filterCache or QueryResultCache. This the same behaviour as in the trunk.

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036270#comment-13036270 ]

Michael McCandless commented on SOLR-2524:
------------------------------------------

{quote}
bq. Was this a simple TermQuery

No a MatchDocAllQuery (:)
{quote}

Ahh OK then that makes sense -- MatchAllDocsQuery is a might fast query to execute ;)  So the work done to cache it is going to be slower.

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated SOLR-2524:
----------------------------------------

    Attachment: SOLR-2524.patch

Attached an updated patch.

* Added cache warning
* Added counting based on groups. This can be controlled by group.totalCount parameter. Use GROUPED to get counts based on groups and UNGROUPED to get counts based on plain documents. Default is UNGROUPED.
* Ported the trunk grouping tests to 3x. The only part I couldn't port was the random testing. The API is different in 3x.
* Added grouping by query (group.query). Porting this back to 3x was trivial.
* Second pass caching is now enabled by default.
* Changed .cache.maxSize into .cache.maxSizeMB


> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037475#comment-13037475 ]

Bill Bell commented on SOLR-2524:
---------------------------------

Will this also be applied to 4.0 and the 3.2 branch?

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037480#comment-13037480 ]

Bill Bell commented on SOLR-2524:
---------------------------------

OK. I am trying to understand group.totalCount=grouped... I am not seeing the facets change in Solr.

Using latest 3.2 branch, and patching it with your latest patch:

http://localhost:8983/solr/select?q=*:*&group.field=inStock&group=true&facet=true&facet.field=manu&group.docSet=grouped&group.totalCount=GROUPED

{code}

<lst name="facet_counts">
  <lst name="facet_queries"/>
    <lst name="facet_fields">
      <lst name="manu">
       <int name="inc">8</int>
       <int name="apache">2</int>
       <int name="belkin">2</int>
       <int name="canon">2</int>
etc...
{code}

But in the result set above, I only see the following once:

{code}
<str name="manu">Belkin</str>
{code}


I would assume the results for this field would be:

{code}

<lst name="facet_counts">
  <lst name="facet_queries"/>
    <lst name="facet_fields">
      <lst name="manu">
       <int name="belkin">1</int>
etc...
{code}

Since it it grouped, and only belkin shows... Is there a parameter to do that?







> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037559#comment-13037559 ]

Michael McCandless commented on SOLR-2524:
------------------------------------------

bq. Will this also be applied to 4.0 and the 3.2 branch?

So the current plan is to commit this issue only for 3.2.

Solr trunk already has its own grouping implementation, from which we've factored out a shared grouping module (see LUCENE-1421).  That module is available in trunk and 3.x, but since Solr already has grouping in trunk, and it has more features than the grouping module (specifically, that you can group by docvalues derived from a function query and by arbitrary query), Solr trunk will for now keep its own private impl.

Once we've factored out more stuff from Solr (function queries, LUCENE-2883, and I think also filter caches) then we'll fix the grouping module and also cutover Solr trunk to it.  This is the current thinking anyway...

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037562#comment-13037562 ]

Martijn van Groningen commented on SOLR-2524:
---------------------------------------------

bq. Will this also be applied to 4.0 and the 3.2 branch?
This patch will be applied on the 3x branch. This patch serves as basis for the work needed to make the trunk use the grouping module. But I think that will be addressed in a different issue.

bq. OK. I am trying to understand group.totalCount=grouped... I am not seeing the facets change in Solr.
That is because the group.totalCount parameter only has effect on the total count and not the facets.

So executing the same query with group.totalCount=GROUPED:
{code:xml}
<lst name="grouped">
   <lst name="inStock">
     <int name="matches">2</int>
     <arr name="groups">
        <lst>
           <bool name="groupValue">true</bool>
{code}

So executing the same query with group.totalCount=UNGROUPED (default):
{code:xml}
<lst name="grouped">
   <lst name="inStock">
      <int name="matches">17</int>
         <arr name="groups">
            <lst>
               <bool name="groupValue">true</bool>
{code}
So having facets based on groups is the next step :). I haven't implemented this yet. But I will properly use the group.docSet parameter for that. Because it will not only have a effect on the FacetComponent, but also the StatsComponent. I think we should first focus on getting the current patch committed. And then tackle this issue. Also to implement this we also need a other group collector.  

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037564#comment-13037564 ]

Michael McCandless commented on SOLR-2524:
------------------------------------------

Patch looks awesome Martijn!

And good job getting group-by-Query added back in!  So it's only
missing group-by-function-query-docvalues vs trunk.

Can you add a Solr CHANGES entry?  This is a great addition
(finally!).

Then I think it's ready to commit once we resolve Bill's issue
above...


> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated SOLR-2524:
----------------------------------------

    Attachment: SOLR-2524.patch

Attached an updated patch. I added an entry in Solr's CHANGES.txt

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037593#comment-13037593 ]

Michael McCandless commented on SOLR-2524:
------------------------------------------

Looks great -- I'll commit in a day or two.  Thanks Martijn!

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037594#comment-13037594 ]

Simon Willnauer commented on SOLR-2524:
---------------------------------------

bq. Looks great – I'll commit in a day or two. Thanks Martijn!
+1 - nice work guys

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037745#comment-13037745 ]

Hoss Man commented on SOLR-2524:
--------------------------------

bq. Solr trunk already has its own grouping implementation, from which we've factored out a shared grouping module (see LUCENE-1421). That module is available in trunk and 3.x, but since Solr already has grouping in trunk, and it has more features than the grouping module (specifically, that you can group by docvalues derived from a function query and by arbitrary query), Solr trunk will for now keep its own private impl.

FWIW: I'm a little wary of the idea that we might wind up with an alternate approach to the "grouping" functionality released in 3.2 then what would then later be released in 4.0 ... i haven't looked at the approach in branch enough to understand how they differ, but i'm concerned about the hypothetical possibilities that they might have subtly differnet behavior in edge cases, or different perf characteristics, or that 3.2 might add a feature that is hard to support in 4.0, etc....

I say this only to raise it as a red flag to watch out for -- not because i want to stop the progress on this issue.

The first question that sprang to mind when i saw this issue was: is backporting what solr already uses on trunk to the 3x branch out of the question?

assuming it is, then i guess the main thing that would help ease my fears are if:

1) we had identical Solr tests (at the request api level) on trunk and 3x to help sanity check that the two impls behave the same way
2) the folks working on the grouping refactoring felt confident that by the time we get arround to releasing 4.0, the grouping refactoring would be at the point that the 3.2 impl and the 4.0 impl would be equivalent.






> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037778#comment-13037778 ]

Martijn van Groningen commented on SOLR-2524:
---------------------------------------------

Hi Hoss,

bq. is backporting what solr already uses on trunk to the 3x branch out of the question?
Many folks have grouping like requirements. Nowadays they either have to use the trunk or patch Solr. I think that having grouping in a released version would be great.

bq. we had identical Solr tests (at the request api level) on trunk and 3x to help sanity check that the two impls behave the same way
I copied the grouping test from trunk to 3x. Only the random tests are not enabled. The general super test class (that other tests also use) is not compatible between trunk and 3x.

bq. the folks working on the grouping refactoring felt confident that by the time we get arround to releasing 4.0, the grouping refactoring would be at the point that the 3.2 impl and the 4.0 impl would be equivalent.
I totally agree. I haven't opened a Solr issue yet, but I will do that soon. Basically this new issue will be concerned with incorporating the grouping module into Solr trunk without the loss of the current grouping functionality. I think the at the time 4.0 is released the feature set Solr supports regarding to grouping might be larger then what is in a previous 3.x release. We can keep the response format and request parameters backward compatible.

> Adding grouping to Solr 3x
> --------------------------
>
>                 Key: SOLR-2524
>                 URL: https://issues.apache.org/jira/browse/SOLR-2524
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.2
>            Reporter: Martijn van Groningen
>            Assignee: Michael McCandless
>         Attachments: SOLR-2524.patch, SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more information.
> I think it would be nice if we expose this functionality also to the Solr users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is acceptable. I have it more or less running here. It supports the response format and request parameters (expect: group.query and group.func) described in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many people are using grouping as patch now and this would help them a lot. Any thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

123