[jira] Created: (LUCENE-1243) A few new benchmark tasks

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
A few new benchmark tasks
-------------------------

                 Key: LUCENE-1243
                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
             Project: Lucene - Java
          Issue Type: Improvement
          Components: contrib/benchmark
            Reporter: Mark Miller
            Priority: Minor


Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:

CommitIndexTask
ReopenReaderTask
SearchWithSortTask

I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1243:
--------------------------------

    Attachment: benchmark-tasks.diff

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: benchmark-tasks.diff
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Assigned: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll reassigned LUCENE-1243:
---------------------------------------

    Assignee: Grant Ingersoll

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: benchmark-tasks.diff
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583387#action_12583387 ]

Grant Ingersoll commented on LUCENE-1243:
-----------------------------------------

I get test failures in TestQualityRun when I apply this.  I am guessing it is the BasicDocMaker change.

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: benchmark-tasks.diff
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583388#action_12583388 ]

Mark Miller commented on LUCENE-1243:
-------------------------------------

Yeah, sorry about that Grant. Did not mean for that change to go in, just wanted it as a stumbling block to thinking about good sort field data. At the time I needed to test non String data and there was none.

So ... please ignore that change. Perhaps the best way is a new DocMaker for good sort data?

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: benchmark-tasks.diff
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583389#action_12583389 ]

Grant Ingersoll commented on LUCENE-1243:
-----------------------------------------

I think the Reuters has a data available.  I forget how it is indexed off hand, but you could work w/ that, maybe.  Otherwise, a new DocMaker would be reasonable or perhaps see if Wikipedia offers any other fields.

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: benchmark-tasks.diff
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583432#action_12583432 ]

Mark Miller commented on LUCENE-1243:
-------------------------------------

It does have a date...but I noticed it seems to generally be the same date for a *lot* of docs. Not much sorting to do when the value doesnt change much. I will take a look at the wiki maker.

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: benchmark-tasks.diff
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1243:
--------------------------------

    Attachment: LUCENE-1243.03.30.2008.patch

Here are just the tasks and the change to ReadTask to support sort.

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: benchmark-tasks.diff, LUCENE-1243.03.30.2008.patch
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1243:
--------------------------------

    Attachment: LUCENE-1243.patch

Better sort stuff - adds a new doc maker that adds a random sort field over a specified range of values, allows choosing of sort type, includes sample sort benchmark algorithm.

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: benchmark-tasks.diff, LUCENE-1243.03.30.2008.patch, LUCENE-1243.patch
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1243:
---------------------------------------

    Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])
    Fix Version/s: 2.4

I think we should do this for 2.4?  Mark/Grant, is this ready to go in?

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: benchmark-tasks.diff, LUCENE-1243.03.30.2008.patch, LUCENE-1243.patch
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627773#action_12627773 ]

Grant Ingersoll commented on LUCENE-1243:
-----------------------------------------

My last look at it seemed like it was in good shape, but that was a  
while ago.






> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: benchmark-tasks.diff, LUCENE-1243.03.30.2008.patch, LUCENE-1243.patch
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Commented: (LUCENE-1243) A few new benchmark tasks

Mark Miller-3
And all that has been added is a better way to do the sort testing - its
a new doc maker that you lets pick a range of random numbers to generate
in a known sort field. The range of random ints to be generated can be
specified. Its fairly simple, but its a start that works, and if we need
more than the basics that are provided, they can be added easily later.
Unless there are suggestions for any changes wanted, I think its all
good to go in.

Grant Ingersoll (JIRA) wrote:

>     [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627773#action_12627773 ]
>
> Grant Ingersoll commented on LUCENE-1243:
> -----------------------------------------
>
> My last look at it seemed like it was in good shape, but that was a  
> while ago.
>
>
>
>
>
>
>  
>> A few new benchmark tasks
>> -------------------------
>>
>>                 Key: LUCENE-1243
>>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>>             Project: Lucene - Java
>>          Issue Type: Improvement
>>          Components: contrib/benchmark
>>            Reporter: Mark Miller
>>            Assignee: Grant Ingersoll
>>            Priority: Minor
>>             Fix For: 2.4
>>
>>         Attachments: benchmark-tasks.diff, LUCENE-1243.03.30.2008.patch, LUCENE-1243.patch
>>
>>
>> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
>> CommitIndexTask
>> ReopenReaderTask
>> SearchWithSortTask
>> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
>> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)
>>    
>
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (LUCENE-1243) A few new benchmark tasks

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll resolved LUCENE-1243.
-------------------------------------

       Resolution: Fixed
    Lucene Fields: [Patch Available]  (was: [Patch Available, New])

Committed revision 693495.

> A few new benchmark tasks
> -------------------------
>
>                 Key: LUCENE-1243
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1243
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Mark Miller
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: benchmark-tasks.diff, LUCENE-1243.03.30.2008.patch, LUCENE-1243.patch
>
>
> Some tasks that would be helpful to see added. Might want some expansion, but here are some basic ones I have been using:
> CommitIndexTask
> ReopenReaderTask
> SearchWithSortTask
> I do the sort in a similar way that the highlighting was done, but another method may be better. Just would be great to have sorting.
> Also, since there is no great field for sorting (reuters date always appears to be the same) I changed the id field from doc+id to just id. Again maybe not the best solution, but here I am to get the ball rolling :)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]