[jira] Created: (SOLR-396) tool to auto generate stub analysis factories

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (SOLR-396) tool to auto generate stub analysis factories

kuladeep (Jira)
tool to auto generate stub analysis factories
---------------------------------------------

                 Key: SOLR-396
                 URL: https://issues.apache.org/jira/browse/SOLR-396
             Project: Solr
          Issue Type: Improvement
            Reporter: Hoss Man
            Assignee: Hoss Man
            Priority: Minor


a pet project i've bene working on in some spare time has been looking at source code and byte code analysis toolkits with the goal of being able to write a tool that could be pointed at a jar, and it would generate stub Factories for any TokenFilter or Tokenizer classes it found not already in Solr.

in the end, it looks like a combination of reflection and some simple pattern matching is actually the best way to go (byte code loses info about param names, and reflection saves a lot of the hassle involved in pure source code analysis)

i've got a proof of concept ready that i'll attach shortly.  I hope to have some time next week to resubmit this as a patch that integrates it with the solr build.xml in such a way that anytime we add/update a lucene jar, we can run "ant stub-factories" and have 99% of the work done for us.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-396) tool to auto generate stub analysis factories

kuladeep (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-396:
--------------------------

    Attachment: factory-stub.tgz

proof of concept, you'll need to tweak the build.xml to know where you have Solr and lucene-java checked out to try it.

> tool to auto generate stub analysis factories
> ---------------------------------------------
>
>                 Key: SOLR-396
>                 URL: https://issues.apache.org/jira/browse/SOLR-396
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>            Priority: Minor
>         Attachments: factory-stub.tgz
>
>
> a pet project i've bene working on in some spare time has been looking at source code and byte code analysis toolkits with the goal of being able to write a tool that could be pointed at a jar, and it would generate stub Factories for any TokenFilter or Tokenizer classes it found not already in Solr.
> in the end, it looks like a combination of reflection and some simple pattern matching is actually the best way to go (byte code loses info about param names, and reflection saves a lot of the hassle involved in pure source code analysis)
> i've got a proof of concept ready that i'll attach shortly.  I hope to have some time next week to resubmit this as a patch that integrates it with the solr build.xml in such a way that anytime we add/update a lucene jar, we can run "ant stub-factories" and have 99% of the work done for us.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-396) tool to auto generate stub analysis factories

kuladeep (Jira)
In reply to this post by kuladeep (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-396:
--------------------------

    Attachment: SOLR-396.patch

patch that takes the previous proof of concept and integrates it into the solr build.xml as a new "stub-factories" target.

patch also includes many new factories produced by this target, included some Russian and Greek factories that were stubs i filled in with some "char[] charset" selection args (not that i really understand how/why these filters use these char[]s ... it's all unicode in the jvm right? but they key is that the factories support all the options the filters support).



> tool to auto generate stub analysis factories
> ---------------------------------------------
>
>                 Key: SOLR-396
>                 URL: https://issues.apache.org/jira/browse/SOLR-396
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>            Priority: Minor
>         Attachments: factory-stub.tgz, SOLR-396.patch
>
>
> a pet project i've bene working on in some spare time has been looking at source code and byte code analysis toolkits with the goal of being able to write a tool that could be pointed at a jar, and it would generate stub Factories for any TokenFilter or Tokenizer classes it found not already in Solr.
> in the end, it looks like a combination of reflection and some simple pattern matching is actually the best way to go (byte code loses info about param names, and reflection saves a lot of the hassle involved in pure source code analysis)
> i've got a proof of concept ready that i'll attach shortly.  I hope to have some time next week to resubmit this as a patch that integrates it with the solr build.xml in such a way that anytime we add/update a lucene jar, we can run "ant stub-factories" and have 99% of the work done for us.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (SOLR-396) tool to auto generate stub analysis factories

kuladeep (Jira)
In reply to this post by kuladeep (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man resolved SOLR-396.
---------------------------

       Resolution: Fixed
    Fix Version/s: 1.3

Committed revision 593359.

documented for future use on wiki...
http://wiki.apache.org/solr/CommitterInfo

> tool to auto generate stub analysis factories
> ---------------------------------------------
>
>                 Key: SOLR-396
>                 URL: https://issues.apache.org/jira/browse/SOLR-396
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: factory-stub.tgz, SOLR-396.patch
>
>
> a pet project i've bene working on in some spare time has been looking at source code and byte code analysis toolkits with the goal of being able to write a tool that could be pointed at a jar, and it would generate stub Factories for any TokenFilter or Tokenizer classes it found not already in Solr.
> in the end, it looks like a combination of reflection and some simple pattern matching is actually the best way to go (byte code loses info about param names, and reflection saves a lot of the hassle involved in pure source code analysis)
> i've got a proof of concept ready that i'll attach shortly.  I hope to have some time next week to resubmit this as a patch that integrates it with the solr build.xml in such a way that anytime we add/update a lucene jar, we can run "ant stub-factories" and have 99% of the work done for us.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.