[jira] Created: (SOLR-215) Multiple Solr Cores

classic Classic list List threaded Threaded
68 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
Multiple Solr Cores
-------------------

                 Key: SOLR-215
                 URL: https://issues.apache.org/jira/browse/SOLR-215
             Project: Solr
          Issue Type: Improvement
            Reporter: Henri Biestro
            Priority: Minor


Allow multiple cores in one web-application (or one class-loader):
This allows to have multiple cores created from different config & schema in the same application.
The side effect is that this also allows different indexes.


Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-src.patch

The patch that allows multiple cores/indexes

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-533775.patch

The patch as it stands still requires some refactoring 'above' the Java core.
Although the 'single core' feature has been retained (aka the static SolrCore.getCore), the SolrConfig.config could not; the admin servlet has been modified accordingly.

Updated patch based on latest trunk.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Description:
Allow multiple cores in one web-application (or one class-loader):
This allows to have multiple cores created from different config & schema in the same application.
The side effect is that this also allows different indexes.

Implementation notes for the patch:
The patch allows to have multiple 'named' cores in the same application.
The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).

A few classes were only existing as singletons and have thus been refactored.
The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
The SolrCore is built from a SolrConfig & an IndexSchema.

The creation of a core has become:
//create a configuration
SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0");


There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.

Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

  was:
Allow multiple cores in one web-application (or one class-loader):
This allows to have multiple cores created from different config & schema in the same application.
The side effect is that this also allows different indexes.


Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro resolved SOLR-215.
--------------------------------

    Resolution: Fixed

junits & admin servlet (single core) test ok

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492743 ]

Hoss Man commented on SOLR-215:
-------------------------------

I'm confused ... why is this issue Resolved:Fixed ?

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Reopened: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic reopened SOLR-215:
-----------------------------------


I think Henri accidentally resolved this.  Reopening.
Btw. I'm *very* interested in serving multiple indices under a single Solr instance, possibly even embedded as described on the wiki or in LUCENE-212.  I may not find the time to look at the patch before next week, though.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic updated SOLR-215:
----------------------------------

    Comment: was deleted

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492836 ]

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

I think Henri accidentally resolved this. Reopening.
Btw. I'm *very* interested in serving multiple indices under a single Solr instance, possibly even embedded as described on the wiki or in SOLR-212. I may not find the time to look at the patch before next week, though.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-538091.patch

Updated for revision 538091

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499919 ]

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri - I'm starting to ook at this.  I see a lot of space changes in the patch.  Could you please generate a patch that doesn't have all those space changes?
When you generate a diff file for the patch, these may be handy parameters to use (I'm assuming you're going work under some kind of UNIX)

       -E  --ignore-tab-expansion
              Ignore changes due to tab expansion.

       -b  --ignore-space-change
              Ignore changes in the amount of white space.

       -w  --ignore-all-space
              Ignore all white space.

       -B  --ignore-blank-lines
              Ignore changes whose lines are all blank.

Thanks!
I just skimmed the patch and didn't see where the name of the index/core gets passed in the request.  Can you please point me to the right place to look?


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-542847.patch

A revised version of the patch based on revision 542847.

The patch was produced with the following command run from trunk directory:
svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N" > solr-trunk-542847.patch
This should take care of the white spaces as well as inclusion of new files.
All unit tests behave as in the single core version; 133 tests, 5 failures, 0 errors

The content of the patch also includes modifications to the admin, servlet & filters to accomodate the declaration & handling of multiple cores. The example conf & web.xml have been modified to declare 2 other cores (besides the default) named 'core0' and 'core1'.
The filter itself forwards to the proper servlet if no specific handler exists in the core configuration.
Example:
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
Will index the 2 documents in solr.xml & monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2 documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have different content.

Comments & advice welcome.



> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-542847-1.patch

Supersedes previous patches (including solr-trunk-542847.patch); all other attached patches should be ignored (& removed by anyone with proper permissions?).

Forgot to svn add some new files before creating the patch;
fixed a stupid logic error in SolrInit when parameters were missing;
added a way to get to the config & schema file names from a configured core.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500216 ]

Mike Klaas commented on SOLR-215:
---------------------------------

I haven't looked at the patch, but:

  - there are no current failures on trunk, save from a sporatic AutoCommitTest failure if the machine is heavily-loaded.  Are you testing this patch in the context of other local changes?
  - if you maintain the same name for subsequent versions of the patch, JIRA automatically keeps track of the most recent for you
  - personally, I find it helpful to check out a fresh copy of trunk and apply my patch and run the tests there.  It helps ferret out the problematic issues and oversights.

cheers

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch

Thanks Mike for your usefull advice;
I've corrected the (modified) tests so they are now behaving as the non-patched version do (aka no error nor failure, 133 tests); there were still some of them using the 'unnamed/null' core. My bad, thanks again for pointing it out.
The 'superseding' patch is now called solr-215.patch so JIRA should take care of keeping only the last version. (all others can be ignored & deleted).
This drop is based on svn revision 543145.


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch

New version of the patch that should be easier to verify.
Created with: svn diff  --diff-cmd /usr/bin/diff -x "-w -B -b -E -N -u" > ~/solr-215.patch
Verified it can be applied on clean trunk through: patch -u -p0 < ~/solr-215.patch

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Description:
What
-------
As of Solr 1.2, Solr only instantiates one SolrCore which handles one Lucene index. This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

Why
------
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents when needed. If you believe you need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and you functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

How
------
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.

Details (per package)
-----------------------------
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

Replication
----------------
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.

Future
---------
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; besides the upload mechanism itself which should be easy, the servlet filter would have to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

Misc
-------
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing.
3/ Apply the patch to the 'clean trunk'.
You can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
Alternatively, TortoiseSVN 'apply patch' command since the patch format is 'unified diff'.


  was:
Allow multiple cores in one web-application (or one class-loader):
This allows to have multiple cores created from different config & schema in the same application.
The side effect is that this also allows different indexes.

Implementation notes for the patch:
The patch allows to have multiple 'named' cores in the same application.
The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).

A few classes were only existing as singletons and have thus been refactored.
The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
The SolrCore is built from a SolrConfig & an IndexSchema.

The creation of a core has become:
//create a configuration
SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0");


There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.

Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355


Patch can now be installed on a clean trunk.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> What
> -------
> As of Solr 1.2, Solr only instantiates one SolrCore which handles one Lucene index. This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> Why
> ------
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents when needed. If you believe you need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and you functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355
> How
> ------
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> Details (per package)
> -----------------------------
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> Replication
> ----------------
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> Future
> ---------
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; besides the upload mechanism itself which should be easy, the servlet filter would have to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> Misc
> -------
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing.
> 3/ Apply the patch to the 'clean trunk'.
> You can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> Alternatively, TortoiseSVN 'apply patch' command since the patch format is 'unified diff'.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Description:
WHAT:
As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

WHY:
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.

HOW:
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.

USAGE (example web deployment, patch installed):
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
Will index the 2 documents in solr.xml & monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2 documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have different content.

USAGE (Java code):
//create a configuration
SolrConfig config = new SolrConfig("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0");

PATCH MODIFICATIONS DETAILS (per package):
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

REPLICATION:
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.

FUTURE:
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

MISC:
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
3/ Apply the patch to the 'clean trunk'.
TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.


  was:
What
-------
As of Solr 1.2, Solr only instantiates one SolrCore which handles one Lucene index. This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

Why
------
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents when needed. If you believe you need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and you functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

How
------
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.

Details (per package)
-----------------------------
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

Replication
----------------
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.

Future
---------
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; besides the upload mechanism itself which should be easy, the servlet filter would have to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

Misc
-------
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing.
3/ Apply the patch to the 'clean trunk'.
You can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
Alternatively, TortoiseSVN 'apply patch' command since the patch format is 'unified diff'.



Forgot usage example in description

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch

update to current trunk; patch generated from a Solaris Express 10 box.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506717 ]

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri, I think Toru is doing something useful in SOLR-255 - FederatedSearch over RMI + support for multiple local indices.  I think your work is overlapping a lot and you two need to sync, either working on a single patch or on multiple smaller patches with serial dependency.


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

1234