[jira] Created: (SOLR-215) Multiple Solr Cores

classic Classic list List threaded Threaded
68 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores - remove static singleton

Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-215:
-------------------------------

    Summary: Multiple Solr Cores - remove static singleton  (was: Multiple Solr Cores)

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-215) Multiple Solr Cores - remove static singleton

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-215:
--------------------------

    Fix Version/s: 1.3

marking Fixed in 1.3

(I believe Ryan left this open to track any potential issues ...  if nothing else this way we'll remember to resolve it before releasing)

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529509 ]

Yonik Seeley commented on SOLR-215:
-----------------------------------

FYI, firstSearcher/newSearcher hooks are now broken because the constructors to AbstractSolrEventListener was changed to take a SolrCore, and the code in SolrCore that creates event listeners does a simple newInstance()

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529625 ]

Henri Biestro commented on SOLR-215:
------------------------------------

Replacing the line
          SolrEventListener listener = (SolrEventListener)solrConfig.newInstance(className);
With
          SolrEventListener listener = createEventListener(className);
should fix it.

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529668 ]

Ryan McKinley commented on SOLR-215:
------------------------------------

fixed the SolrEventListener issue in rev578451


> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534433 ]

Ryan McKinley commented on SOLR-215:
------------------------------------

Mike Klass points out a BIG BAD problem with this patch:
http://www.nabble.com/Deprecations-and-SolrConfig-patch-tf4611038.html

The token filter interface keeps:
@Deprecated
  public void init(Map<String,String> args) {
    log.warning("calling the deprecated form of init; should be calling init(SolrConfig solrConfig, Map<String,String> args)");
    this.args=args;
  }

but this is never called, so it only tricks us into thinking it is backwards compatible.

Options:
1. Break the API -- at least no one would get fooled into thinking it works

2. Add some hacky bits to IndexSchema readTokenFilterFactory that first calls the deprecated init, then calls the 'real' one. -- make some clear statemes somewhere about how this works and how it will go away.

I don't have time to look at this for another week or so, but it is very important.  Henri, if you have some time, it would be great if you could take a look at some options.

ryan


> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534437 ]

Ryan McKinley commented on SOLR-215:
------------------------------------

Ok, I liked... fixing this is not hard.

Deprecation support was already baked into IndexSchema:

    TokenFilterFactory tfac = (TokenFilterFactory)solrConfig.newInstance(className);
    if (tfac instanceof SolrConfig.Initializable)
      ((SolrConfig.Initializable)tfac).init(solrConfig, DOMUtil.toMapExcept(attrs,"class"));
    else
      tfac.init(DOMUtil.toMapExcept(attrs,"class"));

the problem is that BaseTokenizerFactory and BaseTokenFilterFactory both implement SolrConfig.Initializable so the IndexSchema assumes they are using the new interface.  If someone extends something from these Base classes it is not called.

the fix is simply to call init( args ) from within init( config, args ) -- I'll remove the warning message since that will be called by default now.

ryan

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (SOLR-215) Multiple Solr Cores - remove static singleton

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley resolved SOLR-215.
--------------------------------

    Resolution: Fixed
      Assignee: Ryan McKinley

This was committed a while ago.  If it causes any problems, we should open a new issue to track progress.

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Assignee: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url.
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content.
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate.
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

1234