Double Solr Installation on Single Tomcat (or Double Index)

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Double Solr Installation on Single Tomcat (or Double Index)

Tom Weber
Hello,

   I need to have a second separate index (separate data) on the same  
server.

   Is there a possibility to do this in a single solr install on a  
tomcat server or do I need to have a second instance in the same  
tomcat install ?

   If either one is possible, does somebody has some advice how to  
set this up, and how to be sure that both indexes do not interact ?

   Many thanks for any help,

   Best Greetings,

   Tom
Reply | Threaded
Open this post in threaded view
|

Re: Double Solr Installation on Single Tomcat (or Double Index)

sangraal
I've set up 2 separate Solr indexes on one Tomcat instance. I basically
created two separate Solr webapps. I have one webapp that is the client to
both Solr instances as well. So the whole setup is 3 webapps.

I have one set of Solr source classes and an ant task to build a jar file
and copy it into the lib directory of both Solr webapps. This way if you
customize your Solr installs you only have to do it once. Each Solr webapp
obviously needs it's own solr config and data directories which is
configurable through solrConfig. Both indexes are completely separate and
configurable independently through these config files.

If you need more detail let me know, I'll try to help you out.

-S

On 9/6/06, Tom Weber <[hidden email]> wrote:

>
> Hello,
>
>    I need to have a second separate index (separate data) on the same
> server.
>
>    Is there a possibility to do this in a single solr install on a
> tomcat server or do I need to have a second instance in the same
> tomcat install ?
>
>    If either one is possible, does somebody has some advice how to
> set this up, and how to be sure that both indexes do not interact ?
>
>    Many thanks for any help,
>
>    Best Greetings,
>
>    Tom
>
Reply | Threaded
Open this post in threaded view
|

Re: Double Solr Installation on Single Tomcat (or Double Index)

Yonik Seeley-2
Another way to run multiple solr webapps with Tomcat involves context
fragments.  It allows you to use a single copy of the solr.war but
specify different configs (via different solrhomes).

http://wiki.apache.org/solr/SolrTomcat

-Yonik


On 9/6/06, sangraal aiken <[hidden email]> wrote:

> I've set up 2 separate Solr indexes on one Tomcat instance. I basically
> created two separate Solr webapps. I have one webapp that is the client to
> both Solr instances as well. So the whole setup is 3 webapps.
>
> I have one set of Solr source classes and an ant task to build a jar file
> and copy it into the lib directory of both Solr webapps. This way if you
> customize your Solr installs you only have to do it once. Each Solr webapp
> obviously needs it's own solr config and data directories which is
> configurable through solrConfig. Both indexes are completely separate and
> configurable independently through these config files.
>
> If you need more detail let me know, I'll try to help you out.
>
> -S
>
> On 9/6/06, Tom Weber <[hidden email]> wrote:
> >
> > Hello,
> >
> >    I need to have a second separate index (separate data) on the same
> > server.
> >
> >    Is there a possibility to do this in a single solr install on a
> > tomcat server or do I need to have a second instance in the same
> > tomcat install ?
> >
> >    If either one is possible, does somebody has some advice how to
> > set this up, and how to be sure that both indexes do not interact ?
> >
> >    Many thanks for any help,
> >
> >    Best Greetings,
> >
> >    Tom
> >
Reply | Threaded
Open this post in threaded view
|

Re: Double Solr Installation on Single Tomcat (or Double Index)

brzozek
In reply to this post by Tom Weber


Tom Weber napisaƂ(a):
> Hello,
>
>   I need to have a second separate index (separate data) on the same
> server.
>
>   Is there a possibility to do this in a single solr install on a
> tomcat server or do I need to have a second instance in the same
> tomcat install ?
>
You will need separate instances within the same Tomcat.

>   If either one is possible, does somebody has some advice how to set
> this up, and how to be sure that both indexes do not interact ?
>
Create context xml file for each SOLR application in the folder :
CATALINA_HOME\conf\Catalina\localhost\context_name.xml.



<Context docBase="${catalina.home}/..../solr.war" debug="0"
crossContext="true">
    <Environment name="solr/home" type="java.lang.String"
value="${catalina.home}\solr_data_files\" override="true" />
  </Context>

Adjust docBase to point at solr.war.
Adjust solr/home to point at solr_data_files - different folders for
different SOLR instances.

If the context xml file name is called solr1.xml then you can acces that
solr instance using following url  <a href="http://host:port/solr1/admin">http://host:port/solr1/admin.

>   Many thanks for any help,
>
>   Best Greetings,
>
>   Tom
>
>

----------------------------------------------------------------------
200 zlotych bonusu w eliminacjach do Mistrzostw Europy dla Ciebie!
BETWAY.com >>> http://link.interia.pl/f199f


Reply | Threaded
Open this post in threaded view
|

SolrCore as Singleton?

Joachim Martin
Is there a good reason for implementing SolrCore as a Singleton?

We are experimenting with running Solr as a Spring service embedded in
our app.  Since it is a Singleton
we cannot have more than one index (not currently  a problem, but could be).

I note the comment:

  // Singleton for now...

If there is no specific reason for making it a Singleton, I'd vote for
removing this so that the
SolrCore(dataDir, schema) constructor could be used to instantiate
multiple cores.

Seems to me that since the primary usage scenario of solr is access via
REST (i.e. no Solr jar/API),
the Singleton pattern is not necessary here.

--Joachim
Reply | Threaded
Open this post in threaded view
|

Re: SolrCore as Singleton?

Eivind Hasle Amundsen
> If there is no specific reason for making it a Singleton, I'd vote for
> removing this so that the
> SolrCore(dataDir, schema) constructor could be used to instantiate
> multiple cores.

I agree with your arguments. However (although being new to Solr) there
is more than one way to do it, I think.

To be more specific it seems that using several different indexes with
individual datadirs and schemas is very useful, based on my impression
that many enterprise users seem to want this functionality. It is not
difficult to imagine such a usage pattern or implementation, in its
abstract sense, of Solr for almost all uses.

However (and this is where most you guys should fill me in), it could be
wasteful to run multiple complete instances. Could information be shared
in some way between the instances to save on resources? Perhaps what I
am really trying to say here, is that we have to look at the whole model
when considering how to implement better support for the desired usage
pattern outlined above.

Eivind
Reply | Threaded
Open this post in threaded view
|

Re: SolrCore as Singleton?

Chris Hostetter-3
In reply to this post by Joachim Martin

: Is there a good reason for implementing SolrCore as a Singleton?

I'm going to sidestep the issue of wether there *was* a good reason for
it, as well as the "does the singleton pattern make sense for the current
usage" question and answer what i think is an equally significant
question: "what are the implications of trying to change it now?" ... the
biggest i can think of being that SolrConfig is also a static singleton,
and a *lot* of code in the Solr code base would need to be changed to
support multiple SolrConfigs ... and without multiple SolrConfigs, there
really isnt' any reason to have multiple SolrCores.



-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: SolrCore as Singleton?

Eivind Hasle Amundsen
Chris Hostetter wrote:
> I'm going to sidestep the issue of wether there *was* a good reason for
> it, as well as the "does the singleton pattern make sense for the current
> usage" question and answer what i think is an equally significant
> question: "what are the implications of trying to change it now?" ... the
> biggest i can think of being that SolrConfig is also a static singleton,
> and a *lot* of code in the Solr code base would need to be changed to
> support multiple SolrConfigs ... and without multiple SolrConfigs, there
> really isnt' any reason to have multiple SolrCores.

This actually underlines that my guess was right to a certain extent.
Changing from singleton is not straightforward.

I am currently in the startup phase of my thesis regarding open source
and enterprise search. After having worked at perhaps the leading major
enterprise search company, I have the impression that multiple
collections is a very common feature (and very sought-after). It is a
trend I see not just directly from my work, but most certainly also as a
result of enterprise search solutions becoming more common in general.

However I must say that Solr seems to be approaching the problem from a
very logical angle. What really is missing is a more abstract layer,
call it application framework, that probably will come afterwards
anyway. This will perhaps evolve naturally as part of the Solr project
at a later stage, or perhaps even as a separate open source project
building on Solr.

Until this framework is available with its appropriate configuration
files, administrator interface and so on in place, it seems a bit
unnatural to support multiple collections from the same application
instance.

Bottom line (for now): I think that users looking for enterprise search
solutions must have a simple way of creating multiple collections from
within the same application.

I apologize for my very philosophical e-mail, but I tend to become
somewhat visionary and conceptual after a few beers, and this might not
be the perfect forum for these discussions(?) :)

Eivind
Reply | Threaded
Open this post in threaded view
|

Re: SolrCore as Singleton?

Chris Hostetter-3

: I am currently in the startup phase of my thesis regarding open source
: and enterprise search. After having worked at perhaps the leading major
: enterprise search company, I have the impression that multiple
: collections is a very common feature (and very sought-after). It is a
: trend I see not just directly from my work, but most certainly also as a
: result of enterprise search solutions becoming more common in general.

SolrCore being a singleton doesn't prevent you from having multiple
collections per JVM -- you just need to run multiple instances of the
webapp within a single servlet container using JNDI to specify the
seperate solr.home directories, specifics for doing this in Tomcat are on
the wiki...
   http://wiki.apache.org/solr/SolrTomcat

: Until this framework is available with its appropriate configuration
: files, administrator interface and so on in place, it seems a bit
: unnatural to support multiple collections from the same application
: instance.
:
: Bottom line (for now): I think that users looking for enterprise search
: solutions must have a simple way of creating multiple collections from
: within the same application.

Well it's pretty easy right now to make a new collection -- it's
just two new files (solrconfig.xml and schema.xml)




-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: SolrCore as Singleton?

Otis Gospodnetic-2
Nice.  Is the same doable under Jetty? (never had to deal with JNDI under Jetty)

Otis

----- Original Message ----
From: Chris Hostetter <[hidden email]>
To: [hidden email]
Sent: Friday, September 8, 2006 1:46:19 AM
Subject: Re: SolrCore as Singleton?


: I am currently in the startup phase of my thesis regarding open source
: and enterprise search. After having worked at perhaps the leading major
: enterprise search company, I have the impression that multiple
: collections is a very common feature (and very sought-after). It is a
: trend I see not just directly from my work, but most certainly also as a
: result of enterprise search solutions becoming more common in general.

SolrCore being a singleton doesn't prevent you from having multiple
collections per JVM -- you just need to run multiple instances of the
webapp within a single servlet container using JNDI to specify the
seperate solr.home directories, specifics for doing this in Tomcat are on
the wiki...
   http://wiki.apache.org/solr/SolrTomcat

: Until this framework is available with its appropriate configuration
: files, administrator interface and so on in place, it seems a bit
: unnatural to support multiple collections from the same application
: instance.
:
: Bottom line (for now): I think that users looking for enterprise search
: solutions must have a simple way of creating multiple collections from
: within the same application.

Well it's pretty easy right now to make a new collection -- it's
just two new files (solrconfig.xml and schema.xml)




-Hoss




Reply | Threaded
Open this post in threaded view
|

Re: SolrCore as Singleton?

Chris Hostetter-3

: Nice.  Is the same doable under Jetty? (never had to deal with JNDI
: under Jetty)

i haven't tried it personally, but according to Yoav "reading" JNDI
options is part of hte Servlet Spec, and billa found a refrene to
useing "<env-entry>" to do so...

http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html

...where exactly that option goes in Jetty's configuration isn't something
i'm clear on.


: ----- Original Message ----
: From: Chris Hostetter <[hidden email]>
: To: [hidden email]
: Sent: Friday, September 8, 2006 1:46:19 AM
: Subject: Re: SolrCore as Singleton?
:
:
: : I am currently in the startup phase of my thesis regarding open source
: : and enterprise search. After having worked at perhaps the leading major
: : enterprise search company, I have the impression that multiple
: : collections is a very common feature (and very sought-after). It is a
: : trend I see not just directly from my work, but most certainly also as a
: : result of enterprise search solutions becoming more common in general.
:
: SolrCore being a singleton doesn't prevent you from having multiple
: collections per JVM -- you just need to run multiple instances of the
: webapp within a single servlet container using JNDI to specify the
: seperate solr.home directories, specifics for doing this in Tomcat are on
: the wiki...
:    http://wiki.apache.org/solr/SolrTomcat
:
: : Until this framework is available with its appropriate configuration
: : files, administrator interface and so on in place, it seems a bit
: : unnatural to support multiple collections from the same application
: : instance.
: :
: : Bottom line (for now): I think that users looking for enterprise search
: : solutions must have a simple way of creating multiple collections from
: : within the same application.
:
: Well it's pretty easy right now to make a new collection -- it's
: just two new files (solrconfig.xml and schema.xml)
:
:
:
:
: -Hoss
:
:
:
:



-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: SolrCore as Singleton?

Andrew May
Chris Hostetter wrote:

> : Nice.  Is the same doable under Jetty? (never had to deal with JNDI
> : under Jetty)
>
> i haven't tried it personally, but according to Yoav "reading" JNDI
> options is part of hte Servlet Spec, and billa found a refrene to
> useing "<env-entry>" to do so...
>
> http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html
>
> ...where exactly that option goes in Jetty's configuration isn't something
> i'm clear on.
>

<env-entry> values go in web.xml, so it would mean having modified versions of solr.war
for each collection.

<env-entry> is an optional part of the Servlet spec for standalone servlet
implementations. The basic version of Jetty does not have any JNDI support, you need to
use JettyPlus (http://jetty.mortbay.org/jetty5/plus/index.html) for that.

-Andrew
Reply | Threaded
Open this post in threaded view
|

Got it working! And some questions

Michael Imbeault
In reply to this post by Tom Weber
First of all, in reference to
http://www.mail-archive.com/solr-user@.../msg00808.html ,
I got it working! The problem(s) was coming from solPHP; the
implementation in the wiki isn't really working, to be honest, at least
for me. I had to modify it significantly at multiple places to get it
working. Tomcat 5.5, WAMP and Windows XP.

The main problem was that addIndex was sending 1 doc at a time to solr;
it would cause a problem after a few thousand docs because i was running
out of resources. I modified solr_update.php to handle batch queries,
and i'm now sending batches of 1000 docs at a time. Great indexing speed.

Had a slight problem with the curl function of solr_update.php; the
custom HTTP header wasn't recognized; I now use curl_setopt($ch,
CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); -
much simpler, and now everything works!

Up so far I indexed 15.000.000 documents (my whole collection,
basically) and the performance i'm getting is INCREDIBLE (sub 100ms
query time without warmup and no optimization at all on a 7 gigs index -
and with the cache, it gets stupid fast)! Seriously, Solr amaze me every
time I use it. I increased HashDocSet Maxsize to 75000, will continue to
optimize this value - it helped a great deal. I will try disMaxHandler
soon too; right now the standard one is great. And I will index with a
better stopword file; the default one could really use improvements.

Some questions (couldn't find the answer in the docs):

- Is the solr php in the wiki working out of the box for anyone? Else we
could modify the wiki...

- What is the loadFactor variable of HashDocSet? Should I optimize it too?

- What's the units on the size value of the caches? Megs, number of
queries, kilobytes? Not described anywhere.

- Any way to programatically change the OR/AND preference of the query
parser? I set it to AND by default for user queries, but i'd like to set
it to OR for some server-side queries I must do (find related articles,
order by score).

- Whats the difference between the 2 commits type? Blocking and
non-blocking. Didn't see any differences at all, tried both.

- Every time I do an <optimize> command, I get the following in my
catalina logs - should I do anything about it?

 9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
SEVERE: Exception during commit/optimize:java.io.EOFException: no more
data available - expected end tag </optimize> to close start tag
<optimize> from line 1, parser stopped on START_TAG seen <optimize>... @1:10

- Any benefits of setting the allowed memory for Tomcat higher? Right
now im allocating 384 megs.

Can't wait to try the new Faceted Queries... seriously, solr is really,
really awesome up so far. Thanks for all your work, and sorry for all
the questions!

--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212

Reply | Threaded
Open this post in threaded view
|

RE: Got it working! And some questions

Brian Lucas
Hi Michael,

I apologize for the lack of testing on the SolPHP.  I had to "strip" it down
significantly to turn it into a general class that would be usable and the
version up there has not been extensively tested yet (I'm almost ready to
get back to that and "revise" it), plus much of my coding is done in Rails
at the moment.  However...

If you have a new version, could you send it over my way or just upload it
to the wiki?  I'd like to take a look at the changes and throw your revised
version up there or integrate both versions into a cleaner revision of the
version already there.

With respect to batch queries, it's already designed to do that (that's why
you see "array($array)" in the example, because it accepts an array of
updates) but I'd definitely like to see how you revised it.

Thanks,
Brian


-----Original Message-----
From: Michael Imbeault [mailto:[hidden email]]
Sent: Saturday, September 09, 2006 12:30 PM
To: [hidden email]
Subject: Got it working! And some questions

First of all, in reference to
http://www.mail-archive.com/solr-user@.../msg00808.html ,
I got it working! The problem(s) was coming from solPHP; the
implementation in the wiki isn't really working, to be honest, at least
for me. I had to modify it significantly at multiple places to get it
working. Tomcat 5.5, WAMP and Windows XP.

The main problem was that addIndex was sending 1 doc at a time to solr;
it would cause a problem after a few thousand docs because i was running
out of resources. I modified solr_update.php to handle batch queries,
and i'm now sending batches of 1000 docs at a time. Great indexing speed.

Had a slight problem with the curl function of solr_update.php; the
custom HTTP header wasn't recognized; I now use curl_setopt($ch,
CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); -
much simpler, and now everything works!

Up so far I indexed 15.000.000 documents (my whole collection,
basically) and the performance i'm getting is INCREDIBLE (sub 100ms
query time without warmup and no optimization at all on a 7 gigs index -
and with the cache, it gets stupid fast)! Seriously, Solr amaze me every
time I use it. I increased HashDocSet Maxsize to 75000, will continue to
optimize this value - it helped a great deal. I will try disMaxHandler
soon too; right now the standard one is great. And I will index with a
better stopword file; the default one could really use improvements.

Some questions (couldn't find the answer in the docs):

- Is the solr php in the wiki working out of the box for anyone? Else we
could modify the wiki...

- What is the loadFactor variable of HashDocSet? Should I optimize it too?

- What's the units on the size value of the caches? Megs, number of
queries, kilobytes? Not described anywhere.

- Any way to programatically change the OR/AND preference of the query
parser? I set it to AND by default for user queries, but i'd like to set
it to OR for some server-side queries I must do (find related articles,
order by score).

- Whats the difference between the 2 commits type? Blocking and
non-blocking. Didn't see any differences at all, tried both.

- Every time I do an <optimize> command, I get the following in my
catalina logs - should I do anything about it?

 9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
SEVERE: Exception during commit/optimize:java.io.EOFException: no more
data available - expected end tag </optimize> to close start tag
<optimize> from line 1, parser stopped on START_TAG seen <optimize>... @1:10

- Any benefits of setting the allowed memory for Tomcat higher? Right
now im allocating 384 megs.

Can't wait to try the new Faceted Queries... seriously, solr is really,
really awesome up so far. Thanks for all your work, and sorry for all
the questions!

--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212

Reply | Threaded
Open this post in threaded view
|

Re: Re: SolrCore as Singleton?

Tim Archambault-2
In reply to this post by Andrew May
In regard to the comment about lack of an interface, I view this as a
benefit of the tool.

Whether I'm developing with Python, PHP, Coldfusion, .NET, Java, etc.
I can create my own customizable interface. As a coldfusion programmer
with moderate programming capabilities, this tool is perfect for my
needs.



On 9/8/06, Andrew May <[hidden email]> wrote:

> Chris Hostetter wrote:
> > : Nice.  Is the same doable under Jetty? (never had to deal with JNDI
> > : under Jetty)
> >
> > i haven't tried it personally, but according to Yoav "reading" JNDI
> > options is part of hte Servlet Spec, and billa found a refrene to
> > useing "<env-entry>" to do so...
> >
> > http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html
> >
> > ...where exactly that option goes in Jetty's configuration isn't something
> > i'm clear on.
> >
>
> <env-entry> values go in web.xml, so it would mean having modified versions of solr.war
> for each collection.
>
> <env-entry> is an optional part of the Servlet spec for standalone servlet
> implementations. The basic version of Jetty does not have any JNDI support, you need to
> use JettyPlus (http://jetty.mortbay.org/jetty5/plus/index.html) for that.
>
> -Andrew
>
Reply | Threaded
Open this post in threaded view
|

Re: SolrCore as Singleton?

Eivind Hasle Amundsen
Tim Archambault wrote:
> In regard to the comment about lack of an interface, I view this as a
> benefit of the tool.
>
> Whether I'm developing with Python, PHP, Coldfusion, .NET, Java, etc.
> I can create my own customizable interface. As a coldfusion programmer
> with moderate programming capabilities, this tool is perfect for my
> needs.

That's good to hear. I never meant that a GUI should replace anything at
all. Did it come out that way?

As the product evolves, it is only natural to add capabilities and
features. Some of these should be available from different interfaces,
including GUI(s). However one should be able to interface with the
application at different levels. When Solr gets more complex over time,
care must be taken so it does not get complicated. There might be
numerous more points of entry into a more complex product. It is
necessary to keep things simple as well as providing centralized
configuration possibilities. Following this philosophy, Solr users will
be able to choose their level of interaction.

(In a metaphor, some people prefer using GNU/Linux just by installing a
distro; others compile and become best friends with the command line.)

Eivind
Reply | Threaded
Open this post in threaded view
|

Re: Got it working! And some questions

James liu-2
In reply to this post by Brian Lucas
- Is the solr php in the wiki working out of the box for anyone?
show your php.ini. did you performance your php?




2006/9/10, Brian Lucas <[hidden email]>:

>
> Hi Michael,
>
> I apologize for the lack of testing on the SolPHP.  I had to "strip" it
> down
> significantly to turn it into a general class that would be usable and the
> version up there has not been extensively tested yet (I'm almost ready to
> get back to that and "revise" it), plus much of my coding is done in Rails
> at the moment.  However...
>
> If you have a new version, could you send it over my way or just upload it
> to the wiki?  I'd like to take a look at the changes and throw your
> revised
> version up there or integrate both versions into a cleaner revision of the
> version already there.
>
> With respect to batch queries, it's already designed to do that (that's
> why
> you see "array($array)" in the example, because it accepts an array of
> updates) but I'd definitely like to see how you revised it.
>
> Thanks,
> Brian
>
>
> -----Original Message-----
> From: Michael Imbeault [mailto:[hidden email]]
> Sent: Saturday, September 09, 2006 12:30 PM
> To: [hidden email]
> Subject: Got it working! And some questions
>
> First of all, in reference to
> http://www.mail-archive.com/solr-user@.../msg00808.html ,
> I got it working! The problem(s) was coming from solPHP; the
> implementation in the wiki isn't really working, to be honest, at least
> for me. I had to modify it significantly at multiple places to get it
> working. Tomcat 5.5, WAMP and Windows XP.
>
> The main problem was that addIndex was sending 1 doc at a time to solr;
> it would cause a problem after a few thousand docs because i was running
> out of resources. I modified solr_update.php to handle batch queries,
> and i'm now sending batches of 1000 docs at a time. Great indexing speed.
>
> Had a slight problem with the curl function of solr_update.php; the
> custom HTTP header wasn't recognized; I now use curl_setopt($ch,
> CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); -
> much simpler, and now everything works!
>
> Up so far I indexed 15.000.000 documents (my whole collection,
> basically) and the performance i'm getting is INCREDIBLE (sub 100ms
> query time without warmup and no optimization at all on a 7 gigs index -
> and with the cache, it gets stupid fast)! Seriously, Solr amaze me every
> time I use it. I increased HashDocSet Maxsize to 75000, will continue to
> optimize this value - it helped a great deal. I will try disMaxHandler
> soon too; right now the standard one is great. And I will index with a
> better stopword file; the default one could really use improvements.
>
> Some questions (couldn't find the answer in the docs):
>
> - Is the solr php in the wiki working out of the box for anyone? Else we
> could modify the wiki...
>
> - What is the loadFactor variable of HashDocSet? Should I optimize it too?
>
> - What's the units on the size value of the caches? Megs, number of
> queries, kilobytes? Not described anywhere.
>
> - Any way to programatically change the OR/AND preference of the query
> parser? I set it to AND by default for user queries, but i'd like to set
> it to OR for some server-side queries I must do (find related articles,
> order by score).
>
> - Whats the difference between the 2 commits type? Blocking and
> non-blocking. Didn't see any differences at all, tried both.
>
> - Every time I do an <optimize> command, I get the following in my
> catalina logs - should I do anything about it?
>
> 9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
> SEVERE: Exception during commit/optimize:java.io.EOFException: no more
> data available - expected end tag </optimize> to close start tag
> <optimize> from line 1, parser stopped on START_TAG seen <optimize>...
> @1:10
>
> - Any benefits of setting the allowed memory for Tomcat higher? Right
> now im allocating 384 megs.
>
> Can't wait to try the new Faceted Queries... seriously, solr is really,
> really awesome up so far. Thanks for all your work, and sorry for all
> the questions!
>
> --
> Michael Imbeault
> CHUL Research Center (CHUQ)
> 2705 boul. Laurier
> Ste-Foy, QC, Canada, G1V 4G2
> Tel: (418) 654-2705, Fax: (418) 654-2212
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Got it working! And some questions

Chris Hostetter-3
In reply to this post by Michael Imbeault

: - What is the loadFactor variable of HashDocSet? Should I optimize it too?

this is the same as the loadFactor in a HashMap constructor -- but i don't
think it has much affect on performance since the HashDocSets never
"grow".

I personally have never tuned the loadFactor :)

: - What's the units on the size value of the caches? Megs, number of
: queries, kilobytes? Not described anywhere.

"entries" ... the number of items allowed in the cache.

: - Any way to programatically change the OR/AND preference of the query
: parser? I set it to AND by default for user queries, but i'd like to set
: it to OR for some server-side queries I must do (find related articles,
: order by score).

you mean using StandardRequestHandler? ... not that i can think of off the
top of my head, but typicaly it makes sense to just configure what you
want for your "users" in the schema, and then make any machine generated
queries be explicit.

: - Whats the difference between the 2 commits type? Blocking and
: non-blocking. Didn't see any differences at all, tried both.

do you mean the waitFlush and waitSearcher options?
if either of those is true, you shouldn't get a response back from the
server untill they have finished.  if they are false, then the server
should respond instantly even if it takes several seconds (or maybe even
minutes) to complete the operation (optimizes can take a while in some
cases -- as can opening newSearchers if you have a lot of cache warming
configured)

: - Every time I do an <optimize> command, I get the following in my
: catalina logs - should I do anything about it?

the optimize command needs to be well formed XML, try "<optimize/>"
instead of just "<optimize>"

: - Any benefits of setting the allowed memory for Tomcat higher? Right
: now im allocating 384 megs.

the more memory you've got, the more cachng you can support .. but if
your index changes so frequently compared to the rate of *unique*
queries you get that your caches never fill up, it may not matter.




-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: Got it working! And some questions

Michael Imbeault
First of all, it seems the mailing list is having some troubles? Some of
my posts end up in the wrong thread (even new threads I post), I don't
receive them in my mail, and they're present only in the 'date archive'
of http://www.mail-archive.com, and not in the 'thread' one? I don't
receive some of the other peoples post in my mail too, problems started
last week I think.

Secondly, Chris, thanks for all the useful answers, everything is much
clearer now. This info should be added to the wiki I think; should I do
it? I'm still a little disappointed that I can't change the OR/AND
parsing by just changing some parameter (like I can do for the number of
results returned, for example); adding a OR between each word in the
text i want to compare sounds suboptimal, but i'll probably do it that
way; its a very minor nitpick, solr is awesome, as I said before.

@ Brian Lucas: Don't worry, solrPHP was still 99.9% functional, great
work; part of it sending a doc at a time was my fault; I was following
the exact sequence (add to array, submit) displayed in the docs. The
only thing that could be added is a big "//TODO: change this code"
before sections you have to change to make it work for a particular
schema. I'm pretty sure the custom header curl submit works for everyone
else than me; I'm on a windows test box with WAMP on it, so it may be
caused by that. I'll send you tomorrow the changes I done to the code
anyway; as I said, nothing major.

Chris Hostetter wrote:

> : - What is the loadFactor variable of HashDocSet? Should I optimize it too?
>
> this is the same as the loadFactor in a HashMap constructor -- but i don't
> think it has much affect on performance since the HashDocSets never
> "grow".
>
> I personally have never tuned the loadFactor :)
>
> : - What's the units on the size value of the caches? Megs, number of
> : queries, kilobytes? Not described anywhere.
>
> "entries" ... the number of items allowed in the cache.
>
> : - Any way to programatically change the OR/AND preference of the query
> : parser? I set it to AND by default for user queries, but i'd like to set
> : it to OR for some server-side queries I must do (find related articles,
> : order by score).
>
> you mean using StandardRequestHandler? ... not that i can think of off the
> top of my head, but typicaly it makes sense to just configure what you
> want for your "users" in the schema, and then make any machine generated
> queries be explicit.
>
> : - Whats the difference between the 2 commits type? Blocking and
> : non-blocking. Didn't see any differences at all, tried both.
>
> do you mean the waitFlush and waitSearcher options?
> if either of those is true, you shouldn't get a response back from the
> server untill they have finished.  if they are false, then the server
> should respond instantly even if it takes several seconds (or maybe even
> minutes) to complete the operation (optimizes can take a while in some
> cases -- as can opening newSearchers if you have a lot of cache warming
> configured)
>
> : - Every time I do an <optimize> command, I get the following in my
> : catalina logs - should I do anything about it?
>
> the optimize command needs to be well formed XML, try "<optimize/>"
> instead of just "<optimize>"
>
> : - Any benefits of setting the allowed memory for Tomcat higher? Right
> : now im allocating 384 megs.
>
> the more memory you've got, the more cachng you can support .. but if
> your index changes so frequently compared to the rate of *unique*
> queries you get that your caches never fill up, it may not matter.
>
>
>
>
> -Hoss
>  
--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212

Reply | Threaded
Open this post in threaded view
|

Re: Got it working! And some questions

Chris Hostetter-3

: First of all, it seems the mailing list is having some troubles? Some of
: my posts end up in the wrong thread (even new threads I post), I don't
: receive them in my mail, and they're present only in the 'date archive'
: of http://www.mail-archive.com, and not in the 'thread' one? I don't
: receive some of the other peoples post in my mail too, problems started
: last week I think.

i haven't noticed any problems with mail not making it through - some mail
clients (gmail for example) seem to supress messages they can tell you
sent, maybe that'swhat's happening on your end?  As for
threads you start not showing up on the "thread" list ... according to
my mailbox, all but one message i've recieved from you included a
"References:" header (if not a In-Reply-To header) which causes some mail
archivers to assume it's part of an existing thread (this thread for
instance is considered part of the "Double Solr Installation on Single
Tomcat (or Double Index)" thread) ... you may wnat to experiement with
your mail client (off list) to see if you can figure out when/why this
happening.

: Secondly, Chris, thanks for all the useful answers, everything is much
: clearer now. This info should be added to the wiki I think; should I do

feel free ... that's why it's a wiki.

: it? I'm still a little disappointed that I can't change the OR/AND
: parsing by just changing some parameter (like I can do for the number of
: results returned, for example); adding a OR between each word in the
: text i want to compare sounds suboptimal, but i'll probably do it that
: way; its a very minor nitpick, solr is awesome, as I said before.

it would be a fairly simple option to add just like changing the
default field (patches welcome!) but as i said -- typcially if you don't
want the default behavior you are programaticaly generating the query
anyway, and already adding some markup, a little more doesn't make it less
optimal.





-Hoss

12