Solr vs. Compass

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Solr vs. Compass

Ken Lane (kenlane)
We are knee-deep in a Solr project to provide a web services layer
between our Oracle DB's and a web front end to be named later  to
supplement our numerous Business Intelligence dashboards. Someone from a
peer group questioned why we selected Solr rather than Compass to start
development. The real reason is that we had not heard of Compass until
that comment. Now I need to come up with a better answer.

 

Does anyone out there have experience in both approaches who might be
able to give a quick compare and contrast?

 

Thanks in advance,

Ken

Reply | Threaded
Open this post in threaded view
|

Re: Solr vs. Compass

Lukáš Vlček
Hi,

I think that these products do not compete directly that much, each fit
different business case. Can you tell us more about our specific situation?
What do you need to search and where your data is? (DB, Filesystem, Web
...?)

Solr provides some specific extensions which are not supported directly by
Lucene (faceted search, DisMax... etc) so if you need these then your bet on
Compass might not be perfect. On the other hand if you need to index
persistent Java objects then Compass fits perfectly into this scenario (and
if you are using Spring and JPA then setting up search can be matter of
several modifications to configuration and annotations).

Compass is more Hibernate search competitor (but Compass is not limited to
Hibernate only and is not even limited to DB content as well).

Regards,
Lukas


On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane) <[hidden email]>wrote:

> We are knee-deep in a Solr project to provide a web services layer
> between our Oracle DB's and a web front end to be named later  to
> supplement our numerous Business Intelligence dashboards. Someone from a
> peer group questioned why we selected Solr rather than Compass to start
> development. The real reason is that we had not heard of Compass until
> that comment. Now I need to come up with a better answer.
>
>
>
> Does anyone out there have experience in both approaches who might be
> able to give a quick compare and contrast?
>
>
>
> Thanks in advance,
>
> Ken
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr vs. Compass

Uri Boness
In addition, the biggest appealing feature in Compass is that it's
transactional and therefore integrates well with your infrastructure
(Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some
systems (not very large scale ones) and the programming model is clean.
On the other hand, Solr scales much better and provides a load of
functionality that otherwise you'll have to custom build on top of
Compass/Lucene.

Lukáš Vlček wrote:

> Hi,
>
> I think that these products do not compete directly that much, each fit
> different business case. Can you tell us more about our specific situation?
> What do you need to search and where your data is? (DB, Filesystem, Web
> ...?)
>
> Solr provides some specific extensions which are not supported directly by
> Lucene (faceted search, DisMax... etc) so if you need these then your bet on
> Compass might not be perfect. On the other hand if you need to index
> persistent Java objects then Compass fits perfectly into this scenario (and
> if you are using Spring and JPA then setting up search can be matter of
> several modifications to configuration and annotations).
>
> Compass is more Hibernate search competitor (but Compass is not limited to
> Hibernate only and is not even limited to DB content as well).
>
> Regards,
> Lukas
>
>
> On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane) <[hidden email]>wrote:
>
>  
>> We are knee-deep in a Solr project to provide a web services layer
>> between our Oracle DB's and a web front end to be named later  to
>> supplement our numerous Business Intelligence dashboards. Someone from a
>> peer group questioned why we selected Solr rather than Compass to start
>> development. The real reason is that we had not heard of Compass until
>> that comment. Now I need to come up with a better answer.
>>
>>
>>
>> Does anyone out there have experience in both approaches who might be
>> able to give a quick compare and contrast?
>>
>>
>>
>> Thanks in advance,
>>
>> Ken
>>
>>
>>    
>
>  
Reply | Threaded
Open this post in threaded view
|

Re: Solr vs. Compass

Erick Erickson
In reply to this post by Ken Lane (kenlane)
SOLR is, first and foremost, a text searching tool that scales. Are
you searching lots of text here or not? There are situations
in which you need both in order to accomplish your business
needs, so asking "which one is best" is tricky to answer....

FWIW
Erick

On Thu, Jan 21, 2010 at 10:40 AM, Ken Lane (kenlane) <[hidden email]>wrote:

> We are knee-deep in a Solr project to provide a web services layer
> between our Oracle DB's and a web front end to be named later  to
> supplement our numerous Business Intelligence dashboards. Someone from a
> peer group questioned why we selected Solr rather than Compass to start
> development. The real reason is that we had not heard of Compass until
> that comment. Now I need to come up with a better answer.
>
>
>
> Does anyone out there have experience in both approaches who might be
> able to give a quick compare and contrast?
>
>
>
> Thanks in advance,
>
> Ken
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Ken Lane (kenlane)
In reply to this post by Uri Boness
Uri, Lucas,

Thanks for your feedback. To clarify on some specifics,

1. Yes, faceted search and DisMax are very imortant to this project.
2. Our data is imported from Oracle tables. (Unstructured sources maybe later). We manufacture each document from DB queries.
3. Our platform won't be transactional, we will update the indexes periodically throughout the day probably via dataimport handler.
 
Regards, Ken

-----Original Message-----
From: Uri Boness [mailto:[hidden email]]
Sent: Thursday, January 21, 2010 11:35 AM
To: [hidden email]
Subject: Re: Solr vs. Compass

In addition, the biggest appealing feature in Compass is that it's
transactional and therefore integrates well with your infrastructure
(Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some
systems (not very large scale ones) and the programming model is clean.
On the other hand, Solr scales much better and provides a load of
functionality that otherwise you'll have to custom build on top of
Compass/Lucene.

Lukáš Vlček wrote:

> Hi,
>
> I think that these products do not compete directly that much, each fit
> different business case. Can you tell us more about our specific situation?
> What do you need to search and where your data is? (DB, Filesystem, Web
> ...?)
>
> Solr provides some specific extensions which are not supported directly by
> Lucene (faceted search, DisMax... etc) so if you need these then your bet on
> Compass might not be perfect. On the other hand if you need to index
> persistent Java objects then Compass fits perfectly into this scenario (and
> if you are using Spring and JPA then setting up search can be matter of
> several modifications to configuration and annotations).
>
> Compass is more Hibernate search competitor (but Compass is not limited to
> Hibernate only and is not even limited to DB content as well).
>
> Regards,
> Lukas
>
>
> On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane) <[hidden email]>wrote:
>
>  
>> We are knee-deep in a Solr project to provide a web services layer
>> between our Oracle DB's and a web front end to be named later  to
>> supplement our numerous Business Intelligence dashboards. Someone from a
>> peer group questioned why we selected Solr rather than Compass to start
>> development. The real reason is that we had not heard of Compass until
>> that comment. Now I need to come up with a better answer.
>>
>>
>>
>> Does anyone out there have experience in both approaches who might be
>> able to give a quick compare and contrast?
>>
>>
>>
>> Thanks in advance,
>>
>> Ken
>>
>>
>>    
>
>  
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Minutello, Nick

Not sure how many here have used both ...

I've used raw Lucene in the past - and after that, Compass. More recently Solr.

Here are some of the things I have noticed:

1) Stating the obvious: Solr has a server capability that Compass/Lucene does not. This means indexing/searching is available to non-java clients without writing any code.

2) Compass does a number of things really nicely (that afaik, isn't addressed by Solr)

+ Object-search engine mapping (great for structured data - i.e. not just text documents). I find writing the code that converts to/from a SolrDocument a bit annoying (but in my current project, the data is really simple). If you have many different kinds of things that you want to index ... Compass has an advantage.

+ Transactional updates & linking in, via transactional coordinator, a database transaction. This means that the primary persistence (database) is always in line with the index (notwithstanding some funky edge cases). This is very useful if your indexing needs to be 'realtime' - i.e. user cannot see a delay between what is served off the index vs. something they just updated. Jira is a good example (though they don't use compass).

+ Multi-index support - gives better transaction isolation (locking is done at the granularity of the subindex - but searching is across the whole index). There might be a solr analogue, but I haven't seen it yet.

3) Solr does a number of things that are really nice (that aren't really addressed by Solr)

+ auto-commit & StreamingUpdate client. This client and server-side buffering is very nice in dealing with high indexing throughput, without having to write the queuing/buffering

+ Facets

+ Handles date and integer types (and range queries on them) automatically. (With compass you have to write an index mapping so that you can do range queries)

+ I find the built-in Luke support really handy.

I would say that if you are just indexing simple text, and you don't have to remain tightly in synch with the database, use Solr.
If you're data is more structured, and you have more complex read/write/synchronisation-with-db requirements, go with Compass.

In terms of using them, I find Solr config a bit confusing - and like most apache projects it has a bazillion dependencies :)
Compass can be a bit confusing also (the names, Compass, GPS, etc... Don't resonate well with me (sorry Shay)) - but if you have used hibernate, compass will feel very familiar.

-Nick

+ In response to earlier mails Compass is not really anything like hibernate search. Admittedly I haven't touched HS in a while, but it seemed they completely missed the point of using lucene (i.e. being able to satisfy search results & statistics via collectors super-super fast without hitting the DB)

+ There seems to be an implication that compass wont scale as well as solr - and I'm not sure that's true at all. They will both scale as well as the underlying Lucene.



 

-----Original Message-----
From: Ken Lane (kenlane) [mailto:[hidden email]]
Sent: 21 January 2010 17:08
To: [hidden email]
Subject: RE: Solr vs. Compass

Uri, Lucas,

Thanks for your feedback. To clarify on some specifics,

1. Yes, faceted search and DisMax are very imortant to this project.
2. Our data is imported from Oracle tables. (Unstructured sources maybe later). We manufacture each document from DB queries.
3. Our platform won't be transactional, we will update the indexes periodically throughout the day probably via dataimport handler.
 
Regards, Ken

-----Original Message-----
From: Uri Boness [mailto:[hidden email]]
Sent: Thursday, January 21, 2010 11:35 AM
To: [hidden email]
Subject: Re: Solr vs. Compass

In addition, the biggest appealing feature in Compass is that it's transactional and therefore integrates well with your infrastructure (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some systems (not very large scale ones) and the programming model is clean.
On the other hand, Solr scales much better and provides a load of functionality that otherwise you'll have to custom build on top of Compass/Lucene.

Lukáš Vlček wrote:

> Hi,
>
> I think that these products do not compete directly that much, each
> fit different business case. Can you tell us more about our specific situation?
> What do you need to search and where your data is? (DB, Filesystem,
> Web
> ...?)
>
> Solr provides some specific extensions which are not supported
> directly by Lucene (faceted search, DisMax... etc) so if you need
> these then your bet on Compass might not be perfect. On the other hand
> if you need to index persistent Java objects then Compass fits
> perfectly into this scenario (and if you are using Spring and JPA then
> setting up search can be matter of several modifications to configuration and annotations).
>
> Compass is more Hibernate search competitor (but Compass is not
> limited to Hibernate only and is not even limited to DB content as well).
>
> Regards,
> Lukas
>
>
> On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane) <[hidden email]>wrote:
>
>  
>> We are knee-deep in a Solr project to provide a web services layer
>> between our Oracle DB's and a web front end to be named later  to
>> supplement our numerous Business Intelligence dashboards. Someone
>> from a peer group questioned why we selected Solr rather than Compass
>> to start development. The real reason is that we had not heard of
>> Compass until that comment. Now I need to come up with a better answer.
>>
>>
>>
>> Does anyone out there have experience in both approaches who might be
>> able to give a quick compare and contrast?
>>
>>
>>
>> Thanks in advance,
>>
>> Ken
>>
>>
>>    
>
>  

===============================================================================
 Please access the attached hyperlink for an important electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
 ===============================================================================
 
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Minutello, Nick
 
Oops!
>> Solr does a number of things that are really nice (that aren't really addressed by Solr)
I obviously meant:

"Solr does a number of things that are really nice (that aren't really addressed by Compass)"

-N



-----Original Message-----
From: Minutello, Nick
Sent: 21 January 2010 17:52
To: [hidden email]
Subject: RE: Solr vs. Compass


Not sure how many here have used both ...

I've used raw Lucene in the past - and after that, Compass. More recently Solr.

Here are some of the things I have noticed:

1) Stating the obvious: Solr has a server capability that Compass/Lucene does not. This means indexing/searching is available to non-java clients without writing any code.

2) Compass does a number of things really nicely (that afaik, isn't addressed by Solr)

+ Object-search engine mapping (great for structured data - i.e. not just text documents). I find writing the code that converts to/from a SolrDocument a bit annoying (but in my current project, the data is really simple). If you have many different kinds of things that you want to index ... Compass has an advantage.

+ Transactional updates & linking in, via transactional coordinator, a database transaction. This means that the primary persistence (database) is always in line with the index (notwithstanding some funky edge cases). This is very useful if your indexing needs to be 'realtime' - i.e. user cannot see a delay between what is served off the index vs. something they just updated. Jira is a good example (though they don't use compass).

+ Multi-index support - gives better transaction isolation (locking is done at the granularity of the subindex - but searching is across the whole index). There might be a solr analogue, but I haven't seen it yet.

3) Solr does a number of things that are really nice (that aren't really addressed by Solr)

+ auto-commit & StreamingUpdate client. This client and server-side
+ buffering is very nice in dealing with high indexing throughput,
+ without having to write the queuing/buffering

+ Facets

+ Handles date and integer types (and range queries on them)
+ automatically. (With compass you have to write an index mapping so
+ that you can do range queries)

+ I find the built-in Luke support really handy.

I would say that if you are just indexing simple text, and you don't have to remain tightly in synch with the database, use Solr.
If you're data is more structured, and you have more complex read/write/synchronisation-with-db requirements, go with Compass.

In terms of using them, I find Solr config a bit confusing - and like most apache projects it has a bazillion dependencies :) Compass can be a bit confusing also (the names, Compass, GPS, etc... Don't resonate well with me (sorry Shay)) - but if you have used hibernate, compass will feel very familiar.

-Nick

+ In response to earlier mails Compass is not really anything like
+ hibernate search. Admittedly I haven't touched HS in a while, but it
+ seemed they completely missed the point of using lucene (i.e. being
+ able to satisfy search results & statistics via collectors super-super
+ fast without hitting the DB)

+ There seems to be an implication that compass wont scale as well as solr - and I'm not sure that's true at all. They will both scale as well as the underlying Lucene.



 

-----Original Message-----
From: Ken Lane (kenlane) [mailto:[hidden email]]
Sent: 21 January 2010 17:08
To: [hidden email]
Subject: RE: Solr vs. Compass

Uri, Lucas,

Thanks for your feedback. To clarify on some specifics,

1. Yes, faceted search and DisMax are very imortant to this project.
2. Our data is imported from Oracle tables. (Unstructured sources maybe later). We manufacture each document from DB queries.
3. Our platform won't be transactional, we will update the indexes periodically throughout the day probably via dataimport handler.
 
Regards, Ken

-----Original Message-----
From: Uri Boness [mailto:[hidden email]]
Sent: Thursday, January 21, 2010 11:35 AM
To: [hidden email]
Subject: Re: Solr vs. Compass

In addition, the biggest appealing feature in Compass is that it's transactional and therefore integrates well with your infrastructure (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some systems (not very large scale ones) and the programming model is clean.
On the other hand, Solr scales much better and provides a load of functionality that otherwise you'll have to custom build on top of Compass/Lucene.

Lukáš Vlček wrote:

> Hi,
>
> I think that these products do not compete directly that much, each
> fit different business case. Can you tell us more about our specific situation?
> What do you need to search and where your data is? (DB, Filesystem,
> Web
> ...?)
>
> Solr provides some specific extensions which are not supported
> directly by Lucene (faceted search, DisMax... etc) so if you need
> these then your bet on Compass might not be perfect. On the other hand
> if you need to index persistent Java objects then Compass fits
> perfectly into this scenario (and if you are using Spring and JPA then
> setting up search can be matter of several modifications to configuration and annotations).
>
> Compass is more Hibernate search competitor (but Compass is not
> limited to Hibernate only and is not even limited to DB content as well).
>
> Regards,
> Lukas
>
>
> On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane) <[hidden email]>wrote:
>
>  
>> We are knee-deep in a Solr project to provide a web services layer
>> between our Oracle DB's and a web front end to be named later  to
>> supplement our numerous Business Intelligence dashboards. Someone
>> from a peer group questioned why we selected Solr rather than Compass
>> to start development. The real reason is that we had not heard of
>> Compass until that comment. Now I need to come up with a better answer.
>>
>>
>>
>> Does anyone out there have experience in both approaches who might be
>> able to give a quick compare and contrast?
>>
>>
>>
>> Thanks in advance,
>>
>> Ken
>>
>>
>>    
>
>  

===============================================================================
 Please access the attached hyperlink for an important electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
 ===============================================================================
 

===============================================================================
 Please access the attached hyperlink for an important electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
 ===============================================================================
 
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Adamsky, Robert
In reply to this post by Minutello, Nick

> 2) Compass does a number of things really nicely (that afaik, isn't addressed by Solr)
> + Object-search engine mapping (great for structured data - i.e. not just text documents). I find writing the code that converts to/from a SolrDocument a bit annoying (but in my current project, the data is really simple). If you have many different kinds of things that you want to index ... Compass has an advantage.

Solrj does have ability to write pojos and annotate them for mapping to/from solr.
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Minutello, Nick

Actually, that's true. But IMO it's not that great :)
After fighting it for a bit, we gave up on it ... (maybe more of a
reflection of our capabilities rather than Solr's - but Id like to think
we are some-way competant)

-N

-----Original Message-----
From: Adamsky, Robert [mailto:[hidden email]]
Sent: 21 January 2010 18:16
To: [hidden email]
Subject: RE: Solr vs. Compass


> 2) Compass does a number of things really nicely (that afaik, isn't
> addressed by Solr)
> + Object-search engine mapping (great for structured data - i.e. not
just text documents). I find writing the code that converts to/from a
SolrDocument a bit annoying (but in my current project, the data is
really simple). If you have many different kinds of things that you want
to index ... Compass has an advantage.

Solrj does have ability to write pojos and annotate them for mapping
to/from solr.

===============================================================================
 Please access the attached hyperlink for an important electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
 ===============================================================================
 
Reply | Threaded
Open this post in threaded view
|

Re: Solr vs. Compass

Uri Boness
In reply to this post by Minutello, Nick
>
> There seems to be an implication that compass wont scale as well as solr - and I'm not sure that's true at all. They will both scale as well as the underlying Lucene.
Lucene doesn't handle distributed search or replication out of the box,
you have to implement it using some of it's features (deletion policy,
etc..). Compass provides distributed index support but mainly through
some grid solution (GigaSpaces, Oracle Coherence, Terrecota) many of
which are commercial products, or by using the JDBC Directory which
doesn't perform very well. Even when using Terracotta I don't know of an
actual deployment which handles hundreds of million of documents (do
you?) so it's hard to say how well it scales. Solr on the other hand
already provides distributed/replication mechanism which is proven to
work well on very large collections. But I do agree that if you don't
need to handle such large scale deployments Compass may still fit your
needs. If I would have to choose between Compass and Hibernate Search, I
would definitely go for Compass (much more robust architecture... not
bound to ORM... much more customizable..). More over, transaction
support and very frequent updates (as in the case with most Compass
deployments I've seen) are not always that scalable.... it very much
depends on your collection (perhaps now with the near real-time searche
support in Lucene it can be much better supported).

> Solrj does have ability to write pojos and annotate them for mapping
> to/from solr.
This support is extremely limited compared to Compass. Compass can
really be seen as an ORM-like framework on top of Lucene... supporting
different types of relationships and aggregation in the domain model.
This is actually one of the big differentiators between Compass and
Solr... while in Solr the schema dictates the structure of the index, in
Compass it's the domain model that defines the structure.
Reply | Threaded
Open this post in threaded view
|

Re: Solr vs. Compass

Otis Gospodnetic-2
In reply to this post by Ken Lane (kenlane)
Hi Ken,

Based on this, Solr sounds like the way to go.

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----

> From: Ken Lane (kenlane) <[hidden email]>
> To: [hidden email]
> Sent: Thu, January 21, 2010 12:07:56 PM
> Subject: RE: Solr vs. Compass
>
> Uri, Lucas,
>
> Thanks for your feedback. To clarify on some specifics,
>
> 1. Yes, faceted search and DisMax are very imortant to this project.
> 2. Our data is imported from Oracle tables. (Unstructured sources maybe later).
> We manufacture each document from DB queries.
> 3. Our platform won't be transactional, we will update the indexes periodically
> throughout the day probably via dataimport handler.
>
> Regards, Ken
>
> -----Original Message-----
> From: Uri Boness [mailto:[hidden email]]
> Sent: Thursday, January 21, 2010 11:35 AM
> To: [hidden email]
> Subject: Re: Solr vs. Compass
>
> In addition, the biggest appealing feature in Compass is that it's
> transactional and therefore integrates well with your infrastructure
> (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some
> systems (not very large scale ones) and the programming model is clean.
> On the other hand, Solr scales much better and provides a load of
> functionality that otherwise you'll have to custom build on top of
> Compass/Lucene.
>
> Lukáš Vlček wrote:
> > Hi,
> >
> > I think that these products do not compete directly that much, each fit
> > different business case. Can you tell us more about our specific situation?
> > What do you need to search and where your data is? (DB, Filesystem, Web
> > ...?)
> >
> > Solr provides some specific extensions which are not supported directly by
> > Lucene (faceted search, DisMax... etc) so if you need these then your bet on
> > Compass might not be perfect. On the other hand if you need to index
> > persistent Java objects then Compass fits perfectly into this scenario (and
> > if you are using Spring and JPA then setting up search can be matter of
> > several modifications to configuration and annotations).
> >
> > Compass is more Hibernate search competitor (but Compass is not limited to
> > Hibernate only and is not even limited to DB content as well).
> >
> > Regards,
> > Lukas
> >
> >
> > On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane) wrote:
> >
> >  
> >> We are knee-deep in a Solr project to provide a web services layer
> >> between our Oracle DB's and a web front end to be named later  to
> >> supplement our numerous Business Intelligence dashboards. Someone from a
> >> peer group questioned why we selected Solr rather than Compass to start
> >> development. The real reason is that we had not heard of Compass until
> >> that comment. Now I need to come up with a better answer.
> >>
> >>
> >>
> >> Does anyone out there have experience in both approaches who might be
> >> able to give a quick compare and contrast?
> >>
> >>
> >>
> >> Thanks in advance,
> >>
> >> Ken
> >>
> >>
> >>    
> >
> >  

Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Minutello, Nick
In reply to this post by Uri Boness

Agree with everything you said.



-----Original Message-----
From: Uri Boness [mailto:[hidden email]]
Sent: 22 January 2010 01:25
To: [hidden email]
Subject: Re: Solr vs. Compass

>
> There seems to be an implication that compass wont scale as well as
solr - and I'm not sure that's true at all. They will both scale as well
as the underlying Lucene.
Lucene doesn't handle distributed search or replication out of the box,
you have to implement it using some of it's features (deletion policy,
etc..). Compass provides distributed index support but mainly through
some grid solution (GigaSpaces, Oracle Coherence, Terrecota) many of
which are commercial products, or by using the JDBC Directory which
doesn't perform very well. Even when using Terracotta I don't know of an
actual deployment which handles hundreds of million of documents (do
you?) so it's hard to say how well it scales. Solr on the other hand
already provides distributed/replication mechanism which is proven to
work well on very large collections. But I do agree that if you don't
need to handle such large scale deployments Compass may still fit your
needs. If I would have to choose between Compass and Hibernate Search, I
would definitely go for Compass (much more robust architecture... not
bound to ORM... much more customizable..). More over, transaction
support and very frequent updates (as in the case with most Compass
deployments I've seen) are not always that scalable.... it very much
depends on your collection (perhaps now with the near real-time searche
support in Lucene it can be much better supported).

> Solrj does have ability to write pojos and annotate them for mapping
> to/from solr.
This support is extremely limited compared to Compass. Compass can
really be seen as an ORM-like framework on top of Lucene... supporting
different types of relationships and aggregation in the domain model.
This is actually one of the big differentiators between Compass and
Solr... while in Solr the schema dictates the structure of the index, in
Compass it's the domain model that defines the structure.

===============================================================================
 Please access the attached hyperlink for an important electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
 ===============================================================================
 
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Minutello, Nick
In reply to this post by Otis Gospodnetic-2
 
I would tend to agree.

-----Original Message-----
From: Otis Gospodnetic [mailto:[hidden email]]
Sent: 22 January 2010 05:18
To: [hidden email]
Subject: Re: Solr vs. Compass

Hi Ken,

Based on this, Solr sounds like the way to go.

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----

> From: Ken Lane (kenlane) <[hidden email]>
> To: [hidden email]
> Sent: Thu, January 21, 2010 12:07:56 PM
> Subject: RE: Solr vs. Compass
>
> Uri, Lucas,
>
> Thanks for your feedback. To clarify on some specifics,
>
> 1. Yes, faceted search and DisMax are very imortant to this project.
> 2. Our data is imported from Oracle tables. (Unstructured sources maybe later).
> We manufacture each document from DB queries.
> 3. Our platform won't be transactional, we will update the indexes
> periodically throughout the day probably via dataimport handler.
>
> Regards, Ken
>
> -----Original Message-----
> From: Uri Boness [mailto:[hidden email]]
> Sent: Thursday, January 21, 2010 11:35 AM
> To: [hidden email]
> Subject: Re: Solr vs. Compass
>
> In addition, the biggest appealing feature in Compass is that it's
> transactional and therefore integrates well with your infrastructure
> (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some
> systems (not very large scale ones) and the programming model is clean.
> On the other hand, Solr scales much better and provides a load of
> functionality that otherwise you'll have to custom build on top of
> Compass/Lucene.
>
> Lukáš Vlček wrote:
> > Hi,
> >
> > I think that these products do not compete directly that much, each
> > fit different business case. Can you tell us more about our specific situation?
> > What do you need to search and where your data is? (DB, Filesystem,
> > Web
> > ...?)
> >
> > Solr provides some specific extensions which are not supported
> > directly by Lucene (faceted search, DisMax... etc) so if you need
> > these then your bet on Compass might not be perfect. On the other
> > hand if you need to index persistent Java objects then Compass fits
> > perfectly into this scenario (and if you are using Spring and JPA
> > then setting up search can be matter of several modifications to configuration and annotations).
> >
> > Compass is more Hibernate search competitor (but Compass is not
> > limited to Hibernate only and is not even limited to DB content as well).
> >
> > Regards,
> > Lukas
> >
> >
> > On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane) wrote:
> >
> >  
> >> We are knee-deep in a Solr project to provide a web services layer
> >> between our Oracle DB's and a web front end to be named later  to
> >> supplement our numerous Business Intelligence dashboards. Someone
> >> from a peer group questioned why we selected Solr rather than
> >> Compass to start development. The real reason is that we had not
> >> heard of Compass until that comment. Now I need to come up with a better answer.
> >>
> >>
> >>
> >> Does anyone out there have experience in both approaches who might
> >> be able to give a quick compare and contrast?
> >>
> >>
> >>
> >> Thanks in advance,
> >>
> >> Ken
> >>
> >>
> >>    
> >
> >  


===============================================================================
 Please access the attached hyperlink for an important electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
 ===============================================================================
 
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Fuad Efendi
In reply to this post by Uri Boness
Yes, "transactional", I tried it: do we really need "transactional"? Even if "commit" takes 20 minutes?
It's their "selling point" nothing more.
HBase is not transactional, and it has specific use case; each tool has specific use case... in some cases Compass is the best!

Also, note that Compass (Hibernate) ((RDBMS)) use specific "business domain model" terms with relationships; huge overhead to convert "relational" into "object-oriented" (why for? Any advantages?)... Lucene does it behind-the-scenes: you don't have to worry that field "USA" (3 characters) is repeated in few millions documents, and field "Canada" (6 characters) in another few; no any "relational", it's done automatically without any Compass/Hibernate/Table(s)


Don't think "relational".

I wrote this 2 years ago:
http://www.theserverside.com/news/thread.tss?thread_id=50711#272351


Fuad Efendi
+1 416-993-2060
http://www.tokenizer.ca/


> -----Original Message-----
> From: Uri Boness [mailto:[hidden email]]
> Sent: January-21-10 11:35 AM
> To: [hidden email]
> Subject: Re: Solr vs. Compass
>
> In addition, the biggest appealing feature in Compass is that it's
> transactional and therefore integrates well with your infrastructure
> (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some
> systems (not very large scale ones) and the programming model is clean.
> On the other hand, Solr scales much better and provides a load of
> functionality that otherwise you'll have to custom build on top of
> Compass/Lucene.
>
> Lukáš Vlček wrote:
> > Hi,
> >
> > I think that these products do not compete directly that much, each
> fit
> > different business case. Can you tell us more about our specific
> situation?
> > What do you need to search and where your data is? (DB, Filesystem,
> Web
> > ...?)
> >
> > Solr provides some specific extensions which are not supported
> directly by
> > Lucene (faceted search, DisMax... etc) so if you need these then your
> bet on
> > Compass might not be perfect. On the other hand if you need to index
> > persistent Java objects then Compass fits perfectly into this scenario
> (and
> > if you are using Spring and JPA then setting up search can be matter
> of
> > several modifications to configuration and annotations).
> >
> > Compass is more Hibernate search competitor (but Compass is not
> limited to
> > Hibernate only and is not even limited to DB content as well).
> >
> > Regards,
> > Lukas
> >
> >
> > On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane)
> <[hidden email]>wrote:
> >
> >
> >> We are knee-deep in a Solr project to provide a web services layer
> >> between our Oracle DB's and a web front end to be named later  to
> >> supplement our numerous Business Intelligence dashboards. Someone
> from a
> >> peer group questioned why we selected Solr rather than Compass to
> start
> >> development. The real reason is that we had not heard of Compass
> until
> >> that comment. Now I need to come up with a better answer.
> >>
> >>
> >>
> >> Does anyone out there have experience in both approaches who might be
> >> able to give a quick compare and contrast?
> >>
> >>
> >>
> >> Thanks in advance,
> >>
> >> Ken
> >>
> >>
> >>
> >
> >


Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Fuad Efendi
Of course, I understand what "transaction" means; have you guys been thinking some about what may happen if we transfer $123.45 from one banking account to another banking account, and MySQL forgets to index "decimal" during transaction, or DBA was weird and forgot to create an index? Absolutely nothing.

Why to embed "indexing" as a transaction dependency? Extremely weird idea. But I understand some selling points...


SOLR: it is faster than Lucene. Filtered queries run faster than traditional "AND" queries! And this is real selling point.



Thanks,

Fuad Efendi
+1 416-993-2060
http://www.linkedin.com/in/liferay

Tokenizer Inc.
http://www.tokenizer.ca/
Data Mining, Vertical Search


> -----Original Message-----
> From: Fuad Efendi [mailto:[hidden email]]
> Sent: January-22-10 11:23 PM
> To: [hidden email]
> Subject: RE: Solr vs. Compass
>
> Yes, "transactional", I tried it: do we really need "transactional"?
> Even if "commit" takes 20 minutes?
> It's their "selling point" nothing more.
> HBase is not transactional, and it has specific use case; each tool has
> specific use case... in some cases Compass is the best!
>
> Also, note that Compass (Hibernate) ((RDBMS)) use specific "business
> domain model" terms with relationships; huge overhead to convert
> "relational" into "object-oriented" (why for? Any advantages?)... Lucene
> does it behind-the-scenes: you don't have to worry that field "USA" (3
> characters) is repeated in few millions documents, and field "Canada" (6
> characters) in another few; no any "relational", it's done automatically
> without any Compass/Hibernate/Table(s)
>
>
> Don't think "relational".
>
> I wrote this 2 years ago:
> http://www.theserverside.com/news/thread.tss?thread_id=50711#272351
>
>
> Fuad Efendi
> +1 416-993-2060
> http://www.tokenizer.ca/
>
>
> > -----Original Message-----
> > From: Uri Boness [mailto:[hidden email]]
> > Sent: January-21-10 11:35 AM
> > To: [hidden email]
> > Subject: Re: Solr vs. Compass
> >
> > In addition, the biggest appealing feature in Compass is that it's
> > transactional and therefore integrates well with your infrastructure
> > (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some
> > systems (not very large scale ones) and the programming model is
> clean.
> > On the other hand, Solr scales much better and provides a load of
> > functionality that otherwise you'll have to custom build on top of
> > Compass/Lucene.
> >
> > Lukáš Vlček wrote:
> > > Hi,
> > >
> > > I think that these products do not compete directly that much, each
> > fit
> > > different business case. Can you tell us more about our specific
> > situation?
> > > What do you need to search and where your data is? (DB, Filesystem,
> > Web
> > > ...?)
> > >
> > > Solr provides some specific extensions which are not supported
> > directly by
> > > Lucene (faceted search, DisMax... etc) so if you need these then
> your
> > bet on
> > > Compass might not be perfect. On the other hand if you need to index
> > > persistent Java objects then Compass fits perfectly into this
> scenario
> > (and
> > > if you are using Spring and JPA then setting up search can be matter
> > of
> > > several modifications to configuration and annotations).
> > >
> > > Compass is more Hibernate search competitor (but Compass is not
> > limited to
> > > Hibernate only and is not even limited to DB content as well).
> > >
> > > Regards,
> > > Lukas
> > >
> > >
> > > On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane)
> > <[hidden email]>wrote:
> > >
> > >
> > >> We are knee-deep in a Solr project to provide a web services layer
> > >> between our Oracle DB's and a web front end to be named later  to
> > >> supplement our numerous Business Intelligence dashboards. Someone
> > from a
> > >> peer group questioned why we selected Solr rather than Compass to
> > start
> > >> development. The real reason is that we had not heard of Compass
> > until
> > >> that comment. Now I need to come up with a better answer.
> > >>
> > >>
> > >>
> > >> Does anyone out there have experience in both approaches who might
> be
> > >> able to give a quick compare and contrast?
> > >>
> > >>
> > >>
> > >> Thanks in advance,
> > >>
> > >> Ken
> > >>
> > >>
> > >>
> > >
> > >
>



Reply | Threaded
Open this post in threaded view
|

Re: Solr vs. Compass

Uri Boness
waw...

well, transactional or "transactional", whether it's a nice feature to
have or just a "selling point". Bottom line, For some applications
Compass can be very appealing, for other Solr will be the choice. In the
last several years I've integrated both in different applications and
gained from both. Do you own math based on your needs, requirements and
personal preferences. But if someone asks a questions, then it's always
nice to get several opinions from different experiences.

peace,
Uri

Fuad Efendi wrote:

> Of course, I understand what "transaction" means; have you guys been thinking some about what may happen if we transfer $123.45 from one banking account to another banking account, and MySQL forgets to index "decimal" during transaction, or DBA was weird and forgot to create an index? Absolutely nothing.
>
> Why to embed "indexing" as a transaction dependency? Extremely weird idea. But I understand some selling points...
>
>
> SOLR: it is faster than Lucene. Filtered queries run faster than traditional "AND" queries! And this is real selling point.
>
>
>
> Thanks,
>
> Fuad Efendi
> +1 416-993-2060
> http://www.linkedin.com/in/liferay
>
> Tokenizer Inc.
> http://www.tokenizer.ca/
> Data Mining, Vertical Search
>
>
>  
>> -----Original Message-----
>> From: Fuad Efendi [mailto:[hidden email]]
>> Sent: January-22-10 11:23 PM
>> To: [hidden email]
>> Subject: RE: Solr vs. Compass
>>
>> Yes, "transactional", I tried it: do we really need "transactional"?
>> Even if "commit" takes 20 minutes?
>> It's their "selling point" nothing more.
>> HBase is not transactional, and it has specific use case; each tool has
>> specific use case... in some cases Compass is the best!
>>
>> Also, note that Compass (Hibernate) ((RDBMS)) use specific "business
>> domain model" terms with relationships; huge overhead to convert
>> "relational" into "object-oriented" (why for? Any advantages?)... Lucene
>> does it behind-the-scenes: you don't have to worry that field "USA" (3
>> characters) is repeated in few millions documents, and field "Canada" (6
>> characters) in another few; no any "relational", it's done automatically
>> without any Compass/Hibernate/Table(s)
>>
>>
>> Don't think "relational".
>>
>> I wrote this 2 years ago:
>> http://www.theserverside.com/news/thread.tss?thread_id=50711#272351
>>
>>
>> Fuad Efendi
>> +1 416-993-2060
>> http://www.tokenizer.ca/
>>
>>
>>    
>>> -----Original Message-----
>>> From: Uri Boness [mailto:[hidden email]]
>>> Sent: January-21-10 11:35 AM
>>> To: [hidden email]
>>> Subject: Re: Solr vs. Compass
>>>
>>> In addition, the biggest appealing feature in Compass is that it's
>>> transactional and therefore integrates well with your infrastructure
>>> (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some
>>> systems (not very large scale ones) and the programming model is
>>>      
>> clean.
>>    
>>> On the other hand, Solr scales much better and provides a load of
>>> functionality that otherwise you'll have to custom build on top of
>>> Compass/Lucene.
>>>
>>> Lukáš Vlček wrote:
>>>      
>>>> Hi,
>>>>
>>>> I think that these products do not compete directly that much, each
>>>>        
>>> fit
>>>      
>>>> different business case. Can you tell us more about our specific
>>>>        
>>> situation?
>>>      
>>>> What do you need to search and where your data is? (DB, Filesystem,
>>>>        
>>> Web
>>>      
>>>> ...?)
>>>>
>>>> Solr provides some specific extensions which are not supported
>>>>        
>>> directly by
>>>      
>>>> Lucene (faceted search, DisMax... etc) so if you need these then
>>>>        
>> your
>>    
>>> bet on
>>>      
>>>> Compass might not be perfect. On the other hand if you need to index
>>>> persistent Java objects then Compass fits perfectly into this
>>>>        
>> scenario
>>    
>>> (and
>>>      
>>>> if you are using Spring and JPA then setting up search can be matter
>>>>        
>>> of
>>>      
>>>> several modifications to configuration and annotations).
>>>>
>>>> Compass is more Hibernate search competitor (but Compass is not
>>>>        
>>> limited to
>>>      
>>>> Hibernate only and is not even limited to DB content as well).
>>>>
>>>> Regards,
>>>> Lukas
>>>>
>>>>
>>>> On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane)
>>>>        
>>> <[hidden email]>wrote:
>>>      
>>>>        
>>>>> We are knee-deep in a Solr project to provide a web services layer
>>>>> between our Oracle DB's and a web front end to be named later  to
>>>>> supplement our numerous Business Intelligence dashboards. Someone
>>>>>          
>>> from a
>>>      
>>>>> peer group questioned why we selected Solr rather than Compass to
>>>>>          
>>> start
>>>      
>>>>> development. The real reason is that we had not heard of Compass
>>>>>          
>>> until
>>>      
>>>>> that comment. Now I need to come up with a better answer.
>>>>>
>>>>>
>>>>>
>>>>> Does anyone out there have experience in both approaches who might
>>>>>          
>> be
>>    
>>>>> able to give a quick compare and contrast?
>>>>>
>>>>>
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> Ken
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>        
>
>
>
>
>  
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Minutello, Nick
In reply to this post by Fuad Efendi

>> Why to embed "indexing" as a transaction dependency? Extremely weird idea.
There is nothing weird about different use cases requiring different approaches....

If you're just thinking documents and text search ... then its less of an issue.
If you have an online application where the indexing is being used to drive certain features (not just search), then the transactionality is quite useful.

>> Even if "commit" takes 20 minutes?
>> It's their "selling point" nothing more.
"they" are not "selling" anything. You will find its an open-source project, and the main guy is quite a smart guy.
I've never seen a commit take 20 minutes... (anything taking that long is broken, perhaps in concept)

>> Also, note that Compass (Hibernate) ((RDBMS)) use specific "business domain model" terms
>> with relationships; huge overhead to convert "relational" into "object-oriented" (why for? Any advantages?)
Perhaps the pros and cons of Object Relational Mapping are for another forum?

There is naturally some overhead in Compass's OSEM - but it makes your life easier if you work with the same domain model irrespective of whether something comes from the database or comes from the index. Typically we serve search results from the index - and when clicking on one of the results, we load from db to get the master copy. Moreover, the OSEM does an excellent job of flattening & indexing whatever object hierarchy exists into the flat lucene document - this is great for google-style searches where you want to find, e.g. some product using _anything_ you can remember about it. Using Lucene or Solr, you have to write the code that constructs the lucene document from the object & its relationships. Not an issue if you have a small number of entity types, but rather a pita if you have dozens... Or you eventually write some reflection-based thing.. (i.e. you begin to write a poor-mans implementation of compass OSEM)

As mentioned before, they address different kinds of problems....

-Nick

 

-----Original Message-----
From: Fuad Efendi [mailto:[hidden email]]
Sent: 23 January 2010 05:01
To: [hidden email]
Subject: RE: Solr vs. Compass

Of course, I understand what "transaction" means; have you guys been thinking some about what may happen if we transfer $123.45 from one banking account to another banking account, and MySQL forgets to index "decimal" during transaction, or DBA was weird and forgot to create an index? Absolutely nothing.

Why to embed "indexing" as a transaction dependency? Extremely weird idea. But I understand some selling points...


SOLR: it is faster than Lucene. Filtered queries run faster than traditional "AND" queries! And this is real selling point.



Thanks,

Fuad Efendi
+1 416-993-2060
http://www.linkedin.com/in/liferay

Tokenizer Inc.
http://www.tokenizer.ca/
Data Mining, Vertical Search


> -----Original Message-----
> From: Fuad Efendi [mailto:[hidden email]]
> Sent: January-22-10 11:23 PM
> To: [hidden email]
> Subject: RE: Solr vs. Compass
>
> Yes, "transactional", I tried it: do we really need "transactional"?
> Even if "commit" takes 20 minutes?
> It's their "selling point" nothing more.
> HBase is not transactional, and it has specific use case; each tool
> has specific use case... in some cases Compass is the best!
>
> Also, note that Compass (Hibernate) ((RDBMS)) use specific "business
> domain model" terms with relationships; huge overhead to convert
> "relational" into "object-oriented" (why for? Any advantages?)...
> Lucene does it behind-the-scenes: you don't have to worry that field
> "USA" (3
> characters) is repeated in few millions documents, and field "Canada"
> (6
> characters) in another few; no any "relational", it's done
> automatically without any Compass/Hibernate/Table(s)
>
>
> Don't think "relational".
>
> I wrote this 2 years ago:
> http://www.theserverside.com/news/thread.tss?thread_id=50711#272351
>
>
> Fuad Efendi
> +1 416-993-2060
> http://www.tokenizer.ca/
>
>
> > -----Original Message-----
> > From: Uri Boness [mailto:[hidden email]]
> > Sent: January-21-10 11:35 AM
> > To: [hidden email]
> > Subject: Re: Solr vs. Compass
> >
> > In addition, the biggest appealing feature in Compass is that it's
> > transactional and therefore integrates well with your infrastructure
> > (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for
> > some systems (not very large scale ones) and the programming model
> > is
> clean.
> > On the other hand, Solr scales much better and provides a load of
> > functionality that otherwise you'll have to custom build on top of
> > Compass/Lucene.
> >
> > Lukáš Vlček wrote:
> > > Hi,
> > >
> > > I think that these products do not compete directly that much,
> > > each
> > fit
> > > different business case. Can you tell us more about our specific
> > situation?
> > > What do you need to search and where your data is? (DB,
> > > Filesystem,
> > Web
> > > ...?)
> > >
> > > Solr provides some specific extensions which are not supported
> > directly by
> > > Lucene (faceted search, DisMax... etc) so if you need these then
> your
> > bet on
> > > Compass might not be perfect. On the other hand if you need to
> > > index persistent Java objects then Compass fits perfectly into
> > > this
> scenario
> > (and
> > > if you are using Spring and JPA then setting up search can be
> > > matter
> > of
> > > several modifications to configuration and annotations).
> > >
> > > Compass is more Hibernate search competitor (but Compass is not
> > limited to
> > > Hibernate only and is not even limited to DB content as well).
> > >
> > > Regards,
> > > Lukas
> > >
> > >
> > > On Thu, Jan 21, 2010 at 4:40 PM, Ken Lane (kenlane)
> > <[hidden email]>wrote:
> > >
> > >
> > >> We are knee-deep in a Solr project to provide a web services
> > >> layer between our Oracle DB's and a web front end to be named
> > >> later  to supplement our numerous Business Intelligence
> > >> dashboards. Someone
> > from a
> > >> peer group questioned why we selected Solr rather than Compass to
> > start
> > >> development. The real reason is that we had not heard of Compass
> > until
> > >> that comment. Now I need to come up with a better answer.
> > >>
> > >>
> > >>
> > >> Does anyone out there have experience in both approaches who
> > >> might
> be
> > >> able to give a quick compare and contrast?
> > >>
> > >>
> > >>
> > >> Thanks in advance,
> > >>
> > >> Ken
> > >>
> > >>
> > >>
> > >
> > >
>




===============================================================================
 Please access the attached hyperlink for an important electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
 ===============================================================================
 
Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Fuad Efendi
> >> Why to embed "indexing" as a transaction dependency? Extremely weird
> idea.
> There is nothing weird about different use cases requiring different
> approaches....
>
> If you're just thinking documents and text search ... then its less of
> an issue.
> If you have an online application where the indexing is being used to
> drive certain features (not just search), then the transactionality is
> quite useful.


I mean:
- Primary Key Constraint in RDBMS is not the same as an index
- Index in RDBMS: data is still searchable, even if we don't have index

Are you sure that index in RDBMS is part of transaction in current
implementations of Oracle, IBM, SUN? I never heard such staff, there are no
such requirements for transactions. I am talking about transactions and
referential integrity, and not about indexed non-tokenized single-valued
field "Social Insurance Number". It could be done asynchronously outside of
transaction, I can't imagine use case when it must be done inside
transaction / failing transaction when it can't be done.

"Primary Key Constraint" is different use case, it is not necessarily
indexing of data. Especially for Hibernate where we mostly use surrogate
auto-generated keys.

 
-Fuad


Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Fuad Efendi
In reply to this post by Minutello, Nick

> >> Even if "commit" takes 20 minutes?
> I've never seen a commit take 20 minutes... (anything taking that long
> is broken, perhaps in concept)


"index merge" can take from few minutes to few hours. That's why nothing can
beat SOLR Master/Slave and sharding for huge datasets. And reopening of
IndexReader after each commit may take at least few seconds (although
depends on usage patterns).

"IndexReader or IndexSearcher will only see the index as of the "point in
time" that it was opened. Any changes committed to the index after the
reader was opened are not visible until the reader is re-opened."


I am wondering how Compass opens new instance of IndexReader (after each
commit!) - is it really implemented? I can't believe! It will work probably
fine for small datasets (less than 100k), and 1 TPD (transaction-per-day)...
 

Very expensive and unnatural ACID...


-Fuad


Reply | Threaded
Open this post in threaded view
|

RE: Solr vs. Compass

Minutello, Nick
In reply to this post by Fuad Efendi
Sorry, you have completely lost me :/

In simple terms, there are times when you want the primary storage
(database) and the Lucene index to be in synch - and updated atomically.
It all depends on the kind of application.

-N




-----Original Message-----
From: Fuad Efendi [mailto:[hidden email]]
Sent: 25 January 2010 16:06
To: [hidden email]
Subject: RE: Solr vs. Compass

> >> Why to embed "indexing" as a transaction dependency? Extremely
> >> weird
> idea.
> There is nothing weird about different use cases requiring different
> approaches....
>
> If you're just thinking documents and text search ... then its less of

> an issue.
> If you have an online application where the indexing is being used to
> drive certain features (not just search), then the transactionality is

> quite useful.


I mean:
- Primary Key Constraint in RDBMS is not the same as an index
- Index in RDBMS: data is still searchable, even if we don't have index

Are you sure that index in RDBMS is part of transaction in current
implementations of Oracle, IBM, SUN? I never heard such staff, there are
no such requirements for transactions. I am talking about transactions
and referential integrity, and not about indexed non-tokenized
single-valued field "Social Insurance Number". It could be done
asynchronously outside of transaction, I can't imagine use case when it
must be done inside transaction / failing transaction when it can't be
done.

"Primary Key Constraint" is different use case, it is not necessarily
indexing of data. Especially for Hibernate where we mostly use surrogate
auto-generated keys.

 
-Fuad



===============================================================================
 Please access the attached hyperlink for an important electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
 ===============================================================================
 
12