[Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
Hi!

I am very excited to announce the availability of Solr 4.0-ALPHA with
RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT
implementation now supports both RankingAlgorithm and Lucene. Realtime
NRT is a high performance and more granular NRT implementation as to
soft commit. The update performance is about 70,000 documents / sec*.
You can also scale up to 2 billion documents* in a single core, and
query half a billion documents index in ms**.

RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or
boolean queries and is compatible with the new Lucene 4.0-ALPHA api.

You can get more information about Solr 4.0-ALPHA with RankingAlgorithm
1.4.4 Realtime performance from here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x

You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
http://solr-ra.tgels.org

Please download and give the new version a try.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

* performance seen at a user installation of Solr 4.0 with
RankingAlgorithm 1.4.3
** performance seen when using the age feature

Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

project2501
What exactly is "Realtime NRT" (Near Real Time)?

On Sun, 2012-07-22 at 14:07 -0700, Nagendra Nagarajayya wrote:

> Hi!
>
> I am very excited to announce the availability of Solr 4.0-ALPHA with
> RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT
> implementation now supports both RankingAlgorithm and Lucene. Realtime
> NRT is a high performance and more granular NRT implementation as to
> soft commit. The update performance is about 70,000 documents / sec*.
> You can also scale up to 2 billion documents* in a single core, and
> query half a billion documents index in ms**.
>
> RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or
> boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
>
> You can get more information about Solr 4.0-ALPHA with RankingAlgorithm
> 1.4.4 Realtime performance from here:
> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
>
> You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
> http://solr-ra.tgels.org
>
> Please download and give the new version a try.
>
> Regards,
>
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
> * performance seen at a user installation of Solr 4.0 with
> RankingAlgorithm 1.4.3
> ** performance seen when using the age feature
>


Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
Realtime NRT is a NRT implementation available for Solr 1.4.1 to Solr
4.0. To enable NRT it makes available a NRTIndexReader to the
IndexSearcher for searching the index.  It does not close the
SolrIndexSearcher which is a very heavy object with caches, etc. to do
this. Since the Searcher is never closed it always uses the most recent
NRTIndexReader for searching and you get a pipe that is always filled
with new updated documents. The code changes are to handle this dynamic
pipe that may always have something new as in a realtime system.

Realtime NRT is different from soft commit as it does not close the
SolrIndexSearcher object every 1000 secs, invalidating the caches, etc.
SolrIndexSearcher is a very heavy object, ref. counted with caches, etc.
Closing it every time may turn out to be expensive.

I am contributing Realtime NRT to Solr 4.0 and am working on  making
available a patch, etc.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/22/2012 2:03 PM, Darren Govoni wrote:

> What exactly is "Realtime NRT" (Near Real Time)?
>
> On Sun, 2012-07-22 at 14:07 -0700, Nagendra Nagarajayya wrote:
>
>> Hi!
>>
>> I am very excited to announce the availability of Solr 4.0-ALPHA with
>> RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT
>> implementation now supports both RankingAlgorithm and Lucene. Realtime
>> NRT is a high performance and more granular NRT implementation as to
>> soft commit. The update performance is about 70,000 documents / sec*.
>> You can also scale up to 2 billion documents* in a single core, and
>> query half a billion documents index in ms**.
>>
>> RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or
>> boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
>>
>> You can get more information about Solr 4.0-ALPHA with RankingAlgorithm
>> 1.4.4 Realtime performance from here:
>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
>>
>> You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
>> http://solr-ra.tgels.org
>>
>> Please download and give the new version a try.
>>
>> Regards,
>>
>> Nagendra Nagarajayya
>> http://solr-ra.tgels.org
>> http://rankingalgorithm.tgels.org
>>
>> * performance seen at a user installation of Solr 4.0 with
>> RankingAlgorithm 1.4.3
>> ** performance seen when using the age feature
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Mark Miller-3
In reply to this post by nnagarajayya
These emails from Nagendra are very confusing. I've asked him in the past to be explicit about his announce and make it clear that it is an external project.

Since I don't think he has changed how he does announce since that request, allow me to help out:

Please note: This project has nothing to do with Apache. It is a completely external project that apparently uses Apache Solr.

It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's simply a project that an external user is promoting on the Solr mailing list.

- Mark Miller
lucidimagination.com

On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:

> Hi!
>
> I am very excited to announce the availability of Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance and more granular NRT implementation as to soft commit. The update performance is about 70,000 documents / sec*. You can also scale up to 2 billion documents* in a single core, and query half a billion documents index in ms**.
>
> RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
>
> You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 Realtime performance from here:
> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
>
> You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
> http://solr-ra.tgels.org
>
> Please download and give the new version a try.
>
> Regards,
>
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
> * performance seen at a user installation of Solr 4.0 with RankingAlgorithm 1.4.3
> ** performance seen when using the age feature
>












Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Bernd Fehling
+1

What would be if ALL external projects using lucene and/or solr are announcing on this list
that they have stepped up to the next higher release after a release change?

Also "Realtime NRT", if NRT stands for "Near_Real_Time" he has a "Realtime Near_Real_Time" Algorithm.

Regards,
Bernd


Am 23.07.2012 14:09, schrieb Mark Miller:

> These emails from Nagendra are very confusing. I've asked him in the past to be explicit about his announce and make it clear that it is an external project.
>
> Since I don't think he has changed how he does announce since that request, allow me to help out:
>
> Please note: This project has nothing to do with Apache. It is a completely external project that apparently uses Apache Solr.
>
> It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's simply a project that an external user is promoting on the Solr mailing list.
>
> - Mark Miller
> lucidimagination.com
>
> On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:
>
>> Hi!
>>
>> I am very excited to announce the availability of Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance and more granular NRT implementation as to soft commit. The update performance is about 70,000 documents / sec*. You can also scale up to 2 billion documents* in a single core, and query half a billion documents index in ms**.
>>
>> RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
>>
>> You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 Realtime performance from here:
>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
>>
>> You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
>> http://solr-ra.tgels.org
>>
>> Please download and give the new version a try.
>>
>> Regards,
>>
>> Nagendra Nagarajayya
>> http://solr-ra.tgels.org
>> http://rankingalgorithm.tgels.org
>>
>> * performance seen at a user installation of Solr 4.0 with RankingAlgorithm 1.4.3
>> ** performance seen when using the age feature
>>

Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Stefan Moises
+1, too... very confusing announcements, both because of the "official"
sounding posts and also the "double-realtime" name :P
And he also says in a follow-up post "I am contributing Realtime NRT to
Solr 4.0...", which sounds like this feature will be available in the
official 4.x Solr release, which makes it even more confusing.

The project itself sounds cool, though.

Cheers,
Stefan
Am 23.07.2012 16:01, schrieb Bernd Fehling:

> +1
>
> What would be if ALL external projects using lucene and/or solr are announcing on this list
> that they have stepped up to the next higher release after a release change?
>
> Also "Realtime NRT", if NRT stands for "Near_Real_Time" he has a "Realtime Near_Real_Time" Algorithm.
>
> Regards,
> Bernd
>
>
> Am 23.07.2012 14:09, schrieb Mark Miller:
>> These emails from Nagendra are very confusing. I've asked him in the past to be explicit about his announce and make it clear that it is an external project.
>>
>> Since I don't think he has changed how he does announce since that request, allow me to help out:
>>
>> Please note: This project has nothing to do with Apache. It is a completely external project that apparently uses Apache Solr.
>>
>> It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's simply a project that an external user is promoting on the Solr mailing list.
>>
>> - Mark Miller
>> lucidimagination.com
>>
>> On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:
>>
>>> Hi!
>>>
>>> I am very excited to announce the availability of Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance and more granular NRT implementation as to soft commit. The update performance is about 70,000 documents / sec*. You can also scale up to 2 billion documents* in a single core, and query half a billion documents index in ms**.
>>>
>>> RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
>>>
>>> You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 Realtime performance from here:
>>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
>>>
>>> You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
>>> http://solr-ra.tgels.org
>>>
>>> Please download and give the new version a try.
>>>
>>> Regards,
>>>
>>> Nagendra Nagarajayya
>>> http://solr-ra.tgels.org
>>> http://rankingalgorithm.tgels.org
>>>
>>> * performance seen at a user installation of Solr 4.0 with RankingAlgorithm 1.4.3
>>> ** performance seen when using the age feature
>>>
>

--
Mit den besten Grüßen aus Nürnberg,
Stefan Moises

*******************************************
Stefan Moises
Senior Softwareentwickler
Leiter Modulentwicklung

shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck

Tel.: 0911/25566-0
Fax:  0911/25566-29
[hidden email]
http://www.shoptimax.de
*******************************************


Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
In reply to this post by Mark Miller-3
I like Mark's suggestion of marking the announcement as an external
project. Will add that to future announcements.

Regarding the announcement itself, Apache Solr with RankingAlgorithm has
made available NRT functionality to Apache Solr  from version 1.4.1.
There were lots of requests/JIRAs for this functionality (sometime back)
which had not been addressed in Solr. So announcing on this list  to let
know everyone in the community that this functionality is available with
Apache Solr, is the right way to do it right ? The whole list is made up
of developers who are using Apache Solr and who are interested in
hearing about Apache Solr related stuff. I am not sure why any one will
get offended by an announcement that NRT functionality was available
with older releases. Apache Solr  4.0 does support NRT functionality now
with soft commit but Realtime NRT is another way of providing the
realtime functionality (much faster than soft commit). The breath of
Apache Software Foundation is for innovation to come in not only from
organized groups as Apache Solr or Apache Lucene but also from
individuals, small business or even large well funded business. The ASF
license also promotes that innovation may not be masked and provides
ways to bundle closed source with open source.  Apache Solr with
RankingAlgorithm is available for free to everyone. It will provide
innovative ways to search that may not be available with regular Apache
Solr. So I think it is fair to announce a new release on Apache Solr
mailing list.

This announcement was made as Apache Solr 4.0-ALPHA is a major milestone
Solr release. This would be similar to Python support for Apache Solr or
other announcements related to Apache Solr being announced on this list.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org



On 7/23/2012 5:09 AM, Mark Miller wrote:

> These emails from Nagendra are very confusing. I've asked him in the past to be explicit about his announce and make it clear that it is an external project.
>
> Since I don't think he has changed how he does announce since that request, allow me to help out:
>
> Please note: This project has nothing to do with Apache. It is a completely external project that apparently uses Apache Solr.
>
> It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's simply a project that an external user is promoting on the Solr mailing list.
>
> - Mark Miller
> lucidimagination.com
>
> On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:
>
>> Hi!
>>
>> I am very excited to announce the availability of Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance and more granular NRT implementation as to soft commit. The update performance is about 70,000 documents / sec*. You can also scale up to 2 billion documents* in a single core, and query half a billion documents index in ms**.
>>
>> RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
>>
>> You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 Realtime performance from here:
>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
>>
>> You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
>> http://solr-ra.tgels.org
>>
>> Please download and give the new version a try.
>>
>> Regards,
>>
>> Nagendra Nagarajayya
>> http://solr-ra.tgels.org
>> http://rankingalgorithm.tgels.org
>>
>> * performance seen at a user installation of Solr 4.0 with RankingAlgorithm 1.4.3
>> ** performance seen when using the age feature
>>
>
>
>
>
>
>
>
>
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
In reply to this post by Bernd Fehling
Thanks Bernd! Apache Solr 4.0-ALPHA is a major Solr  milestone release
so I think you will find lots of announcements related to it, like
python support, etc. Similarly Apache Solr with RankinAlgorithm release.

Realtime NRT is a innovative way to provide NRT functionality to Solr.
Realtime is the name of the tag used in solrconfig.xml to turn on this
functionality. I had not named the previous releases but decided to name
it from this release so as to differentiate the NRT functionality from
the one provided by soft-commit. Realtime NRT algorithm enables NRT
functionality in Solr by not closing the Searcher object  and so is very
fast. I am in the process of contributing the algorithm back to Apache
Solr as a patch.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org



On 7/23/2012 7:01 AM, Bernd Fehling wrote:

> +1
>
> What would be if ALL external projects using lucene and/or solr are announcing on this list
> that they have stepped up to the next higher release after a release change?
>
> Also "Realtime NRT", if NRT stands for "Near_Real_Time" he has a "Realtime Near_Real_Time" Algorithm.
>
> Regards,
> Bernd
>
>
> Am 23.07.2012 14:09, schrieb Mark Miller:
>> These emails from Nagendra are very confusing. I've asked him in the past to be explicit about his announce and make it clear that it is an external project.
>>
>> Since I don't think he has changed how he does announce since that request, allow me to help out:
>>
>> Please note: This project has nothing to do with Apache. It is a completely external project that apparently uses Apache Solr.
>>
>> It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's simply a project that an external user is promoting on the Solr mailing list.
>>
>> - Mark Miller
>> lucidimagination.com
>>
>> On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:
>>
>>> Hi!
>>>
>>> I am very excited to announce the availability of Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance and more granular NRT implementation as to soft commit. The update performance is about 70,000 documents / sec*. You can also scale up to 2 billion documents* in a single core, and query half a billion documents index in ms**.
>>>
>>> RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
>>>
>>> You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 Realtime performance from here:
>>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
>>>
>>> You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
>>> http://solr-ra.tgels.org
>>>
>>> Please download and give the new version a try.
>>>
>>> Regards,
>>>
>>> Nagendra Nagarajayya
>>> http://solr-ra.tgels.org
>>> http://rankingalgorithm.tgels.org
>>>
>>> * performance seen at a user installation of Solr 4.0 with RankingAlgorithm 1.4.3
>>> ** performance seen when using the age feature
>>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Yonik Seeley-2-2
On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
<[hidden email]> wrote:
> Realtime NRT algorithm enables NRT functionality in
> Solr by not closing the Searcher object  and so is very fast. I am in the
> process of contributing the algorithm back to Apache Solr as a patch.

Since you're in the process of contributing this back, perhaps you
could explain your approach - it never made sense to me.

Replacing the reader in an existing SolrIndexSearcher as you do means
that all the related caches will be invalid (meaning you can't use
solr's caches).  You could just ensure that there is no auto-warming
set up for Solr's caches (which is now the default), or you could
disable caching altogether.  It's not clear what you're comparing
against when you claim it's faster.

There are also consistency and concurrency issues with replacing the
reader in an existing SolrIndexSearcher, which is supposed to have a
static view of the index.  If a reader replacement happens in the
middle of a request, it's bound to cause trouble, including returning
the wrong documents!

-Yonik
http://lucidimagination.com
Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Mark Miller-3
In reply to this post by nnagarajayya

On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:

> I am not sure why any one will get offended by an announcement that NRT functionality was available with older releases.

FWIW, I'm not offended - I don't mind if third parties post announcements if they are related to Solr.

I just want to make sure it's very clear that it's a third party announce so there is no confusion - people that don't follow the lists on a daily basis read these things. A lot of these emails end up archived on various sites that collect mailing lists. It's easy to run into them without the proper context.

I think part of the confusion is the naming. Technically, Apache does not allow the use of Apache marks as part of a third party name. Instead, the name should be something like "Product X, powered by Solr"

See http://www.apache.org/foundation/marks/faq/#products

- Mark Miller
lucidimagination.com











Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
In reply to this post by Yonik Seeley-2-2
Hi Yonik:

Please see my comments below:

On 7/23/2012 8:52 AM, Yonik Seeley wrote:

> On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
> <[hidden email]>  wrote:
>> Realtime NRT algorithm enables NRT functionality in
>> Solr by not closing the Searcher object  and so is very fast. I am in the
>> process of contributing the algorithm back to Apache Solr as a patch.
> Since you're in the process of contributing this back, perhaps you
> could explain your approach - it never made sense to me.
>
> Replacing the reader in an existing SolrIndexSearcher as you do means
> that all the related caches will be invalid (meaning you can't use
> solr's caches).  You could just ensure that there is no auto-warming
> set up for Solr's caches (which is now the default), or you could
> disable caching altogether.  It's not clear what you're comparing
> against when you claim it's faster.

Solr with RankingAlgorithm does not replace the reader in
SolrIndexSearcher object. All it does is override the
IndexSearcher.getIndexReader() method so as to supply a NRTReader if
realtime is enabled. All direct references to the "reader" member has
been replaced with a getIndexReader() method access.

The performance is better as SolrIndexSearcher is not closed every 1 sec
as in soft-commit. SolrIndexSearcher is a heavy object with caches, etc.
and is reference counted. So every 1 sec this object needs to closed,
re-allocated and the indexes need to be re-opened, caches invalidated,
while waiting for existing searchers to complete, making this very
expensive. realtime NRT does not close the SolrIndexSearcher object but
makes available a new NRTReader with document updates ie.
getIndexReader() returns a new NRTReader.

> There are also consistency and concurrency issues with replacing the
> reader in an existing SolrIndexSearcher, which is supposed to have a
> static view of the index.  If a reader replacement happens in the
> middle of a request, it's bound to cause trouble, including returning
> the wrong documents!

The reader member is not replaced in the existing SolrIndexSearcher
object. The IndexSearcher.getIndexReader() method has been overriden in
SolrIndexSearcher and all direct reader member access has been replaced
with a getIndexReader() method call allowing a NRT reader to be supplied
when realtime is enabled. The concurrency is handled by the
getNRTReader() method, with the static index view now increased to the
granularity provided by the NRTIndexReader.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

> -Yonik
> http://lucidimagination.com
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Yonik Seeley-2-2
On Tue, Jul 24, 2012 at 8:24 AM, Nagendra Nagarajayya
<[hidden email]> wrote:
> SolrIndexSearcher is a heavy object with caches, etc.

As I've said, the caches are configurable, and it's trivial to disable
all caching (to the point where the cache objects are not even
created).

> The reader member is not replaced in the existing SolrIndexSearcher object.
> The IndexSearcher.getIndexReader() method has been overriden in
> SolrIndexSearcher and all direct reader member access has been replaced with
> a getIndexReader() method call allowing a NRT reader to be supplied when
> realtime is enabled.

In a single Solr request (that runs through multiple components like
query, highlight, facet, and response writing),
does IndexSearcher.getIndexReader() always return the same reader?  If
not, this breaks pretty much every standard solr component - but it
will only be apparent under load, and if you are carefully sanity
checking the results.

-Yonik
http://lucidimagination.com
Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
In reply to this post by Mark Miller-3
Thanks Mark! I am already working with Apache Software Foundation on the
mark and am using the correct usage of the mark as suggested by them.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/23/2012 12:15 PM, Mark Miller wrote:

> On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:
>
>> I am not sure why any one will get offended by an announcement that NRT functionality was available with older releases.
> FWIW, I'm not offended - I don't mind if third parties post announcements if they are related to Solr.
>
> I just want to make sure it's very clear that it's a third party announce so there is no confusion - people that don't follow the lists on a daily basis read these things. A lot of these emails end up archived on various sites that collect mailing lists. It's easy to run into them without the proper context.
>
> I think part of the confusion is the naming. Technically, Apache does not allow the use of Apache marks as part of a third party name. Instead, the name should be something like "Product X, powered by Solr"
>
> See http://www.apache.org/foundation/marks/faq/#products
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Andy-152
In reply to this post by nnagarajayya
Nagendra,

Does RankingAlgorithm work with faceting which requires the use of cache? As new documents are added or updated, the cache will be constantly invalidated. So how would RankingAlgorithm work in this case?


________________________________
 From: Nagendra Nagarajayya <[hidden email]>
To: [hidden email]
Sent: Tuesday, July 24, 2012 8:24 AM
Subject: Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download
 
Hi Yonik:

Please see my comments below:

On 7/23/2012 8:52 AM, Yonik Seeley wrote:

> On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
> <[hidden email]>  wrote:
>> Realtime NRT algorithm enables NRT functionality in
>> Solr by not closing the Searcher object  and so is very fast. I am in the
>> process of contributing the algorithm back to Apache Solr as a patch.
> Since you're in the process of contributing this back, perhaps you
> could explain your approach - it never made sense to me.
>
> Replacing the reader in an existing SolrIndexSearcher as you do means
> that all the related caches will be invalid (meaning you can't use
> solr's caches).  You could just ensure that there is no auto-warming
> set up for Solr's caches (which is now the default), or you could
> disable caching altogether.  It's not clear what you're comparing
> against when you claim it's faster.

Solr with RankingAlgorithm does not replace the reader in SolrIndexSearcher object. All it does is override the IndexSearcher.getIndexReader() method so as to supply a NRTReader if realtime is enabled. All direct references to the "reader" member has been replaced with a getIndexReader() method access.

The performance is better as SolrIndexSearcher is not closed every 1 sec as in soft-commit. SolrIndexSearcher is a heavy object with caches, etc. and is reference counted. So every 1 sec this object needs to closed, re-allocated and the indexes need to be re-opened, caches invalidated, while waiting for existing searchers to complete, making this very expensive. realtime NRT does not close the SolrIndexSearcher object but makes available a new NRTReader with document updates ie. getIndexReader() returns a new NRTReader.

> There are also consistency and concurrency issues with replacing the
> reader in an existing SolrIndexSearcher, which is supposed to have a
> static view of the index.  If a reader replacement happens in the
> middle of a request, it's bound to cause trouble, including returning
> the wrong documents!

The reader member is not replaced in the existing SolrIndexSearcher object. The IndexSearcher.getIndexReader() method has been overriden in SolrIndexSearcher and all direct reader member access has been replaced with a getIndexReader() method call allowing a NRT reader to be supplied when realtime is enabled. The concurrency is handled by the getNRTReader() method, with the static index view now increased to the granularity provided by the NRTIndexReader.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

> -Yonik
> http://lucidimagination.com
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
Yes faceting works as before. Regarding the cache, the suggestion is to
disable the cache for realtime NRT, for now.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/24/2012 2:57 PM, Andy wrote:

> Nagendra,
>
> Does RankingAlgorithm work with faceting which requires the use of cache? As new documents are added or updated, the cache will be constantly invalidated. So how would RankingAlgorithm work in this case?
>
>
> ________________________________
>   From: Nagendra Nagarajayya<[hidden email]>
> To: [hidden email]
> Sent: Tuesday, July 24, 2012 8:24 AM
> Subject: Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download
>
> Hi Yonik:
>
> Please see my comments below:
>
> On 7/23/2012 8:52 AM, Yonik Seeley wrote:
>> On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
>> <[hidden email]>   wrote:
>>> Realtime NRT algorithm enables NRT functionality in
>>> Solr by not closing the Searcher object  and so is very fast. I am in the
>>> process of contributing the algorithm back to Apache Solr as a patch.
>> Since you're in the process of contributing this back, perhaps you
>> could explain your approach - it never made sense to me.
>>
>> Replacing the reader in an existing SolrIndexSearcher as you do means
>> that all the related caches will be invalid (meaning you can't use
>> solr's caches).  You could just ensure that there is no auto-warming
>> set up for Solr's caches (which is now the default), or you could
>> disable caching altogether.  It's not clear what you're comparing
>> against when you claim it's faster.
> Solr with RankingAlgorithm does not replace the reader in SolrIndexSearcher object. All it does is override the IndexSearcher.getIndexReader() method so as to supply a NRTReader if realtime is enabled. All direct references to the "reader" member has been replaced with a getIndexReader() method access.
>
> The performance is better as SolrIndexSearcher is not closed every 1 sec as in soft-commit. SolrIndexSearcher is a heavy object with caches, etc. and is reference counted. So every 1 sec this object needs to closed, re-allocated and the indexes need to be re-opened, caches invalidated, while waiting for existing searchers to complete, making this very expensive. realtime NRT does not close the SolrIndexSearcher object but makes available a new NRTReader with document updates ie. getIndexReader() returns a new NRTReader.
>
>> There are also consistency and concurrency issues with replacing the
>> reader in an existing SolrIndexSearcher, which is supposed to have a
>> static view of the index.  If a reader replacement happens in the
>> middle of a request, it's bound to cause trouble, including returning
>> the wrong documents!
> The reader member is not replaced in the existing SolrIndexSearcher object. The IndexSearcher.getIndexReader() method has been overriden in SolrIndexSearcher and all direct reader member access has been replaced with a getIndexReader() method call allowing a NRT reader to be supplied when realtime is enabled. The concurrency is handled by the getNRTReader() method, with the static index view now increased to the granularity provided by the NRTIndexReader.
>
>
> Regards,
>
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
>> -Yonik
>> http://lucidimagination.com
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
In reply to this post by Yonik Seeley-2-2
Each request thread may return updated results.  Each component may also
in certain cases return updated results. The algorithm is designed to
handle these. The granularity of the returned results can be controlled
through a visible parameter.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

On 7/24/2012 5:36 AM, Yonik Seeley wrote:

> On Tue, Jul 24, 2012 at 8:24 AM, Nagendra Nagarajayya
> <[hidden email]>  wrote:
>> SolrIndexSearcher is a heavy object with caches, etc.
> As I've said, the caches are configurable, and it's trivial to disable
> all caching (to the point where the cache objects are not even
> created).
>
>> The reader member is not replaced in the existing SolrIndexSearcher object.
>> The IndexSearcher.getIndexReader() method has been overriden in
>> SolrIndexSearcher and all direct reader member access has been replaced with
>> a getIndexReader() method call allowing a NRT reader to be supplied when
>> realtime is enabled.
> In a single Solr request (that runs through multiple components like
> query, highlight, facet, and response writing),
> does IndexSearcher.getIndexReader() always return the same reader?  If
> not, this breaks pretty much every standard solr component - but it
> will only be apparent under load, and if you are carefully sanity
> checking the results.
>
> -Yonik
> http://lucidimagination.com
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Mark Miller-3
In reply to this post by nnagarajayya
You are changing the name, or someone at Apache told you the current name is okay?

If someone at Apache told you it was okay, who was that?

You are certainly not using the Solr mark in an approved manner and I'd hope if you are going to take advantage of our mailing list for promotion of your product, that you would not violate our trademark. You are already on shaky ground promoting a Solr fork on the Solr mailing list by announcing every release - naming your fork something with Solr in it puts you over the edge on my list.

We don't allow people to name their products things like "Solr: the wonder edition" or anything along those lines. Solr is our trademark and third party products must have their own name. The only thing we allow is the phrase "powered by Solr".

I'm on the Lucene/Solr PMC and am an Apache member and I'd find it pretty hard to believe that anyone would suggest that your usage is a correct usage of the Solr trademark.

- Mark

On Jul 24, 2012, at 8:36 AM, Nagendra Nagarajayya wrote:

> Thanks Mark! I am already working with Apache Software Foundation on the mark and am using the correct usage of the mark as suggested by them.
>
> Regards,
>
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
>
> On 7/23/2012 12:15 PM, Mark Miller wrote:
>> On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:
>>
>>> I am not sure why any one will get offended by an announcement that NRT functionality was available with older releases.
>> FWIW, I'm not offended - I don't mind if third parties post announcements if they are related to Solr.
>>
>> I just want to make sure it's very clear that it's a third party announce so there is no confusion - people that don't follow the lists on a daily basis read these things. A lot of these emails end up archived on various sites that collect mailing lists. It's easy to run into them without the proper context.
>>
>> I think part of the confusion is the naming. Technically, Apache does not allow the use of Apache marks as part of a third party name. Instead, the name should be something like "Product X, powered by Solr"
>>
>> See http://www.apache.org/foundation/marks/faq/#products
>>
>> - Mark Miller
>> lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

- Mark Miller
lucidimagination.com











Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Andy-152
In reply to this post by nnagarajayya
But Solr relies on cache in faceting for performance reason. If it is required to disable the cache then faceting would be very slow under RankingAlgorithm, no?


________________________________
 From:Nagendra Nagarajayya <[hidden email]>
To:[hidden email]
Sent:Wednesday, July 25, 2012 9:12 AM
Subject:Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download
 
Yes faceting works as before. Regarding the cache, the suggestion is to
disable the cache for realtime NRT, for now.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/24/2012 2:57 PM, Andy wrote:

> Nagendra,
>
> Does RankingAlgorithm work with faceting which requires the use of cache? As new documents are added or updated, the cache will be constantly invalidated. So how would RankingAlgorithm work in this case?
>
>
> ________________________________
>   From: Nagendra Nagarajayya<[hidden email]>
> To: [hidden email]
> Sent: Tuesday, July 24, 2012 8:24 AM
> Subject: Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download
>
> Hi Yonik:
>
> Please see my comments below:
>
> On 7/23/2012 8:52 AM, Yonik Seeley wrote:
>> On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
>> <[hidden email]>   wrote:
>>> Realtime NRT algorithm enables NRT functionality in
>>> Solr by not closing the Searcher object  and so is very fast. I am in the
>>> process of contributing the algorithm back to Apache Solr as a patch.
>> Since you're in the process of contributing this back, perhaps you
>> could explain your approach - it never made sense to me.
>>
>> Replacing the reader in an existing SolrIndexSearcher as you do means
>> that all the related caches will be invalid (meaning you can't use
>> solr's caches).  You could just ensure that there is no auto-warming
>> set up for Solr's caches (which is now the default), or you could
>> disable caching altogether.  It's not clear what you're comparing
>> against when you claim it's faster.
> Solr with RankingAlgorithm does not replace the reader in SolrIndexSearcher object. All it does is override the IndexSearcher.getIndexReader() method so as to supply a NRTReader if realtime is enabled. All direct references to the "reader" member has been replaced with a getIndexReader() method access.
>
> The performance is better as SolrIndexSearcher is not closed every 1 sec as in soft-commit. SolrIndexSearcher is a heavy object with caches, etc. and is reference counted. So every 1 sec this object needs to closed, re-allocated and the indexes need to be re-opened, caches invalidated, while waiting for existing searchers to complete, making this very expensive. realtime NRT does not close the SolrIndexSearcher object but makes available a new NRTReader with document updates ie. getIndexReader() returns a new NRTReader.
>
>> There are also consistency and concurrency issues with replacing the
>> reader in an existing SolrIndexSearcher, which is supposed to have a
>> static view of the index.  If a reader replacement happens in the
>> middle of a request, it's bound to cause trouble, including returning
>> the wrong documents!
> The reader member is not replaced in the existing SolrIndexSearcher object. The IndexSearcher.getIndexReader() method has been overriden in SolrIndexSearcher and all direct reader member access has been replaced with a getIndexReader() method call allowing a NRT reader to be supplied when realtime is enabled. The concurrency is handled by the getNRTReader() method, with the static index view now increased to the granularity provided by the NRTIndexReader.
>
>
> Regards,
>
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
>> -Yonik
>> http://lucidimagination.com
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

nnagarajayya
In reply to this post by Mark Miller-3
Mark,

Grant Ingersoll from ASF got in touch with me to ensure that I am
compliant with the Apache Trade Mark. I made changes to the names, web
pages, wiki, papers, etc. and sent back the links to Grant for approval.
You may want to check with Grant.

Regarding the fork, I am not creating a fork but actually contributing
the realtime NRT back to Apache Solr.  There was no NRT functionality in
the older versions of Solr.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org



On 7/25/2012 6:54 AM, Mark Miller wrote:

> You are changing the name, or someone at Apache told you the current name is okay?
>
> If someone at Apache told you it was okay, who was that?
>
> You are certainly not using the Solr mark in an approved manner and I'd hope if you are going to take advantage of our mailing list for promotion of your product, that you would not violate our trademark. You are already on shaky ground promoting a Solr fork on the Solr mailing list by announcing every release - naming your fork something with Solr in it puts you over the edge on my list.
>
> We don't allow people to name their products things like "Solr: the wonder edition" or anything along those lines. Solr is our trademark and third party products must have their own name. The only thing we allow is the phrase "powered by Solr".
>
> I'm on the Lucene/Solr PMC and am an Apache member and I'd find it pretty hard to believe that anyone would suggest that your usage is a correct usage of the Solr trademark.
>
> - Mark
>
> On Jul 24, 2012, at 8:36 AM, Nagendra Nagarajayya wrote:
>
>> Thanks Mark! I am already working with Apache Software Foundation on the mark and am using the correct usage of the mark as suggested by them.
>>
>> Regards,
>>
>> Nagendra Nagarajayya
>> http://solr-ra.tgels.org
>> http://rankingalgorithm.tgels.org
>>
>>
>> On 7/23/2012 12:15 PM, Mark Miller wrote:
>>> On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:
>>>
>>>> I am not sure why any one will get offended by an announcement that NRT functionality was available with older releases.
>>> FWIW, I'm not offended - I don't mind if third parties post announcements if they are related to Solr.
>>>
>>> I just want to make sure it's very clear that it's a third party announce so there is no confusion - people that don't follow the lists on a daily basis read these things. A lot of these emails end up archived on various sites that collect mailing lists. It's easy to run into them without the proper context.
>>>
>>> I think part of the confusion is the naming. Technically, Apache does not allow the use of Apache marks as part of a third party name. Instead, the name should be something like "Product X, powered by Solr"
>>>
>>> See http://www.apache.org/foundation/marks/faq/#products
>>>
>>> - Mark Miller
>>> lucidimagination.com
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

Mark Miller-3
On Wed, Jul 25, 2012 at 11:03 AM, Nagendra Nagarajayya <
[hidden email]> wrote:

> Mark,
>
> Grant Ingersoll from ASF got in touch with me to ensure that I am
> compliant with the Apache Trade Mark. I made changes to the names, web
> pages, wiki, papers, etc. and sent back the links to Grant for approval.
> You may want to check with Grant.
>

Great, I'm glad to hear it. I didn't understand your original response with
regards to when you had spoken to someone and if a change was coming or you
thought you were already in compliance.


>
> Regarding the fork, I am not creating a fork but actually contributing the
> realtime NRT back to Apache Solr.  There was no NRT functionality in the
> older versions of Solr.


You have a fork now though - and forks are fine. Anyone should feel
comfortable forking Apache licensed code. I just want to make sure there is
no confusion about it - that is why we have the naming rules. If you end up
contributing code back, that is great, but it's a separate thing.


>
>
> Regards,
>
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.**org <http://rankingalgorithm.tgels.org>
>
>
>
> On 7/25/2012 6:54 AM, Mark Miller wrote:
>
>> You are changing the name, or someone at Apache told you the current name
>> is okay?
>>
>> If someone at Apache told you it was okay, who was that?
>>
>> You are certainly not using the Solr mark in an approved manner and I'd
>> hope if you are going to take advantage of our mailing list for promotion
>> of your product, that you would not violate our trademark. You are already
>> on shaky ground promoting a Solr fork on the Solr mailing list by
>> announcing every release - naming your fork something with Solr in it puts
>> you over the edge on my list.
>>
>> We don't allow people to name their products things like "Solr: the
>> wonder edition" or anything along those lines. Solr is our trademark and
>> third party products must have their own name. The only thing we allow is
>> the phrase "powered by Solr".
>>
>> I'm on the Lucene/Solr PMC and am an Apache member and I'd find it pretty
>> hard to believe that anyone would suggest that your usage is a correct
>> usage of the Solr trademark.
>>
>> - Mark
>>
>> On Jul 24, 2012, at 8:36 AM, Nagendra Nagarajayya wrote:
>>
>>  Thanks Mark! I am already working with Apache Software Foundation on the
>>> mark and am using the correct usage of the mark as suggested by them.
>>>
>>> Regards,
>>>
>>> Nagendra Nagarajayya
>>> http://solr-ra.tgels.org
>>> http://rankingalgorithm.tgels.**org <http://rankingalgorithm.tgels.org>
>>>
>>>
>>> On 7/23/2012 12:15 PM, Mark Miller wrote:
>>>
>>>> On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:
>>>>
>>>>  I am not sure why any one will get offended by an announcement that
>>>>> NRT functionality was available with older releases.
>>>>>
>>>> FWIW, I'm not offended - I don't mind if third parties post
>>>> announcements if they are related to Solr.
>>>>
>>>> I just want to make sure it's very clear that it's a third party
>>>> announce so there is no confusion - people that don't follow the lists on a
>>>> daily basis read these things. A lot of these emails end up archived on
>>>> various sites that collect mailing lists. It's easy to run into them
>>>> without the proper context.
>>>>
>>>> I think part of the confusion is the naming. Technically, Apache does
>>>> not allow the use of Apache marks as part of a third party name. Instead,
>>>> the name should be something like "Product X, powered by Solr"
>>>>
>>>> See http://www.apache.org/**foundation/marks/faq/#products<http://www.apache.org/foundation/marks/faq/#products>
>>>>
>>>> - Mark Miller
>>>> lucidimagination.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>  - Mark Miller
>> lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>


--
- Mark

http://www.lucidimagination.com
12