I just want to ask don't you think that response streaming can be useful for things like OLAP, e.g. is you have sharded index presorted and pre-joined by BJQ way you can calculate counts in many cube cells in parallel?
Essential distributed test for response streaming just passed.
The current issue is that reading response by SolrJ is done as whole. Reading by callback is supported by EmbeddedServer only. Anyway it should not a big deal. ResponseStreamingTest.java somehow works. I'm stuck on introducing response streaming in distributes search, it's actually more challenging - RespStreamDistributedTest fails
On Fri, Mar 16, 2012 at 3:51 PM, Nicholas Ball <[hidden email]> wrote:
Mikhail & Ludovic,
Thanks for both your replies, very helpful indeed!
Ludovic, I was actually looking into just that and did some tests with
SolrJ, it does work well but needs some changes on the Solr server if we
want to send out individual documents a various times. This could be done
with a write() and flush() to the FastOutputStream (daos) in JavBinCodec. I
therefore think that a combination of this and Mikhail's solution would
Mikhail, you mention that your solution doesn't currently work and not
sure why this is the case, but could it be that you haven't flushed the
data (os.flush()) you've written in the collect method of DocSetStreamer? I
think placing the output stream into the SolrQueryRequest is the way to go,
so that we can access it and write to it how we intend. However, I think
using the JavaBinCodec would be ideal so that we can work with SolrJ
directly, and not mess around with the encoding of the docs/data etc...
At the moment the entry point to JavaBinCodec is through the
BinaryResponseWriter which calls the highest level marshal() method which
decodes and sends out the entire SolrQueryResponse (line 49 @
BinaryResponseWriter). What would be ideal is to be able to break up the
response and call the JavaBinCodec for pieces of it with a flush after each
call. Did a few tests with a simple Thread.sleep and a flush to see if this
would actually work and looks like it's working out perfectly. Just trying
to figure out the best way to actually do it now :) any ideas?
An another note, for a solution to work with the chunked transfer encoding
(and therefore web browsers), a lot more development is going to be needed.
Not sure if it's worth trying yet but might look into it later down the
On Fri, 16 Mar 2012 07:29:20 +0300, Mikhail Khludnev
<[hidden email]> wrote:
> I looked through. First of all, it seems to me you don't amend regular
> "servlet" solr server, but the only embedded one.
> Anyway, the difference is that you stream DocList via callback, but it
> means that you've instantiated it in memory and keep it there until it
> be completely consumed. Think about a billion numfound. Core idea of my
> approach is keep almost zero memory for response.
> On Fri, Mar 16, 2012 at 12:12 AM, lboutros <[hidden email]> wrote:
>> I was looking for something similar.
>> I tried this patch :
>> it's working quite well (I've back-ported the code in Solr 3.5.0...).
>> Is it really different from what you are trying to achieve ?
>> View this message in context:
>> Sent from the Solr - User mailing list archive at Nabble.com.