extending SolrIndexSearcher

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

extending SolrIndexSearcher

Koji Miyamoto
Hi,

I am looking at extending the source code for SolrIndexSearcher for my own
purposes.  Basically, I am trying to replace the use of Lucene's
IndexSearcher with a ParallelMultiSearcher version so that I can have a
query search both locally available indexes as well as remote indexes
available only via RMI.  This ParallelMultiSearcher is instantiated to
consist of both local and remote Searchable references.  The local
Searchables are simply IndexSearcher instances tied to local disk (separate
indexes), while the remote Searchables are made reachable via RMI.

In essence, where it used to be:

  IndexSearcher searcher = new IndexSearcher(reader);

it is now: (not the actual code but similar)

  Searchable[] searchables = new Searchable[3];
  for (int i=0; i<2; i++) {
    // Local searchable:
    searchables[i] = new IndexSearcher("/disk" + i + "/index");
  }

  // RMI searchable:  throws exception during search..
  searchables[2] = (Searchable) Naming.lookup
("//remote_host:1099/remote_svc");

  ParallelMultiSearcher searcher = new ParallelMultiSearcher(sch);

When I build the source and use it (the short story, by replacing the
relevant class file(s) within solr.war used by the example jetty
implementation), it starts up just fine.  If I comment out the RMI
searchable line, submission of a search query to Jetty/Solr works just fine,
and it is able to search any number of indexes.  However, with the RMI
searchable uncommented out, I get an exception thrown (here's the ending of
it):

May 9, 2006 1:38:07 AM org.apache.solr.core.SolrException log
SEVERE: java.rmi.MarshalException: error marshalling arguments; nested
exception is:
        java.io.NotSerializableException:
org.apache.lucene.search.MultiSearcher$1
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122)
        at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown
Source)
        at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java
:248)
        at org.apache.lucene.search.Searcher.search(Searcher.java:116)
        at org.apache.lucene.search.Searcher.search(Searcher.java:95)
        at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
SolrIndexSearcher.java:794)
        at org.apache.solr.search.SolrIndexSearcher.getDocListC(
SolrIndexSearcher.java:712)
        at org.apache.solr.search.SolrIndexSearcher.getDocList(
SolrIndexSearcher.java:605)
        at org.apache.solr.request.StandardRequestHandler.handleRequest(
StandardRequestHandler.java:106)

So it looks like it requires Serialization somehow to get it to work.
Wondering if anyone has any ideas to get around this problem.

tia,
Koji
Reply | Threaded
Open this post in threaded view
|

Re: extending SolrIndexSearcher

Chris Hostetter-3

I don't really know a lot about RMI, but as i understand it, Serialization
is a core neccessity -- if the arguments you want to pass to your Remote
Method aren't serializable, then RMI can't pass those argument across the
wire.

That said: it's not clear to me from the psuedocode/stacktrace you
included *what* isn't serializable ... is it a Solr class or a core Lucene
class?

If it's a Lucene class, you may want to start by making a small proof
of concept RMI app that just uses the Lucene core classes, once that
works then try your changes in Solr.


: Date: Tue, 9 May 2006 02:32:45 -0700
: From: Koji Miyamoto <[hidden email]>
: Reply-To: [hidden email]
: To: [hidden email]
: Subject: extending SolrIndexSearcher
:
: Hi,
:
: I am looking at extending the source code for SolrIndexSearcher for my own
: purposes.  Basically, I am trying to replace the use of Lucene's
: IndexSearcher with a ParallelMultiSearcher version so that I can have a
: query search both locally available indexes as well as remote indexes
: available only via RMI.  This ParallelMultiSearcher is instantiated to
: consist of both local and remote Searchable references.  The local
: Searchables are simply IndexSearcher instances tied to local disk (separate
: indexes), while the remote Searchables are made reachable via RMI.
:
: In essence, where it used to be:
:
:   IndexSearcher searcher = new IndexSearcher(reader);
:
: it is now: (not the actual code but similar)
:
:   Searchable[] searchables = new Searchable[3];
:   for (int i=0; i<2; i++) {
:     // Local searchable:
:     searchables[i] = new IndexSearcher("/disk" + i + "/index");
:   }
:
:   // RMI searchable:  throws exception during search..
:   searchables[2] = (Searchable) Naming.lookup
: ("//remote_host:1099/remote_svc");
:
:   ParallelMultiSearcher searcher = new ParallelMultiSearcher(sch);
:
: When I build the source and use it (the short story, by replacing the
: relevant class file(s) within solr.war used by the example jetty
: implementation), it starts up just fine.  If I comment out the RMI
: searchable line, submission of a search query to Jetty/Solr works just fine,
: and it is able to search any number of indexes.  However, with the RMI
: searchable uncommented out, I get an exception thrown (here's the ending of
: it):
:
: May 9, 2006 1:38:07 AM org.apache.solr.core.SolrException log
: SEVERE: java.rmi.MarshalException: error marshalling arguments; nested
: exception is:
:         java.io.NotSerializableException:
: org.apache.lucene.search.MultiSearcher$1
:         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122)
:         at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown
: Source)
:         at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java
: :248)
:         at org.apache.lucene.search.Searcher.search(Searcher.java:116)
:         at org.apache.lucene.search.Searcher.search(Searcher.java:95)
:         at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
: SolrIndexSearcher.java:794)
:         at org.apache.solr.search.SolrIndexSearcher.getDocListC(
: SolrIndexSearcher.java:712)
:         at org.apache.solr.search.SolrIndexSearcher.getDocList(
: SolrIndexSearcher.java:605)
:         at org.apache.solr.request.StandardRequestHandler.handleRequest(
: StandardRequestHandler.java:106)
:
: So it looks like it requires Serialization somehow to get it to work.
: Wondering if anyone has any ideas to get around this problem.
:
: tia,
: Koji
:



-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: extending SolrIndexSearcher

Koji Miyamoto
I tried it with just Lucene + RMI, and that works just fine.  It's actually
based on the Lucene In Action e-book topic on how to use
ParallelMultiSearcher (chap.5).  The relevant code snippet follows:

/*
 * search server:
 * This is the code frag for the search server, which enters
 * a wait-loop to accept requests on port 1099.
 * This server implementation is run on 2+ separate boxes, one
 * is a "master" while the rest are as "slaves", where master is
 * the main entry point which searches both it's local indexes,
 * and sends requests to each slave, which only searches its own
 * local indexes and reports back results to the master.
 */

  //private Vector<Searchable> _searchables;
  //private Vector<String> _localDirs;
  // ...

  // add local dirs as searchables..
  for (int i=0; i<_localDirs.size(); i++) {
     System.out.println("local searchable: " + _localDirs.get(i) + " ..");
     _searchables.add(new IndexSearcher(_localDirs.get(i)));
  }

  // add remote nodes (slaves) as searchables..
  // note: only master will do this, the slaves only looks at its local
indexes..
  if (_remoteNodes != null) {
     Collection nodes = _remoteNodes.values();
     Iterator it = nodes.iterator();
     String node = "";
     while (it.hasNext()) {
        node = (String) it.next();
        try {
           // remote nodes (slaves) also reachable via port 1099
           _searchables.add((Searchable) Naming.lookup("//" + node +
":1099/" + _DEFAULT_SVC_NAME_));
           System.out.println("remote searchable: " + node + " ..");
        } catch (java.rmi.ConnectException e) {
           System.err.println("ERROR: unable to connect to node=" + node + "
...");
        }
     }
  }

  // just some glue to prepare list of searchables for ParallelMultiSearcher
constructor..
  Searchable[] sch = new Searchable[_searchables.size()];
  for (int i=0; i<_searchables.size(); i++) {
     sch[i] = _searchables.get(i);
  }

  // start up server..
  System.setSecurityManager(new RMISecurityManager());
  LocateRegistry.createRegistry(_port);
  Searcher parallelSearcher = new ParallelMultiSearcher(sch);
  RemoteSearchable parallelImpl = new RemoteSearchable(parallelSearcher);
  Naming.rebind("//" + _nodeID + ":" + _port + "/" + _DEFAULT_SVC_NAME_,
parallelImpl);
  System.out.println("SearchServer started " +
        "(nodeID=" + _nodeID +
        ", port=" + _port +
        ", role=" + ((_remoteNodes!=null)?"master":"slave") +
        ", # searchables=" + _searchables.size() + ")...");

  // enters wait state, ready to accept requests on port 1099...

========================

/*
 * search client
 * This basically does an RMI naming lookup to get a reference to
 * the master node on port 1099, then sends a search query..
 */

TermQuery query = new TermQuery(new Term("body", word));
MultiSearcher searcher = new MultiSearcher(new
             Searchable[]{_lookupRemote(_DEFAULT_SVC_NAME_)});

Hits hits = searcher.search(query);

Document doc = null;
for (int i=0; i<hits.length(); i++) {
  doc = hits.doc(i);
  // able to get hit info here...
}

// .....

private Searchable _lookupRemote(String svcName) throws Exception {
  return (Searchable) Naming.lookup("//" + _host + ":" + _port + "/" +
svcName);
}

========================

From both of the above code, I am able to start a server on box1 (master),
another server on box2 (slave), then invoke a client that queries box1,
which can get results from searching indexes in box1+box2.  With this
working, that's when I tried to incorporate ParallelMultiSearcher on Solr's
SolrIndexSearcher, since I saw that it is the place where it uses Lucene's
IndexSearcher.  I replaced it with ParallelMultiSearcher, where it is
initialized similar to the client code I mentioned above.

From that, it seems like Solr itself needs to marshall and unmarshall the
searcher instance SolrIndexSearcher holds, and because the
ParallelMultiSearcher is initialized with RMI stubs, it fails to proceed
with such marshall/unmarshall internal actions.  As mentioned in the first
email, if I use ParallelMultiSearcher to only look at local indexes (no RMI
stub), Solr works just fine.  So I'm wondering if there is a way use
SolrIndexSearcher to search both local and remote indexes, even if not
through the RMI solution Lucene's ebook has suggested via its
ParallelMultiSearcher class.

tia,
Koji



On 5/9/06, Chris Hostetter <[hidden email]> wrote:

>
>
> I don't really know a lot about RMI, but as i understand it, Serialization
> is a core neccessity -- if the arguments you want to pass to your Remote
> Method aren't serializable, then RMI can't pass those argument across the
> wire.
>
> That said: it's not clear to me from the psuedocode/stacktrace you
> included *what* isn't serializable ... is it a Solr class or a core Lucene
> class?
>
> If it's a Lucene class, you may want to start by making a small proof
> of concept RMI app that just uses the Lucene core classes, once that
> works then try your changes in Solr.
>
>
> : Date: Tue, 9 May 2006 02:32:45 -0700
> : From: Koji Miyamoto <[hidden email]>
> : Reply-To: [hidden email]
> : To: [hidden email]
> : Subject: extending SolrIndexSearcher
> :
> : Hi,
> :
> : I am looking at extending the source code for SolrIndexSearcher for my
> own
> : purposes.  Basically, I am trying to replace the use of Lucene's
> : IndexSearcher with a ParallelMultiSearcher version so that I can have a
> : query search both locally available indexes as well as remote indexes
> : available only via RMI.  This ParallelMultiSearcher is instantiated to
> : consist of both local and remote Searchable references.  The local
> : Searchables are simply IndexSearcher instances tied to local disk
> (separate
> : indexes), while the remote Searchables are made reachable via RMI.
> :
> : In essence, where it used to be:
> :
> :   IndexSearcher searcher = new IndexSearcher(reader);
> :
> : it is now: (not the actual code but similar)
> :
> :   Searchable[] searchables = new Searchable[3];
> :   for (int i=0; i<2; i++) {
> :     // Local searchable:
> :     searchables[i] = new IndexSearcher("/disk" + i + "/index");
> :   }
> :
> :   // RMI searchable:  throws exception during search..
> :   searchables[2] = (Searchable) Naming.lookup
> : ("//remote_host:1099/remote_svc");
> :
> :   ParallelMultiSearcher searcher = new ParallelMultiSearcher(sch);
> :
> : When I build the source and use it (the short story, by replacing the
> : relevant class file(s) within solr.war used by the example jetty
> : implementation), it starts up just fine.  If I comment out the RMI
> : searchable line, submission of a search query to Jetty/Solr works just
> fine,
> : and it is able to search any number of indexes.  However, with the RMI
> : searchable uncommented out, I get an exception thrown (here's the ending
> of
> : it):
> :
> : May 9, 2006 1:38:07 AM org.apache.solr.core.SolrException log
> : SEVERE: java.rmi.MarshalException: error marshalling arguments; nested
> : exception is:
> :         java.io.NotSerializableException:
> : org.apache.lucene.search.MultiSearcher$1
> :         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122)
> :         at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown
> : Source)
> :         at org.apache.lucene.search.MultiSearcher.search(
> MultiSearcher.java
> : :248)
> :         at org.apache.lucene.search.Searcher.search(Searcher.java:116)
> :         at org.apache.lucene.search.Searcher.search(Searcher.java:95)
> :         at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
> : SolrIndexSearcher.java:794)
> :         at org.apache.solr.search.SolrIndexSearcher.getDocListC(
> : SolrIndexSearcher.java:712)
> :         at org.apache.solr.search.SolrIndexSearcher.getDocList(
> : SolrIndexSearcher.java:605)
> :         at org.apache.solr.request.StandardRequestHandler.handleRequest(
> : StandardRequestHandler.java:106)
> :
> : So it looks like it requires Serialization somehow to get it to work.
> : Wondering if anyone has any ideas to get around this problem.
> :
> : tia,
> : Koji
> :
>
>
>
> -Hoss
>
>
Reply | Threaded
Open this post in threaded view
|

Re: extending SolrIndexSearcher

Chris Hostetter-3

: IndexSearcher.  I replaced it with ParallelMultiSearcher, where it is
: initialized similar to the client code I mentioned above.
:
: >From that, it seems like Solr itself needs to marshall and unmarshall the
: searcher instance SolrIndexSearcher holds, and because the
: ParallelMultiSearcher is initialized with RMI stubs, it fails to proceed
: with such marshall/unmarshall internal actions.  As mentioned in the first
: email, if I use ParallelMultiSearcher to only look at local indexes (no RMI
: stub), Solr works just fine.  So I'm wondering if there is a way use
: SolrIndexSearcher to search both local and remote indexes, even if not
: through the RMI solution Lucene's ebook has suggested via its
: ParallelMultiSearcher class.

As I said, i don't really know a lot about RMI, but I don't think the
client code is expected to marshall/unmarshall things -- but the objects
you want to pass to remote methods (or recieve back from from remote
methods) need to be serializable.  Do you know what objects you got
serialization exceptions from? (you didn't include any real source -- just
psuedocode, so it's not posisble to use the line numbers in your stack
trace to look at the code because we don't know exactly what you changed)



-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: extending SolrIndexSearcher

Koji Miyamoto
Hi Chris,

My last email msg was in response to your suggestion:

> If it's a Lucene class, you may want to start by making a small proof
> of concept RMI app that just uses the Lucene core classes, once that
> works then try your changes in Solr.

For which I agree is a good starting point to narrow things down.  So my
last msg was actual code of non-solr testing of ParallelMultiSearcher with
RMI calls.

As for actual solr code modification, the following are the relevant pieces:

// approximately line 65, the constructor:
// SolrIndexSearcher class attributes:
// this was the original:
// private final IndexSearcher searcher;
// replaced with:
private final ParallelMultiSearcher searcher;


// approximately line 123, the constructor:
private SolrIndexSearcher(IndexSchema schema, String name, IndexReader r,
boolean closeReader, boolean enableCache) throws Exception {
    this.schema = schema;
    this.name = "Searcher@" + Integer.toHexString(hashCode()) + (name!=null
? " "+name : "");

    log.info("Opening " + this.name);

    reader = r;

    // this is the original:
    //searcher = new IndexSearcher(r);
    // replaced with:
    searcher = _initSearcher();
....
}

// and i added this to initialize searcher:
private ParallelMultiSearcher _initSearcher() throws Exception {

      Searchable[] sch = new Searchable[3];

      // local indexes that are searchable..
      for (int i=0; i<2; i++) {
         sch[i] = new IndexSearcher("/disk" + i);
      }

      // a remote searchable available via RMI
      sch[2] = (Searchable) Naming.lookup("//somehost.com:1099/searchit");

      ParallelMultiSearcher searcher = new ParallelMultiSearcher(sch);
      return searcher;
}

From this src code modification, I do an 'ant compile', repackage solr.war,
install it in the appropriate location, start up the example ('java -jar
start.jar'), then submit search queries via curl.

Then I submit a simple curl from cmd line:

curl http://localhost:8080/solr/select -d version="2.1" -d start=0 -d
rows=10 -d indent=on -d submit=search -d q="body:blablabla"

Without the RMI as a searchable, the search works just fine,  With the RMI
as a searchable, I get an exception:

java.rmi.MarshalException: error marshalling arguments; nested exception is:

        java.io.NotSerializableException:
org.apache.lucene.search.ParallelMultiSearcher$1
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122)
        at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown
Source)
        at org.apache.lucene.search.ParallelMultiSearcher.search(
ParallelMultiSearcher.java:172)
        at org.apache.lucene.search.Searcher.search(Searcher.java:116)
        at org.apache.lucene.search.Searcher.search(Searcher.java:95)
        at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
SolrIndexSearcher.java:794)
        at org.apache.solr.search.SolrIndexSearcher.getDocListC(
SolrIndexSearcher.java:712)
        at org.apache.solr.search.SolrIndexSearcher.getDocList(
SolrIndexSearcher.java:605)
        at org.apache.solr.request.StandardRequestHandler.handleRequest(
StandardRequestHandler.java:106)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:585)
        at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:80)
        at org.apache.solr.servlet.SolrServlet.doPost(SolrServlet.java:70)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:767)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:860)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java
:408)
        at org.mortbay.jetty.servlet.ServletHandler.handle(
ServletHandler.java:350)
        at org.mortbay.jetty.servlet.SessionHandler.handle(
SessionHandler.java:195)
        at org.mortbay.jetty.security.SecurityHandler.handle(
SecurityHandler.java:164)
        at org.mortbay.jetty.handler.ContextHandler.handle(
ContextHandler.java:536)

Looking at the last place on the src code for SolrIndexSearcher.java (line
794), this is the source code it threw from a search call with a newly
defined HitCollector:

    searcher.search(query, new HitCollector() {
      float minScore=Float.NEGATIVE_INFINITY;  // minimum score in the
priority queue
      public void collect(int doc, float score) {
        if (filt!=null && !filt.exists(doc)) return;
        if (numHits[0]++ < lastDocRequested || score >= minScore) {
          // if docs are always delivered in order, we could use
"score>minScore"
          // but might BooleanScorer14 might still be used and deliver docs
out-of-order?
          hq.insert(new ScoreDoc(doc, score));
          minScore = ((ScoreDoc)hq.top()).score;
        }
      }
    }

If I follow the exception trail, within Lucene it's
(repeated from above for context)

at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown Source)
        at org.apache.lucene.search.ParallelMultiSearcher.search(
ParallelMultiSearcher.java:172)
        at org.apache.lucene.search.Searcher.search(Searcher.java:116)
        at org.apache.lucene.search.Searcher.search(Searcher.java:95)

which has the following src code:

Searcher.java:95
public void search(Query query, HitCollector results)
    throws IOException {
  search(query, (Filter)null, results);
}

Searcher.java:116
public void search(Query query, Filter filter, HitCollector results)
    throws IOException {
  search(createWeight(query), filter, results);
}

ParallelMultiSearcher.java:172
public void search(Weight weight, Filter filter, final HitCollector results)
    throws IOException {
  for (int i = 0; i < searchables.length; i++) {

    final int start = starts[i];

>>> HERE:    searchables[i].search(weight, filter, new HitCollector() {
        public void collect(int doc, float score) {
          results.collect(doc + start, score);
        }
      });
  }
}

I'm wondering if it is a failure to deal with the HitCollector.  Any ideas?

thanks,
Koji


On 5/9/06, Chris Hostetter <[hidden email]> wrote:

>
>
> : IndexSearcher.  I replaced it with ParallelMultiSearcher, where it is
> : initialized similar to the client code I mentioned above.
> :
> : >From that, it seems like Solr itself needs to marshall and unmarshall
> the
> : searcher instance SolrIndexSearcher holds, and because the
> : ParallelMultiSearcher is initialized with RMI stubs, it fails to proceed
> : with such marshall/unmarshall internal actions.  As mentioned in the
> first
> : email, if I use ParallelMultiSearcher to only look at local indexes (no
> RMI
> : stub), Solr works just fine.  So I'm wondering if there is a way use
> : SolrIndexSearcher to search both local and remote indexes, even if not
> : through the RMI solution Lucene's ebook has suggested via its
> : ParallelMultiSearcher class.
>
> As I said, i don't really know a lot about RMI, but I don't think the
> client code is expected to marshall/unmarshall things -- but the objects
> you want to pass to remote methods (or recieve back from from remote
> methods) need to be serializable.  Do you know what objects you got
> serialization exceptions from? (you didn't include any real source -- just
> psuedocode, so it's not posisble to use the line numbers in your stack
> trace to look at the code because we don't know exactly what you changed)
>
>
>
> -Hoss
>
>
Reply | Threaded
Open this post in threaded view
|

Re: extending SolrIndexSearcher

Chris Hostetter-3

: My last email msg was in response to your suggestion:
:
: > If it's a Lucene class, you may want to start by making a small proof
: > of concept RMI app that just uses the Lucene core classes, once that
: > works then try your changes in Solr.
:
: For which I agree is a good starting point to narrow things down.  So my
: last msg was actual code of non-solr testing of ParallelMultiSearcher with
: RMI calls.

I understood that -- I'm glad to know it worked.  My point is that in
order for people to try to help you understand/fix the problem you are
having, they need to be able to look at the source and run it themselves.
having the source for your bare Lucene test is great -- but without having
either a copy of your modified SolrIndexSearcher or a patch file they can
use to generate your version, there's no way to try and reproduce what's
happening.

We can't even speculate on the cause of the exception, because the line
numbers in your version of SolrIndexSearcher are completely differnet
after you made your change.

I'm not promising that anyone on this list will have time to try and
reproduce your problem -- it depends on how interested people might be in
RMI and connecting to remote indexes -- but without the code your
running they can't even try.

Now that you've at least given us some idea what code is at line 794 where
the exception happens, we can at least speculate, and I'm guessing you are
right -- the problem is most likely that the anonymous HitCollector class
isn't serializable.  What I'm not sure of is wether changing the Solr
HitCollector will acctually solve the problem.  If i understand your
explanation of hte code correcly, ParallelMultiSearcher has it's own
HitCollector which wraps the HitCollector from SolrIndexSearcher ... which
makes me wonder how your non-Solr example worked .. whouldn't the
anonymous HitCollector in ParrallelMultiSearcher have had the same problem
there?

As I've said, i don't really know much about RMI, but perhaps you could
try replacing the anonymous HitCollector in SolrIndexSearcher with a
concrete subclass that implements Serializable and see if that works?

If it does, then maybe you could submit a patch to Jira containing your
changes?  In order to be reusable, we'd need some way to configure
the remote indexes, and I'm not sure how much interest there would be from
the rest of the community -- but a proof of concept patch would be a great
start.




-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: extending SolrIndexSearcher

Koji Miyamoto
Thanks Chris, yes, that makes sense.  I'll do some more experimenting along
the lines of HitCollector in solr and from my standalone programs as well.
If I yield meaningful resolution and it would benefit the community (if
there's interest) I'll do a concept patch as you mentioned.

-koji

On 5/10/06, Chris Hostetter <[hidden email]> wrote:

>
>
> : My last email msg was in response to your suggestion:
> :
> : > If it's a Lucene class, you may want to start by making a small proof
> : > of concept RMI app that just uses the Lucene core classes, once that
> : > works then try your changes in Solr.
> :
> : For which I agree is a good starting point to narrow things down.  So my
> : last msg was actual code of non-solr testing of ParallelMultiSearcher
> with
> : RMI calls.
>
> I understood that -- I'm glad to know it worked.  My point is that in
> order for people to try to help you understand/fix the problem you are
> having, they need to be able to look at the source and run it themselves.
> having the source for your bare Lucene test is great -- but without having
> either a copy of your modified SolrIndexSearcher or a patch file they can
> use to generate your version, there's no way to try and reproduce what's
> happening.
>
> We can't even speculate on the cause of the exception, because the line
> numbers in your version of SolrIndexSearcher are completely differnet
> after you made your change.
>
> I'm not promising that anyone on this list will have time to try and
> reproduce your problem -- it depends on how interested people might be in
> RMI and connecting to remote indexes -- but without the code your
> running they can't even try.
>
> Now that you've at least given us some idea what code is at line 794 where
> the exception happens, we can at least speculate, and I'm guessing you are
> right -- the problem is most likely that the anonymous HitCollector class
> isn't serializable.  What I'm not sure of is wether changing the Solr
> HitCollector will acctually solve the problem.  If i understand your
> explanation of hte code correcly, ParallelMultiSearcher has it's own
> HitCollector which wraps the HitCollector from SolrIndexSearcher ... which
> makes me wonder how your non-Solr example worked .. whouldn't the
> anonymous HitCollector in ParrallelMultiSearcher have had the same problem
> there?
>
> As I've said, i don't really know much about RMI, but perhaps you could
> try replacing the anonymous HitCollector in SolrIndexSearcher with a
> concrete subclass that implements Serializable and see if that works?
>
> If it does, then maybe you could submit a patch to Jira containing your
> changes?  In order to be reusable, we'd need some way to configure
> the remote indexes, and I'm not sure how much interest there would be from
> the rest of the community -- but a proof of concept patch would be a great
> start.
>
>
>
>
> -Hoss
>
>