Using Lucene index in Solr

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Using Lucene index in Solr

pgwillia
Hi,

    I was wondering if there are any major differences in building an index
using Lucene and Solr.  If there is no substantial differences, how would one
go about using an existing index created using Lucene in Solr?

Thanks,
Tricia
Reply | Threaded
Open this post in threaded view
|

Using Lucene index in Solr

pgwillia
Hi,

   I was wondering if there are any major differences in building an index
using Lucene and Solr.  If there is no substantial differences, how would one
go about using an existing index created using Lucene in Solr?

Thanks,
Tricia

Reply | Threaded
Open this post in threaded view
|

Re: Using Lucene index in Solr

Yonik Seeley
On 6/21/06, Tricia Williams <[hidden email]> wrote:
>    I was wondering if there are any major differences in building an index
> using Lucene and Solr.  If there is no substantial differences, how would one
> go about using an existing index created using Lucene in Solr?

You can definitely do that for the majority of indicies w/o writing
any code... you just need to make sure the schema matches what is in
the index (make the analyzers for the field types compatible, etc).

If you have access to the source code that built the index, start
there.  If you don't then open up the index with Luke to see what you
can find out.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Using Lucene index in Solr

pgwillia
So I've modified schema.xml to account for my lucene index.  I've created
a field type for my custom analyzer "text_lu", created fields for those in
my index, and changed the defaultSearchField.  The index I want to use is
in the data/index folder.

Now I want to use the admin page to query my old index.  I fill in the
Query text box and press the search button.  I recieve the following
message:
XML Parsing Error: syntax error
Location:
http://localhost:8080/solr/select/?stylesheet=&q=alberta&version=2.1&start=0&rows=10&indent=on
Line Number 1, Column 1:java.lang.NullPointerException

When I try to PING:
HTTP Status 500 - java.lang.NullPointerException at
org.apache.solr.search.SolrQueryParser.<init>(SolrQueryParser.java:38) at
org.apache.solr.search.QueryParsing.parseQuery(QueryParsing.java:47) at
org.apache.solr.request.StandardRequestHandler.handleRequest(StandardRequestHandler.java:90)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:592) at
org.apache.jsp.admin.ping_jsp._jspService(ping_jsp.java:70) at
etc....

Does anyone have an intuitive notion as to if these exceptions are
generated because of the custom analyzer that I am using or because of the
changes I have made to schema.xml?  What is the best way to debug my
instance of Solr?

Any help is much appreciated,
Tricia

On Wed, 21 Jun 2006, Yonik Seeley wrote:

> On 6/21/06, Tricia Williams <[hidden email]> wrote:
>>    I was wondering if there are any major differences in building an index
>> using Lucene and Solr.  If there is no substantial differences, how would
>> one
>> go about using an existing index created using Lucene in Solr?
>
> You can definitely do that for the majority of indicies w/o writing
> any code... you just need to make sure the schema matches what is in
> the index (make the analyzers for the field types compatible, etc).
>
> If you have access to the source code that built the index, start
> there.  If you don't then open up the index with Luke to see what you
> can find out.
>
> -Yonik
>
Reply | Threaded
Open this post in threaded view
|

Re: Using Lucene index in Solr

Yonik Seeley
On 6/21/06, Tricia Williams <[hidden email]> wrote:
> So I've modified schema.xml to account for my lucene index.  I've created
> a field type for my custom analyzer "text_lu", created fields for those in
> my index, and changed the defaultSearchField.  The index I want to use is
> in the data/index folder.
>
> Now I want to use the admin page to query my old index.  I fill in the
> Query text box and press the search button.  I recieve the following
> message:
> XML Parsing Error: syntax error

XML parsing isn't done during querying... so what this probably means
is that the schema.xml or solrconfig.xml has a problem and failed to
parse whe you started the server, hence the SolrCore even failed to
load.

When you try to execute a query, it tries to instantiate the SolrCore
and IndexSchema objects again, and fails again.

Look at the exceptions in your log file, and try to find the root exception.
The exception you posted suggests you might not have well-formed XML anymore.


-Yonik
Reply | Threaded
Open this post in threaded view
|

Secure Solr

pgwillia
In reply to this post by pgwillia
Hi All,

    It seems to me that the way that documents are indexed and managed via
Solr using http get requests leaves your index open to malicious attacks
as anyone with the right syntax and some information about your index
could commit changes to your index.  Is there some mechanism in solr that
prevents this kind of attack?

Thanks,
Tricia

Reply | Threaded
Open this post in threaded view
|

Re: Secure Solr

Yonik Seeley
On 6/21/06, Tricia Williams <[hidden email]> wrote:
>     It seems to me that the way that documents are indexed and managed via
> Solr using http get requests leaves your index open to malicious attacks
> as anyone with the right syntax and some information about your index
> could commit changes to your index.  Is there some mechanism in solr that
> prevents this kind of attack?

We (CNET) use Solr as a back-end system.
Web traffic goes to apache web servers, then some requests go to
app-servers to generate dynamic content, and those app servers make
requests to Solr servers for search results.

If you have any ideas about it, adding security might be useful to
others though.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Using Lucene index in Solr

The Flight Captain
In reply to this post by Yonik Seeley
Do I have to set the datasource that my index references?

My data is stored in a database, I want Solr to look up the data in that database using my existing index. At the moment, I have set the <dataDir> element in my solrconfig to point at my existing index, and checked the schema on my existing index using Luke but I can't get any results when searching in Solr.

My index was created using hibernate-search.

How I can retrieve my data in Solr, using the existing Lucene index? I think I need to set the database connection details somewhere, just not sure where. I have set up a dataImport handler, but I don't want that to overwrite my exising index.

yonik wrote
On 6/21/06, Tricia Williams <pgwillia@student.cs.uwaterloo.ca> wrote:
>    I was wondering if there are any major differences in building an index
> using Lucene and Solr.  If there is no substantial differences, how would one
> go about using an existing index created using Lucene in Solr?

You can definitely do that for the majority of indicies w/o writing
any code... you just need to make sure the schema matches what is in
the index (make the analyzers for the field types compatible, etc).

If you have access to the source code that built the index, start
there.  If you don't then open up the index with Luke to see what you
can find out.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Using Lucene index in Solr

hossman

: My data is stored in a database, I want Solr to look up the data in that
: database using my existing index. At the moment, I have set the <dataDir>

you seem to be confusing two issues....

: element in my solrconfig to point at my existing index, and checked the
: schema on my existing index using Luke but I can't get any results when
: searching in Solr.
:
: My index was created using hibernate-search.

...if you have an existing index, and you want to search in Solr, you have
to create a schema.xml file that tells solr what the fields are that you
have and what datatypes to treat them as -- in particular what analysers
to use when querying them.

if hibernate-search built your index, you'll need to look at how it was
configured to build the index to figure some of this out (i'm not familiar
with hibernate-search so i can't help you there) ... the
LUkeRequestHandler can help you spot check the raw index if you need to
(ie: "oh, look all of hte terms are lowercased so i guess i would use a
LowerCaseFilterFactory")

: How I can retrieve my data in Solr, using the existing Lucene index? I think
: I need to set the database connection details somewhere, just not sure
: where. I have set up a dataImport handler, but I don't want that to
: overwrite my exising index.

If you are given solr an existing index, it doens't care what database it
was built from -- just what analsysis rules were used when it was built.  
the only thing in solr that cares about databases is the DataImportHandler
which you could use to update your idex as new data gets added to your
database if you want -- but first you have to create a schema.xml that
makes sense for your index.

Alternately: create the schema.xml that you *want* to have, abandom your
existing index and use DataImportHandler to build a new index and keep it
up to date.


-Hoss