Search failing for matched text in large field

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Search failing for matched text in large field

Paul-8
I'm using solr 1.4.1.

I have a document that has a pretty big field. If I search for a
phrase that occurs near the start of that field, it works fine. If I
search for a phrase that appears even a little ways into the field, it
doesn't find it. Is there some limit to how far into a field solr will
search?

Here's the way I'm doing the search. All I'm changing is the text I'm
searching on to make it succeed or fail:

http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text

Or, if it is not related to how large the document is, what else could
it possibly be related to? Could there be some character in that field
that is stopping the search?
Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Sascha Szott
Hi Paul,

did you increase the value of the maxFieldLength parameter in your
solrconfig.xml?

-Sascha

On 23.03.2011 17:05, Paul wrote:

> I'm using solr 1.4.1.
>
> I have a document that has a pretty big field. If I search for a
> phrase that occurs near the start of that field, it works fine. If I
> search for a phrase that appears even a little ways into the field, it
> doesn't find it. Is there some limit to how far into a field solr will
> search?
>
> Here's the way I'm doing the search. All I'm changing is the text I'm
> searching on to make it succeed or fail:
>
> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>
> Or, if it is not related to how large the document is, what else could
> it possibly be related to? Could there be some character in that field
> that is stopping the search?
Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Paul-8
Ah, no, I'll try that now.

What is the disadvantage of setting that to a really large number?

I do want the search to work for every word I give to solr. Otherwise
I wouldn't have indexed it to begin with.

On Wed, Mar 23, 2011 at 11:15 AM, Sascha Szott <[hidden email]> wrote:

> Hi Paul,
>
> did you increase the value of the maxFieldLength parameter in your
> solrconfig.xml?
>
> -Sascha
>
> On 23.03.2011 17:05, Paul wrote:
>>
>> I'm using solr 1.4.1.
>>
>> I have a document that has a pretty big field. If I search for a
>> phrase that occurs near the start of that field, it works fine. If I
>> search for a phrase that appears even a little ways into the field, it
>> doesn't find it. Is there some limit to how far into a field solr will
>> search?
>>
>> Here's the way I'm doing the search. All I'm changing is the text I'm
>> searching on to make it succeed or fail:
>>
>>
>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>
>> Or, if it is not related to how large the document is, what else could
>> it possibly be related to? Could there be some character in that field
>> that is stopping the search?
>
Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Jonathan Rochkind
In reply to this post by Paul-8
How large?

But rather than think about if there's something in the "searching"
that's not working, the first step might be to make sure that everything
in the _indexing_ is working -- that your field is actually being
indexed as you intend.

I forget the best way to view what's in your index -- the Luke request
handler in the Solr admin maybe?

On 3/23/2011 12:05 PM, Paul wrote:

> I'm using solr 1.4.1.
>
> I have a document that has a pretty big field. If I search for a
> phrase that occurs near the start of that field, it works fine. If I
> search for a phrase that appears even a little ways into the field, it
> doesn't find it. Is there some limit to how far into a field solr will
> search?
>
> Here's the way I'm doing the search. All I'm changing is the text I'm
> searching on to make it succeed or fail:
>
> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>
> Or, if it is not related to how large the document is, what else could
> it possibly be related to? Could there be some character in that field
> that is stopping the search?
>
Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Paul-8
In reply to this post by Sascha Szott
I increased maxFieldLength and reindexed a small number of documents.
That worked -- I got the correct results. In 3 minutes!

I assume that if I reindex all my documents that all searches will
become even slower. Is there any way to get all the results in a way
that is quick enough that my user won't get bored waiting? Is there
some optimization of this coming in solr 3.0?

On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott <[hidden email]> wrote:

> Hi Paul,
>
> did you increase the value of the maxFieldLength parameter in your
> solrconfig.xml?
>
> -Sascha
>
> On 23.03.2011 17:05, Paul wrote:
>>
>> I'm using solr 1.4.1.
>>
>> I have a document that has a pretty big field. If I search for a
>> phrase that occurs near the start of that field, it works fine. If I
>> search for a phrase that appears even a little ways into the field, it
>> doesn't find it. Is there some limit to how far into a field solr will
>> search?
>>
>> Here's the way I'm doing the search. All I'm changing is the text I'm
>> searching on to make it succeed or fail:
>>
>>
>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>
>> Or, if it is not related to how large the document is, what else could
>> it possibly be related to? Could there be some character in that field
>> that is stopping the search?
>
Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Jonathan Rochkind
Hmm, there's no reason it should take anywhere close 3 minutes to get a
result from a simple search, even with very large documents/term lists.  
Especially if you're really JUST doing a simple search, you aren't using
facetting or statistics component or highlighting etc at this point. (If
you ARE using highlighting, that could be the culprit).

You might need more RAM allocated to the Solr JVM.  For reasons I can't
explain myself, I sometimes get pathologically slow search results when
I don't have enough RAM, even though there aren't any errors in my logs
or anything -- which adding more RAM fixes.

It's also possible (just taking random guesses, I am not familiar with
this part of Solr internals), that if you increased the maxFieldLength
on an existing index, but only reindexed SOME of the results in that
index, than Solr is getting all confused about your index. I don't know
if Solr can handle changing the maxFieldLength on an existing index
without re-indexing all docs.

Also, if you tell us HOW large you made maxFieldLength, someone (not me)
might be able to say something about if it's so large it could create
some kind of other problem.

On 3/23/2011 1:52 PM, Paul wrote:

> I increased maxFieldLength and reindexed a small number of documents.
> That worked -- I got the correct results. In 3 minutes!
>
> I assume that if I reindex all my documents that all searches will
> become even slower. Is there any way to get all the results in a way
> that is quick enough that my user won't get bored waiting? Is there
> some optimization of this coming in solr 3.0?
>
> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<[hidden email]>  wrote:
>> Hi Paul,
>>
>> did you increase the value of the maxFieldLength parameter in your
>> solrconfig.xml?
>>
>> -Sascha
>>
>> On 23.03.2011 17:05, Paul wrote:
>>> I'm using solr 1.4.1.
>>>
>>> I have a document that has a pretty big field. If I search for a
>>> phrase that occurs near the start of that field, it works fine. If I
>>> search for a phrase that appears even a little ways into the field, it
>>> doesn't find it. Is there some limit to how far into a field solr will
>>> search?
>>>
>>> Here's the way I'm doing the search. All I'm changing is the text I'm
>>> searching on to make it succeed or fail:
>>>
>>>
>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>>
>>> Or, if it is not related to how large the document is, what else could
>>> it possibly be related to? Could there be some character in that field
>>> that is stopping the search?
Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Sascha Szott
In reply to this post by Paul-8
On 23.03.2011 18:52, Paul wrote:
> I increased maxFieldLength and reindexed a small number of documents.
> That worked -- I got the correct results. In 3 minutes!
Did you mark the field in question as stored = false?

-Sascha

>
> I assume that if I reindex all my documents that all searches will
> become even slower. Is there any way to get all the results in a way
> that is quick enough that my user won't get bored waiting? Is there
> some optimization of this coming in solr 3.0?
>
> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<[hidden email]>  wrote:
>> Hi Paul,
>>
>> did you increase the value of the maxFieldLength parameter in your
>> solrconfig.xml?
>>
>> -Sascha
>>
>> On 23.03.2011 17:05, Paul wrote:
>>>
>>> I'm using solr 1.4.1.
>>>
>>> I have a document that has a pretty big field. If I search for a
>>> phrase that occurs near the start of that field, it works fine. If I
>>> search for a phrase that appears even a little ways into the field, it
>>> doesn't find it. Is there some limit to how far into a field solr will
>>> search?
>>>
>>> Here's the way I'm doing the search. All I'm changing is the text I'm
>>> searching on to make it succeed or fail:
>>>
>>>
>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>>
>>> Or, if it is not related to how large the document is, what else could
>>> it possibly be related to? Could there be some character in that field
>>> that is stopping the search?
>>
Reply | Threaded
Open this post in threaded view
|

Storing Nested Fields

Sethi, Parampreet-2
Hi All,

This is regarding nested array functionality. I have requirements
1. to store category and sub-category association with a word in the Solr.
2. Also each word can be listed under multiple categories (and thus
sub-categories).
3. Query based on category or sub-category.

One way is to have two separate Array fields in Solr and making sure that
field category[0] is the super-category of field sub-category[0].

Has anyone encountered similar problem in Solr? Any suggestions will be
great.

Thanks
Param

Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Paul-8
In reply to this post by Sascha Szott
I looked into the search that I'm doing a little closer and it seems
like the highlighting is slowing it down. If I do the query without
requesting highlighting it is fast. (BTW, I also have faceting and
pagination in my query. Faceting doesn't seem to change the response
time much, adding &rows= and &start= does, but not prohibitively.)

The field in question needs to be stored=true, because it is needed
for highlighting.

I'm thinking of doing this in two searches: first without highlighting
and put a progress spinner next to each result, then do an ajax call
to repeat the search with highlighting that can take its time to
finish.

(I, too, have seen random really long response times that seem to be
related to not enough RAM, but this isn't the problem because the
results here are repeatable.)

On Wed, Mar 23, 2011 at 2:30 PM, Sascha Szott <[hidden email]> wrote:

> On 23.03.2011 18:52, Paul wrote:
>>
>> I increased maxFieldLength and reindexed a small number of documents.
>> That worked -- I got the correct results. In 3 minutes!
>
> Did you mark the field in question as stored = false?
>
> -Sascha
>
>>
>> I assume that if I reindex all my documents that all searches will
>> become even slower. Is there any way to get all the results in a way
>> that is quick enough that my user won't get bored waiting? Is there
>> some optimization of this coming in solr 3.0?
>>
>> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<[hidden email]>  wrote:
>>>
>>> Hi Paul,
>>>
>>> did you increase the value of the maxFieldLength parameter in your
>>> solrconfig.xml?
>>>
>>> -Sascha
>>>
>>> On 23.03.2011 17:05, Paul wrote:
>>>>
>>>> I'm using solr 1.4.1.
>>>>
>>>> I have a document that has a pretty big field. If I search for a
>>>> phrase that occurs near the start of that field, it works fine. If I
>>>> search for a phrase that appears even a little ways into the field, it
>>>> doesn't find it. Is there some limit to how far into a field solr will
>>>> search?
>>>>
>>>> Here's the way I'm doing the search. All I'm changing is the text I'm
>>>> searching on to make it succeed or fail:
>>>>
>>>>
>>>>
>>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>>>
>>>> Or, if it is not related to how large the document is, what else could
>>>> it possibly be related to? Could there be some character in that field
>>>> that is stopping the search?
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Jonathan Rochkind
Yeah, you aren't going to be able to do highlighting on a very very
large field without terrible performance.  I believe it's just the
nature of the algorithm used by the highlighting component. I don't know
of any workaround.  Other than inventing a new algorithm for
highlighting and writing a component for it.

Even with an AJAX call, you don't want to wait 3 minutes. Plus the load
on your server.

On 3/23/2011 3:52 PM, Paul wrote:

> I looked into the search that I'm doing a little closer and it seems
> like the highlighting is slowing it down. If I do the query without
> requesting highlighting it is fast. (BTW, I also have faceting and
> pagination in my query. Faceting doesn't seem to change the response
> time much, adding&rows= and&start= does, but not prohibitively.)
>
> The field in question needs to be stored=true, because it is needed
> for highlighting.
>
> I'm thinking of doing this in two searches: first without highlighting
> and put a progress spinner next to each result, then do an ajax call
> to repeat the search with highlighting that can take its time to
> finish.
>
> (I, too, have seen random really long response times that seem to be
> related to not enough RAM, but this isn't the problem because the
> results here are repeatable.)
>
> On Wed, Mar 23, 2011 at 2:30 PM, Sascha Szott<[hidden email]>  wrote:
>> On 23.03.2011 18:52, Paul wrote:
>>> I increased maxFieldLength and reindexed a small number of documents.
>>> That worked -- I got the correct results. In 3 minutes!
>> Did you mark the field in question as stored = false?
>>
>> -Sascha
>>
>>> I assume that if I reindex all my documents that all searches will
>>> become even slower. Is there any way to get all the results in a way
>>> that is quick enough that my user won't get bored waiting? Is there
>>> some optimization of this coming in solr 3.0?
>>>
>>> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<[hidden email]>    wrote:
>>>> Hi Paul,
>>>>
>>>> did you increase the value of the maxFieldLength parameter in your
>>>> solrconfig.xml?
>>>>
>>>> -Sascha
>>>>
>>>> On 23.03.2011 17:05, Paul wrote:
>>>>> I'm using solr 1.4.1.
>>>>>
>>>>> I have a document that has a pretty big field. If I search for a
>>>>> phrase that occurs near the start of that field, it works fine. If I
>>>>> search for a phrase that appears even a little ways into the field, it
>>>>> doesn't find it. Is there some limit to how far into a field solr will
>>>>> search?
>>>>>
>>>>> Here's the way I'm doing the search. All I'm changing is the text I'm
>>>>> searching on to make it succeed or fail:
>>>>>
>>>>>
>>>>>
>>>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>>>>
>>>>> Or, if it is not related to how large the document is, what else could
>>>>> it possibly be related to? Could there be some character in that field
>>>>> that is stopping the search?
Reply | Threaded
Open this post in threaded view
|

Re: Search failing for matched text in large field

Markus Jelsma-2
In reply to this post by Paul-8
Enable TermVectors for fields that you're going tot highlight. If it is
disabled Solr will reanalyze the field, killing performance.

> I looked into the search that I'm doing a little closer and it seems
> like the highlighting is slowing it down. If I do the query without
> requesting highlighting it is fast. (BTW, I also have faceting and
> pagination in my query. Faceting doesn't seem to change the response
> time much, adding &rows= and &start= does, but not prohibitively.)
>
> The field in question needs to be stored=true, because it is needed
> for highlighting.
>
> I'm thinking of doing this in two searches: first without highlighting
> and put a progress spinner next to each result, then do an ajax call
> to repeat the search with highlighting that can take its time to
> finish.
>
> (I, too, have seen random really long response times that seem to be
> related to not enough RAM, but this isn't the problem because the
> results here are repeatable.)
>
> On Wed, Mar 23, 2011 at 2:30 PM, Sascha Szott <[hidden email]> wrote:
> > On 23.03.2011 18:52, Paul wrote:
> >> I increased maxFieldLength and reindexed a small number of documents.
> >> That worked -- I got the correct results. In 3 minutes!
> >
> > Did you mark the field in question as stored = false?
> >
> > -Sascha
> >
> >> I assume that if I reindex all my documents that all searches will
> >> become even slower. Is there any way to get all the results in a way
> >> that is quick enough that my user won't get bored waiting? Is there
> >> some optimization of this coming in solr 3.0?
> >>
> >> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<[hidden email]>  wrote:
> >>> Hi Paul,
> >>>
> >>> did you increase the value of the maxFieldLength parameter in your
> >>> solrconfig.xml?
> >>>
> >>> -Sascha
> >>>
> >>> On 23.03.2011 17:05, Paul wrote:
> >>>> I'm using solr 1.4.1.
> >>>>
> >>>> I have a document that has a pretty big field. If I search for a
> >>>> phrase that occurs near the start of that field, it works fine. If I
> >>>> search for a phrase that appears even a little ways into the field, it
> >>>> doesn't find it. Is there some limit to how far into a field solr will
> >>>> search?
> >>>>
> >>>> Here's the way I'm doing the search. All I'm changing is the text I'm
> >>>> searching on to make it succeed or fail:
> >>>>
> >>>>
> >>>>
> >>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on
> >>>> &hl.fl=text
> >>>>
> >>>> Or, if it is not related to how large the document is, what else could
> >>>> it possibly be related to? Could there be some character in that field
> >>>> that is stopping the search?