OUTOFMEMORY ERROR

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

OUTOFMEMORY ERROR

MariLuz Elola
Hi, I have a problem when I am trying to search a simple query without sorting into an index with 210.000 documents.
Executing the query several times I am getting the OutOfMemory error.
I am creating an IndexSearcher(pathDir) every search.
I don´t know if it will be necessary to create only one indexSearcher and caching it,
If I search into an index with only 50.000 documents, the outofMemory error doen´t appear.
------------------------
ENVIROMENT DESCRIPTION:
------------------------

---SERVER---
MEMORY 2GB
APP SERVER Jboss3.2.3
JAVA_OPTS -Xmx640M -Xms640M

----LUCENE 1.4.3-------
INDEX +- 210.000 documents
EACH DOCUMENT +- 20 fields (metadatas)
SIZE TEXT DOCUMENT 1k

------------------------
ERROR:
------------------------
18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
java.lang.OutOfMemoryError
18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
java.lang.OutOfMemoryError
18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error; nested exception is:
        java.lang.OutOfMemoryError
18:52:18,661 ERROR [STDERR]     at org.jboss.ejb.plugins.LogInterceptor.handleException(LogInterceptor.java:374)
18:52:18,661 ERROR [STDERR]     at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
18:52:18,661 ERROR [STDERR]     at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.StatelessSessionContainer.internalInvoke(StatelessSessionContainer.java:331)
18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke(Container.java:700)
18:52:18,662 ERROR [STDERR]     at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
18:52:18,662 ERROR [STDERR]     at sun.reflect.DelegatingMethodAccessorImpl.invok
.
.
Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?


Could anybody help me???

Thanks in advance

    Mari Luz




Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

Erik Hatcher
We'll need some more details to help.  What query was it?

     Erik

On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:

> Hi, I have a problem when I am trying to search a simple query  
> without sorting into an index with 210.000 documents.
> Executing the query several times I am getting the OutOfMemory error.
> I am creating an IndexSearcher(pathDir) every search.
> I don´t know if it will be necessary to create only one  
> indexSearcher and caching it,
> If I search into an index with only 50.000 documents, the  
> outofMemory error doen´t appear.
> ------------------------
> ENVIROMENT DESCRIPTION:
> ------------------------
>
> ---SERVER---
> MEMORY 2GB
> APP SERVER Jboss3.2.3
> JAVA_OPTS -Xmx640M -Xms640M
>
> ----LUCENE 1.4.3-------
> INDEX +- 210.000 documents
> EACH DOCUMENT +- 20 fields (metadatas)
> SIZE TEXT DOCUMENT 1k
>
> ------------------------
> ERROR:
> ------------------------
> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
> java.lang.OutOfMemoryError
> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
> java.lang.OutOfMemoryError
> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;  
> nested exception is:
>         java.lang.OutOfMemoryError
> 18:52:18,661 ERROR [STDERR]     at  
> org.jboss.ejb.plugins.LogInterceptor.handleException
> (LogInterceptor.java:374)
> 18:52:18,661 ERROR [STDERR]     at  
> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
> 18:52:18,661 ERROR [STDERR]     at  
> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
> (ProxyFactoryFinderInterceptor.java:122)
> 18:52:18,662 ERROR [STDERR]     at  
> org.jboss.ejb.StatelessSessionContainer.internalInvoke
> (StatelessSessionContainer.java:331)
> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke
> (Container.java:700)
> 18:52:18,662 ERROR [STDERR]     at  
> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
> 18:52:18,662 ERROR [STDERR]     at  
> sun.reflect.DelegatingMethodAccessorImpl.invok
> .
> .
> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:  
> Work queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap  
> space?
>
>
> Could anybody help me???
>
> Thanks in advance
>
>     Mari Luz
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

MariLuz Elola
The query is ==> ID:0*
This query returns all the documents, exactly 210.000 documents.
If the user doesn?t specify any criterio in the user interface of searching,
the server searchs all the documents.

    Mari Luz



Untitled Document --------------------------------------------------- Mari
Luz Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.: +34 91
768 46 58 mailto:
[hidden email] ---------------------------------------------------  
Privileged/Confidential Information may be contained in this message and is
intended solely for the use of the named addressee(s). Access to this e-mail
by anyone else is unauthorised. If you are not the intended recipient, any
disclosure, copying, distribution or re-use of the information contained in
it is prohibited and may be unlawful. Opinions, conclusions and any other
information contained in this message that do not relate to the official
business of Seinet shall be understood as neither given nor endorsed by it.
If you have received this communication in error, please notify us
immediately by replying to this mail and deleting it from your computer.
Thank you.
----- Original Message -----
From: "Erik Hatcher" <[hidden email]>
To: <[hidden email]>
Sent: Wednesday, July 06, 2005 8:12 PM
Subject: Re: OUTOFMEMORY ERROR


We'll need some more details to help.  What query was it?

     Erik

On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:

> Hi, I have a problem when I am trying to search a simple query  without
> sorting into an index with 210.000 documents.
> Executing the query several times I am getting the OutOfMemory error.
> I am creating an IndexSearcher(pathDir) every search.
> I don?t know if it will be necessary to create only one  indexSearcher and
> caching it,
> If I search into an index with only 50.000 documents, the  outofMemory
> error doen?t appear.
> ------------------------
> ENVIROMENT DESCRIPTION:
> ------------------------
>
> ---SERVER---
> MEMORY 2GB
> APP SERVER Jboss3.2.3
> JAVA_OPTS -Xmx640M -Xms640M
>
> ----LUCENE 1.4.3-------
> INDEX +- 210.000 documents
> EACH DOCUMENT +- 20 fields (metadatas)
> SIZE TEXT DOCUMENT 1k
>
> ------------------------
> ERROR:
> ------------------------
> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
> java.lang.OutOfMemoryError
> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
> java.lang.OutOfMemoryError
> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;
> nested exception is:
>         java.lang.OutOfMemoryError
> 18:52:18,661 ERROR [STDERR]     at
> org.jboss.ejb.plugins.LogInterceptor.handleException
> (LogInterceptor.java:374)
> 18:52:18,661 ERROR [STDERR]     at
> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
> 18:52:18,661 ERROR [STDERR]     at
> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
> (ProxyFactoryFinderInterceptor.java:122)
> 18:52:18,662 ERROR [STDERR]     at
> org.jboss.ejb.StatelessSessionContainer.internalInvoke
> (StatelessSessionContainer.java:331)
> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke
> (Container.java:700)
> 18:52:18,662 ERROR [STDERR]     at
> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
> 18:52:18,662 ERROR [STDERR]     at
> sun.reflect.DelegatingMethodAccessorImpl.invok
> .
> .
> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:  Work
> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap  space?
>
>
> Could anybody help me???
>
> Thanks in advance
>
>     Mari Luz
>
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

Erik Hatcher

On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
> The query is ==> ID:0*
> This query returns all the documents, exactly 210.000 documents.
> If the user doesn´t specify any criterio in the user interface of  
> searching, the server searchs all the documents.

Doing a prefix query (which ID:0* is) internally builds a  
BooleanQuery OR'ing all unique terms in the ID field that begin with  
a "0".  The built in limit is 1,024 clauses in a BooleanQuery.

You will need to re-think your approach.  If the goal is to return  
all documents, then use IndexReader to walk them.  If the goal is to  
have a general user query expression where ID:0* would be entered you  
will need to account for that possibility with more system resources  
and bumping up the BooleanQuery limit or indexing differently so that  
there are no so many terms being put into the BooleanQuery.  It is  
difficult to offer specific advice as I'm not sure what your use  
cases are.

     Erik



>
>    Mari Luz
>
>
>
> Untitled Document  
> --------------------------------------------------- Mari Luz Elola  
> Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.: +34 91  
> 768 46 58 mailto: [hidden email]  
> ---------------------------------------------------  Privileged/
> Confidential Information may be contained in this message and is  
> intended solely for the use of the named addressee(s). Access to  
> this e-mail by anyone else is unauthorised. If you are not the  
> intended recipient, any disclosure, copying, distribution or re-use  
> of the information contained in it is prohibited and may be  
> unlawful. Opinions, conclusions and any other information contained  
> in this message that do not relate to the official business of  
> Seinet shall be understood as neither given nor endorsed by it. If  
> you have received this communication in error, please notify us  
> immediately by replying to this mail and deleting it from your  
> computer. Thank you.
> ----- Original Message ----- From: "Erik Hatcher"  
> <[hidden email]>
> To: <[hidden email]>
> Sent: Wednesday, July 06, 2005 8:12 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
> We'll need some more details to help.  What query was it?
>
>     Erik
>
> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>
>
>> Hi, I have a problem when I am trying to search a simple query  
>> without sorting into an index with 210.000 documents.
>> Executing the query several times I am getting the OutOfMemory error.
>> I am creating an IndexSearcher(pathDir) every search.
>> I don´t know if it will be necessary to create only one  
>> indexSearcher and caching it,
>> If I search into an index with only 50.000 documents, the  
>> outofMemory error doen´t appear.
>> ------------------------
>> ENVIROMENT DESCRIPTION:
>> ------------------------
>>
>> ---SERVER---
>> MEMORY 2GB
>> APP SERVER Jboss3.2.3
>> JAVA_OPTS -Xmx640M -Xms640M
>>
>> ----LUCENE 1.4.3-------
>> INDEX +- 210.000 documents
>> EACH DOCUMENT +- 20 fields (metadatas)
>> SIZE TEXT DOCUMENT 1k
>>
>> ------------------------
>> ERROR:
>> ------------------------
>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>> java.lang.OutOfMemoryError
>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>> java.lang.OutOfMemoryError
>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected  
>> Error; nested exception is:
>>         java.lang.OutOfMemoryError
>> 18:52:18,661 ERROR [STDERR]     at  
>> org.jboss.ejb.plugins.LogInterceptor.handleException  
>> (LogInterceptor.java:374)
>> 18:52:18,661 ERROR [STDERR]     at  
>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>> 18:52:18,661 ERROR [STDERR]     at  
>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke  
>> (ProxyFactoryFinderInterceptor.java:122)
>> 18:52:18,662 ERROR [STDERR]     at  
>> org.jboss.ejb.StatelessSessionContainer.internalInvoke  
>> (StatelessSessionContainer.java:331)
>> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke  
>> (Container.java:700)
>> 18:52:18,662 ERROR [STDERR]     at  
>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>> 18:52:18,662 ERROR [STDERR]     at  
>> sun.reflect.DelegatingMethodAccessorImpl.invok
>> .
>> .
>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:  
>> Work queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of  
>> swap  space?
>>
>>
>> Could anybody help me???
>>
>> Thanks in advance
>>
>>     Mari Luz
>>
>>
>>
>>
>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

MariLuz Elola
Thanks Erik,
I was wrong, exactly the query that throws an OutOfMemory error is ==>
ID:0* -ID:xtent.
With the query ID:0* I have tried to reproduce the error, but the exception
doen?t appear.
I will use IndexReader instead of IndexSearcher for getting all the
documents. It?s a good idea.
Other thing, when the user searchs without using any query, internally I am
creating the next query ==> ID:0* OR NOT ID:xtent. And this query parsed by
QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0* AND NOT
ID:xtent), isn?t? Is QueryParser working wrong???
About maxClauseCount (by default 1024), I am setting this property:
org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;

    Mari Luz

----- Original Message -----
From: "Erik Hatcher" <[hidden email]>
To: <[hidden email]>
Sent: Thursday, July 07, 2005 2:46 PM
Subject: Re: OUTOFMEMORY ERROR



On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
> The query is ==> ID:0*
> This query returns all the documents, exactly 210.000 documents.
> If the user doesn?t specify any criterio in the user interface of
> searching, the server searchs all the documents.

Doing a prefix query (which ID:0* is) internally builds a
BooleanQuery OR'ing all unique terms in the ID field that begin with
a "0".  The built in limit is 1,024 clauses in a BooleanQuery.

You will need to re-think your approach.  If the goal is to return
all documents, then use IndexReader to walk them.  If the goal is to
have a general user query expression where ID:0* would be entered you
will need to account for that possibility with more system resources
and bumping up the BooleanQuery limit or indexing differently so that
there are no so many terms being put into the BooleanQuery.  It is
difficult to offer specific advice as I'm not sure what your use
cases are.

     Erik



>
>    Mari Luz
>
>
>
> Untitled Document  ---------------------------------------------------  
> Mari Luz Elola  Developer Engineer Caleruega, 67 28033 Madrid (Spain)
> Tel.: +34 91  768 46 58 mailto:
> [hidden email]  ---------------------------------------------------  
> Privileged/ Confidential Information may be contained in this message and
> is  intended solely for the use of the named addressee(s). Access to  this
> e-mail by anyone else is unauthorised. If you are not the  intended
> recipient, any disclosure, copying, distribution or re-use  of the
> information contained in it is prohibited and may be  unlawful. Opinions,
> conclusions and any other information contained  in this message that do
> not relate to the official business of  Seinet shall be understood as
> neither given nor endorsed by it. If  you have received this communication
> in error, please notify us  immediately by replying to this mail and
> deleting it from your  computer. Thank you.
> ----- Original Message ----- From: "Erik Hatcher"
> <[hidden email]>
> To: <[hidden email]>
> Sent: Wednesday, July 06, 2005 8:12 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
> We'll need some more details to help.  What query was it?
>
>     Erik
>
> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>
>
>> Hi, I have a problem when I am trying to search a simple query   without
>> sorting into an index with 210.000 documents.
>> Executing the query several times I am getting the OutOfMemory error.
>> I am creating an IndexSearcher(pathDir) every search.
>> I don?t know if it will be necessary to create only one   indexSearcher
>> and caching it,
>> If I search into an index with only 50.000 documents, the   outofMemory
>> error doen?t appear.
>> ------------------------
>> ENVIROMENT DESCRIPTION:
>> ------------------------
>>
>> ---SERVER---
>> MEMORY 2GB
>> APP SERVER Jboss3.2.3
>> JAVA_OPTS -Xmx640M -Xms640M
>>
>> ----LUCENE 1.4.3-------
>> INDEX +- 210.000 documents
>> EACH DOCUMENT +- 20 fields (metadatas)
>> SIZE TEXT DOCUMENT 1k
>>
>> ------------------------
>> ERROR:
>> ------------------------
>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>> java.lang.OutOfMemoryError
>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>> java.lang.OutOfMemoryError
>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected  Error;
>> nested exception is:
>>         java.lang.OutOfMemoryError
>> 18:52:18,661 ERROR [STDERR]     at
>> org.jboss.ejb.plugins.LogInterceptor.handleException
>> (LogInterceptor.java:374)
>> 18:52:18,661 ERROR [STDERR]     at
>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>> 18:52:18,661 ERROR [STDERR]     at
>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>> (ProxyFactoryFinderInterceptor.java:122)
>> 18:52:18,662 ERROR [STDERR]     at
>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>> (StatelessSessionContainer.java:331)
>> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke
>> (Container.java:700)
>> 18:52:18,662 ERROR [STDERR]     at
>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>> 18:52:18,662 ERROR [STDERR]     at
>> sun.reflect.DelegatingMethodAccessorImpl.invok
>> .
>> .
>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:   Work
>> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of  swap  space?
>>
>>
>> Could anybody help me???
>>
>> Thanks in advance
>>
>>     Mari Luz
>>
>>
>>
>>
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

MariLuz Elola
Erik, I have a problem.
Firstly I have created several IndexWriter.
One of them has 210.000 documents, and in the future will be IndexWriters
with more than millions of documents.
I need to obtain all the documents.
I am searching using the query ID:0* because this query returns all the
documents.
Exactly I am getting the metadata ID (hits.doc(start).get(.ID)), I am
getting all the IDs of all the documents of a specific IndexWriter.
I am getting out of memory doing it.
About maxClauseCount (by default 1024), I am setting this property:
org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
You gave me an idea...to use IndexReader instead of IndexSearcher for
getting all the documents.
I think that it is not possible to use IndexReader, because I need the ID,
not the phisical files:

      Directory directory = FSDirectory.getDirectory(path false);
      IndexReader reader = IndexReader.open(directory);
      for (int i = 0; i < reader.maxDoc(); i++) ............

Moreover "directory" has all the documents of all the IndexWriter.


        Mari Luz

----- Original Message -----
From: "MariLuz Elola" <[hidden email]>
To: <[hidden email]>
Sent: Thursday, July 07, 2005 3:40 PM
Subject: Re: OUTOFMEMORY ERROR


> Thanks Erik,
> I was wrong, exactly the query that throws an OutOfMemory error is ==>
> ID:0* -ID:xtent.
> With the query ID:0* I have tried to reproduce the error, but the
> exception doen?t appear.
> I will use IndexReader instead of IndexSearcher for getting all the
> documents. It?s a good idea.
> Other thing, when the user searchs without using any query, internally I
> am creating the next query ==> ID:0* OR NOT ID:xtent. And this query
> parsed by QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0*
> AND NOT ID:xtent), isn?t? Is QueryParser working wrong???
> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
>
>    Mari Luz
>
> ----- Original Message -----
> From: "Erik Hatcher" <[hidden email]>
> To: <[hidden email]>
> Sent: Thursday, July 07, 2005 2:46 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>
> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>> The query is ==> ID:0*
>> This query returns all the documents, exactly 210.000 documents.
>> If the user doesn?t specify any criterio in the user interface of
>> searching, the server searchs all the documents.
>
> Doing a prefix query (which ID:0* is) internally builds a
> BooleanQuery OR'ing all unique terms in the ID field that begin with
> a "0".  The built in limit is 1,024 clauses in a BooleanQuery.
>
> You will need to re-think your approach.  If the goal is to return
> all documents, then use IndexReader to walk them.  If the goal is to
> have a general user query expression where ID:0* would be entered you
> will need to account for that possibility with more system resources
> and bumping up the BooleanQuery limit or indexing differently so that
> there are no so many terms being put into the BooleanQuery.  It is
> difficult to offer specific advice as I'm not sure what your use
> cases are.
>
>     Erik
>
>
>
>>
>>    Mari Luz
>>
>>
>>
>> Untitled Document  ---------------------------------------------------  
>> Mari Luz Elola  Developer Engineer Caleruega, 67 28033 Madrid (Spain)
>> Tel.: +34 91  768 46 58 mailto:
>> [hidden email]  ---------------------------------------------------  
>> Privileged/ Confidential Information may be contained in this message and
>> is  intended solely for the use of the named addressee(s). Access to
>> this e-mail by anyone else is unauthorised. If you are not the  intended
>> recipient, any disclosure, copying, distribution or re-use  of the
>> information contained in it is prohibited and may be  unlawful. Opinions,
>> conclusions and any other information contained  in this message that do
>> not relate to the official business of  Seinet shall be understood as
>> neither given nor endorsed by it. If  you have received this
>> communication in error, please notify us  immediately by replying to this
>> mail and deleting it from your  computer. Thank you.
>> ----- Original Message ----- From: "Erik Hatcher"
>> <[hidden email]>
>> To: <[hidden email]>
>> Sent: Wednesday, July 06, 2005 8:12 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>> We'll need some more details to help.  What query was it?
>>
>>     Erik
>>
>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>
>>
>>> Hi, I have a problem when I am trying to search a simple query   without
>>> sorting into an index with 210.000 documents.
>>> Executing the query several times I am getting the OutOfMemory error.
>>> I am creating an IndexSearcher(pathDir) every search.
>>> I don?t know if it will be necessary to create only one   indexSearcher
>>> and caching it,
>>> If I search into an index with only 50.000 documents, the   outofMemory
>>> error doen?t appear.
>>> ------------------------
>>> ENVIROMENT DESCRIPTION:
>>> ------------------------
>>>
>>> ---SERVER---
>>> MEMORY 2GB
>>> APP SERVER Jboss3.2.3
>>> JAVA_OPTS -Xmx640M -Xms640M
>>>
>>> ----LUCENE 1.4.3-------
>>> INDEX +- 210.000 documents
>>> EACH DOCUMENT +- 20 fields (metadatas)
>>> SIZE TEXT DOCUMENT 1k
>>>
>>> ------------------------
>>> ERROR:
>>> ------------------------
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected  Error;
>>> nested exception is:
>>>         java.lang.OutOfMemoryError
>>> 18:52:18,661 ERROR [STDERR]     at
>>> org.jboss.ejb.plugins.LogInterceptor.handleException
>>> (LogInterceptor.java:374)
>>> 18:52:18,661 ERROR [STDERR]     at
>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>> 18:52:18,661 ERROR [STDERR]     at
>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>>> (ProxyFactoryFinderInterceptor.java:122)
>>> 18:52:18,662 ERROR [STDERR]     at
>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>>> (StatelessSessionContainer.java:331)
>>> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke
>>> (Container.java:700)
>>> 18:52:18,662 ERROR [STDERR]     at
>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>> 18:52:18,662 ERROR [STDERR]     at
>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>> .
>>> .
>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:   Work
>>> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of  swap  space?
>>>
>>>
>>> Could anybody help me???
>>>
>>> Thanks in advance
>>>
>>>     Mari Luz
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

MariLuz Elola
Excuse, I was wrong again.
I can use IndexReader.... forget the last email :-D

----- Original Message -----
From: "MariLuz Elola" <[hidden email]>
To: <[hidden email]>
Sent: Thursday, July 07, 2005 4:16 PM
Subject: Re: OUTOFMEMORY ERROR


> Erik, I have a problem.
> Firstly I have created several IndexWriter.
> One of them has 210.000 documents, and in the future will be IndexWriters
> with more than millions of documents.
> I need to obtain all the documents.
> I am searching using the query ID:0* because this query returns all the
> documents.
> Exactly I am getting the metadata ID (hits.doc(start).get(.ID)), I am
> getting all the IDs of all the documents of a specific IndexWriter.
> I am getting out of memory doing it.
> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
> You gave me an idea...to use IndexReader instead of IndexSearcher for
> getting all the documents.
> I think that it is not possible to use IndexReader, because I need the ID,
> not the phisical files:
>
>      Directory directory = FSDirectory.getDirectory(path false);
>      IndexReader reader = IndexReader.open(directory);
>      for (int i = 0; i < reader.maxDoc(); i++) ............
>
> Moreover "directory" has all the documents of all the IndexWriter.
>
>
>        Mari Luz
>
> ----- Original Message -----
> From: "MariLuz Elola" <[hidden email]>
> To: <[hidden email]>
> Sent: Thursday, July 07, 2005 3:40 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>> Thanks Erik,
>> I was wrong, exactly the query that throws an OutOfMemory error is ==>
>> ID:0* -ID:xtent.
>> With the query ID:0* I have tried to reproduce the error, but the
>> exception doen?t appear.
>> I will use IndexReader instead of IndexSearcher for getting all the
>> documents. It?s a good idea.
>> Other thing, when the user searchs without using any query, internally I
>> am creating the next query ==> ID:0* OR NOT ID:xtent. And this query
>> parsed by QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0*
>> AND NOT ID:xtent), isn?t? Is QueryParser working wrong???
>> About maxClauseCount (by default 1024), I am setting this property:
>> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
>>
>>    Mari Luz
>>
>> ----- Original Message -----
>> From: "Erik Hatcher" <[hidden email]>
>> To: <[hidden email]>
>> Sent: Thursday, July 07, 2005 2:46 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>>
>> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>>> The query is ==> ID:0*
>>> This query returns all the documents, exactly 210.000 documents.
>>> If the user doesn?t specify any criterio in the user interface of
>>> searching, the server searchs all the documents.
>>
>> Doing a prefix query (which ID:0* is) internally builds a
>> BooleanQuery OR'ing all unique terms in the ID field that begin with
>> a "0".  The built in limit is 1,024 clauses in a BooleanQuery.
>>
>> You will need to re-think your approach.  If the goal is to return
>> all documents, then use IndexReader to walk them.  If the goal is to
>> have a general user query expression where ID:0* would be entered you
>> will need to account for that possibility with more system resources
>> and bumping up the BooleanQuery limit or indexing differently so that
>> there are no so many terms being put into the BooleanQuery.  It is
>> difficult to offer specific advice as I'm not sure what your use
>> cases are.
>>
>>     Erik
>>
>>
>>
>>>
>>>    Mari Luz
>>>
>>>
>>>
>>> Untitled Document  ---------------------------------------------------  
>>> Mari Luz Elola  Developer Engineer Caleruega, 67 28033 Madrid (Spain)
>>> Tel.: +34 91  768 46 58 mailto:
>>> [hidden email]  ---------------------------------------------------  
>>> Privileged/ Confidential Information may be contained in this message
>>> and is  intended solely for the use of the named addressee(s). Access to
>>> this e-mail by anyone else is unauthorised. If you are not the  intended
>>> recipient, any disclosure, copying, distribution or re-use  of the
>>> information contained in it is prohibited and may be  unlawful.
>>> Opinions, conclusions and any other information contained  in this
>>> message that do not relate to the official business of  Seinet shall be
>>> understood as neither given nor endorsed by it. If  you have received
>>> this communication in error, please notify us  immediately by replying
>>> to this mail and deleting it from your  computer. Thank you.
>>> ----- Original Message ----- From: "Erik Hatcher"
>>> <[hidden email]>
>>> To: <[hidden email]>
>>> Sent: Wednesday, July 06, 2005 8:12 PM
>>> Subject: Re: OUTOFMEMORY ERROR
>>>
>>>
>>> We'll need some more details to help.  What query was it?
>>>
>>>     Erik
>>>
>>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>>
>>>
>>>> Hi, I have a problem when I am trying to search a simple query
>>>> without sorting into an index with 210.000 documents.
>>>> Executing the query several times I am getting the OutOfMemory error.
>>>> I am creating an IndexSearcher(pathDir) every search.
>>>> I don?t know if it will be necessary to create only one   indexSearcher
>>>> and caching it,
>>>> If I search into an index with only 50.000 documents, the   outofMemory
>>>> error doen?t appear.
>>>> ------------------------
>>>> ENVIROMENT DESCRIPTION:
>>>> ------------------------
>>>>
>>>> ---SERVER---
>>>> MEMORY 2GB
>>>> APP SERVER Jboss3.2.3
>>>> JAVA_OPTS -Xmx640M -Xms640M
>>>>
>>>> ----LUCENE 1.4.3-------
>>>> INDEX +- 210.000 documents
>>>> EACH DOCUMENT +- 20 fields (metadatas)
>>>> SIZE TEXT DOCUMENT 1k
>>>>
>>>> ------------------------
>>>> ERROR:
>>>> ------------------------
>>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>>> java.lang.OutOfMemoryError
>>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>>> java.lang.OutOfMemoryError
>>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected  Error;
>>>> nested exception is:
>>>>         java.lang.OutOfMemoryError
>>>> 18:52:18,661 ERROR [STDERR]     at
>>>> org.jboss.ejb.plugins.LogInterceptor.handleException
>>>> (LogInterceptor.java:374)
>>>> 18:52:18,661 ERROR [STDERR]     at
>>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>>> 18:52:18,661 ERROR [STDERR]     at
>>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>>>> (ProxyFactoryFinderInterceptor.java:122)
>>>> 18:52:18,662 ERROR [STDERR]     at
>>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>>>> (StatelessSessionContainer.java:331)
>>>> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke
>>>> (Container.java:700)
>>>> 18:52:18,662 ERROR [STDERR]     at
>>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>>> 18:52:18,662 ERROR [STDERR]     at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>>> .
>>>> .
>>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:   Work
>>>> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of  swap  space?
>>>>
>>>>
>>>> Could anybody help me???
>>>>
>>>> Thanks in advance
>>>>
>>>>     Mari Luz
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

Erik Hatcher
In reply to this post by MariLuz Elola

On Jul 7, 2005, at 9:40 AM, MariLuz Elola wrote:
> Thanks Erik,
> I was wrong, exactly the query that throws an OutOfMemory error is  
> ==> ID:0* -ID:xtent.
> With the query ID:0* I have tried to reproduce the error, but the  
> exception doen´t appear.

> Other thing, when the user searchs without using any query,  
> internally I am creating the next query ==> ID:0* OR NOT ID:xtent.

That's a hairy query.  I definitely do not recommend doing something  
like that with prefix queries.  Check out using a Filter for some of  
this sort of thing also.

> And this query parsed by QueryParser I am obtaining ID:0* -ID:xtent  
> (traslated ==> ID:0* AND NOT ID:xtent), isn´t? Is QueryParser  
> working wrong???

It depends.  By default, QueryParser uses OR as the default operator.

> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.s
> earchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;

Bumping up that limit is not necessarily the best thing to do - I  
recommend changing your approach to querying all documents rather  
than trying to make BooleanQuery happy with an enormously inefficient  
query.

     Erik


>
>    Mari Luz
>
> ----- Original Message ----- From: "Erik Hatcher"  
> <[hidden email]>
> To: <[hidden email]>
> Sent: Thursday, July 07, 2005 2:46 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>
> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>
>> The query is ==> ID:0*
>> This query returns all the documents, exactly 210.000 documents.
>> If the user doesn´t specify any criterio in the user interface of  
>> searching, the server searchs all the documents.
>>
>
> Doing a prefix query (which ID:0* is) internally builds a
> BooleanQuery OR'ing all unique terms in the ID field that begin with
> a "0".  The built in limit is 1,024 clauses in a BooleanQuery.
>
> You will need to re-think your approach.  If the goal is to return
> all documents, then use IndexReader to walk them.  If the goal is to
> have a general user query expression where ID:0* would be entered you
> will need to account for that possibility with more system resources
> and bumping up the BooleanQuery limit or indexing differently so that
> there are no so many terms being put into the BooleanQuery.  It is
> difficult to offer specific advice as I'm not sure what your use
> cases are.
>
>     Erik
>
>
>
>
>>
>>    Mari Luz
>>
>>
>>
>> Untitled Document  
>> ---------------------------------------------------  Mari Luz  
>> Elola  Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.:  
>> +34 91  768 46 58 mailto: [hidden email]  
>> ---------------------------------------------------  Privileged/  
>> Confidential Information may be contained in this message and is  
>> intended solely for the use of the named addressee(s). Access to  
>> this e-mail by anyone else is unauthorised. If you are not the  
>> intended recipient, any disclosure, copying, distribution or re-
>> use  of the information contained in it is prohibited and may be  
>> unlawful. Opinions, conclusions and any other information  
>> contained  in this message that do not relate to the official  
>> business of  Seinet shall be understood as neither given nor  
>> endorsed by it. If  you have received this communication in error,  
>> please notify us  immediately by replying to this mail and  
>> deleting it from your  computer. Thank you.
>> ----- Original Message ----- From: "Erik Hatcher"  
>> <[hidden email]>
>> To: <[hidden email]>
>> Sent: Wednesday, July 06, 2005 8:12 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>> We'll need some more details to help.  What query was it?
>>
>>     Erik
>>
>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>
>>
>>
>>> Hi, I have a problem when I am trying to search a simple query    
>>> without sorting into an index with 210.000 documents.
>>> Executing the query several times I am getting the OutOfMemory  
>>> error.
>>> I am creating an IndexSearcher(pathDir) every search.
>>> I don´t know if it will be necessary to create only one    
>>> indexSearcher and caching it,
>>> If I search into an index with only 50.000 documents, the    
>>> outofMemory error doen´t appear.
>>> ------------------------
>>> ENVIROMENT DESCRIPTION:
>>> ------------------------
>>>
>>> ---SERVER---
>>> MEMORY 2GB
>>> APP SERVER Jboss3.2.3
>>> JAVA_OPTS -Xmx640M -Xms640M
>>>
>>> ----LUCENE 1.4.3-------
>>> INDEX +- 210.000 documents
>>> EACH DOCUMENT +- 20 fields (metadatas)
>>> SIZE TEXT DOCUMENT 1k
>>>
>>> ------------------------
>>> ERROR:
>>> ------------------------
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected  
>>> Error; nested exception is:
>>>         java.lang.OutOfMemoryError
>>> 18:52:18,661 ERROR [STDERR]     at  
>>> org.jboss.ejb.plugins.LogInterceptor.handleException  
>>> (LogInterceptor.java:374)
>>> 18:52:18,661 ERROR [STDERR]     at  
>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>> 18:52:18,661 ERROR [STDERR]     at  
>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke  
>>> (ProxyFactoryFinderInterceptor.java:122)
>>> 18:52:18,662 ERROR [STDERR]     at  
>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke  
>>> (StatelessSessionContainer.java:331)
>>> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke  
>>> (Container.java:700)
>>> 18:52:18,662 ERROR [STDERR]     at  
>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>> 18:52:18,662 ERROR [STDERR]     at  
>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>> .
>>> .
>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for  
>>> CMS:   Work queue overflow; try -XX:-CMSParallelRemarkEnabled.  
>>> Out of  swap  space?
>>>
>>>
>>> Could anybody help me???
>>>
>>> Thanks in advance
>>>
>>>     Mari Luz
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

MariLuz Elola
Hi Erik, excuse me for all my questions. Thank you very much for your speedy
answers, and sorry for my bad english.
I am spanish and I don?t speak english very well.
Well, I have one question more.
Finally I am using IndexReader to return all the documents:
                Directory directory = FSDirectory.getDirectory(path, false);
                IndexReader reader = IndexReader.open(directory);
        for (int start = base; start < end; start++) {
            Document doc = reader.document(start);
            String
id=doc.get(es.seinet.xtent.searchEngine.lucene.general.Util.ID);
            ides.add(id);
        }
It works fine and speedy. The only problem is that it is impossible to sort
the results by some metadata (gets all the documents order by title, for
example).

My question is about the parameter maxClauseCount. I think the same that
you. It is not a good idea bump up the limit...
If I use the default vale (1024) and I search, I am getting this error:
[SearchCollection,executeQuery] caught a class
org.apache.lucene.search.BooleanQuery$TooManyClauses
 with message: null

Are there any way to search all the documents (210.000 documents) and
internally works only with 1024, returns documents until 1024 and not get
the toomanyclauses error??? I need to work efficiently with collections of
more than 250.000 regitries, and the users normally does complex querys (ej:
DATE:[20050601 to 20050701] AND TITLE:Lucene*  ...... ect....)

Ah!! I have seen that you are Erik Hatcher, the author of Lucene In
Action!!!
I don?t understand you about the filter.... well, I will read the charter of
filtering a search :-D

Thanks in advance

        Mari Luz

----- Original Message -----
From: "Erik Hatcher" <[hidden email]>
To: <[hidden email]>
Sent: Thursday, July 07, 2005 5:53 PM
Subject: Re: OUTOFMEMORY ERROR



On Jul 7, 2005, at 9:40 AM, MariLuz Elola wrote:
> Thanks Erik,
> I was wrong, exactly the query that throws an OutOfMemory error is  ==>
> ID:0* -ID:xtent.
> With the query ID:0* I have tried to reproduce the error, but the
> exception doen?t appear.

> Other thing, when the user searchs without using any query,  internally I
> am creating the next query ==> ID:0* OR NOT ID:xtent.

That's a hairy query.  I definitely do not recommend doing something
like that with prefix queries.  Check out using a Filter for some of
this sort of thing also.

> And this query parsed by QueryParser I am obtaining ID:0* -ID:xtent
> (traslated ==> ID:0* AND NOT ID:xtent), isn?t? Is QueryParser  working
> wrong???

It depends.  By default, QueryParser uses OR as the default operator.

> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.s
> earchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;

Bumping up that limit is not necessarily the best thing to do - I
recommend changing your approach to querying all documents rather
than trying to make BooleanQuery happy with an enormously inefficient
query.

     Erik


>
>    Mari Luz
>
> ----- Original Message ----- From: "Erik Hatcher"
> <[hidden email]>
> To: <[hidden email]>
> Sent: Thursday, July 07, 2005 2:46 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>
> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>
>> The query is ==> ID:0*
>> This query returns all the documents, exactly 210.000 documents.
>> If the user doesn?t specify any criterio in the user interface of
>> searching, the server searchs all the documents.
>>
>
> Doing a prefix query (which ID:0* is) internally builds a
> BooleanQuery OR'ing all unique terms in the ID field that begin with
> a "0".  The built in limit is 1,024 clauses in a BooleanQuery.
>
> You will need to re-think your approach.  If the goal is to return
> all documents, then use IndexReader to walk them.  If the goal is to
> have a general user query expression where ID:0* would be entered you
> will need to account for that possibility with more system resources
> and bumping up the BooleanQuery limit or indexing differently so that
> there are no so many terms being put into the BooleanQuery.  It is
> difficult to offer specific advice as I'm not sure what your use
> cases are.
>
>     Erik
>
>
>
>
>>
>>    Mari Luz
>>
>>
>>
>> Untitled Document   ---------------------------------------------------  
>> Mari Luz  Elola  Developer Engineer Caleruega, 67 28033 Madrid (Spain)
>> Tel.:  +34 91  768 46 58 mailto:
>> [hidden email]   ---------------------------------------------------  
>> Privileged/  Confidential Information may be contained in this message
>> and is   intended solely for the use of the named addressee(s). Access to
>> this e-mail by anyone else is unauthorised. If you are not the   intended
>> recipient, any disclosure, copying, distribution or re- use  of the
>> information contained in it is prohibited and may be   unlawful.
>> Opinions, conclusions and any other information  contained  in this
>> message that do not relate to the official  business of  Seinet shall be
>> understood as neither given nor  endorsed by it. If  you have received
>> this communication in error,  please notify us  immediately by replying
>> to this mail and  deleting it from your  computer. Thank you.
>> ----- Original Message ----- From: "Erik Hatcher"
>> <[hidden email]>
>> To: <[hidden email]>
>> Sent: Wednesday, July 06, 2005 8:12 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>> We'll need some more details to help.  What query was it?
>>
>>     Erik
>>
>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>
>>
>>
>>> Hi, I have a problem when I am trying to search a simple query
>>> without sorting into an index with 210.000 documents.
>>> Executing the query several times I am getting the OutOfMemory  error.
>>> I am creating an IndexSearcher(pathDir) every search.
>>> I don?t know if it will be necessary to create only one    indexSearcher
>>> and caching it,
>>> If I search into an index with only 50.000 documents, the    outofMemory
>>> error doen?t appear.
>>> ------------------------
>>> ENVIROMENT DESCRIPTION:
>>> ------------------------
>>>
>>> ---SERVER---
>>> MEMORY 2GB
>>> APP SERVER Jboss3.2.3
>>> JAVA_OPTS -Xmx640M -Xms640M
>>>
>>> ----LUCENE 1.4.3-------
>>> INDEX +- 210.000 documents
>>> EACH DOCUMENT +- 20 fields (metadatas)
>>> SIZE TEXT DOCUMENT 1k
>>>
>>> ------------------------
>>> ERROR:
>>> ------------------------
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected   Error;
>>> nested exception is:
>>>         java.lang.OutOfMemoryError
>>> 18:52:18,661 ERROR [STDERR]     at
>>> org.jboss.ejb.plugins.LogInterceptor.handleException
>>> (LogInterceptor.java:374)
>>> 18:52:18,661 ERROR [STDERR]     at
>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>> 18:52:18,661 ERROR [STDERR]     at
>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>>> (ProxyFactoryFinderInterceptor.java:122)
>>> 18:52:18,662 ERROR [STDERR]     at
>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>>> (StatelessSessionContainer.java:331)
>>> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke
>>> (Container.java:700)
>>> 18:52:18,662 ERROR [STDERR]     at
>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>> 18:52:18,662 ERROR [STDERR]     at
>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>> .
>>> .
>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for  CMS:   Work
>>> queue overflow; try -XX:-CMSParallelRemarkEnabled.  Out of  swap  space?
>>>
>>>
>>> Could anybody help me???
>>>
>>> Thanks in advance
>>>
>>>     Mari Luz
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: OUTOFMEMORY ERROR

Erik Hatcher

On Jul 7, 2005, at 1:12 PM, MariLuz Elola wrote:

> Hi Erik, excuse me for all my questions. Thank you very much for  
> your speedy answers, and sorry for my bad english.
> I am spanish and I don´t speak english very well.
> Well, I have one question more.
> Finally I am using IndexReader to return all the documents:
>                Directory directory = FSDirectory.getDirectory(path,  
> false);
>                IndexReader reader = IndexReader.open(directory);
>        for (int start = base; start < end; start++) {
>            Document doc = reader.document(start);
>            String id=doc.get
> (es.seinet.xtent.searchEngine.lucene.general.Util.ID);
>            ides.add(id);
>        }
> It works fine and speedy. The only problem is that it is impossible  
> to sort the results by some metadata (gets all the documents order  
> by title, for example).

If you truly need to have a Query that can find all documents, then  
add a special field to each document with a fixed value such as  
doc:yes and then do a TermQuery for doc:yes.  You could then leverage  
Lucene's sorting capability.

> My question is about the parameter maxClauseCount. I think the same  
> that you. It is not a good idea bump up the limit...
> If I use the default vale (1024) and I search, I am getting this  
> error:
> [SearchCollection,executeQuery] caught a class  
> org.apache.lucene.search.BooleanQuery$TooManyClauses
> with message: null
>
> Are there any way to search all the documents (210.000 documents)  
> and internally works only with 1024, returns documents until 1024  
> and not get the toomanyclauses error??? I need to work efficiently  
> with collections of more than 250.000 regitries, and the users  
> normally does complex querys (ej: DATE:[20050601 to 20050701] AND  
> TITLE:Lucene*  ...... ect....)

The issue is that PrefixQuery, WildcardQuery, RangeQuery, and  
FuzzyQuery all expand to the terms that match in a BooleanQuery OR  
fashion.  You need to identify what terms those are and address them  
individually.  I can't offer specific advice since I don't know what  
fields you're using and what values they may contain.  But one  
example is with dates.  If you index dates and do it at the  
millisecond granularity but you really only need to query by YEAR  
then there is a great chance one of those query types will expand to  
TooManyClauses.  If, instead, you indexed dates by YYYY when all you  
need is year granularity then you have far fewer terms.  I hope this  
makes sense and helps.

     Erik