NumberFormatException when creating field cache

classic Classic list List threaded Threaded
3 messages Options
adb
Reply | Threaded
Open this post in threaded view
|

NumberFormatException when creating field cache

adb
I'm using Lucene 2.3.2 and have a date field used for sorting, which is
YYYYMMDDHHMM.  I get an exception when the FieldCache is being generated as follows:

java.lang.NumberFormatException: For input string: "190400-412317"
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
         at java.lang.Long.parseLong(Long.java:412)
         at java.lang.Long.parseLong(Long.java:461)
org.apache.lucene.search.ExtendedFieldCacheImpl$1.parseLong(ExtendedFieldCacheImpl.java:18)
org.apache.lucene.search.ExtendedFieldCacheImpl$3.createValue(ExtendedFieldCacheImpl.java:53)
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
org.apache.lucene.search.ExtendedFieldCacheImpl.getLongs(ExtendedFieldCacheImpl.java:36)
org.apache.lucene.search.ExtendedFieldCacheImpl.getLongs(ExtendedFieldCacheImpl.java:30)
org.apache.lucene.search.FieldSortedHitQueue.comparatorLong(FieldSortedHitQueue.java:254)
org.apache.lucene.search.FieldSortedHitQueue$1.createValue(FieldSortedHitQueue.java:194)
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:168)
org.apache.lucene.search.FieldSortedHitQueue.<init>(FieldSortedHitQueue.java:56)

I'm not able to get onto the server that has the index DB at the moment, but I
expect my data is corrupt in the index.  That may have been because I have not
validated certain data given by a 'trusted' source, however, the problem now is
that assuming that data exists as it is, I am then unable to ever sort on the
date field.

It maybe that the original data for the Document is no longer available, so
deleting and re-creating may not be an option.

Would it be useful to allow some sort of data tolerance when creating these
caches?  At least now the only solution is to delete that Document.  Perhaps the
values could then be returned as 0 in the Parser implementations for numeric
failures.

Antony



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: NumberFormatException when creating field cache

Mark Miller-3
Antony Bowesman wrote:

> I'm using Lucene 2.3.2 and have a date field used for sorting, which
> is YYYYMMDDHHMM.  I get an exception when the FieldCache is being
> generated as follows:
>
> java.lang.NumberFormatException: For input string: "190400-412317"
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
>
>         at java.lang.Long.parseLong(Long.java:412)
>         at java.lang.Long.parseLong(Long.java:461)
> org.apache.lucene.search.ExtendedFieldCacheImpl$1.parseLong(ExtendedFieldCacheImpl.java:18)
>
> org.apache.lucene.search.ExtendedFieldCacheImpl$3.createValue(ExtendedFieldCacheImpl.java:53)
>
> org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
> org.apache.lucene.search.ExtendedFieldCacheImpl.getLongs(ExtendedFieldCacheImpl.java:36)
>
> org.apache.lucene.search.ExtendedFieldCacheImpl.getLongs(ExtendedFieldCacheImpl.java:30)
>
> org.apache.lucene.search.FieldSortedHitQueue.comparatorLong(FieldSortedHitQueue.java:254)
>
> org.apache.lucene.search.FieldSortedHitQueue$1.createValue(FieldSortedHitQueue.java:194)
>
> org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
> org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:168)
>
> org.apache.lucene.search.FieldSortedHitQueue.<init>(FieldSortedHitQueue.java:56)
>
>
> I'm not able to get onto the server that has the index DB at the
> moment, but I expect my data is corrupt in the index.  That may have
> been because I have not validated certain data given by a 'trusted'
> source, however, the problem now is that assuming that data exists as
> it is, I am then unable to ever sort on the date field.
>
> It maybe that the original data for the Document is no longer
> available, so deleting and re-creating may not be an option.
>
> Would it be useful to allow some sort of data tolerance when creating
> these caches?  At least now the only solution is to delete that
> Document.  Perhaps the values could then be returned as 0 in the
> Parser implementations for numeric failures.
>
> Antony
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
Validation adds costs - its best for you to validate before putting it
in the field if you need it.

Not only do you have to delete, but you also have to purge the delete by
causing the right merge or
optimizing (guaranteed to cause the right merge). Until then, deleted or
not, it will attempt to parse
the bad Long.

--
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: NumberFormatException when creating field cache

hossman
In reply to this post by adb

: Would it be useful to allow some sort of data tolerance when creating these
: caches?  At least now the only solution is to delete that Document.  Perhaps
: the values could then be returned as 0 in the Parser implementations for
: numeric failures.

picking an artibtrary number wouldn't be very general purpose.  it would
hide errors.

clients that want more tollerant parsing behavior are free to provide
their own parser...
http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/search/FieldCache.IntParser.html




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]