Handling Indexed, Stored and Tokenized fields

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Handling Indexed, Stored and Tokenized fields

ts01
Hi,

We have a requirement to index as well as store multiple fields in a document, each with its own special tokenizer. The following seems to provide a way to index multiple fields each with its own tokenizer:

Field(String name, Reader reader)

The following seems to provide a way to Index and Store, but using the default tokenizer of the entire document, when index is set to Index.TOKENIZED.

Field(String name, String value, Store store, Index index)

The following is what we need, with store.STORED and index.TOKENIZED with specific reader.

Field(String name, String value, Store store, Index index, Reader reader)

Assuming that this capability does not exist, we added some code to the Field class, IndexWriter class and to the AbstractField class to provide this capability. We'd be happy to contribute this to the next Open Source release.

Another possibility was to create two fields: field_indexed and field_stored for each document, but that would be an inconvenient workaround.

Is this of interest to folks?

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Handling Indexed, Stored and Tokenized fields

Doron Cohen-2
Seems that PerFieldAnalyzerWrapper would be convenient here?

Doron

On Dec 12, 2007 10:41 PM, ts01 <[hidden email]> wrote:

>
> Hi,
>
> We have a requirement to index as well as store multiple fields in a
> document, each with its own special tokenizer. The following seems to
> provide a way to index multiple fields each with its own tokenizer:
>
> Field(String name, Reader reader)
>
> The following seems to provide a way to Index and Store, but using the
> default tokenizer of the entire document, when index is set to
> Index.TOKENIZED.
>
> Field(String name, String value, Store store, Index index)
>
> The following is what we need, with store.STORED and index.TOKENIZED with
> specific reader.
>
> Field(String name, String value, Store store, Index index, Reader reader)
>
> Assuming that this capability does not exist, we added some code to the
> Field class, IndexWriter class and to the AbstractField class to provide
> this capability. We'd be happy to contribute this to the next Open Source
> release.
>
> Another possibility was to create two fields: field_indexed and
> field_stored
> for each document, but that would be an inconvenient workaround.
>
> Is this of interest to folks?
>
> Thanks.
>
> --
> View this message in context:
> http://www.nabble.com/Handling-Indexed%2C-Stored-and-Tokenized-fields-tp14303464p14303464.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>