desired constructor Field("contents", new FileReader(f), Field.Store.COMPRESS)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

desired constructor Field("contents", new FileReader(f), Field.Store.COMPRESS)

Boris Galitsky-2
Hello

 I need to index the content of a file (using our in-house analyzer)
and store in compressed way.
So  Field("contents", new FileReader(f), Field.Store.COMPRESS) would be
a desired constructor
(but it does not exist in this form).

 How would one "combine"  new FileReader(f) and Field.Store.COMPRESS ?

Regards
--
Boris Galitsky.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: desired constructor Field("contents", new FileReader(f), Field.Store.COMPRESS)

Grant Ingersoll
Hi Boris,

Readers are never stored, so I don't believe you can do it that way.  
Of course, you can always read the values into a String and then  
access the appropriate String based constructor.

Storage is a separate mechanism from indexing, so my _guess_ is that  
if you want Readers to be stored, it would result in having to use  
the Reader twice (once for indexing and once for storage), which  
isn't possible, I don't believe, since not all Readers support the  
mark() and reset() functionality.  Besides, you will get better  
performance reading once...


-Grant


On Aug 15, 2006, at 5:12 PM, Boris Galitsky wrote:

> Hello
>
> I need to index the content of a file (using our in-house analyzer)
> and store in compressed way.
> So  Field("contents", new FileReader(f), Field.Store.COMPRESS)  
> would be a desired constructor
> (but it does not exist in this form).
>
> How would one "combine"  new FileReader(f) and Field.Store.COMPRESS ?
>
> Regards
> --
> Boris Galitsky.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

--------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org

Voice: 315-443-5484
Skype: grant_ingersoll
Fax: 315-443-6886




---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: desired constructor Field("contents", new FileReader(f), Field.Store.COMPRESS)

Chris Hostetter-3

: Storage is a separate mechanism from indexing, so my _guess_ is that
: if you want Readers to be stored, it would result in having to use
: the Reader twice (once for indexing and once for storage), which
: isn't possible, I don't believe, since not all Readers support the
: mark() and reset() functionality.  Besides, you will get better
: performance reading once...

To elaborate: this really isn't a flaw in the way Fields work -- the
fact that seperate mechanisms are involved doesn't result in any penalty
that you wouldn't also be facing if it was done at once....

The advantage of indexing a "Reader" based Field is that you can
tokenize/index a stream of text from a Reader without needing the entire
contents the Reader refrences in memory at one time -- it's a true stream
opeeration.  Storing a field requires the full data set to be available in
memory at once -- so if you are going to Store the data, you have to read
it into a String anyway.

That said, it would certainly be possible to have a convince method that
let you construct a Stored Field with a Reader, and it could slurp in the
whole reader for you, but it would be missleading to people who expect the
Reader based Fields to be stream based.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]