Manually create a term freq vector

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Manually create a term freq vector

Johan Oskarsson
Hi.

I'm trying to speed up my indexing process and
since I already know how many times I want a specific
word to occur in the term frequence vector I'd like to be
able to create the vector myself.

This would speed things up because I wouldn't have to
take the extra step of creating a string with the
word duplicated x times just so that lucene can
break it down into "string y occurs x times" again.

The problem is that I can't seem to figure out how to do this.
Even if I manage to create a term vector there doesn't seem to
be a good place to set it in the field.

/Johan


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Manually create a term freq vector

Grant Ingersoll
You may find this useful:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200511.mbox/%3c4368E375.2070105@...%3e


Johan Oskarsson wrote:

> Hi.
>
> I'm trying to speed up my indexing process and
> since I already know how many times I want a specific
> word to occur in the term frequence vector I'd like to be
> able to create the vector myself.
>
> This would speed things up because I wouldn't have to
> take the extra step of creating a string with the
> word duplicated x times just so that lucene can
> break it down into "string y occurs x times" again.
>
> The problem is that I can't seem to figure out how to do this.
> Even if I manage to create a term vector there doesn't seem to
> be a good place to set it in the field.
>
> /Johan
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
-------------------------------------------------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
335 Hinds Hall
Syracuse, NY 13244

http://www.cnlp.org 
Voice:  315-443-5484
Fax: 315-443-6886


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...