Injecting additional tokens

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Injecting additional tokens

Markus Lux
Hi,

Assume I have a String "z-4". That would be properly indexed by my Analyzer,
so I'd find the belonging document if I search for "z-4". Now I also want to
find that document if I search for "z4".
Now my approach would be to inject an additional token "z4" at indexing
time. There may also be several other characters that could be deleted in a
new token.
How could I manage that? Is there any predefined Tokenizer/Filter for this?
Or am I wrong and there is a better way to get this done?

Thanks.

--
Markus
Reply | Threaded
Open this post in threaded view
|

Re: Injecting additional tokens

MilleB
Is my subscription working... I got no reply on my previous question.
Sorry the disturbance.

On Mon, Sep 1, 2008 at 10:29 PM, Markus Lux <[hidden email]> wrote:

> Hi,
>
> Assume I have a String "z-4". That would be properly indexed by my
> Analyzer,
> so I'd find the belonging document if I search for "z-4". Now I also want
> to
> find that document if I search for "z4".
> Now my approach would be to inject an additional token "z4" at indexing
> time. There may also be several other characters that could be deleted in a
> new token.
> How could I manage that? Is there any predefined Tokenizer/Filter for this?
> Or am I wrong and there is a better way to get this done?
>
> Thanks.
>
> --
> Markus
>
Reply | Threaded
Open this post in threaded view
|

Re: Injecting additional tokens

Karsten F.-2
In reply to this post by Markus Lux
Hi Markus,

hopefully someone will tell you the predefined Filter for this.

I only want to agree, that filter is the correct place for this, and that you should be aware of the Token positions (after your filter you must have two Tokens on the same position).

I think "WordDelimitierFilter" is a good starting point, if you have to write this filter by your own.

best regards
  Karsten



Markus Lux wrote
Hi,

Assume I have a String "z-4". That would be properly indexed by my Analyzer,
so I'd find the belonging document if I search for "z-4". Now I also want to
find that document if I search for "z4".
Now my approach would be to inject an additional token "z4" at indexing
time. There may also be several other characters that could be deleted in a
new token.
How could I manage that? Is there any predefined Tokenizer/Filter for this?
Or am I wrong and there is a better way to get this done?

Thanks.

--
Markus
Reply | Threaded
Open this post in threaded view
|

Re: Injecting additional tokens

hossman
In reply to this post by MilleB

: Is my subscription working... I got no reply on my previous question.
: Sorry the disturbance.

1) if you see your message show up in one of the archives, that' a pretty
good indication that your post made it to the list...
http://www.nabble.com/forum/Search.jtp?query=Raymond+Balm%C3%A8s&local=y&forum=44

2) People answer questions on the Lucene lists voluntarily ... they do it
because they want to help and they enjoy it.  but since it's not a paid
support service, that means there is no garuntee of how long it will take
someone to reply to any given post.  it's not that unusually for messages
to go unanswered for  few days -- particularly when it's a 3 day holiday
in the US.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]