reversing porter stemming

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

reversing porter stemming

zzzzz shalev
is it possible to take a stemmed token from as index and run some sort of reverse porter stemming to get a logical word, the problem is that porter stemming is very aggressive, for example: people is indexed as peopl , so basically my quesion is
   
  if i have peoples , people, both indexed as peopl, is there a way to go from peopl -> people (retrieving the root word would be fine)
   
  thanks

 
---------------------------------
How low will we go? Check out Yahoo! Messenger¬ís low  PC-to-Phone call rates.
Reply | Threaded
Open this post in threaded view
|

Re: reversing porter stemming

Yonik Seeley
On 6/16/06, zzzzz shalev <[hidden email]> wrote:
> is it possible to take a stemmed token from as index and run some sort of reverse porter stemming to get a logical word, the problem is that porter stemming is very aggressive, for example: people is indexed as peopl , so basically my quesion is
>
>   if i have peoples , people, both indexed as peopl, is there a way to go from peopl -> people (retrieving the root word would be fine)

Interesting question... I assume this is so you can do something like
retrieve the top terms for a field and have it more readable by an
end-user?

I don't think there is a way built into Lucene, but you could get
mostly there by keeping a reverse mapping yourself.  Run a dictionary
of common words through the stemmer and keep track of what word
generated the stemmed word.


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]