Integrating Spell Checker contributed to Lucene

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Integrating Spell Checker contributed to Lucene

Ivan Vasilev-2
Hi Guys,

Has anybody integrated the Spell Checker contributed to Lucene. I need
advise from where to get free dictionary file (one that contains all
words in English) that could be used to create instance of
PlainTextDictionary class. I currently use for my tests responding files
from Jazzy and JADT projects, but I think I do not have right to use
them officially outside of their applications.
 
Best Regards,
Ivan


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Spell Checker contributed to Lucene

Mathieu Lecarme
Ivan Vasilev a écrit :
> Hi Guys,
>
> Has anybody integrated the Spell Checker contributed to Lucene.
http://blog.garambrogne.net/index.php?post/2008/03/07/A-lexicon-approach-for-Lucene-index
https://issues.apache.org/jira/browse/LUCENE-1190

> I need advise from where to get free dictionary file (one that
> contains all words in English) that could be used to create instance
> of PlainTextDictionary class.
all english word is a nonsense. Have a look at wordnet and hunspell.

> I currently use for my tests responding files from Jazzy and JADT
> projects, but I think I do not have right to use them officially
> outside of their applications.
>
> Best Regards,
> Ivan
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Spell Checker contributed to Lucene

Ivan Vasilev-2
Thanks Mathieu for your help!

The contribution that you have made to Lucene by this patch seems to be
great, but the hunspell dictionary is under LGPL which the lawyer of our
company does not like. Wordnet dictionary seems to be more free and may
be could help together with your patch.
In the Lucene's Jira I found the issue LUCENE-1190
<https://issues.apache.org/jira/browse/LUCENE-1190> and I have the
following questions about it:
1. There are two apthone-lexicon.patch - is one of them out of date? Is
the 336 KB current version?
2. In the Lucene SVN
(http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/) I can not
find the code of your contribution. In the same time the link from Jira
opens a very long file containing all the classes and packages of your
patch and each row is prefixed by "+". It is very inconvenient to
recreate the source code packages out of it. If possible could you give
a link from where to get these sources as they are?

Best Regards,
Ivan



Mathieu Lecarme wrote:

> Ivan Vasilev a écrit :
>> Hi Guys,
>>
>> Has anybody integrated the Spell Checker contributed to Lucene.
> http://blog.garambrogne.net/index.php?post/2008/03/07/A-lexicon-approach-for-Lucene-index 
>
> https://issues.apache.org/jira/browse/LUCENE-1190
>
>> I need advise from where to get free dictionary file (one that
>> contains all words in English) that could be used to create instance
>> of PlainTextDictionary class.
> all english word is a nonsense. Have a look at wordnet and hunspell.
>
>> I currently use for my tests responding files from Jazzy and JADT
>> projects, but I think I do not have right to use them officially
>> outside of their applications.
>>
>> Best Regards,
>> Ivan
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
> __________ NOD32 2970 (20080325) Information __________
>
> This message was checked by NOD32 antivirus system.
> http://www.eset.com
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Spell Checker contributed to Lucene

Mathieu Lecarme
Ivan Vasilev a écrit :
> Thanks Mathieu for your help!
>
> The contribution that you have made to Lucene by this patch seems to
> be great, but the hunspell dictionary is under LGPL which the lawyer
> of our company does not like.
It's the spell tool used by Openoffice and firefox. Data must be multi
licencied. Maybe you'll find the right licence for you. Older version
(aspell and ispell) may fit you more.
> Wordnet dictionary seems to be more free and may be could help
> together with your patch.
> In the Lucene's Jira I found the issue LUCENE-1190
> <https://issues.apache.org/jira/browse/LUCENE-1190> and I have the
> following questions about it:
> 1. There are two apthone-lexicon.patch - is one of them out of date?
> Is the 336 KB current version?
bigger is newer. I'm working on a third version.
> 2. In the Lucene SVN
> (http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/) I can not
> find the code of your contribution.
the patch is submited, but not yet approved.
> In the same time the link from Jira opens a very long file containing
> all the classes and packages of your patch and each row is prefixed by
> "+". It is very inconvenient to recreate the source code packages out
> of it. If possible could you give a link from where to get these
> sources as they are?
It's a standard patch (man patch).
Here is the svn :
https://admin.garambrogne.net/projets/revuedepresse/browser/trunk/src/java
You can do a svn checkout.

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Spell Checker contributed to Lucene

Ivan Vasilev-2
Thanks Mathieu,

I tryed to checkout but without success. Anyway I can do it manually,
but as the contribution is still not approved from Lucene our chiefs
will not whant it to be included to our project by now.
OK Thanks once again :)

Ivan

PS: Here is the output from the SVN version win32-1.4.5:

E:\SVN\svn-win32-1.4.5\bin>svn co
https://admin.garambrogne.net/projets/revuedepresse/browser/trunk/src/java 
temp
Error validating server certificate for '<a href="https://admin.garambrogne.net:443':">https://admin.garambrogne.net:443':
 - The certificate is not issued by a trusted authority. Use the
   fingerprint to validate the certificate manually!
 - The certificate hostname does not match.
 - The certificate has expired.
Certificate information:
 - Hostname: admin.garambogne.net
 - Valid: from Wed, 09 Mar 2005 19:21:51 GMT until Thu, 09 Mar 2006
19:21:51 GMT
 - Issuer: Quihou, Garambrogne, Saint Mande, Val de Marne, FR
 - Fingerprint: f4:f4:09:14:e5:4e:33:f8:05:d1:a2:73:b6:0c:b4:03:d9:13:83:9b
(R)eject, accept (t)emporarily or accept (p)ermanently? t
svn: PROPFIND request failed on
'/projets/revuedepresse/browser/trunk/src/java'
svn: PROPFIND of '/projets/revuedepresse/browser/trunk/src/java': 200 OK
(https://admin.garambrogne.net)


Mathieu Lecarme wrote:

> Ivan Vasilev a écrit :
>> Thanks Mathieu for your help!
>>
>> The contribution that you have made to Lucene by this patch seems to
>> be great, but the hunspell dictionary is under LGPL which the lawyer
>> of our company does not like.
> It's the spell tool used by Openoffice and firefox. Data must be multi
> licencied. Maybe you'll find the right licence for you. Older version
> (aspell and ispell) may fit you more.
>> Wordnet dictionary seems to be more free and may be could help
>> together with your patch.
>> In the Lucene's Jira I found the issue LUCENE-1190
>> <https://issues.apache.org/jira/browse/LUCENE-1190> and I have the
>> following questions about it:
>> 1. There are two apthone-lexicon.patch - is one of them out of date?
>> Is the 336 KB current version?
> bigger is newer. I'm working on a third version.
>> 2. In the Lucene SVN
>> (http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/) I can
>> not find the code of your contribution.
> the patch is submited, but not yet approved.
>> In the same time the link from Jira opens a very long file containing
>> all the classes and packages of your patch and each row is prefixed
>> by "+". It is very inconvenient to recreate the source code packages
>> out of it. If possible could you give a link from where to get these
>> sources as they are?
> It's a standard patch (man patch).
> Here is the svn :
> https://admin.garambrogne.net/projets/revuedepresse/browser/trunk/src/java 
>
> You can do a svn checkout.
>
> M.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
> __________ NOD32 2973 (20080326) Information __________
>
> This message was checked by NOD32 antivirus system.
> http://www.eset.com
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Spell Checker contributed to Lucene

Mathieu Lecarme
Ivan Vasilev a écrit :
> Thanks Mathieu,
>
> I tryed to checkout but without success. Anyway I can do it manually,
> but as the contribution is still not approved from Lucene our chiefs
> will not whant it to be included to our project by now.
It's a right decision. I hope the third patch will be good enough.

> OK Thanks once again :)
>
> Ivan
>
> PS: Here is the output from the SVN version win32-1.4.5:
>
> E:\SVN\svn-win32-1.4.5\bin>svn co
> https://admin.garambrogne.net/projets/revuedepresse/browser/trunk/src/java 
> temp
> Error validating server certificate for
> '<a href="https://admin.garambrogne.net:443':">https://admin.garambrogne.net:443':
> - The certificate is not issued by a trusted authority. Use the
>   fingerprint to validate the certificate manually!
> - The certificate hostname does not match.
> - The certificate has expired.
> Certificate information:
> - Hostname: admin.garambogne.net
> - Valid: from Wed, 09 Mar 2005 19:21:51 GMT until Thu, 09 Mar 2006
> 19:21:51 GMT
> - Issuer: Quihou, Garambrogne, Saint Mande, Val de Marne, FR
> - Fingerprint:
> f4:f4:09:14:e5:4e:33:f8:05:d1:a2:73:b6:0c:b4:03:d9:13:83:9b
> (R)eject, accept (t)emporarily or accept (p)ermanently? t
> svn: PROPFIND request failed on
> '/projets/revuedepresse/browser/trunk/src/java'
> svn: PROPFIND of '/projets/revuedepresse/browser/trunk/src/java': 200
> OK (https://admin.garambrogne.net)
the svn url is :
https://admin.garambrogne.net/subversion/revuedepresse/trunk/src/java 
and the certificat is self signed. you can use any login, like "anonymous".

M.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Spell Checker contributed to Lucene

Ivan Vasilev-2
This is! Now I finally got it :)
OK will use it only for test integration by now (if there will time for
this :) ) and will expect the third patch.
Have a nice time :)
Ivan


Mathieu Lecarme wrote:

> Ivan Vasilev a écrit :
>> Thanks Mathieu,
>>
>> I tryed to checkout but without success. Anyway I can do it manually,
>> but as the contribution is still not approved from Lucene our chiefs
>> will not whant it to be included to our project by now.
> It's a right decision. I hope the third patch will be good enough.
>> OK Thanks once again :)
>>
>> Ivan
>>
>> PS: Here is the output from the SVN version win32-1.4.5:
>>
>> E:\SVN\svn-win32-1.4.5\bin>svn co
>> https://admin.garambrogne.net/projets/revuedepresse/browser/trunk/src/java 
>> temp
>> Error validating server certificate for
>> '<a href="https://admin.garambrogne.net:443':">https://admin.garambrogne.net:443':
>> - The certificate is not issued by a trusted authority. Use the
>>   fingerprint to validate the certificate manually!
>> - The certificate hostname does not match.
>> - The certificate has expired.
>> Certificate information:
>> - Hostname: admin.garambogne.net
>> - Valid: from Wed, 09 Mar 2005 19:21:51 GMT until Thu, 09 Mar 2006
>> 19:21:51 GMT
>> - Issuer: Quihou, Garambrogne, Saint Mande, Val de Marne, FR
>> - Fingerprint:
>> f4:f4:09:14:e5:4e:33:f8:05:d1:a2:73:b6:0c:b4:03:d9:13:83:9b
>> (R)eject, accept (t)emporarily or accept (p)ermanently? t
>> svn: PROPFIND request failed on
>> '/projets/revuedepresse/browser/trunk/src/java'
>> svn: PROPFIND of '/projets/revuedepresse/browser/trunk/src/java': 200
>> OK (https://admin.garambrogne.net)
> the svn url is :
> https://admin.garambrogne.net/subversion/revuedepresse/trunk/src/java 
> and the certificat is self signed. you can use any login, like
> "anonymous".
>
> M.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
> __________ NOD32 2974 (20080326) Information __________
>
> This message was checked by NOD32 antivirus system.
> http://www.eset.com
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]