How to add machine learning to Apache lucene

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

How to add machine learning to Apache lucene

Priyanka Tufchi
Hello All

How can I add Maching Learning Part in Apache Lucene .


Thanks
Priyanka
Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

iorixxx
Hi Priyanka,

There are existing tools that can feed from lucene index. For example http://mahout.apache.org

Why not use them?

Ahmet



On Wednesday, May 7, 2014 11:05 PM, Priyanka Tufchi <[hidden email]> wrote:
Hello All

How can I add Maching Learning Part in Apache Lucene .


Thanks
Priyanka


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

kamaci
In reply to this post by Priyanka Tufchi
Hi;

Could you explain what you need a bit more?

Thanks;
Furkan KAMACI
7 May 2014 23:05 tarihinde "Priyanka Tufchi" <[hidden email]>
yazdı:

> Hello All
>
> How can I add Maching Learning Part in Apache Lucene .
>
>
> Thanks
> Priyanka
>
Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

Koji Sekiguchi
In reply to this post by Priyanka Tufchi
Hi Priyanka,

 > How can I add Maching Learning Part in Apache Lucene .

I think your question is too wide to asnwer because machine learning
covers a lot of things...

Lucene has already got a text categorization function which is a well
known task of NLP and NLP is a part of machine learning. I've written
the article about it. Please take a look at it if you'd like:

Comparing Document Classification Functions of Lucene and Mahout
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

And there's an OSS project that can search similar photos/images
othar than text by using Lucene:

alike in Apache Labs:
http://labs.apache.org/labs.html
(please see short slides for overview in the folder
http://svn.apache.org/repos/asf/labs/alike/trunk/ )

Koji
--
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

Koji Sekiguchi
In reply to this post by Priyanka Tufchi
Hi Priyanka,

 > How can I add Maching Learning Part in Apache Lucene .

I think your question is too wide to asnwer because machine learning
covers a lot of things...

Lucene has already got a text categorization function which is a well
known task of NLP and NLP is a part of machine learning. I've written
the article about it. Please take a look at it if you'd like:

Comparing Document Classification Functions of Lucene and Mahout
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

And there's an OSS project that can search similar photos/images
othar than text by using Lucene:

alike in Apache Labs:
http://labs.apache.org/labs.html
(please see short slides for overview in the folder
http://svn.apache.org/repos/asf/labs/alike/trunk/ )

Koji
--
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

aiguofer
In reply to this post by Koji Sekiguchi
I've actually been wondering about this as well.  More specifically, I've been wondering if there's any kind of framework to integrate some sort of learn to rank approach (http://en.wikipedia.org/wiki/Learning_to_rank) to Lucene/Solr.  Although a similar result can be accomplished by using boost functions, it becomes very hard to find the "optimal" boost value for each of the features that you'd like to use.  An ML framework like this could allow us to "learn" what these values are, as well as allow us to use non-numeric features (depending on the ML approach).  If anyone has tried this or has some insight I'd be really interested to hear about it.

Diego Fernandez - 爱国
Software Engineer
US GSS Supportability - Diagnostics


----- Original Message -----

> Hi Priyanka,
>
>  > How can I add Maching Learning Part in Apache Lucene .
>
> I think your question is too wide to asnwer because machine learning
> covers a lot of things...
>
> Lucene has already got a text categorization function which is a well
> known task of NLP and NLP is a part of machine learning. I've written
> the article about it. Please take a look at it if you'd like:
>
> Comparing Document Classification Functions of Lucene and Mahout
> http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
>
> And there's an OSS project that can search similar photos/images
> othar than text by using Lucene:
>
> alike in Apache Labs:
> http://labs.apache.org/labs.html
> (please see short slides for overview in the folder
> http://svn.apache.org/repos/asf/labs/alike/trunk/ )
>
> Koji
> --
> http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

iorixxx
Hi Diego,

There is no such thing in lucene ecosystem yet. Although some ideas

http://search-lucene.com/m/WwzTb2nt1Tk1 
http://search-lucene.com/m/WwzTb2d9o2m

float time to time. 


I would like to integrate https://code.google.com/p/jforests/ and create a prototype my self in the future.

New added features SOLR-6088 LUCENE-5489 looks promising.

Ahmet



On Friday, May 16, 2014 6:30 PM, Diego Fernandez <[hidden email]> wrote:



I've actually been wondering about this as well.  More specifically, I've been wondering if there's any kind of framework to integrate some sort of learn to rank approach (http://en.wikipedia.org/wiki/Learning_to_rank) to Lucene/Solr.  Although a similar result can be accomplished by using boost functions, it becomes very hard to find the "optimal" boost value for each of the features that you'd like to use.  An ML framework like this could allow us to "learn" what these values are, as well as allow us to use non-numeric features (depending on the ML approach).  If anyone has tried this or has some insight I'd be really interested to hear about it.

Diego Fernandez - 爱国
Software Engineer
US GSS Supportability - Diagnostics


----- Original Message -----

> Hi Priyanka,
>
>  > How can I add Maching Learning Part in Apache Lucene .
>
> I think your question is too wide to asnwer because machine learning
> covers a lot of things...
>
> Lucene has already got a text categorization function which is a well
> known task of NLP and NLP is a part of machine learning. I've written
> the article about it. Please take a look at it if you'd like:
>
> Comparing Document Classification Functions of Lucene and Mahout
> http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
>
> And there's an OSS project that can search similar photos/images
> othar than text by using Lucene:
>
> alike in Apache Labs:
> http://labs.apache.org/labs.html
> (please see short slides for overview in the folder
> http://svn.apache.org/repos/asf/labs/alike/trunk/ )
>
> Koji
> --
> http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

aiguofer
Great! That's good to know that at least it's being discussed.  The jforests package seems really interesting, I'll start looking a bit more into it.  Although I'm not a Java programmer per se, I'd be interested in helping out with this effort.  If there's anything I can do, let me know!

Diego Fernandez - 爱国
Software Engineer
US GSS Supportability - Diagnostics


----- Original Message -----

> Hi Diego,
>
> There is no such thing in lucene ecosystem yet. Although some ideas
>
> http://search-lucene.com/m/WwzTb2nt1Tk1
> http://search-lucene.com/m/WwzTb2d9o2m
>
> float time to time.
>
>
> I would like to integrate https://code.google.com/p/jforests/ and create a
> prototype my self in the future.
>
> New added features SOLR-6088 LUCENE-5489 looks promising.
>
> Ahmet
>
>
>
> On Friday, May 16, 2014 6:30 PM, Diego Fernandez <[hidden email]> wrote:
>
>
>
> I've actually been wondering about this as well.  More specifically, I've
> been wondering if there's any kind of framework to integrate some sort of
> learn to rank approach (http://en.wikipedia.org/wiki/Learning_to_rank) to
> Lucene/Solr.  Although a similar result can be accomplished by using boost
> functions, it becomes very hard to find the "optimal" boost value for each
> of the features that you'd like to use.  An ML framework like this could
> allow us to "learn" what these values are, as well as allow us to use
> non-numeric features (depending on the ML approach).  If anyone has tried
> this or has some insight I'd be really interested to hear about it.
>
> Diego Fernandez - 爱国
> Software Engineer
> US GSS Supportability - Diagnostics
>
>
> ----- Original Message -----
> > Hi Priyanka,
> >
> >  > How can I add Maching Learning Part in Apache Lucene .
> >
> > I think your question is too wide to asnwer because machine learning
> > covers a lot of things...
> >
> > Lucene has already got a text categorization function which is a well
> > known task of NLP and NLP is a part of machine learning. I've written
> > the article about it. Please take a look at it if you'd like:
> >
> > Comparing Document Classification Functions of Lucene and Mahout
> > http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
> >
> > And there's an OSS project that can search similar photos/images
> > othar than text by using Lucene:
> >
> > alike in Apache Labs:
> > http://labs.apache.org/labs.html
> > (please see short slides for overview in the folder
> > http://svn.apache.org/repos/asf/labs/alike/trunk/ )
> >
> > Koji
> > --
> > http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

rulinma
In reply to this post by Priyanka Tufchi
very different one is search , another is ml.
But, I want use ml results to improve solr performance, for example, buy more, view more.
Reply | Threaded
Open this post in threaded view
|

Re: How to add machine learning to Apache lucene

Priyanka Tufchi
In reply to this post by kamaci
Hi Furkan

Actually i have set of CandidateResumes and comments which are related to
whether resumes have been selected or rejected now I have to make machine
learn itself that if next time such or similar resume comes based on the
pre history it should go in which bag selected or rejected .


Thanks
Priyanka


On Wed, May 7, 2014 at 3:23 PM, Furkan KAMACI <[hidden email]>wrote:

> Hi;
>
> Could you explain what you need a bit more?
>
> Thanks;
> Furkan KAMACI
> 7 May 2014 23:05 tarihinde "Priyanka Tufchi" <
> [hidden email]>
> yazdı:
>
> > Hello All
> >
> > How can I add Maching Learning Part in Apache Lucene .
> >
> >
> > Thanks
> > Priyanka
> >
>