Fwd: Software Announcement: LuSql: Database to Lucene indexing

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: Software Announcement: LuSql: Database to Lucene indexing

Matthew Runo
Hello -

I wanted to forward this on, since I thought that people here might be  
able to use this to build indexes. So long as the lucene version in  
LuSQL matches the version in Solr, it would work fine for indexing -  
yea?

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
[hidden email] - 702-943-7833

Begin forwarded message:

> From: "Glen Newton" <[hidden email]>
> Date: November 17, 2008 4:32:18 AM PST
> To: [hidden email]
> Subject: Software Announcement: LuSql: Database to Lucene indexing
> Reply-To: [hidden email]
>
> LuSql is a simple but powerful tool for building Lucene indexes from
> relational databases. It is a command-line Java application for the
> construction of a Lucene index from an arbitrary SQL query of a
> JDBC-accessible SQL database. It allows a user to control a number of
> parameters, including the SQL query to use, individual
> indexing/storage/term-vector nature of fields, analyzer, stop word
> list, and other tuning parameters. In its default mode it uses
> threading to take advantage of multiple cores.
>
> LuSql can handle complex queries, allows for additional per record
> sub-queries, and has a plug-in architecture for arbitrary Lucene
> document manipulation. Its only dependencies are three Apache Commons
> libraries, the Lucene core itself, and a JDBC driver.
>
> LuSql has been extensively tested, including a large 6+ million
> full-text & metadata journal article document collection, producing an
> 86GB Lucene index in ~13 hours.
>
> http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
>
> Glen Newton
>
> --
>
> -
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

Reply | Threaded
Open this post in threaded view
|

Re: Software Announcement: LuSql: Database to Lucene indexing

Erik Hatcher
Yeah, it'd work, though not only does the version of Lucene need to  
match, but the field indexing/storage attributes need to jive as well  
- and that is the trickier part of the equation.

But yeah, LuSQL looks slick!

        Erik


On Nov 17, 2008, at 2:17 PM, Matthew Runo wrote:

> Hello -
>
> I wanted to forward this on, since I thought that people here might  
> be able to use this to build indexes. So long as the lucene version  
> in LuSQL matches the version in Solr, it would work fine for  
> indexing - yea?
>
> Thanks for your time!
>
> Matthew Runo
> Software Engineer, Zappos.com
> [hidden email] - 702-943-7833
>
> Begin forwarded message:
>
>> From: "Glen Newton" <[hidden email]>
>> Date: November 17, 2008 4:32:18 AM PST
>> To: [hidden email]
>> Subject: Software Announcement: LuSql: Database to Lucene indexing
>> Reply-To: [hidden email]
>>
>> LuSql is a simple but powerful tool for building Lucene indexes from
>> relational databases. It is a command-line Java application for the
>> construction of a Lucene index from an arbitrary SQL query of a
>> JDBC-accessible SQL database. It allows a user to control a number of
>> parameters, including the SQL query to use, individual
>> indexing/storage/term-vector nature of fields, analyzer, stop word
>> list, and other tuning parameters. In its default mode it uses
>> threading to take advantage of multiple cores.
>>
>> LuSql can handle complex queries, allows for additional per record
>> sub-queries, and has a plug-in architecture for arbitrary Lucene
>> document manipulation. Its only dependencies are three Apache Commons
>> libraries, the Lucene core itself, and a JDBC driver.
>>
>> LuSql has been extensively tested, including a large 6+ million
>> full-text & metadata journal article document collection, producing  
>> an
>> 86GB Lucene index in ~13 hours.
>>
>> http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
>>
>> Glen Newton
>>
>> --
>>
>> -
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>

Reply | Threaded
Open this post in threaded view
|

Re: Software Announcement: LuSql: Database to Lucene indexing

Glen Newton
In reply to this post by Matthew Runo
Hello,

I'm Glen Newton, LuSql author.

Thanks for the kind words about LuSql!  :-)

I have just joined the Solr list, and while knowing about Solr, I have
not used it and have only limited technical knowledge of Solr.

That said, I am very interested in making LuSql useful to the Solr
community as well as teh broader Lucene community, so if any of you
can offer any feedback on how LuSql can changed to better support
Solr, I would appreciate it.

thanks,

Glen Newton

-------------------------------------------------------------------
From Erik Hatcher <[hidden email]>
Subject Re: Software Announcement: LuSql: Database to Lucene indexing
Date Mon, 17 Nov 2008 20:12:35 GMT

Yeah, it'd work, though not only does the version of Lucene need to
match, but the field indexing/storage attributes need to jive as well
- and that is the trickier part of the equation.

But yeah, LuSQL looks slick!

        Erik


On Nov 17, 2008, at 2:17 PM, Matthew Runo wrote:

> Hello -
>
> I wanted to forward this on, since I thought that people here might
> be able to use this to build indexes. So long as the lucene version
> in LuSQL matches the version in Solr, it would work fine for
> indexing - yea?
>
> Thanks for your time!
>
> Matthew Runo
> Software Engineer, Zappos.com
> [hidden email] - 702-943-7833
>
> Begin forwarded message:
>
>> From: "Glen Newton" <[hidden email]>
>> Date: November 17, 2008 4:32:18 AM PST
>> To: [hidden email]
>> Subject: Software Announcement: LuSql: Database to Lucene indexing
>> Reply-To: [hidden email]
>>
>> LuSql is a simple but powerful tool for building Lucene indexes from
>> relational databases. It is a command-line Java application for the
>> construction of a Lucene index from an arbitrary SQL query of a
>> JDBC-accessible SQL database. It allows a user to control a number of
>> parameters, including the SQL query to use, individual
>> indexing/storage/term-vector nature of fields, analyzer, stop word
>> list, and other tuning parameters. In its default mode it uses
>> threading to take advantage of multiple cores.
>>
>> LuSql can handle complex queries, allows for additional per record
>> sub-queries, and has a plug-in architecture for arbitrary Lucene
>> document manipulation. Its only dependencies are three Apache Commons
>> libraries, the Lucene core itself, and a JDBC driver.
>>
>> LuSql has been extensively tested, including a large 6+ million
>> full-text & metadata journal article document collection, producing
>> an
>> 86GB Lucene index in ~13 hours.
>>
>> http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
>>
>> Glen Newton
>>
>> --
>>
>> -
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>




--

-
Reply | Threaded
Open this post in threaded view
|

Re: Software Announcement: LuSql: Database to Lucene indexing

Erik Hatcher
Glen,

The thing is, Solr has a database integration built-in with the new  
DataImportHandler.   So I'm not sure how much interest Solr users  
would have in LuSql by itself.

Maybe there are LuSql features that DIH could borrow from?  Or vice  
versa?

        Erik


On Nov 17, 2008, at 11:03 PM, Glen Newton wrote:

> That said, I am very interested in making LuSql useful to the Solr
> community as well as teh broader Lucene community, so if any of you
> can offer any feedback on how LuSql can changed to better support
> Solr, I would appreciate it.
>
> thanks,
>
> Glen Newton
>
> -------------------------------------------------------------------
> From Erik Hatcher <[hidden email]>
> Subject Re: Software Announcement: LuSql: Database to Lucene indexing
> Date Mon, 17 Nov 2008 20:12:35 GMT
>
> Yeah, it'd work, though not only does the version of Lucene need to
> match, but the field indexing/storage attributes need to jive as well
> - and that is the trickier part of the equation.
>
> But yeah, LuSQL looks slick!
>
> Erik
>
>
> On Nov 17, 2008, at 2:17 PM, Matthew Runo wrote:
>
>> Hello -
>>
>> I wanted to forward this on, since I thought that people here might
>> be able to use this to build indexes. So long as the lucene version
>> in LuSQL matches the version in Solr, it would work fine for
>> indexing - yea?
>>
>> Thanks for your time!
>>
>> Matthew Runo
>> Software Engineer, Zappos.com
>> [hidden email] - 702-943-7833
>>
>> Begin forwarded message:
>>
>>> From: "Glen Newton" <[hidden email]>
>>> Date: November 17, 2008 4:32:18 AM PST
>>> To: [hidden email]
>>> Subject: Software Announcement: LuSql: Database to Lucene indexing
>>> Reply-To: [hidden email]
>>>
>>> LuSql is a simple but powerful tool for building Lucene indexes from
>>> relational databases. It is a command-line Java application for the
>>> construction of a Lucene index from an arbitrary SQL query of a
>>> JDBC-accessible SQL database. It allows a user to control a number  
>>> of
>>> parameters, including the SQL query to use, individual
>>> indexing/storage/term-vector nature of fields, analyzer, stop word
>>> list, and other tuning parameters. In its default mode it uses
>>> threading to take advantage of multiple cores.
>>>
>>> LuSql can handle complex queries, allows for additional per record
>>> sub-queries, and has a plug-in architecture for arbitrary Lucene
>>> document manipulation. Its only dependencies are three Apache  
>>> Commons
>>> libraries, the Lucene core itself, and a JDBC driver.
>>>
>>> LuSql has been extensively tested, including a large 6+ million
>>> full-text & metadata journal article document collection, producing
>>> an
>>> 86GB Lucene index in ~13 hours.
>>>
>>> http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
>>>
>>> Glen Newton
>>>
>>> --
>>>
>>> -
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>
>
>
>
> --
>
> -

Reply | Threaded
Open this post in threaded view
|

Re: Software Announcement: LuSql: Database to Lucene indexing

Glen Newton
Erik,

Right now there is no real abstraction like DIH in LuSql. But as
indicated in the TODO section of the documentation, I was planning on
implementing or straight borrowing DIH in the near future.

I am assuming that Solr is all multi-threaded & as performant as it
can be. Is there a test SQL database that is used to test Solr, so I
might try to do some comparisons?

Not being a Solr user, it is hard for me to know of any advantages of
LuSql over Solr. Hopefully some in the community can identify possible
overlaps / use cases. I will see what I can figure out.

Thanks,

-Glen

2008/11/18 Erik Hatcher <[hidden email]>:

> Glen,
>
> The thing is, Solr has a database integration built-in with the new
> DataImportHandler.   So I'm not sure how much interest Solr users would have
> in LuSql by itself.
>
> Maybe there are LuSql features that DIH could borrow from?  Or vice versa?
>
>        Erik
>
>
> On Nov 17, 2008, at 11:03 PM, Glen Newton wrote:
>>
>> That said, I am very interested in making LuSql useful to the Solr
>> community as well as teh broader Lucene community, so if any of you
>> can offer any feedback on how LuSql can changed to better support
>> Solr, I would appreciate it.
>>
>> thanks,
>>
>> Glen Newton
>>
>> -------------------------------------------------------------------
>> From    Erik Hatcher <[hidden email]>
>> Subject Re: Software Announcement: LuSql: Database to Lucene indexing
>> Date    Mon, 17 Nov 2008 20:12:35 GMT
>>
>> Yeah, it'd work, though not only does the version of Lucene need to
>> match, but the field indexing/storage attributes need to jive as well
>> - and that is the trickier part of the equation.
>>
>> But yeah, LuSQL looks slick!
>>
>>        Erik
>>
>>
>> On Nov 17, 2008, at 2:17 PM, Matthew Runo wrote:
>>
>>> Hello -
>>>
>>> I wanted to forward this on, since I thought that people here might
>>> be able to use this to build indexes. So long as the lucene version
>>> in LuSQL matches the version in Solr, it would work fine for
>>> indexing - yea?
>>>
>>> Thanks for your time!
>>>
>>> Matthew Runo
>>> Software Engineer, Zappos.com
>>> [hidden email] - 702-943-7833
>>>
>>> Begin forwarded message:
>>>
>>>> From: "Glen Newton" <[hidden email]>
>>>> Date: November 17, 2008 4:32:18 AM PST
>>>> To: [hidden email]
>>>> Subject: Software Announcement: LuSql: Database to Lucene indexing
>>>> Reply-To: [hidden email]
>>>>
>>>> LuSql is a simple but powerful tool for building Lucene indexes from
>>>> relational databases. It is a command-line Java application for the
>>>> construction of a Lucene index from an arbitrary SQL query of a
>>>> JDBC-accessible SQL database. It allows a user to control a number of
>>>> parameters, including the SQL query to use, individual
>>>> indexing/storage/term-vector nature of fields, analyzer, stop word
>>>> list, and other tuning parameters. In its default mode it uses
>>>> threading to take advantage of multiple cores.
>>>>
>>>> LuSql can handle complex queries, allows for additional per record
>>>> sub-queries, and has a plug-in architecture for arbitrary Lucene
>>>> document manipulation. Its only dependencies are three Apache Commons
>>>> libraries, the Lucene core itself, and a JDBC driver.
>>>>
>>>> LuSql has been extensively tested, including a large 6+ million
>>>> full-text & metadata journal article document collection, producing
>>>> an
>>>> 86GB Lucene index in ~13 hours.
>>>>
>>>> http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
>>>>
>>>> Glen Newton
>>>>
>>>> --
>>>>
>>>> -
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>
>>
>>
>>
>> --
>>
>> -
>
>



--

-
Reply | Threaded
Open this post in threaded view
|

Re: Software Announcement: LuSql: Database to Lucene indexing

Shalin Shekhar Mangar
Hi Glen,

There is an issue open for making DIH API friendly. Take a look and let us
know what you think.

https://issues.apache.org/jira/browse/SOLR-853

On Tue, Nov 18, 2008 at 8:26 PM, Glen Newton <[hidden email]> wrote:

> Erik,
>
> Right now there is no real abstraction like DIH in LuSql. But as
> indicated in the TODO section of the documentation, I was planning on
> implementing or straight borrowing DIH in the near future.
>
> I am assuming that Solr is all multi-threaded & as performant as it
> can be. Is there a test SQL database that is used to test Solr, so I
> might try to do some comparisons?
>
> Not being a Solr user, it is hard for me to know of any advantages of
> LuSql over Solr. Hopefully some in the community can identify possible
> overlaps / use cases. I will see what I can figure out.
>
> Thanks,
>
> -Glen
>
> 2008/11/18 Erik Hatcher <[hidden email]>:
> > Glen,
> >
> > The thing is, Solr has a database integration built-in with the new
> > DataImportHandler.   So I'm not sure how much interest Solr users would
> have
> > in LuSql by itself.
> >
> > Maybe there are LuSql features that DIH could borrow from?  Or vice
> versa?
> >
> >        Erik
> >
> >
> > On Nov 17, 2008, at 11:03 PM, Glen Newton wrote:
> >>
> >> That said, I am very interested in making LuSql useful to the Solr
> >> community as well as teh broader Lucene community, so if any of you
> >> can offer any feedback on how LuSql can changed to better support
> >> Solr, I would appreciate it.
> >>
> >> thanks,
> >>
> >> Glen Newton
> >>
> >> -------------------------------------------------------------------
> >> From    Erik Hatcher <[hidden email]>
> >> Subject Re: Software Announcement: LuSql: Database to Lucene indexing
> >> Date    Mon, 17 Nov 2008 20:12:35 GMT
> >>
> >> Yeah, it'd work, though not only does the version of Lucene need to
> >> match, but the field indexing/storage attributes need to jive as well
> >> - and that is the trickier part of the equation.
> >>
> >> But yeah, LuSQL looks slick!
> >>
> >>        Erik
> >>
> >>
> >> On Nov 17, 2008, at 2:17 PM, Matthew Runo wrote:
> >>
> >>> Hello -
> >>>
> >>> I wanted to forward this on, since I thought that people here might
> >>> be able to use this to build indexes. So long as the lucene version
> >>> in LuSQL matches the version in Solr, it would work fine for
> >>> indexing - yea?
> >>>
> >>> Thanks for your time!
> >>>
> >>> Matthew Runo
> >>> Software Engineer, Zappos.com
> >>> [hidden email] - 702-943-7833
> >>>
> >>> Begin forwarded message:
> >>>
> >>>> From: "Glen Newton" <[hidden email]>
> >>>> Date: November 17, 2008 4:32:18 AM PST
> >>>> To: [hidden email]
> >>>> Subject: Software Announcement: LuSql: Database to Lucene indexing
> >>>> Reply-To: [hidden email]
> >>>>
> >>>> LuSql is a simple but powerful tool for building Lucene indexes from
> >>>> relational databases. It is a command-line Java application for the
> >>>> construction of a Lucene index from an arbitrary SQL query of a
> >>>> JDBC-accessible SQL database. It allows a user to control a number of
> >>>> parameters, including the SQL query to use, individual
> >>>> indexing/storage/term-vector nature of fields, analyzer, stop word
> >>>> list, and other tuning parameters. In its default mode it uses
> >>>> threading to take advantage of multiple cores.
> >>>>
> >>>> LuSql can handle complex queries, allows for additional per record
> >>>> sub-queries, and has a plug-in architecture for arbitrary Lucene
> >>>> document manipulation. Its only dependencies are three Apache Commons
> >>>> libraries, the Lucene core itself, and a JDBC driver.
> >>>>
> >>>> LuSql has been extensively tested, including a large 6+ million
> >>>> full-text & metadata journal article document collection, producing
> >>>> an
> >>>> 86GB Lucene index in ~13 hours.
> >>>>
> >>>> http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
> >>>>
> >>>> Glen Newton
> >>>>
> >>>> --
> >>>>
> >>>> -
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: [hidden email]
> >>>> For additional commands, e-mail: [hidden email]
> >>>>
> >>
> >>
> >>
> >>
> >> --
> >>
> >> -
> >
> >
>
>
>
> --
>
> -
>



--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

Re: Software Announcement: LuSql: Database to Lucene indexing

Glen Newton
Yes, I've found it.

Do you want my comments here or in solr-dev or on jira?

 Glen

2008/11/18 Shalin Shekhar Mangar <[hidden email]>:

> Hi Glen,
>
> There is an issue open for making DIH API friendly. Take a look and let us
> know what you think.
>
> https://issues.apache.org/jira/browse/SOLR-853
>
> On Tue, Nov 18, 2008 at 8:26 PM, Glen Newton <[hidden email]> wrote:
>
>> Erik,
>>
>> Right now there is no real abstraction like DIH in LuSql. But as
>> indicated in the TODO section of the documentation, I was planning on
>> implementing or straight borrowing DIH in the near future.
>>
>> I am assuming that Solr is all multi-threaded & as performant as it
>> can be. Is there a test SQL database that is used to test Solr, so I
>> might try to do some comparisons?
>>
>> Not being a Solr user, it is hard for me to know of any advantages of
>> LuSql over Solr. Hopefully some in the community can identify possible
>> overlaps / use cases. I will see what I can figure out.
>>
>> Thanks,
>>
>> -Glen
>>
>> 2008/11/18 Erik Hatcher <[hidden email]>:
>> > Glen,
>> >
>> > The thing is, Solr has a database integration built-in with the new
>> > DataImportHandler.   So I'm not sure how much interest Solr users would
>> have
>> > in LuSql by itself.
>> >
>> > Maybe there are LuSql features that DIH could borrow from?  Or vice
>> versa?
>> >
>> >        Erik
>> >
>> >
>> > On Nov 17, 2008, at 11:03 PM, Glen Newton wrote:
>> >>
>> >> That said, I am very interested in making LuSql useful to the Solr
>> >> community as well as teh broader Lucene community, so if any of you
>> >> can offer any feedback on how LuSql can changed to better support
>> >> Solr, I would appreciate it.
>> >>
>> >> thanks,
>> >>
>> >> Glen Newton
>> >>
>> >> -------------------------------------------------------------------
>> >> From    Erik Hatcher <[hidden email]>
>> >> Subject Re: Software Announcement: LuSql: Database to Lucene indexing
>> >> Date    Mon, 17 Nov 2008 20:12:35 GMT
>> >>
>> >> Yeah, it'd work, though not only does the version of Lucene need to
>> >> match, but the field indexing/storage attributes need to jive as well
>> >> - and that is the trickier part of the equation.
>> >>
>> >> But yeah, LuSQL looks slick!
>> >>
>> >>        Erik
>> >>
>> >>
>> >> On Nov 17, 2008, at 2:17 PM, Matthew Runo wrote:
>> >>
>> >>> Hello -
>> >>>
>> >>> I wanted to forward this on, since I thought that people here might
>> >>> be able to use this to build indexes. So long as the lucene version
>> >>> in LuSQL matches the version in Solr, it would work fine for
>> >>> indexing - yea?
>> >>>
>> >>> Thanks for your time!
>> >>>
>> >>> Matthew Runo
>> >>> Software Engineer, Zappos.com
>> >>> [hidden email] - 702-943-7833
>> >>>
>> >>> Begin forwarded message:
>> >>>
>> >>>> From: "Glen Newton" <[hidden email]>
>> >>>> Date: November 17, 2008 4:32:18 AM PST
>> >>>> To: [hidden email]
>> >>>> Subject: Software Announcement: LuSql: Database to Lucene indexing
>> >>>> Reply-To: [hidden email]
>> >>>>
>> >>>> LuSql is a simple but powerful tool for building Lucene indexes from
>> >>>> relational databases. It is a command-line Java application for the
>> >>>> construction of a Lucene index from an arbitrary SQL query of a
>> >>>> JDBC-accessible SQL database. It allows a user to control a number of
>> >>>> parameters, including the SQL query to use, individual
>> >>>> indexing/storage/term-vector nature of fields, analyzer, stop word
>> >>>> list, and other tuning parameters. In its default mode it uses
>> >>>> threading to take advantage of multiple cores.
>> >>>>
>> >>>> LuSql can handle complex queries, allows for additional per record
>> >>>> sub-queries, and has a plug-in architecture for arbitrary Lucene
>> >>>> document manipulation. Its only dependencies are three Apache Commons
>> >>>> libraries, the Lucene core itself, and a JDBC driver.
>> >>>>
>> >>>> LuSql has been extensively tested, including a large 6+ million
>> >>>> full-text & metadata journal article document collection, producing
>> >>>> an
>> >>>> 86GB Lucene index in ~13 hours.
>> >>>>
>> >>>> http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
>> >>>>
>> >>>> Glen Newton
>> >>>>
>> >>>> --
>> >>>>
>> >>>> -
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: [hidden email]
>> >>>> For additional commands, e-mail: [hidden email]
>> >>>>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> -
>> >
>> >
>>
>>
>>
>> --
>>
>> -
>>
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



--

-
Reply | Threaded
Open this post in threaded view
|

Re: Software Announcement: LuSql: Database to Lucene indexing

Mike Klaas
In reply to this post by Glen Newton
On 18-Nov-08, at 6:56 AM, Glen Newton wrote:

> Erik,
>
> Right now there is no real abstraction like DIH in LuSql. But as
> indicated in the TODO section of the documentation, I was planning on
> implementing or straight borrowing DIH in the near future.
>
> I am assuming that Solr is all multi-threaded & as performant as it
> can be. Is there a test SQL database that is used to test Solr, so I
> might try to do some comparisons?

Actually, I think that Solr's multithreaded indexing could be  
improved.  It is really only analysis that is parallelizable ATM.

-Mike
Reply | Threaded
Open this post in threaded view
|

Re: Software Announcement: LuSql: Database to Lucene indexing

Noble Paul നോബിള്‍  नोब्ळ्
In reply to this post by Glen Newton
Hi Glen ,
You can post all the queries first on solr-dev and all the valid ones
can be moved to JIRA

thanks,
Noble

On Wed, Nov 19, 2008 at 3:26 AM, Glen Newton <[hidden email]> wrote:

> Yes, I've found it.
>
> Do you want my comments here or in solr-dev or on jira?
>
>  Glen
>
> 2008/11/18 Shalin Shekhar Mangar <[hidden email]>:
>> Hi Glen,
>>
>> There is an issue open for making DIH API friendly. Take a look and let us
>> know what you think.
>>
>> https://issues.apache.org/jira/browse/SOLR-853
>>
>> On Tue, Nov 18, 2008 at 8:26 PM, Glen Newton <[hidden email]> wrote:
>>
>>> Erik,
>>>
>>> Right now there is no real abstraction like DIH in LuSql. But as
>>> indicated in the TODO section of the documentation, I was planning on
>>> implementing or straight borrowing DIH in the near future.
>>>
>>> I am assuming that Solr is all multi-threaded & as performant as it
>>> can be. Is there a test SQL database that is used to test Solr, so I
>>> might try to do some comparisons?
>>>
>>> Not being a Solr user, it is hard for me to know of any advantages of
>>> LuSql over Solr. Hopefully some in the community can identify possible
>>> overlaps / use cases. I will see what I can figure out.
>>>
>>> Thanks,
>>>
>>> -Glen
>>>
>>> 2008/11/18 Erik Hatcher <[hidden email]>:
>>> > Glen,
>>> >
>>> > The thing is, Solr has a database integration built-in with the new
>>> > DataImportHandler.   So I'm not sure how much interest Solr users would
>>> have
>>> > in LuSql by itself.
>>> >
>>> > Maybe there are LuSql features that DIH could borrow from?  Or vice
>>> versa?
>>> >
>>> >        Erik
>>> >
>>> >
>>> > On Nov 17, 2008, at 11:03 PM, Glen Newton wrote:
>>> >>
>>> >> That said, I am very interested in making LuSql useful to the Solr
>>> >> community as well as teh broader Lucene community, so if any of you
>>> >> can offer any feedback on how LuSql can changed to better support
>>> >> Solr, I would appreciate it.
>>> >>
>>> >> thanks,
>>> >>
>>> >> Glen Newton
>>> >>
>>> >> -------------------------------------------------------------------
>>> >> From    Erik Hatcher <[hidden email]>
>>> >> Subject Re: Software Announcement: LuSql: Database to Lucene indexing
>>> >> Date    Mon, 17 Nov 2008 20:12:35 GMT
>>> >>
>>> >> Yeah, it'd work, though not only does the version of Lucene need to
>>> >> match, but the field indexing/storage attributes need to jive as well
>>> >> - and that is the trickier part of the equation.
>>> >>
>>> >> But yeah, LuSQL looks slick!
>>> >>
>>> >>        Erik
>>> >>
>>> >>
>>> >> On Nov 17, 2008, at 2:17 PM, Matthew Runo wrote:
>>> >>
>>> >>> Hello -
>>> >>>
>>> >>> I wanted to forward this on, since I thought that people here might
>>> >>> be able to use this to build indexes. So long as the lucene version
>>> >>> in LuSQL matches the version in Solr, it would work fine for
>>> >>> indexing - yea?
>>> >>>
>>> >>> Thanks for your time!
>>> >>>
>>> >>> Matthew Runo
>>> >>> Software Engineer, Zappos.com
>>> >>> [hidden email] - 702-943-7833
>>> >>>
>>> >>> Begin forwarded message:
>>> >>>
>>> >>>> From: "Glen Newton" <[hidden email]>
>>> >>>> Date: November 17, 2008 4:32:18 AM PST
>>> >>>> To: [hidden email]
>>> >>>> Subject: Software Announcement: LuSql: Database to Lucene indexing
>>> >>>> Reply-To: [hidden email]
>>> >>>>
>>> >>>> LuSql is a simple but powerful tool for building Lucene indexes from
>>> >>>> relational databases. It is a command-line Java application for the
>>> >>>> construction of a Lucene index from an arbitrary SQL query of a
>>> >>>> JDBC-accessible SQL database. It allows a user to control a number of
>>> >>>> parameters, including the SQL query to use, individual
>>> >>>> indexing/storage/term-vector nature of fields, analyzer, stop word
>>> >>>> list, and other tuning parameters. In its default mode it uses
>>> >>>> threading to take advantage of multiple cores.
>>> >>>>
>>> >>>> LuSql can handle complex queries, allows for additional per record
>>> >>>> sub-queries, and has a plug-in architecture for arbitrary Lucene
>>> >>>> document manipulation. Its only dependencies are three Apache Commons
>>> >>>> libraries, the Lucene core itself, and a JDBC driver.
>>> >>>>
>>> >>>> LuSql has been extensively tested, including a large 6+ million
>>> >>>> full-text & metadata journal article document collection, producing
>>> >>>> an
>>> >>>> 86GB Lucene index in ~13 hours.
>>> >>>>
>>> >>>> http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
>>> >>>>
>>> >>>> Glen Newton
>>> >>>>
>>> >>>> --
>>> >>>>
>>> >>>> -
>>> >>>>
>>> >>>> ---------------------------------------------------------------------
>>> >>>> To unsubscribe, e-mail: [hidden email]
>>> >>>> For additional commands, e-mail: [hidden email]
>>> >>>>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >>
>>> >> -
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>>
>>> -
>>>
>>
>>
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>
>
>
> --
>
> -
>



--
--Noble Paul