Need Opinion!!

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Need Opinion!!

Shardul
Hi All,

I am facing a scenrio where I am considering using Lucene over the existing implementation. The move to Lucene is going to require a lot of re-work so I though I better post this and ask for an expert opinion.

Background
 There are 3 tables.
 1. University
 2. Course
 3. Subjects
 
 A university can have multiple courses.
 2 universities can offer the same course.
 A course can have multiple subjects.
 2 courses can have the same set of subjects.
 
Current implemetation.
        The current table is  
        uni_id, course_id, subject_id
       
        I usually query for the uni_id based on combinations of course_id and subject_id. The "user" looks for a university based on subject or a course name.
       
Proposed Implementation
        The new table would be
        uni_id, course_name, subject_name
       
        Index(using Lucene) the table on course_name and subject_name and then look for the uni_id using Lucene Index.
       
The size of the table is huge, test data has over 5000000 rows.

Shall I move to Lucene? Would it help improving the performance?

TIA.
Shardul.

Reply | Threaded
Open this post in threaded view
|

Re: Need Opinion!!

Ian Lea
Hi


Would using lucene help improve performance?  Probably.  Lucene is
blindingly fast and 5,000,000 docs is not huge by lucene standards.
But we don't know how fast the existing implementation is.

Should you move to lucene?  Your call, to balance the expected
performance gain over the work involved.


If you do decide to move to lucene you should, as Erick said just
yesterday, be sure to "take off your DB hat".
http://www.nabble.com/Re%3A-searching-in-2-indexes-p21017290.html


--
Ian.

On Tue, Dec 16, 2008 at 5:41 AM, Shardul Bhatt <[hidden email]> wrote:

>
> Hi All,
>
> I am facing a scenrio where I am considering using Lucene over the existing
> implementation. The move to Lucene is going to require a lot of re-work so I
> though I better post this and ask for an expert opinion.
>
> Background
>  There are 3 tables.
>  1. University
>  2. Course
>  3. Subjects
>
>  A university can have multiple courses.
>  2 universities can offer the same course.
>  A course can have multiple subjects.
>  2 courses can have the same set of subjects.
>
> Current implemetation.
>        The current table is
>        uni_id, course_id, subject_id
>
>        I usually query for the uni_id based on combinations of course_id and
> subject_id. The "user" looks for a university based on subject or a course
> name.
>
> Proposed Implementation
>        The new table would be
>        uni_id, course_name, subject_name
>
>        Index(using Lucene) the table on course_name and subject_name and then look
> for the uni_id using Lucene Index.
>
> The size of the table is huge, test data has over 5000000 rows.
>
> Shall I move to Lucene? Would it help improving the performance?
>
> TIA.
> Shardul.
>
>
> --
> View this message in context: http://www.nabble.com/Need-Opinion%21%21-tp21027642p21027642.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Need Opinion!!

Grant Ingersoll-2
In reply to this post by Shardul
Hi Shardul,

I just was w/ a client who pretty much had the exact same scenario and  
Lucene (actually Solr) was just fine.  In fact, I think you could have  
a prototype of this up and running in Solr (Lucene-based Search  
Server) in a day or two.  As for performance, given each record is  
likely quite small, I don't think 5M records will be that big of a deal.

Of course, every situation is different, so I'd recommend spending a  
few days w/ Solr on it, or just setup some simple indexing/search  
tests with Lucene.

-Grant

On Dec 16, 2008, at 12:41 AM, Shardul Bhatt wrote:

>
> Hi All,
>
> I am facing a scenrio where I am considering using Lucene over the  
> existing
> implementation. The move to Lucene is going to require a lot of re-
> work so I
> though I better post this and ask for an expert opinion.
>
> Background
> There are 3 tables.
> 1. University
> 2. Course
> 3. Subjects
>
> A university can have multiple courses.
> 2 universities can offer the same course.
> A course can have multiple subjects.
> 2 courses can have the same set of subjects.
>
> Current implemetation.
> The current table is
> uni_id, course_id, subject_id
>
> I usually query for the uni_id based on combinations of course_id and
> subject_id. The "user" looks for a university based on subject or a  
> course
> name.
>
> Proposed Implementation
> The new table would be
> uni_id, course_name, subject_name
>
> Index(using Lucene) the table on course_name and subject_name and  
> then look
> for the uni_id using Lucene Index.
>
> The size of the table is huge, test data has over 5000000 rows.
>
> Shall I move to Lucene? Would it help improving the performance?
>
> TIA.
> Shardul.
>
>
> --
> View this message in context: http://www.nabble.com/Need-Opinion%21%21-tp21027642p21027642.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ











---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]