DBSight, search on database by Lucene

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

DBSight, search on database by Lucene

chrislusf
Hello Lucene developers,

I would like to introduce myself and say thanks to Lucene contributors
and this mailing list.
We have just released DBSight 1.0, which is a J2EE application that can
create a search engine on any relational database.

You can build a vertical search website within hours if your data is in
a JDBC-enabled database.

To demonstrate the search capability, DBSight created a demo search
on 1.7 million CD albums information by freedb.org provided data.
http://search.dbsight.com/

DBSight is a highly configurable platform to create search.
It can crawl your database, create indexes, display search results.
You can customize most of the components, and manage the indexes -- all
by web interface.

DBSight is built with Lucene for searching, JDBC for crawling, and
Velocity for rendering.

Will this qualify DBSight to be listed on "Powered By Lucene" wiki page?

Here is a step by step tutorial on how the demo search is created.

Resources:
    Step by step tutorial :
http://www.dbsight.net/mediawiki/index.php?title=Step_by_step
    Demo Search on freedb.org's data: http://search.dbsight.com/
    Feature List: http://www.dbsight.net/?q=node/34

Chris Lu


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

International Stemmers and Character Encoding

Edwin Mol
I have downloaded the analysers sources from the sandbox area, but for
every *Stemmer class I'm having compilation problems:
"Invalid Character Constant".
Here is how a code snipper looks like from the DutchtStemmer class:

  /**
   * Substitute ??, ??, ??, ??, ??, ?? , ??, ??, ??, ??
   */
  private void substitute(StringBuffer buffer) {
    for (int i = 0; i < buffer.length(); i++) {
      switch (buffer.charAt(i)) {
        case '??':
        case '??':
          {
            buffer.setCharAt(i, 'a');
            break;
          }
        case '??':
        case '??'::

In this example the '??' Character causes a problem.

I think the code is messed up because of wrong character encoding of the
java file.
Does anyone know if I'm correct and more importantly how to solve this
problem.

Thanks,

Edwin Mol


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: International Stemmers and Character Encoding

Edwin Mol
Please ignore my previous post, I have solved the problem.
Turned out that my IDE(eclipse) didn't use UTF-8 encoding by default.

Edwin


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DBSight, search on database by Lucene

Erik Hatcher
In reply to this post by chrislusf

On Jun 11, 2005, at 3:25 AM, Chris Lu wrote:
> To demonstrate the search capability, DBSight created a demo search
> on 1.7 million CD albums information by freedb.org provided data.
> http://search.dbsight.com/

Nice job!  Here's the best query:

     <http://www.dbsight.com/dbs/search.do?
indexName=freedb&templateName=free&q=%22tim+reynolds%22+-dave>

:)

> Will this qualify DBSight to be listed on "Powered By Lucene" wiki  
> page?

Of course.  The wiki is community maintained, so anyone who has  
Lucene Inside is welcome to add their project/product there.

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DBSight, search on database by Lucene

chrislusf
Thanks.

Somehow I found the "Powered By" Lucene page is "Immutable Page", even
if I logged in.
http://wiki.apache.org/jakarta-lucene/PoweredBy

Chris Lu

Erik Hatcher wrote:

>
> On Jun 11, 2005, at 3:25 AM, Chris Lu wrote:
>
>> To demonstrate the search capability, DBSight created a demo search
>> on 1.7 million CD albums information by freedb.org provided data.
>> http://search.dbsight.com/
>
>
> Nice job!  Here's the best query:
>
>     <http://www.dbsight.com/dbs/search.do?
> indexName=freedb&templateName=free&q=%22tim+reynolds%22+-dave>
>
> :)
>
>> Will this qualify DBSight to be listed on "Powered By Lucene" wiki  
>> page?
>
>
> Of course.  The wiki is community maintained, so anyone who has  
> Lucene Inside is welcome to add their project/product there.
>
>     Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DBSight, search on database by Lucene

Erik Hatcher

On Jun 11, 2005, at 1:08 PM, Chris Lu wrote:

> Thanks.
>
> Somehow I found the "Powered By" Lucene page is "Immutable Page",  
> even if I logged in.
> http://wiki.apache.org/jakarta-lucene/PoweredBy

Wow, it sure is.  I'm CC'ing infrastructure to find out why this page  
is immutable.

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DBSight, search on database by Lucene

Joshua Slive-2



On Sat, 11 Jun 2005, Erik Hatcher wrote:

>
> On Jun 11, 2005, at 1:08 PM, Chris Lu wrote:
>
>> Thanks.
>>
>> Somehow I found the "Powered By" Lucene page is "Immutable Page", even if I
>> logged in.
>> http://wiki.apache.org/jakarta-lucene/PoweredBy
>
> Wow, it sure is.  I'm CC'ing infrastructure to find out why this page is
> immutable.


Ughhh... Looks like another caching problem.

A shift- or ctrl-refresh should get you the right thing.  You know if you
have the right page if your userid appears in the upper-right.

It seems like technically moin needs to send Vary: cookie, but this would
completely destroy our ability to cache.

What we want is for anything with a Cookie: header to totally bypass the
cache.  I don't know of any way to configure that.

Joshua.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DBSight, search on database by Lucene

Paul Querna-2
Joshua Slive wrote:

>
>
>
> On Sat, 11 Jun 2005, Erik Hatcher wrote:
>
>>
>> On Jun 11, 2005, at 1:08 PM, Chris Lu wrote:
>>
>>> Thanks.
>>>
>>> Somehow I found the "Powered By" Lucene page is "Immutable Page",
>>> even if I logged in.
>>> http://wiki.apache.org/jakarta-lucene/PoweredBy
>>
>>
>> Wow, it sure is.  I'm CC'ing infrastructure to find out why this page
>> is immutable.
>
>
>
> Ughhh... Looks like another caching problem.
>
> A shift- or ctrl-refresh should get you the right thing.  You know if
> you have the right page if your userid appears in the upper-right.
>
> It seems like technically moin needs to send Vary: cookie, but this would
> completely destroy our ability to cache.

Not if we applied the patch I sent to dev@httpd on Friday.  It fixes
mod_disk_cache's handling of Vary: to keep separate copies for each
combo, instead of only a single copy.

> What we want is for anything with a Cookie: header to totally bypass the
> cache.  I don't know of any way to configure that.

Moin should be sending Cache-Control: Private in these cases, in
addition to the Vary: Cookie header.  If they don't they will break with
other upstream proxies that we have no control over.  Fixing it so httpd
can cache fixes upstream proxies too, so it is the right thing to do.

-Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DBSight, search on database by Lucene

chrislusf
In reply to this post by Joshua Slive-2
Thanks, guys!

I have made the changes to the wiki, following Joshua's advice.
It's the cookie/refreshing problem.

Chris Lu

Joshua Slive wrote:

>
>
>
> On Sat, 11 Jun 2005, Erik Hatcher wrote:
>
>>
>> On Jun 11, 2005, at 1:08 PM, Chris Lu wrote:
>>
>>> Thanks.
>>>
>>> Somehow I found the "Powered By" Lucene page is "Immutable Page",
>>> even if I logged in.
>>> http://wiki.apache.org/jakarta-lucene/PoweredBy
>>
>>
>> Wow, it sure is.  I'm CC'ing infrastructure to find out why this page
>> is immutable.
>
>
>
> Ughhh... Looks like another caching problem.
>
> A shift- or ctrl-refresh should get you the right thing.  You know if
> you have the right page if your userid appears in the upper-right.
>
> It seems like technically moin needs to send Vary: cookie, but this would
> completely destroy our ability to cache.
>
> What we want is for anything with a Cookie: header to totally bypass
> the cache.  I don't know of any way to configure that.
>
> Joshua.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

wiki now sends Vary: Cookie (was Re: DBSight, search on database by Lucene)

Joshua Slive-2
In reply to this post by Paul Querna-2


Paul Querna wrote:
> Joshua Slive wrote:
>> What we want is for anything with a Cookie: header to totally bypass
>> the cache.  I don't know of any way to configure that.
>
>
> Moin should be sending Cache-Control: Private in these cases, in
> addition to the Vary: Cookie header.  If they don't they will break with
> other upstream proxies that we have no control over.  Fixing it so httpd
> can cache fixes upstream proxies too, so it is the right thing to do.

I've added the Vary: Cookie header.  I believe that even with the
current naive Vary handling, this should work ok in mod_cache, since it
won't store any of the logged-in pages due to the Cache-Control headers.
So the non-cookie version should hang around in the cache.

Anyway, I hope this makes things much less confusing for people trying
to edit the pages.

Joshua.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]