How specify the analyzer when created query with api?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How specify the analyzer when created query with api?

Giannandrea Castaldi
Hi,
In my webapp I'm trying to use the lucene api to build  queries
instead of the QueryParser but I haven't found out where to specify
the Analyzer. Any help?
Thanks.

jean71

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How specify the analyzer when created query with api?

Erick Erickson
What do you mean when you say "trying to use the lucene api to build
 queries"?
Are you trying to use BooleanQuery? If so, you either construct specific
clauses yourself (presumably by, say, tokenizing things yourself and
creating TermQuerys, PhraseQuerys, etc.) which *don't* need an
analyzer, or add Querys that you get from running stuff through
QueryParser that has a constructor that takes an Analyzer.

Code snippets or more specific questions would be a great help <G>...
Imagine yourself trying to respond to an e-mail like you sent. There
isn't enough information there to really understand the problem you're
trying to solve, so I have to guess.

Best
Erick

On Mon, Sep 22, 2008 at 4:46 AM, Giannandrea Castaldi <[hidden email]
> wrote:

> Hi,
> In my webapp I'm trying to use the lucene api to build  queries
> instead of the QueryParser but I haven't found out where to specify
> the Analyzer. Any help?
> Thanks.
>
> jean71
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: How specify the analyzer when created query with api?

jean71
On Mon, Sep 22, 2008 at 2:49 PM, Erick Erickson <[hidden email]> wrote:

> What do you mean when you say "trying to use the lucene api to build
>  queries"?
> Are you trying to use BooleanQuery? If so, you either construct specific
> clauses yourself (presumably by, say, tokenizing things yourself and
> creating TermQuerys, PhraseQuerys, etc.) which *don't* need an
> analyzer, or add Querys that you get from running stuff through
> QueryParser that has a constructor that takes an Analyzer.
>
> Code snippets or more specific questions would be a great help <G>...
> Imagine yourself trying to respond to an e-mail like you sent. There
> isn't enough information there to really understand the problem you're
> trying to solve, so I have to guess.

Sorry, my question was indeed too generic.
I've written several tests to take confidence with behavior of Lucene,
all using the StandardAnalyzer and the QueryParser. All the tests
pass.
Now I'm trying to write the same tests using
Term/TermQuery/BooleanQuery but they don't works. Now I've found out
that the problem is that using directly Term/TermQuery/BooleanQuery
I'm not specifying the analyzer and then the query is not correct.
Here one test with the use of the QueryParser, that works, and the
current vesion with Term that doesn't work:

    @Test
    public void testSimpleSearch() throws Exception
    {
        LuceneContext luceneContext = new LuceneContext();
        Document luceneInAction = luceneContext.createDocument("Lucene
in action", 193, "9.90");
        Document luceneForDummies =
luceneContext.createDocument("Lucene for dummies", 245, "11.50");
        Document tomcatInAction = luceneContext.createDocument("Tomcat
in action", 728, "110.30");
        luceneContext.addDocument(luceneInAction);
        luceneContext.addDocument(luceneForDummies);
        luceneContext.addDocument(tomcatInAction);
        luceneContext.closeIndex();
        QueryParser queryParser = new QueryParser(LuceneContext.TITLE,
new StandardAnalyzer());
        Hits result = luceneContext.search(queryParser.parse("Lucene"));
        assertEquals(result, Arrays.asList(luceneInAction, luceneForDummies));
    }

   @Test
    public void testSimpleSearch() throws Exception
    {
        LuceneContext luceneContext = new LuceneContext();
        Document luceneInAction = luceneContext.createDocument("Lucene
in action", 193, "9.90");
        Document luceneForDummies =
luceneContext.createDocument("Lucene for dummies", 245, "11.50");
        Document tomcatInAction = luceneContext.createDocument("Tomcat
in action", 728, "110.30");
        luceneContext.addDocument(luceneInAction);
        luceneContext.addDocument(luceneForDummies);
        luceneContext.addDocument(tomcatInAction);
        luceneContext.closeIndex();
        Hits result = luceneContext.search(new TermQuery(new
Term(LuceneContext.TITLE, "Lucene")));
        assertEquals(result, Arrays.asList(luceneInAction, luceneForDummies));
    }

Now, in our webapp, to execute query on the index we compose a query
as string passed to QueryParser. We've read on the lucene
documentation that from code is better compose a BooleanQuery using
Term/TermQuery/... and then I'm evaluting how do it.
Thanks.
jean71

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How specify the analyzer when created query with api?

Erick Erickson
Right, you can't do that. TermQuerys are low-level, they don't
go through analyzers, you have to do the tokenizing yourself.

StandardAnalyzer, among other things, lowercases the tokens.
If you haven't already got a copy of Luke, please do so as it's
a wonderful tool for seeing what different analyzers do. Not to
mention examining your index. Not to mention....

You could do something like
BooleanQuery bq = new BooleanQuery()
Query q = queryParser.parse("Lucene")
bq.add(q, BooleanClause.Occur.MUST)

Hits result = luceneContext.search(bq);

but I don't know whether this suits your needs.
You're still testing the query parser rather than
testing building BooleanQuerys.

(note, I haven't compiled this....).

BTW, the reason your test is failing is because
of the case issue, I believe. You'd get the
results you expect if you did something like:

luceneContext.search(new TermQuery(new
Term(LuceneContext.TITLE, "lucene")));

(not lower-case "lucene").

You could also do something like getting the
tokenStream from StandardAnalyzer call
next(), adding each returned token as a new
Term to your BooleanQuery.

But TermQuery etc. are different beasts than the
a parsed query.

But isn't this the wonderful thing about tests? They
make your assumptions explicit, and when your
assumptions are incorrect, you find it with much less
pain than in a working program where you say "the most
recent change I made couldn't possibly have broken
anything" <G>.

Best
Erick

On Mon, Sep 22, 2008 at 10:01 AM, Giannandrea Castaldi <[hidden email]
> wrote:

> On Mon, Sep 22, 2008 at 2:49 PM, Erick Erickson <[hidden email]>
> wrote:
> > What do you mean when you say "trying to use the lucene api to build
> >  queries"?
> > Are you trying to use BooleanQuery? If so, you either construct specific
> > clauses yourself (presumably by, say, tokenizing things yourself and
> > creating TermQuerys, PhraseQuerys, etc.) which *don't* need an
> > analyzer, or add Querys that you get from running stuff through
> > QueryParser that has a constructor that takes an Analyzer.
> >
> > Code snippets or more specific questions would be a great help <G>...
> > Imagine yourself trying to respond to an e-mail like you sent. There
> > isn't enough information there to really understand the problem you're
> > trying to solve, so I have to guess.
>
> Sorry, my question was indeed too generic.
> I've written several tests to take confidence with behavior of Lucene,
> all using the StandardAnalyzer and the QueryParser. All the tests
> pass.
> Now I'm trying to write the same tests using
> Term/TermQuery/BooleanQuery but they don't works. Now I've found out
> that the problem is that using directly Term/TermQuery/BooleanQuery
> I'm not specifying the analyzer and then the query is not correct.
> Here one test with the use of the QueryParser, that works, and the
> current vesion with Term that doesn't work:
>
>    @Test
>    public void testSimpleSearch() throws Exception
>    {
>        LuceneContext luceneContext = new LuceneContext();
>        Document luceneInAction = luceneContext.createDocument("Lucene
> in action", 193, "9.90");
>        Document luceneForDummies =
> luceneContext.createDocument("Lucene for dummies", 245, "11.50");
>        Document tomcatInAction = luceneContext.createDocument("Tomcat
> in action", 728, "110.30");
>        luceneContext.addDocument(luceneInAction);
>        luceneContext.addDocument(luceneForDummies);
>        luceneContext.addDocument(tomcatInAction);
>        luceneContext.closeIndex();
>        QueryParser queryParser = new QueryParser(LuceneContext.TITLE,
> new StandardAnalyzer());
>        Hits result = luceneContext.search(queryParser.parse("Lucene"));
>        assertEquals(result, Arrays.asList(luceneInAction,
> luceneForDummies));
>    }
>
>   @Test
>    public void testSimpleSearch() throws Exception
>    {
>        LuceneContext luceneContext = new LuceneContext();
>        Document luceneInAction = luceneContext.createDocument("Lucene
> in action", 193, "9.90");
>        Document luceneForDummies =
> luceneContext.createDocument("Lucene for dummies", 245, "11.50");
>        Document tomcatInAction = luceneContext.createDocument("Tomcat
> in action", 728, "110.30");
>        luceneContext.addDocument(luceneInAction);
>        luceneContext.addDocument(luceneForDummies);
>        luceneContext.addDocument(tomcatInAction);
>        luceneContext.closeIndex();
>        Hits result = luceneContext.search(new TermQuery(new
> Term(LuceneContext.TITLE, "Lucene")));
>        assertEquals(result, Arrays.asList(luceneInAction,
> luceneForDummies));
>    }
>
> Now, in our webapp, to execute query on the index we compose a query
> as string passed to QueryParser. We've read on the lucene
> documentation that from code is better compose a BooleanQuery using
> Term/TermQuery/... and then I'm evaluting how do it.
> Thanks.
> jean71
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: How specify the analyzer when created query with api?

jean71
On Mon, Sep 22, 2008 at 5:42 PM, Erick Erickson <[hidden email]> wrote:
...
>
> But isn't this the wonderful thing about tests? They
> make your assumptions explicit, and when your
> assumptions are incorrect, you find it with much less
> pain than in a working program where you say "the most
> recent change I made couldn't possibly have broken
> anything" <G>.

Yes, I completely agree with you.

jean71

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]