browsing query at Servlet level

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

browsing query at Servlet level

Maria Sifniotis
Hello all!

Newish Nutch User here so I ask for you patience !

I have setup Nutch, fiddled with some things i wanted changed and got my index, servlet all working fine, query results return fine all is set.

My question is, is it possible instead of a text query to the content of a page to do a browsing function? I'll illustrate because I may not be making any sense.

One of my objectives is to see if a webpage has images, indicate it as a field in the indexing phase such as
doc.add(new Field("hasImgs", "YES", Field.Store.YES, Field.Index.TOKENIZED));
This works ok!

Suppose I harvest 10 web pages, 5 of those have images and 5 don't. How is it possible to direct my bean.search to look at the hasImgs fields for the value YES and then display only those?

Provided I don't care for text-based queries - just want to see how many of my indexed paged have a YES in their respective field. I can do this with Luke, but I need it to be in the Tomcat application.

Any clues?

Thank you very much!

Maria


     
Reply | Threaded
Open this post in threaded view
|

Re: browsing query at Servlet level

jthompson-2
Hi Maria,

If I understand what you want correctly, I think you want to use the
addRequired() method of the Query object before you use your query in the
search

*addRequiredTerm<http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/Query.html#addRequiredTerm%28java.lang.String,%20java.lang.String%29>
*(String <http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html> term,
String <http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html>
 field)
          Add a required term in a specified field.

http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/Query.html

HTH,
John

On Tue, Jul 8, 2008 at 8:09 AM, Maria Sifniotis <[hidden email]> wrote:

> Hello all!
>
> Newish Nutch User here so I ask for you patience !
>
> I have setup Nutch, fiddled with some things i wanted changed and got my
> index, servlet all working fine, query results return fine all is set.
>
> My question is, is it possible instead of a text query to the content of a
> page to do a browsing function? I'll illustrate because I may not be making
> any sense.
>
> One of my objectives is to see if a webpage has images, indicate it as a
> field in the indexing phase such as
> doc.add(new Field("hasImgs", "YES", Field.Store.YES,
> Field.Index.TOKENIZED));
> This works ok!
>
> Suppose I harvest 10 web pages, 5 of those have images and 5 don't. How is
> it possible to direct my bean.search to look at the hasImgs fields for the
> value YES and then display only those?
>
> Provided I don't care for text-based queries - just want to see how many of
> my indexed paged have a YES in their respective field. I can do this with
> Luke, but I need it to be in the Tomcat application.
>
> Any clues?
>
> Thank you very much!
>
> Maria
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: browsing query at Servlet level

Maria Sifniotis
Hi John thanks for the answer!

I was looking at the addRequired() method the other day but from what I saw does it not require a query term anyway, in that sense a keyword to be searched for from the content?

Maybe i did not understand it very well from the docs, I'll have a look tomorrow at work and update.

Cheers again for the help,

Maria



--- On Tue, 7/8/08, John Thompson <[hidden email]> wrote:

> From: John Thompson <[hidden email]>
> Subject: Re: browsing query at Servlet level
> To: [hidden email], [hidden email]
> Date: Tuesday, July 8, 2008, 1:05 PM
> Hi Maria,
>
> If I understand what you want correctly, I think you want
> to use the
> addRequired() method of the Query object before you use
> your query in the
> search
>
> *addRequiredTerm<http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/Query.html#addRequiredTerm%28java.lang.String,%20java.lang.String%29>
> *(String
> <http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html>
> term,
> String
> <http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html>
>  field)
>           Add a required term in a specified field.
>
> http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/Query.html
>
> HTH,
> John
>
> On Tue, Jul 8, 2008 at 8:09 AM, Maria Sifniotis
> <[hidden email]> wrote:
>
> > Hello all!
> >
> > Newish Nutch User here so I ask for you patience !
> >
> > I have setup Nutch, fiddled with some things i wanted
> changed and got my
> > index, servlet all working fine, query results return
> fine all is set.
> >
> > My question is, is it possible instead of a text query
> to the content of a
> > page to do a browsing function? I'll illustrate
> because I may not be making
> > any sense.
> >
> > One of my objectives is to see if a webpage has
> images, indicate it as a
> > field in the indexing phase such as
> > doc.add(new Field("hasImgs",
> "YES", Field.Store.YES,
> > Field.Index.TOKENIZED));
> > This works ok!
> >
> > Suppose I harvest 10 web pages, 5 of those have images
> and 5 don't. How is
> > it possible to direct my bean.search to look at the
> hasImgs fields for the
> > value YES and then display only those?
> >
> > Provided I don't care for text-based queries -
> just want to see how many of
> > my indexed paged have a YES in their respective field.
> I can do this with
> > Luke, but I need it to be in the Tomcat application.
> >
> > Any clues?
> >
> > Thank you very much!
> >
> > Maria
> >
> >
> >
> >