[jira] Created: (NUTCH-291) OpenSearchServlet should return "date" as well as "lastModified"

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (NUTCH-291) OpenSearchServlet should return "date" as well as "lastModified"

Nick Burch (Jira)
OpenSearchServlet should return "date" as well as "lastModified"
----------------------------------------------------------------

         Key: NUTCH-291
         URL: http://issues.apache.org/jira/browse/NUTCH-291
     Project: Nutch
        Type: Improvement

  Components: web gui  
    Versions: 0.8-dev    
    Reporter: Stefan Neufeind


Currently lastModified is provided by OpenSearchServlet - but only in case the date lastModified-date is known.

Since you can sort by "date" (which is lastModified or if not present the fetchdate), it might be useful if OpenSearchServlet could provide "date" as well.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-291) OpenSearchServlet should return "date" as well as "lastModified"

Nick Burch (Jira)
     [ http://issues.apache.org/jira/browse/NUTCH-291?page=all ]

Stefan Neufeind updated NUTCH-291:
----------------------------------

    Attachment: NUTCH-291-unfinished.patch

I tried implementing this in OpenSearchServlet.java (see patch). The idea for this match is based on more.jsp. However I receive:

java.lang.NumberFormatException: null
        java.lang.Long.parseLong(Long.java:372)
        java.lang.Long.<init>(Long.java:671)
        org.apache.nutch.searcher.OpenSearchServlet.doGet(OpenSearchServlet.java:230)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

Guess that has to do with date not being present here?!? I've tried hunting down the "problem" and it seems that in
java/org/apache/nutch/searcher/IndexSearcher.java the field also needs to be provided. But I assume that the Lucene-engine here correctly provides the date-field.

Maybe somebody could fix up my patch and then maybe commit as well. I guess always knowing the date from the RSS-feed might be good.

> OpenSearchServlet should return "date" as well as "lastModified"
> ----------------------------------------------------------------
>
>          Key: NUTCH-291
>          URL: http://issues.apache.org/jira/browse/NUTCH-291
>      Project: Nutch
>         Type: Improvement

>   Components: web gui
>     Versions: 0.8-dev
>     Reporter: Stefan Neufeind
>  Attachments: NUTCH-291-unfinished.patch
>
> Currently lastModified is provided by OpenSearchServlet - but only in case the date lastModified-date is known.
> Since you can sort by "date" (which is lastModified or if not present the fetchdate), it might be useful if OpenSearchServlet could provide "date" as well.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-291) OpenSearchServlet should return "date" as well as "lastModified"

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)
    [ http://issues.apache.org/jira/browse/NUTCH-291?page=comments#action_12414445 ]

Stefan Groschupf commented on NUTCH-291:
----------------------------------------

lastModified will be only indexed if you switch on the index-more plugin.
If you think you should change the way lastmodified and date is stored in the index, please submit a patch for MoreIndexingFilter.

> OpenSearchServlet should return "date" as well as "lastModified"
> ----------------------------------------------------------------
>
>          Key: NUTCH-291
>          URL: http://issues.apache.org/jira/browse/NUTCH-291
>      Project: Nutch
>         Type: Improvement

>   Components: web gui
>     Versions: 0.8-dev
>     Reporter: Stefan Neufeind
>  Attachments: NUTCH-291-unfinished.patch
>
> Currently lastModified is provided by OpenSearchServlet - but only in case the date lastModified-date is known.
> Since you can sort by "date" (which is lastModified or if not present the fetchdate), it might be useful if OpenSearchServlet could provide "date" as well.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-291) OpenSearchServlet should return "date" as well as "lastModified"

Nick Burch (Jira)
In reply to this post by Nick Burch (Jira)
    [ http://issues.apache.org/jira/browse/NUTCH-291?page=comments#action_12414466 ]

Stefan Neufeind commented on NUTCH-291:
---------------------------------------

Which way is most favorable? To always set lastModified although it was not returned from the webserver (maybe unclean) or always return date as well (cleaner?).

> OpenSearchServlet should return "date" as well as "lastModified"
> ----------------------------------------------------------------
>
>          Key: NUTCH-291
>          URL: http://issues.apache.org/jira/browse/NUTCH-291
>      Project: Nutch
>         Type: Improvement

>   Components: web gui
>     Versions: 0.8-dev
>     Reporter: Stefan Neufeind
>  Attachments: NUTCH-291-unfinished.patch
>
> Currently lastModified is provided by OpenSearchServlet - but only in case the date lastModified-date is known.
> Since you can sort by "date" (which is lastModified or if not present the fetchdate), it might be useful if OpenSearchServlet could provide "date" as well.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira