nutch fetch status codes

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

nutch fetch status codes

Tranquil
hi,

Can someone explain on the various status codes and their meaning?
fetched, unfetched  - pretty obvious

db_gone - ?
db_redir_perm - ?
db_redir_temp - ?

Eyal Edri
Reply | Threaded
Open this post in threaded view
|

Re: nutch fetch status codes

Andrzej Białecki-2
eyal edri wrote:
> hi,
>
> Can someone explain on the various status codes and their meaning?
> fetched, unfetched  - pretty obvious
>
> db_gone - ?

We tried several times to retrieve this page (3 times by default), and
it was either forbidden by robots.txt, or we got HTTP 404.

> db_redir_perm - ?

This url is redirected to a different url using HTTP 301 (Permanently
Moved). The HTTP spec says that in this case the original url should not
be used anymore.

> db_redir_temp - ?

This url is redirected to a different url using HTTP 302 (Temporarily
Moved).


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

Re: nutch fetch status codes

misc

Hello-

    I should point out that these are HTTP codes, not nutch specific stuff,
so if you want more information you might get more thorough results
referencing that.

                        see you
                            -J


----- Original Message -----
From: "Andrzej Bialecki" <[hidden email]>
To: <[hidden email]>
Sent: Tuesday, September 18, 2007 8:57 AM
Subject: Re: nutch fetch status codes


> eyal edri wrote:
>> hi,
>>
>> Can someone explain on the various status codes and their meaning?
>> fetched, unfetched  - pretty obvious
>>
>> db_gone - ?
>
> We tried several times to retrieve this page (3 times by default), and it
> was either forbidden by robots.txt, or we got HTTP 404.
>
>> db_redir_perm - ?
>
> This url is redirected to a different url using HTTP 301 (Permanently
> Moved). The HTTP spec says that in this case the original url should not
> be used anymore.
>
>> db_redir_temp - ?
>
> This url is redirected to a different url using HTTP 302 (Temporarily
> Moved).
>
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>