[jira] [Commented] (NUTCH-2466) Sitemap processor to follow redirects

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (NUTCH-2466) Sitemap processor to follow redirects

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/NUTCH-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347762#comment-16347762 ]

Markus Jelsma commented on NUTCH-2466:

Another note, curious to see browser developers allow over ten redirects. I never observed any fruition to follow more than a few. Stranger even is IE's choice to jump from eleven to 120!

If anyone reading this can clarify the usefulness of following more than ten redirects? Or even 120?

That made bad choices, or i haven't seen their views about the variety of crap on the web. Probably the latter is true.

> Sitemap processor to follow redirects
> -------------------------------------
>                 Key: NUTCH-2466
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2466
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.13
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Minor
>             Fix For: 1.15
>         Attachments: NUTCH-2466.patch, NUTCH-2466.patch, NUTCH-2466.patch
> It does follow http > https, but not the following redirect, e.g. sitemap_index.xml that some websites have.

This message was sent by Atlassian JIRA