[jira] Created: (NUTCH-418) Fixes parsing of XHTML (e.g. title)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (NUTCH-418) Fixes parsing of XHTML (e.g. title)

ASF GitHub Bot (Jira)
Fixes parsing of XHTML (e.g. title)
-----------------------------------

                 Key: NUTCH-418
                 URL: http://issues.apache.org/jira/browse/NUTCH-418
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 0.8.2
         Environment: Ubuntu Linux
            Reporter: Michael Wechner


Fixes parsing of XHTML (e.g. title)

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-418) Fixes parsing of XHTML (e.g. title)

ASF GitHub Bot (Jira)
     [ http://issues.apache.org/jira/browse/NUTCH-418?page=all ]

Michael Wechner updated NUTCH-418:
----------------------------------

    Attachment: parse-xhtml-patch.txt

patch which fixes the mime-type

> Fixes parsing of XHTML (e.g. title)
> -----------------------------------
>
>                 Key: NUTCH-418
>                 URL: http://issues.apache.org/jira/browse/NUTCH-418
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 0.8.2
>         Environment: Ubuntu Linux
>            Reporter: Michael Wechner
>         Attachments: parse-xhtml-patch.txt
>
>
> Fixes parsing of XHTML (e.g. title)

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-418) Fixes parsing of XHTML (e.g. title)

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)
    [ http://issues.apache.org/jira/browse/NUTCH-418?page=comments#action_12460282 ]
           
Sami Siren commented on NUTCH-418:
----------------------------------

We should perhaps include the rest of changes made in NUTCH-362.

> Fixes parsing of XHTML (e.g. title)
> -----------------------------------
>
>                 Key: NUTCH-418
>                 URL: http://issues.apache.org/jira/browse/NUTCH-418
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 0.8.2
>         Environment: Ubuntu Linux
>            Reporter: Michael Wechner
>         Attachments: parse-xhtml-patch.txt
>
>
> Fixes parsing of XHTML (e.g. title)

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Closed: (NUTCH-418) Fixes parsing of XHTML (e.g. title)

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  closed NUTCH-418.
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.9.0

Already applied.

> Fixes parsing of XHTML (e.g. title)
> -----------------------------------
>
>                 Key: NUTCH-418
>                 URL: https://issues.apache.org/jira/browse/NUTCH-418
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 0.8.2
>         Environment: Ubuntu Linux
>            Reporter: Michael Wechner
>             Fix For: 0.9.0
>
>         Attachments: parse-xhtml-patch.txt
>
>
> Fixes parsing of XHTML (e.g. title)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.