[jira] Created: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child.

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child.

Sebastian Nagel (Jira)
NodeWalker.skipChildren don't wrok for more than 1 child.
---------------------------------------------------------

                 Key: NUTCH-529
                 URL: https://issues.apache.org/jira/browse/NUTCH-529
             Project: Nutch
          Issue Type: Bug
            Reporter: Emmanuel Joke
             Fix For: 1.0.0
         Attachments: NUTCH-529.patch

I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child.

Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Joke updated NUTCH-529:
--------------------------------

    Attachment: NUTCH-529.patch

patch attached

> NodeWalker.skipChildren don't wrok for more than 1 child.
> ---------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516367 ]

Doğacan Güney commented on NUTCH-529:
-------------------------------------

Could you also add a junit test case? (actually, since NodeWalker is used in a few places, we really need a general test case that tests all public methods)

> NodeWalker.skipChildren don't wrok for more than 1 child.
> ---------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doğacan Güney updated NUTCH-529:
--------------------------------

    Summary: NodeWalker.skipChildren doesn't work for more than 1 child.  (was: NodeWalker.skipChildren don't wrok for more than 1 child.)

Fixed typo in issue title.

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Joke updated NUTCH-529:
--------------------------------

    Attachment: TestNodeWalker.java

Junit test provided.

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12526245 ]

Doğacan Güney commented on NUTCH-529:
-------------------------------------

Thanks for the test case, Emmanuel, but it doesn't compile as it seems to depend on neko. Test cases should only depend on what's in nutch core (unless they are plugin test cases, of course).

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Joke updated NUTCH-529:
--------------------------------

    Attachment: TestNodeWalker.java

Another version without dependency to Neko.

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529392 ]

Doğacan Güney commented on NUTCH-529:
-------------------------------------

Emmanuel, nutch seems to pass the junit test case without applying your patch to NodeWalker. When attaching a test case to a bug-related issue, try to make unpatched nutch fail  :).

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Joke updated NUTCH-529:
--------------------------------

    Attachment: TestNodeWalker.java

Oops... well, it reflect my lack of experience in term of unit test. Sorry for that.

Another class provided which should be correct.


> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java, TestNodeWalker.java, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Joke updated NUTCH-529:
--------------------------------

    Attachment:     (was: TestNodeWalker.java)

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Emmanuel Joke updated NUTCH-529:
--------------------------------

    Attachment:     (was: TestNodeWalker.java)

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Closed: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doğacan Güney closed NUTCH-529.
-------------------------------


Resolved and committed.

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doğacan Güney resolved NUTCH-529.
---------------------------------

    Resolution: Fixed

Fixed in rev. 578703.

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child.

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/NUTCH-529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530040 ]

Hudson commented on NUTCH-529:
------------------------------

Integrated in Nutch-Nightly #217 (See [http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/217/])

> NodeWalker.skipChildren doesn't work for more than 1 child.
> -----------------------------------------------------------
>
>                 Key: NUTCH-529
>                 URL: https://issues.apache.org/jira/browse/NUTCH-529
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-529.patch, TestNodeWalker.java
>
>
> I used NodeWalker to parse an HTML page and skip element like "SELECT" and their children. I noticed that it didn't skip the "OPTION" element which was the children of the parent SELECT element. It skipt it if I have only one element but if I have 8 children elements it keep it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.