RSS Feed Parser

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

RSS Feed Parser

Zaheed Haque
Hello:

I am realy hoping that Chris Mattmann RSS parser will make it to the
release 0.7.

http://issues.apache.org/jira/browse/NUTCH-30

I got it working from last nights SVN. I believe newbie users like me
would benefit very much having it as a part of the distribution. +1
for this plugin!

Thanks Chris for solving my problem!!
--
Best Regards
Zaheed Haque
Reply | Threaded
Open this post in threaded view
|

RE: RSS Feed Parser

chrismattmann
Hi Zaheed,

 Thanks for the nice comments. I've went ahead and wrote an HTML page that
summarizes what I sent to Zaheed with respect to installing the parse-rss
plugin. You can find the small guide here:

http://www-scf.usc.edu/~mattmann/parse-rss-install.html


Thanks,
  Chris


______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.



> -----Original Message-----
> From: Zaheed Haque [mailto:[hidden email]]
> Sent: Thursday, August 11, 2005 11:49 AM
> To: [hidden email]
> Subject: RSS Feed Parser
>
> Hello:
>
> I am realy hoping that Chris Mattmann RSS parser will make it to the
> release 0.7.
>
> http://issues.apache.org/jira/browse/NUTCH-30
>
> I got it working from last nights SVN. I believe newbie users like me
> would benefit very much having it as a part of the distribution. +1
> for this plugin!
>
> Thanks Chris for solving my problem!!
> --
> Best Regards
> Zaheed Haque

Reply | Threaded
Open this post in threaded view
|

VOTE: (Re: RSS Feed Parser)

Andrzej Białecki-2
Chris Mattmann wrote:
> Hi Zaheed,
>
>  Thanks for the nice comments. I've went ahead and wrote an HTML page that
> summarizes what I sent to Zaheed with respect to installing the parse-rss
> plugin. You can find the small guide here:
>
> http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>

My apologies to Chris - I was supposed to import this plugin before the
release, however due to changes in my travel plans and other work I ran
out of time... :-(

We are in the no-commit period now, before the release. I could do the
import now, if other committers approve this exception. As a safety
measure against this short testing period I would leave it disabled by
default.

Please vote +1 if I should commit it before the release, or -1 if after.

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

Re: [Nutch-general] VOTE: (Re: RSS Feed Parser)

Otis Gospodnetic-2-2
+1

Otis (not a real Nutch committer)


--- Andrzej Bialecki <[hidden email]> wrote:

> Chris Mattmann wrote:
> > Hi Zaheed,
> >
> >  Thanks for the nice comments. I've went ahead and wrote an HTML
> page that
> > summarizes what I sent to Zaheed with respect to installing the
> parse-rss
> > plugin. You can find the small guide here:
> >
> > http://www-scf.usc.edu/~mattmann/parse-rss-install.html
> >
>
> My apologies to Chris - I was supposed to import this plugin before
> the
> release, however due to changes in my travel plans and other work I
> ran
> out of time... :-(
>
> We are in the no-commit period now, before the release. I could do
> the
> import now, if other committers approve this exception. As a safety
> measure against this short testing period I would leave it disabled
> by
> default.
>
> Please vote +1 if I should commit it before the release, or -1 if
> after.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>   ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> Practices
> Agile & Plan-Driven Development * Managing Projects & Teams * Testing
> & QA
> Security * Process Improvement & Measurement *
> http://www.sqe.com/bsce5sf
> _______________________________________________
> Nutch-general mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/nutch-general
>

Reply | Threaded
Open this post in threaded view
|

Re: [Nutch-general] VOTE: (Re: RSS Feed Parser)

Erik Hatcher
In reply to this post by Andrzej Białecki-2
+1  - with it disabled there isn't much risk.

On Aug 11, 2005, at 6:07 PM, Andrzej Bialecki wrote:

> Chris Mattmann wrote:
>
>> Hi Zaheed,
>>  Thanks for the nice comments. I've went ahead and wrote an HTML  
>> page that
>> summarizes what I sent to Zaheed with respect to installing the  
>> parse-rss
>> plugin. You can find the small guide here:
>> http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>>
>
> My apologies to Chris - I was supposed to import this plugin before  
> the release, however due to changes in my travel plans and other  
> work I ran out of time... :-(
>
> We are in the no-commit period now, before the release. I could do  
> the import now, if other committers approve this exception. As a  
> safety measure against this short testing period I would leave it  
> disabled by default.
>
> Please vote +1 if I should commit it before the release, or -1 if  
> after.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle  
> Practices
> Agile & Plan-Driven Development * Managing Projects & Teams *  
> Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/ 
> bsce5sf
> _______________________________________________
> Nutch-general mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/nutch-general
>

Reply | Threaded
Open this post in threaded view
|

Re: [Nutch-general] VOTE: (Re: RSS Feed Parser)

Jon Shoberg
Same for me ....    +1  - with it disabled there isn't much risk.

Erik Hatcher wrote:

> +1  - with it disabled there isn't much risk.
>
> On Aug 11, 2005, at 6:07 PM, Andrzej Bialecki wrote:
>
>> Chris Mattmann wrote:
>>
>>> Hi Zaheed,
>>>  Thanks for the nice comments. I've went ahead and wrote an HTML  
>>> page that
>>> summarizes what I sent to Zaheed with respect to installing the  
>>> parse-rss
>>> plugin. You can find the small guide here:
>>> http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>>>
>>
>> My apologies to Chris - I was supposed to import this plugin before  
>> the release, however due to changes in my travel plans and other  
>> work I ran out of time... :-(
>>
>> We are in the no-commit period now, before the release. I could do  
>> the import now, if other committers approve this exception. As a  
>> safety measure against this short testing period I would leave it  
>> disabled by default.
>>
>> Please vote +1 if I should commit it before the release, or -1 if  
>> after.
>>
>> --
>> Best regards,
>> Andrzej Bialecki     <><
>


Reply | Threaded
Open this post in threaded view
|

Re: [Nutch-general] VOTE: (Re: RSS Feed Parser)

Piotr Kosiorowski
+1 as it would be disabled.
P.
Jon Shoberg wrote:

> Same for me ....    +1  - with it disabled there isn't much risk.
>
> Erik Hatcher wrote:
>
>> +1  - with it disabled there isn't much risk.
>>
>> On Aug 11, 2005, at 6:07 PM, Andrzej Bialecki wrote:
>>
>>> Chris Mattmann wrote:
>>>
>>>> Hi Zaheed,
>>>>  Thanks for the nice comments. I've went ahead and wrote an HTML  
>>>> page that
>>>> summarizes what I sent to Zaheed with respect to installing the  
>>>> parse-rss
>>>> plugin. You can find the small guide here:
>>>> http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>>>>
>>>
>>> My apologies to Chris - I was supposed to import this plugin before  
>>> the release, however due to changes in my travel plans and other  
>>> work I ran out of time... :-(
>>>
>>> We are in the no-commit period now, before the release. I could do  
>>> the import now, if other committers approve this exception. As a  
>>> safety measure against this short testing period I would leave it  
>>> disabled by default.
>>>
>>> Please vote +1 if I should commit it before the release, or -1 if  
>>> after.
>>>
>>> --
>>> Best regards,
>>> Andrzej Bialecki     <><
>>
>>
>
>
>

Reply | Threaded
Open this post in threaded view
|

RE: VOTE: (Re: RSS Feed Parser)

Fuad Efendi
In reply to this post by Andrzej Białecki-2
+1
Need more samples!
(Can I vote? I am novice... Just a few funny days!)


-----Original Message-----
From: Andrzej Bialecki [mailto:[hidden email]]
Sent: Thursday, August 11, 2005 6:08 PM
To: [hidden email]
Subject: VOTE: (Re: RSS Feed Parser)


Chris Mattmann wrote:
> Hi Zaheed,
>
>  Thanks for the nice comments. I've went ahead and wrote an HTML page
> that summarizes what I sent to Zaheed with respect to installing the
> parse-rss plugin. You can find the small guide here:
>
> http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>

My apologies to Chris - I was supposed to import this plugin before the
release, however due to changes in my travel plans and other work I ran
out of time... :-(

We are in the no-commit period now, before the release. I could do the
import now, if other committers approve this exception. As a safety
measure against this short testing period I would leave it disabled by
default.

Please vote +1 if I should commit it before the release, or -1 if after.

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web ___|||__||
\|  ||  |  Embedded Unix, System Integration http://www.sigram.com
Contact: info at sigram dot com



Reply | Threaded
Open this post in threaded view
|

Re: [Nutch-general] RE: RSS Feed Parser

American Jeff Bowden
In reply to this post by chrismattmann
Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
doesn't appear to be in commons svn or on the feedparser site.

Chris Mattmann wrote:

>Hi Zaheed,
>
> Thanks for the nice comments. I've went ahead and wrote an HTML page that
>summarizes what I sent to Zaheed with respect to installing the parse-rss
>plugin. You can find the small guide here:
>
>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>
>
>Thanks,
>  Chris
>
>
>______________________________________________
>Chris A. Mattmann
>[hidden email]
>Staff Member
>Modeling and Data Management Systems Section (387)
>Data Management Systems and Technologies Group
>
>_________________________________________________
>Jet Propulsion Laboratory            Pasadena, CA
>Office: 171-266B                        Mailstop:  171-246
>_______________________________________________________
>
>Disclaimer:  The opinions presented within are my own and do not reflect
>those of either NASA, JPL, or the California Institute of Technology.
>
>
>
>  
>
>>-----Original Message-----
>>From: Zaheed Haque [mailto:[hidden email]]
>>Sent: Thursday, August 11, 2005 11:49 AM
>>To: [hidden email]
>>Subject: RSS Feed Parser
>>
>>Hello:
>>
>>I am realy hoping that Chris Mattmann RSS parser will make it to the
>>release 0.7.
>>
>>http://issues.apache.org/jira/browse/NUTCH-30
>>
>>I got it working from last nights SVN. I believe newbie users like me
>>would benefit very much having it as a part of the distribution. +1
>>for this plugin!
>>
>>Thanks Chris for solving my problem!!
>>--
>>Best Regards
>>Zaheed Haque
>>    
>>
>
>
>
>-------------------------------------------------------
>SF.Net email is Sponsored by the Better Software Conference & EXPO
>September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>_______________________________________________
>Nutch-general mailing list
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/nutch-general
>  
>

Reply | Threaded
Open this post in threaded view
|

Re: [Nutch-general] RE: RSS Feed Parser

chrismattmann
Hi Jeff,

   commons-feedparser-fork was a branched off version of the feedparser 0.6
base code that I made, which removed some of the specific jar files that
were part of standard 0.6 feedparser distro that conflicted with the jar
files included in Nutch's lib directory. Specifically, I changed it so that
the core jaxen libraries that the feed parser relied on weren't dom4j, but
in fact were jdom (see postings on the Nutch list around March 2005 between
John X, Stefan G. and I). This required changing about 9 or 10 of the source
files for the feedparser to use the jdom Node classes rather than the dom4j.

If you like, I can put up a link to the feedparser forked code on my
website, and post the link to the list.

Thanks,
  Chris



On 8/24/05 2:04 PM, "American Jeff Bowden" <[hidden email]>
wrote:

> Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
> doesn't appear to be in commons svn or on the feedparser site.
>
> Chris Mattmann wrote:
>
>> Hi Zaheed,
>>
>> Thanks for the nice comments. I've went ahead and wrote an HTML page that
>> summarizes what I sent to Zaheed with respect to installing the parse-rss
>> plugin. You can find the small guide here:
>>
>> http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>>
>>
>> Thanks,
>>  Chris
>>
>>
>> ______________________________________________
>> Chris A. Mattmann
>> [hidden email]
>> Staff Member
>> Modeling and Data Management Systems Section (387)
>> Data Management Systems and Technologies Group
>>
>> _________________________________________________
>> Jet Propulsion Laboratory            Pasadena, CA
>> Office: 171-266B                        Mailstop:  171-246
>> _______________________________________________________
>>
>> Disclaimer:  The opinions presented within are my own and do not reflect
>> those of either NASA, JPL, or the California Institute of Technology.
>>
>>
>>
>>  
>>
>>> -----Original Message-----
>>> From: Zaheed Haque [mailto:[hidden email]]
>>> Sent: Thursday, August 11, 2005 11:49 AM
>>> To: [hidden email]
>>> Subject: RSS Feed Parser
>>>
>>> Hello:
>>>
>>> I am realy hoping that Chris Mattmann RSS parser will make it to the
>>> release 0.7.
>>>
>>> http://issues.apache.org/jira/browse/NUTCH-30
>>>
>>> I got it working from last nights SVN. I believe newbie users like me
>>> would benefit very much having it as a part of the distribution. +1
>>> for this plugin!
>>>
>>> Thanks Chris for solving my problem!!
>>> --
>>> Best Regards
>>> Zaheed Haque
>>>    
>>>
>>
>>
>>
>> -------------------------------------------------------
>> SF.Net email is Sponsored by the Better Software Conference & EXPO
>> September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>> _______________________________________________
>> Nutch-general mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/nutch-general
>>  
>>
>

______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group
 
_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________
 
Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.
 
 



Reply | Threaded
Open this post in threaded view
|

Re: [Nutch-general] RE: RSS Feed Parser

American Jeff Bowden
Yes please, that would be great.  I couldn't even figure out where to
find the 0.6 version of feedparser, much less your patches to it.

Chris Mattmann wrote:

>Hi Jeff,
>
>   commons-feedparser-fork was a branched off version of the feedparser 0.6
>base code that I made, which removed some of the specific jar files that
>were part of standard 0.6 feedparser distro that conflicted with the jar
>files included in Nutch's lib directory. Specifically, I changed it so that
>the core jaxen libraries that the feed parser relied on weren't dom4j, but
>in fact were jdom (see postings on the Nutch list around March 2005 between
>John X, Stefan G. and I). This required changing about 9 or 10 of the source
>files for the feedparser to use the jdom Node classes rather than the dom4j.
>
>If you like, I can put up a link to the feedparser forked code on my
>website, and post the link to the list.
>
>Thanks,
>  Chris
>
>
>
>On 8/24/05 2:04 PM, "American Jeff Bowden" <[hidden email]>
>wrote:
>
>  
>
>>Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
>>doesn't appear to be in commons svn or on the feedparser site.
>>
>>Chris Mattmann wrote:
>>
>>    
>>
>>>Hi Zaheed,
>>>
>>>Thanks for the nice comments. I've went ahead and wrote an HTML page that
>>>summarizes what I sent to Zaheed with respect to installing the parse-rss
>>>plugin. You can find the small guide here:
>>>
>>>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>>>
>>>
>>>Thanks,
>>> Chris
>>>
>>>
>>>______________________________________________
>>>Chris A. Mattmann
>>>[hidden email]
>>>Staff Member
>>>Modeling and Data Management Systems Section (387)
>>>Data Management Systems and Technologies Group
>>>
>>>_________________________________________________
>>>Jet Propulsion Laboratory            Pasadena, CA
>>>Office: 171-266B                        Mailstop:  171-246
>>>_______________________________________________________
>>>
>>>Disclaimer:  The opinions presented within are my own and do not reflect
>>>those of either NASA, JPL, or the California Institute of Technology.
>>>
>>>
>>>
>>>
>>>
>>>      
>>>
>>>>-----Original Message-----
>>>>From: Zaheed Haque [mailto:[hidden email]]
>>>>Sent: Thursday, August 11, 2005 11:49 AM
>>>>To: [hidden email]
>>>>Subject: RSS Feed Parser
>>>>
>>>>Hello:
>>>>
>>>>I am realy hoping that Chris Mattmann RSS parser will make it to the
>>>>release 0.7.
>>>>
>>>>http://issues.apache.org/jira/browse/NUTCH-30
>>>>
>>>>I got it working from last nights SVN. I believe newbie users like me
>>>>would benefit very much having it as a part of the distribution. +1
>>>>for this plugin!
>>>>
>>>>Thanks Chris for solving my problem!!
>>>>--
>>>>Best Regards
>>>>Zaheed Haque
>>>>  
>>>>
>>>>        
>>>>
>>>
>>>-------------------------------------------------------
>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>>>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>>>_______________________________________________
>>>Nutch-general mailing list
>>>[hidden email]
>>>https://lists.sourceforge.net/lists/listinfo/nutch-general
>>>
>>>
>>>      
>>>
>
>______________________________________________
>Chris A. Mattmann
>[hidden email]
>Staff Member
>Modeling and Data Management Systems Section (387)
>Data Management Systems and Technologies Group
>
>_________________________________________________
>Jet Propulsion Laboratory            Pasadena, CA
>Office: 171-266B                        Mailstop:  171-246
>_______________________________________________________
>
>Disclaimer:  The opinions presented within are my own and do not reflect
>those of either NASA, JPL, or the California Institute of Technology.
>
>
>
>
>
>
>
>-------------------------------------------------------
>SF.Net email is Sponsored by the Better Software Conference & EXPO
>September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>_______________________________________________
>Nutch-general mailing list
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/nutch-general
>  
>

Reply | Threaded
Open this post in threaded view
|

RE: [Nutch-general] RE: RSS Feed Parser

chrismattmann
Hi Jeff,

 Okay, here is the link to commons-feedparser source that includes my
modifications:

http://www-scf.usc.edu/~mattmann/feedparser-0.6-fork-src.zip


Thanks!

Cheers,
  Chris


______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.



> -----Original Message-----
> From: Jeff Bowden [mailto:[hidden email]]
> Sent: Wednesday, August 24, 2005 10:45 PM
> To: [hidden email]; [hidden email]
> Subject: Re: [Nutch-general] RE: RSS Feed Parser
>
> Yes please, that would be great.  I couldn't even figure out where to
> find the 0.6 version of feedparser, much less your patches to it.
>
> Chris Mattmann wrote:
>
> >Hi Jeff,
> >
> >   commons-feedparser-fork was a branched off version of the feedparser
> 0.6
> >base code that I made, which removed some of the specific jar files that
> >were part of standard 0.6 feedparser distro that conflicted with the jar
> >files included in Nutch's lib directory. Specifically, I changed it so
> that
> >the core jaxen libraries that the feed parser relied on weren't dom4j,
> but
> >in fact were jdom (see postings on the Nutch list around March 2005
> between
> >John X, Stefan G. and I). This required changing about 9 or 10 of the
> source
> >files for the feedparser to use the jdom Node classes rather than the
> dom4j.
> >
> >If you like, I can put up a link to the feedparser forked code on my
> >website, and post the link to the list.
> >
> >Thanks,
> >  Chris
> >
> >
> >
> >On 8/24/05 2:04 PM, "American Jeff Bowden" <[hidden email]>
> >wrote:
> >
> >
> >
> >>Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
> >>doesn't appear to be in commons svn or on the feedparser site.
> >>
> >>Chris Mattmann wrote:
> >>
> >>
> >>
> >>>Hi Zaheed,
> >>>
> >>>Thanks for the nice comments. I've went ahead and wrote an HTML page
> that
> >>>summarizes what I sent to Zaheed with respect to installing the parse-
> rss
> >>>plugin. You can find the small guide here:
> >>>
> >>>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
> >>>
> >>>
> >>>Thanks,
> >>> Chris
> >>>
> >>>
> >>>______________________________________________
> >>>Chris A. Mattmann
> >>>[hidden email]
> >>>Staff Member
> >>>Modeling and Data Management Systems Section (387)
> >>>Data Management Systems and Technologies Group
> >>>
> >>>_________________________________________________
> >>>Jet Propulsion Laboratory            Pasadena, CA
> >>>Office: 171-266B                        Mailstop:  171-246
> >>>_______________________________________________________
> >>>
> >>>Disclaimer:  The opinions presented within are my own and do not
> reflect
> >>>those of either NASA, JPL, or the California Institute of Technology.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>-----Original Message-----
> >>>>From: Zaheed Haque [mailto:[hidden email]]
> >>>>Sent: Thursday, August 11, 2005 11:49 AM
> >>>>To: [hidden email]
> >>>>Subject: RSS Feed Parser
> >>>>
> >>>>Hello:
> >>>>
> >>>>I am realy hoping that Chris Mattmann RSS parser will make it to the
> >>>>release 0.7.
> >>>>
> >>>>http://issues.apache.org/jira/browse/NUTCH-30
> >>>>
> >>>>I got it working from last nights SVN. I believe newbie users like me
> >>>>would benefit very much having it as a part of the distribution. +1
> >>>>for this plugin!
> >>>>
> >>>>Thanks Chris for solving my problem!!
> >>>>--
> >>>>Best Regards
> >>>>Zaheed Haque
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>-------------------------------------------------------
> >>>SF.Net email is Sponsored by the Better Software Conference & EXPO
> >>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> Practices
> >>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
> QA
> >>>Security * Process Improvement & Measurement *
> http://www.sqe.com/bsce5sf
> >>>_______________________________________________
> >>>Nutch-general mailing list
> >>>[hidden email]
> >>>https://lists.sourceforge.net/lists/listinfo/nutch-general
> >>>
> >>>
> >>>
> >>>
> >
> >______________________________________________
> >Chris A. Mattmann
> >[hidden email]
> >Staff Member
> >Modeling and Data Management Systems Section (387)
> >Data Management Systems and Technologies Group
> >
> >_________________________________________________
> >Jet Propulsion Laboratory            Pasadena, CA
> >Office: 171-266B                        Mailstop:  171-246
> >_______________________________________________________
> >
> >Disclaimer:  The opinions presented within are my own and do not reflect
> >those of either NASA, JPL, or the California Institute of Technology.
> >
> >
> >
> >
> >
> >
> >
> >-------------------------------------------------------
> >SF.Net email is Sponsored by the Better Software Conference & EXPO
> >September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> Practices
> >Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
> QA
> >Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> >_______________________________________________
> >Nutch-general mailing list
> >[hidden email]
> >https://lists.sourceforge.net/lists/listinfo/nutch-general
> >
> >

Reply | Threaded
Open this post in threaded view
|

Re: [Nutch-general] RE: RSS Feed Parser

American Jeff Bowden
I notice that build.xml still creates commons-feedparser-0.5.0-RC1.jar
but I'll assume you're just renaming it manually to -0.6-fork.

Thanks.


Chris Mattmann wrote:

>Hi Jeff,
>
> Okay, here is the link to commons-feedparser source that includes my
>modifications:
>
>http://www-scf.usc.edu/~mattmann/feedparser-0.6-fork-src.zip
>
>
>Thanks!
>
>Cheers,
>  Chris
>
>
>______________________________________________
>Chris A. Mattmann
>[hidden email]
>Staff Member
>Modeling and Data Management Systems Section (387)
>Data Management Systems and Technologies Group
>
>_________________________________________________
>Jet Propulsion Laboratory            Pasadena, CA
>Office: 171-266B                        Mailstop:  171-246
>_______________________________________________________
>
>Disclaimer:  The opinions presented within are my own and do not reflect
>those of either NASA, JPL, or the California Institute of Technology.
>
>
>
>  
>
>>-----Original Message-----
>>From: Jeff Bowden [mailto:[hidden email]]
>>Sent: Wednesday, August 24, 2005 10:45 PM
>>To: [hidden email]; [hidden email]
>>Subject: Re: [Nutch-general] RE: RSS Feed Parser
>>
>>Yes please, that would be great.  I couldn't even figure out where to
>>find the 0.6 version of feedparser, much less your patches to it.
>>
>>Chris Mattmann wrote:
>>
>>    
>>
>>>Hi Jeff,
>>>
>>>  commons-feedparser-fork was a branched off version of the feedparser
>>>      
>>>
>>0.6
>>    
>>
>>>base code that I made, which removed some of the specific jar files that
>>>were part of standard 0.6 feedparser distro that conflicted with the jar
>>>files included in Nutch's lib directory. Specifically, I changed it so
>>>      
>>>
>>that
>>    
>>
>>>the core jaxen libraries that the feed parser relied on weren't dom4j,
>>>      
>>>
>>but
>>    
>>
>>>in fact were jdom (see postings on the Nutch list around March 2005
>>>      
>>>
>>between
>>    
>>
>>>John X, Stefan G. and I). This required changing about 9 or 10 of the
>>>      
>>>
>>source
>>    
>>
>>>files for the feedparser to use the jdom Node classes rather than the
>>>      
>>>
>>dom4j.
>>    
>>
>>>If you like, I can put up a link to the feedparser forked code on my
>>>website, and post the link to the list.
>>>
>>>Thanks,
>>> Chris
>>>
>>>
>>>
>>>On 8/24/05 2:04 PM, "American Jeff Bowden" <[hidden email]>
>>>wrote:
>>>
>>>
>>>
>>>      
>>>
>>>>Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
>>>>doesn't appear to be in commons svn or on the feedparser site.
>>>>
>>>>Chris Mattmann wrote:
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>>>Hi Zaheed,
>>>>>
>>>>>Thanks for the nice comments. I've went ahead and wrote an HTML page
>>>>>          
>>>>>
>>that
>>    
>>
>>>>>summarizes what I sent to Zaheed with respect to installing the parse-
>>>>>          
>>>>>
>>rss
>>    
>>
>>>>>plugin. You can find the small guide here:
>>>>>
>>>>>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>>>>>
>>>>>
>>>>>Thanks,
>>>>>Chris
>>>>>
>>>>>
>>>>>______________________________________________
>>>>>Chris A. Mattmann
>>>>>[hidden email]
>>>>>Staff Member
>>>>>Modeling and Data Management Systems Section (387)
>>>>>Data Management Systems and Technologies Group
>>>>>
>>>>>_________________________________________________
>>>>>Jet Propulsion Laboratory            Pasadena, CA
>>>>>Office: 171-266B                        Mailstop:  171-246
>>>>>_______________________________________________________
>>>>>
>>>>>Disclaimer:  The opinions presented within are my own and do not
>>>>>          
>>>>>
>>reflect
>>    
>>
>>>>>those of either NASA, JPL, or the California Institute of Technology.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Zaheed Haque [mailto:[hidden email]]
>>>>>>Sent: Thursday, August 11, 2005 11:49 AM
>>>>>>To: [hidden email]
>>>>>>Subject: RSS Feed Parser
>>>>>>
>>>>>>Hello:
>>>>>>
>>>>>>I am realy hoping that Chris Mattmann RSS parser will make it to the
>>>>>>release 0.7.
>>>>>>
>>>>>>http://issues.apache.org/jira/browse/NUTCH-30
>>>>>>
>>>>>>I got it working from last nights SVN. I believe newbie users like me
>>>>>>would benefit very much having it as a part of the distribution. +1
>>>>>>for this plugin!
>>>>>>
>>>>>>Thanks Chris for solving my problem!!
>>>>>>--
>>>>>>Best Regards
>>>>>>Zaheed Haque
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>>>-------------------------------------------------------
>>>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
>>>>>          
>>>>>
>>Practices
>>    
>>
>>>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
>>>>>          
>>>>>
>>QA
>>    
>>
>>>>>Security * Process Improvement & Measurement *
>>>>>          
>>>>>
>>http://www.sqe.com/bsce5sf
>>    
>>
>>>>>_______________________________________________
>>>>>Nutch-general mailing list
>>>>>[hidden email]
>>>>>https://lists.sourceforge.net/lists/listinfo/nutch-general
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>______________________________________________
>>>Chris A. Mattmann
>>>[hidden email]
>>>Staff Member
>>>Modeling and Data Management Systems Section (387)
>>>Data Management Systems and Technologies Group
>>>
>>>_________________________________________________
>>>Jet Propulsion Laboratory            Pasadena, CA
>>>Office: 171-266B                        Mailstop:  171-246
>>>_______________________________________________________
>>>
>>>Disclaimer:  The opinions presented within are my own and do not reflect
>>>those of either NASA, JPL, or the California Institute of Technology.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>-------------------------------------------------------
>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
>>>      
>>>
>>Practices
>>    
>>
>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
>>>      
>>>
>>QA
>>    
>>
>>>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>>>_______________________________________________
>>>Nutch-general mailing list
>>>[hidden email]
>>>https://lists.sourceforge.net/lists/listinfo/nutch-general
>>>
>>>
>>>      
>>>
>
>
>
>-------------------------------------------------------
>SF.Net email is Sponsored by the Better Software Conference & EXPO
>September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>_______________________________________________
>Nutch-general mailing list
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/nutch-general
>  
>

Reply | Threaded
Open this post in threaded view
|

RE: [Nutch-general] RE: RSS Feed Parser

chrismattmann
Hi Jeff,
 
  Yup, that's correct.

Thanks,
 Chris


______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


> -----Original Message-----
> From: American Jeff Bowden [mailto:[hidden email]]
> Sent: Thursday, August 25, 2005 12:37 PM
> To: [hidden email]
> Cc: [hidden email]
> Subject: Re: [Nutch-general] RE: RSS Feed Parser
>
> I notice that build.xml still creates commons-feedparser-0.5.0-RC1.jar
> but I'll assume you're just renaming it manually to -0.6-fork.
>
> Thanks.
>
>
> Chris Mattmann wrote:
>
> >Hi Jeff,
> >
> > Okay, here is the link to commons-feedparser source that includes my
> >modifications:
> >
> >http://www-scf.usc.edu/~mattmann/feedparser-0.6-fork-src.zip
> >
> >
> >Thanks!
> >
> >Cheers,
> >  Chris
> >
> >
> >______________________________________________
> >Chris A. Mattmann
> >[hidden email]
> >Staff Member
> >Modeling and Data Management Systems Section (387)
> >Data Management Systems and Technologies Group
> >
> >_________________________________________________
> >Jet Propulsion Laboratory            Pasadena, CA
> >Office: 171-266B                        Mailstop:  171-246
> >_______________________________________________________
> >
> >Disclaimer:  The opinions presented within are my own and do not reflect
> >those of either NASA, JPL, or the California Institute of Technology.
> >
> >
> >
> >
> >
> >>-----Original Message-----
> >>From: Jeff Bowden [mailto:[hidden email]]
> >>Sent: Wednesday, August 24, 2005 10:45 PM
> >>To: [hidden email]; [hidden email]
> >>Subject: Re: [Nutch-general] RE: RSS Feed Parser
> >>
> >>Yes please, that would be great.  I couldn't even figure out where to
> >>find the 0.6 version of feedparser, much less your patches to it.
> >>
> >>Chris Mattmann wrote:
> >>
> >>
> >>
> >>>Hi Jeff,
> >>>
> >>>  commons-feedparser-fork was a branched off version of the feedparser
> >>>
> >>>
> >>0.6
> >>
> >>
> >>>base code that I made, which removed some of the specific jar files
> that
> >>>were part of standard 0.6 feedparser distro that conflicted with the
> jar
> >>>files included in Nutch's lib directory. Specifically, I changed it so
> >>>
> >>>
> >>that
> >>
> >>
> >>>the core jaxen libraries that the feed parser relied on weren't dom4j,
> >>>
> >>>
> >>but
> >>
> >>
> >>>in fact were jdom (see postings on the Nutch list around March 2005
> >>>
> >>>
> >>between
> >>
> >>
> >>>John X, Stefan G. and I). This required changing about 9 or 10 of the
> >>>
> >>>
> >>source
> >>
> >>
> >>>files for the feedparser to use the jdom Node classes rather than the
> >>>
> >>>
> >>dom4j.
> >>
> >>
> >>>If you like, I can put up a link to the feedparser forked code on my
> >>>website, and post the link to the list.
> >>>
> >>>Thanks,
> >>> Chris
> >>>
> >>>
> >>>
> >>>On 8/24/05 2:04 PM, "American Jeff Bowden" <[hidden email]>
> >>>wrote:
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
> >>>>doesn't appear to be in commons svn or on the feedparser site.
> >>>>
> >>>>Chris Mattmann wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>Hi Zaheed,
> >>>>>
> >>>>>Thanks for the nice comments. I've went ahead and wrote an HTML page
> >>>>>
> >>>>>
> >>that
> >>
> >>
> >>>>>summarizes what I sent to Zaheed with respect to installing the
> parse-
> >>>>>
> >>>>>
> >>rss
> >>
> >>
> >>>>>plugin. You can find the small guide here:
> >>>>>
> >>>>>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
> >>>>>
> >>>>>
> >>>>>Thanks,
> >>>>>Chris
> >>>>>
> >>>>>
> >>>>>______________________________________________
> >>>>>Chris A. Mattmann
> >>>>>[hidden email]
> >>>>>Staff Member
> >>>>>Modeling and Data Management Systems Section (387)
> >>>>>Data Management Systems and Technologies Group
> >>>>>
> >>>>>_________________________________________________
> >>>>>Jet Propulsion Laboratory            Pasadena, CA
> >>>>>Office: 171-266B                        Mailstop:  171-246
> >>>>>_______________________________________________________
> >>>>>
> >>>>>Disclaimer:  The opinions presented within are my own and do not
> >>>>>
> >>>>>
> >>reflect
> >>
> >>
> >>>>>those of either NASA, JPL, or the California Institute of Technology.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>-----Original Message-----
> >>>>>>From: Zaheed Haque [mailto:[hidden email]]
> >>>>>>Sent: Thursday, August 11, 2005 11:49 AM
> >>>>>>To: [hidden email]
> >>>>>>Subject: RSS Feed Parser
> >>>>>>
> >>>>>>Hello:
> >>>>>>
> >>>>>>I am realy hoping that Chris Mattmann RSS parser will make it to the
> >>>>>>release 0.7.
> >>>>>>
> >>>>>>http://issues.apache.org/jira/browse/NUTCH-30
> >>>>>>
> >>>>>>I got it working from last nights SVN. I believe newbie users like
> me
> >>>>>>would benefit very much having it as a part of the distribution. +1
> >>>>>>for this plugin!
> >>>>>>
> >>>>>>Thanks Chris for solving my problem!!
> >>>>>>--
> >>>>>>Best Regards
> >>>>>>Zaheed Haque
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>-------------------------------------------------------
> >>>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
> >>>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> >>>>>
> >>>>>
> >>Practices
> >>
> >>
> >>>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing
> &
> >>>>>
> >>>>>
> >>QA
> >>
> >>
> >>>>>Security * Process Improvement & Measurement *
> >>>>>
> >>>>>
> >>http://www.sqe.com/bsce5sf
> >>
> >>
> >>>>>_______________________________________________
> >>>>>Nutch-general mailing list
> >>>>>[hidden email]
> >>>>>https://lists.sourceforge.net/lists/listinfo/nutch-general
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>______________________________________________
> >>>Chris A. Mattmann
> >>>[hidden email]
> >>>Staff Member
> >>>Modeling and Data Management Systems Section (387)
> >>>Data Management Systems and Technologies Group
> >>>
> >>>_________________________________________________
> >>>Jet Propulsion Laboratory            Pasadena, CA
> >>>Office: 171-266B                        Mailstop:  171-246
> >>>_______________________________________________________
> >>>
> >>>Disclaimer:  The opinions presented within are my own and do not
> reflect
> >>>those of either NASA, JPL, or the California Institute of Technology.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>-------------------------------------------------------
> >>>SF.Net email is Sponsored by the Better Software Conference & EXPO
> >>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> >>>
> >>>
> >>Practices
> >>
> >>
> >>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
> >>>
> >>>
> >>QA
> >>
> >>
> >>>Security * Process Improvement & Measurement *
> http://www.sqe.com/bsce5sf
> >>>_______________________________________________
> >>>Nutch-general mailing list
> >>>[hidden email]
> >>>https://lists.sourceforge.net/lists/listinfo/nutch-general
> >>>
> >>>
> >>>
> >>>
> >
> >
> >
> >-------------------------------------------------------
> >SF.Net email is Sponsored by the Better Software Conference & EXPO
> >September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> Practices
> >Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
> QA
> >Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> >_______________________________________________
> >Nutch-general mailing list
> >[hidden email]
> >https://lists.sourceforge.net/lists/listinfo/nutch-general
> >
> >