about heritrix crawl,Who will tell me in this Nutch forum?thanks

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

about heritrix crawl,Who will tell me in this Nutch forum?thanks

xingjian
A.3. Mirroring .html Files Only in
http://crawler.archive.org/articles/user_manual/usecases.html

......
On the Setting screen, i'll want to set the following for the
NotMatchesFilePatternDecideRule:

decision: REJECT
use-preset-pattern: CUSTOM
regexp: .*(/|\.html)$


......

How to config above in Submodules of Heritrix ?I do't know.anyone help
me.Thanks