[Nutch Wiki] Update of "FrontPage" by ChrisMattmann

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[Nutch Wiki] Update of "FrontPage" by ChrisMattmann

Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "FrontPage" page has been changed by ChrisMattmann:

- add similarity documentation for naive bayes and cosine

   * [[http://www.covert.io/post/18414889381/accumulo-nutch-and-gora|Accumulo, Nutch, and Gora]] - A step-by-step tutorial /!\ Very Old /!\
  ==== Other Tutorial(s) ====
+  * Focused Crawling with Nutch using [[SimilarityScoringFilter|Cosine Similarity]] or using [[NaiveBayesParseFilter|Naive Bayes]].
   * [[http://hadoop.apache.org/common/docs/stable/|Hadoop Tutorial]] Nutch being based Hadoop, it helps to have a better understanding of Hadoop.
   * [[NutchHadoopSingleNodeTutorial|Running Nutch in (pseudo) distributed mode]] - How to setup and run Nutch in Hadoop pseudo-distributed mode.
   * RunNutchInEclipse - How to configure, build, crawl and debug Nutch within Eclipse