[jira] [Updated] (NUTCH-2507) NutchTutorial wiki pages as a lot of outdated command line calls when it starts with the solr interaction

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (NUTCH-2507) NutchTutorial wiki pages as a lot of outdated command line calls when it starts with the solr interaction

David Pilato (Jira)

     [ https://issues.apache.org/jira/browse/NUTCH-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Nagel updated NUTCH-2507:
-----------------------------------
    Fix Version/s: 1.17

> NutchTutorial wiki pages as a lot of outdated command line calls when it starts with the solr interaction
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-2507
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2507
>             Project: Nutch
>          Issue Type: Bug
>          Components: documentation
>    Affects Versions: 1.14
>            Reporter: artodeto
>            Priority: Major
>              Labels: documentation, easyfix
>             Fix For: 1.17
>
>
> h2. h2. Section "Step-by-Step: Indexing into Apache Solr"
> replace:
> {code:java}
> Example: bin/nutch index http://localhost:8983/solr crawl/crawldb/ -linkdb crawl/linkdb/ crawl/segments/20131108063838/ -filter -normalize -deleteGone{code}
> with:
> {code:java}
> Example: bin/nutch index -Dsolr.server.url=http://localhost:8983/solr/nutch ${NUTCH_RUNTIME_HOME}/crawl
> /crawldb/ -linkdb ${NUTCH_RUNTIME_HOME}/crawl
> /linkdb/ ${NUTCH_RUNTIME_HOME}/crawl
> /segments/20131108063838
> / -filter -normalize -deleteGo{code}
>  
> h2. Section "Step-by-Step: Deleting Duplicates"
> replace:
> {code:java}
>      Usage: bin/nutch dedup <solr url>
>      Example: /bin/nutch dedup http://localhost:8983/solr
> {code}
> with:
> {code:java}
>      Usage: bin/nutch dedup <path to the crawldb> <solr url>
>      Example: /bin/nutch dedup ${NUTCH_RUNTIME_HOME}/crawl/crawldb/ http://localhost:8983/sol
> {code}
> h2. Section "Step-by-Step: Cleaning Solr"
> replace:
> {code:java}
>      Usage: bin/nutch clean -Dsolr.server.url=<solr index url> <crawldb>
>      Example: /bin/nutch clean -Dsolr.server.url=http://localhost:8983/solr/nutch crawl/crawldb/
> {code}
> with:
> {code}
>      Usage: bin/nutch clean -Dsolr.server.url=<solr index url> <crawldb>
>      Example: /bin/nutch clean -Dsolr.server.url=http://localhost:8983/solr/nutch ${NUTCH_RUNTIME_HOME}/crawl/crawldb/
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)