segread vs. readseg

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

segread vs. readseg

Stefan Groschupf-2
Hi developers,

we have command like readdb and readlinkdb but segread. Wouldn't be  
more consistent to name the command readseg instead segread?
... just a thought.

Stefan


Reply | Threaded
Open this post in threaded view
|

Re: segread vs. readseg

Andrzej Białecki-2
Stefan Groschupf wrote:
> Hi developers,
>
> we have command like readdb and readlinkdb but segread. Wouldn't be
> more consistent to name the command readseg instead segread?
> ... just a thought.

Yes, it seems more consistent. However, if we change it then scripts
people wrote would break. We could support both aliases in 0.8, and give
a deprecation message.

What do others think?

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: segread vs. readseg

Stefan Neufeind
Andrzej Bialecki wrote:

> Stefan Groschupf wrote:
>> Hi developers,
>>
>> we have command like readdb and readlinkdb but segread. Wouldn't be
>> more consistent to name the command readseg instead segread?
>> ... just a thought.
>
> Yes, it seems more consistent. However, if we change it then scripts
> people wrote would break. We could support both aliases in 0.8, and give
> a deprecation message.
>
> What do others think?

Same feeling here. Agreed.

   Stefan
Reply | Threaded
Open this post in threaded view
|

Re: segread vs. readseg

Andrzej Białecki-2
Stefan Neufeind wrote:

> Andrzej Bialecki wrote:
>> Stefan Groschupf wrote:
>>> Hi developers,
>>>
>>> we have command like readdb and readlinkdb but segread. Wouldn't be
>>> more consistent to name the command readseg instead segread?
>>> ... just a thought.
>>
>> Yes, it seems more consistent. However, if we change it then scripts
>> people wrote would break. We could support both aliases in 0.8, and
>> give a deprecation message.
>>
>> What do others think?
>
> Same feeling here. Agreed.

What about the following?

Index: bin/nutch
===================================================================
--- bin/nutch    (revision 424960)
+++ bin/nutch    (working copy)
@@ -40,7 +40,7 @@
   echo "  generate          generate new segments to fetch"
   echo "  fetch             fetch a segment's pages"
   echo "  parse             parse a segment's pages"
-  echo "  segread           read / dump segment data"
+  echo "  readseg           read / dump segment data"
   echo "  mergesegs         merge several segments, with optional
filtering and slicing"
   echo "  updatedb          update crawl db from segments after fetching"
   echo "  invertlinks       create a linkdb from parsed segments"
@@ -158,7 +158,10 @@
   CLASS=org.apache.nutch.crawl.CrawlDbMerger
 elif [ "$COMMAND" = "readlinkdb" ] ; then
   CLASS=org.apache.nutch.crawl.LinkDbReader
+elif [ "$COMMAND" = "readseg" ] ; then
+  CLASS=org.apache.nutch.segment.SegmentReader
 elif [ "$COMMAND" = "segread" ] ; then
+  echo "[DEPRECATED] Command 'segread' is deprecated, use 'readseg'
instead."
   CLASS=org.apache.nutch.segment.SegmentReader
 elif [ "$COMMAND" = "mergesegs" ] ; then
   CLASS=org.apache.nutch.segment.SegmentMerger


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: segread vs. readseg

Stefan Groschupf-2
I like it!

Am 24.07.2006 um 16:10 schrieb Andrzej Bialecki:

> Stefan Neufeind wrote:
>> Andrzej Bialecki wrote:
>>> Stefan Groschupf wrote:
>>>> Hi developers,
>>>>
>>>> we have command like readdb and readlinkdb but segread. Wouldn't  
>>>> be more consistent to name the command readseg instead segread?
>>>> ... just a thought.
>>>
>>> Yes, it seems more consistent. However, if we change it then  
>>> scripts people wrote would break. We could support both aliases  
>>> in 0.8, and give a deprecation message.
>>>
>>> What do others think?
>>
>> Same feeling here. Agreed.
>
> What about the following?
>
> Index: bin/nutch
> ===================================================================
> --- bin/nutch    (revision 424960)
> +++ bin/nutch    (working copy)
> @@ -40,7 +40,7 @@
>   echo "  generate          generate new segments to fetch"
>   echo "  fetch             fetch a segment's pages"
>   echo "  parse             parse a segment's pages"
> -  echo "  segread           read / dump segment data"
> +  echo "  readseg           read / dump segment data"
>   echo "  mergesegs         merge several segments, with optional  
> filtering and slicing"
>   echo "  updatedb          update crawl db from segments after  
> fetching"
>   echo "  invertlinks       create a linkdb from parsed segments"
> @@ -158,7 +158,10 @@
>   CLASS=org.apache.nutch.crawl.CrawlDbMerger
> elif [ "$COMMAND" = "readlinkdb" ] ; then
>   CLASS=org.apache.nutch.crawl.LinkDbReader
> +elif [ "$COMMAND" = "readseg" ] ; then
> +  CLASS=org.apache.nutch.segment.SegmentReader
> elif [ "$COMMAND" = "segread" ] ; then
> +  echo "[DEPRECATED] Command 'segread' is deprecated, use  
> 'readseg' instead."
>   CLASS=org.apache.nutch.segment.SegmentReader
> elif [ "$COMMAND" = "mergesegs" ] ; then
>   CLASS=org.apache.nutch.segment.SegmentMerger
>
>
> --
> Best regards,
> Andrzej Bialecki     <><
> ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>