Backwards compatibility strategy

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Backwards compatibility strategy

Sami Siren-2
Hello all,

Currently there are many places in Nutch that tries to handle older
formats of serialized data. This (at least in longer run) will make the
code harder to understand, harder to test and harder to maintain.

IMO it would be more clean to offer conversion with separate tools (like
   CrawlDbConverter) and keep the rest of the code clean from such
functionality. Opinions?

I personally favor starting from scratch when switching version but
probably there are users who wish to convert older data or are there?

--
 Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: Backwards compatibility strategy

Doğacan Güney-3
Hi,

On Nov 22, 2007 7:45 PM, Sami Siren <[hidden email]> wrote:
> Hello all,
>
> Currently there are many places in Nutch that tries to handle older
> formats of serialized data. This (at least in longer run) will make the
> code harder to understand, harder to test and harder to maintain.
>
> IMO it would be more clean to offer conversion with separate tools (like
>    CrawlDbConverter) and keep the rest of the code clean from such
> functionality. Opinions?

I disagree. Posts on nutch-user show that people are confused when we
break compatibility. If backward compatibility code within other code
is getting messy, then we can use conversion tools but they should be
transparent to regular user. For example, before a nutch job runs a
small program can check if any conversion needs to be applied (this
program can check comptaibility by reading a few records of a segment)
then print a warning and first run this conversion job then run
requested job.

>
> I personally favor starting from scratch when switching version but
> probably there are users who wish to convert older data or are there?
>
> --
>  Sami Siren
>



--
Doğacan Güney