best file system for NDFS?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

best file system for NDFS?

Stefan Groschupf-2
Hi geeks,

I have not that much much deep knowledge about the unix file systems,  
so my questions what would be the best file system for nutch  
distributed file systems data nodes?
Does it make any different using the one or the other file system?
Would reiserFS a good choice?

Thanks for any comments from the unix experts.
Stefan
Reply | Threaded
Open this post in threaded view
|

Re: best file system for NDFS?

Andrzej Białecki-2
Stefan Groschupf wrote:

> Hi geeks,
>
> I have not that much much deep knowledge about the unix file systems,  
> so my questions what would be the best file system for nutch  
> distributed file systems data nodes?
> Does it make any different using the one or the other file system?
> Would reiserFS a good choice?


Most of the time we deal with very large files, with sequential access.
Only in few places we deal with a lot of small files (e.g. indexing).
So, I think the best would be an FS optimized for efficient sequential
write/read of large files.

Is reiserfs such an FS? I'm not sure, I think these requirements point
rather to a fairly primitive FS (not FAT - a real FS ;-) ), perhaps
reiserfs is too complex.

When in doubt, test.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: best file system for NDFS?

Rod Taylor-2
On Tue, 2005-12-13 at 21:43 +0100, Andrzej Bialecki wrote:
>
> Most of the time we deal with very large files, with sequential
> access.
> Only in few places we deal with a lot of small files (e.g. indexing).
> So, I think the best would be an FS optimized for efficient
> sequential
> write/read of large files.

But beware what happens if you run more than one task per machine. Each
individual task might be sequential but several in parallel will
generate plenty of disk head movement that approximates parallel IO --
especially on a filesystem that uses small blocks and a driver with poor
read-ahead support.

--
Rod Taylor <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: best file system for NDFS?

Leen Toelen
In reply to this post by Andrzej Białecki-2
I would say the same. I don't think anyone can predict wat will
happen, so I suggest someone does some tests with different
filesystems AND different block sizes etc. Results will probably even
differ on different hardware as well.

Regards,
Leen Toelen


On 12/13/05, Andrzej Bialecki <[hidden email]> wrote:

> Stefan Groschupf wrote:
>
> > Hi geeks,
> >
> > I have not that much much deep knowledge about the unix file systems,
> > so my questions what would be the best file system for nutch
> > distributed file systems data nodes?
> > Does it make any different using the one or the other file system?
> > Would reiserFS a good choice?
>
>
> Most of the time we deal with very large files, with sequential access.
> Only in few places we deal with a lot of small files (e.g. indexing).
> So, I think the best would be an FS optimized for efficient sequential
> write/read of large files.
>
> Is reiserfs such an FS? I'm not sure, I think these requirements point
> rather to a fairly primitive FS (not FAT - a real FS ;-) ), perhaps
> reiserfs is too complex.
>
> When in doubt, test.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>