How to use Hadoop's API to write to a file in append mode?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to use Hadoop's API to write to a file in append mode?

fatcat-2
hi,
If there is a file in the HDFS already, and then I want to write to the file
in append mode, how to do this?

thanks.
Reply | Threaded
Open this post in threaded view
|

RE: How to use Hadoop's API to write to a file in append mode?

Mahadev Konar
Hi Wayne,
  HDFS does not support append mode. It is a write once file system.

Regards
Mahadev

> -----Original Message-----
> From: Wayne Liu [mailto:[hidden email]]
> Sent: Sunday, May 27, 2007 8:29 AM
> To: [hidden email]
> Subject: How to use Hadoop's API to write to a file in append mode?
>
> hi,
> If there is a file in the HDFS already, and then I want to write to the
> file
> in append mode, how to do this?
>
> thanks.

Reply | Threaded
Open this post in threaded view
|

Re: How to use Hadoop's API to write to a file in append mode?

fatcat-2
thanks,Mahadev

But as you know, the data we crawl from the web increase little by little,
if hadoop does not support append mode, then we have to create a new file
everytime when we get new data.I do not think it's a good idea.
Reply | Threaded
Open this post in threaded view
|

RE: How to use Hadoop's API to write to a file in append mode?

Yoram Arnon
This does impose some limitations. Not so much on crawling, but on making the information available.
You'd crawl, writing to a file until it's large enough or until some time has elapsed or something, then you'd close the file and
start a new one. At that time the file you closed becomes available to applications to consume, and it also becomes immutable.
If you want something like a daily file, you can basically set up a daily directory instead, since directories are fully modifyable
while files are not.

The write-once limitation on files may be removed in some future, but probably not in the near future.

Yoram

-----Original Message-----
From: Wayne Liu [mailto:[hidden email]]
Sent: Tuesday, May 29, 2007 4:50 AM
To: [hidden email]
Subject: Re: How to use Hadoop's API to write to a file in append mode?

thanks,Mahadev

But as you know, the data we crawl from the web increase little by little,
if hadoop does not support append mode, then we have to create a new file
everytime when we get new data.I do not think it's a good idea.