Corrupt DFS edits-file

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Corrupt DFS edits-file

Espen Amble Kolstad-2
Hi,

I run hadoop-0.9-dev and my edits-file has become corrupt. When I try to
start the namenode I get the following error:
2006-12-08 20:38:57,431 ERROR dfs.NameNode -
java.io.FileNotFoundException: Parent path does not exist:
/user/trank/dotno/segments/20061208154235/parse_data/part-00000
        at
org.apache.hadoop.dfs.FSDirectory$INode.addNode(FSDirectory.java:186)
        at
org.apache.hadoop.dfs.FSDirectory.unprotectedMkdir(FSDirectory.java:714)
        at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:254)
        at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191)
        at
org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320)
        at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226)
        at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:142)
        at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:134)
        at org.apache.hadoop.dfs.NameNode.main(NameNode.java:585)

I've grep'ed through my edits-file, to see what's wrong. It seems the
edits-file is missing an OP_MKDIR for
/user/trank/dotno/segments/20061208154235/parse_data.

Is there a tool for fixing an edits-file, or to put in an OP_MKDIR ?

- Espen
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt DFS edits-file

Albert Chern
This happened to me too, but the problem was the OP_MKDIR instructions
were in the wrong order.  That is, in the edits file the parent
directory was created after the child.  Maybe you should check to see
if that's the case.

I fixed it by using vi in combination with xxd.  When you have the
file open in vi, press escape and issue the command "%!xxd".  This
will convert the binary file to hexadecimal.  Then you can search
through and perform the necessary edits.  I don't remember what the
bytes were, but it was something like opcode, length of path (in
binary), path.  After you're done, issue the command "%!xxd -r" to
revert it to binary.  Remember to back up your files when you do this!
 I also had to kick off a trailing byte that got tagged on for some
reason during the binary/hex conversion.

Anyhow, this is a serious bug and could lead to data loss for a lot of
people.  I think we should report it.

On 12/8/06, Espen Amble Kolstad <[hidden email]> wrote:

> Hi,
>
> I run hadoop-0.9-dev and my edits-file has become corrupt. When I try to
> start the namenode I get the following error:
> 2006-12-08 20:38:57,431 ERROR dfs.NameNode -
> java.io.FileNotFoundException: Parent path does not exist:
> /user/trank/dotno/segments/20061208154235/parse_data/part-00000
>         at
> org.apache.hadoop.dfs.FSDirectory$INode.addNode(FSDirectory.java:186)
>         at
> org.apache.hadoop.dfs.FSDirectory.unprotectedMkdir(FSDirectory.java:714)
>         at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:254)
>         at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191)
>         at
> org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320)
>         at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:142)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:134)
>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:585)
>
> I've grep'ed through my edits-file, to see what's wrong. It seems the
> edits-file is missing an OP_MKDIR for
> /user/trank/dotno/segments/20061208154235/parse_data.
>
> Is there a tool for fixing an edits-file, or to put in an OP_MKDIR ?
>
> - Espen
>
Reply | Threaded
Open this post in threaded view
|

RE: Corrupt DFS edits-file

Dhruba Borthakur-2
Hi Albert and Espen,

With an eye on debugging more on this issue, I have the following questions:
1. Did you have more than one directory in dfs.name.dir?
2. Was this a new cluster or was it an existing cluster and was upgraded to
0.9.0 recently?
3. Did any unnatural  Namenode restarts occur immediately before the problem
started occurring?

With an eye on making it easier to recover from such a corruption:
1. Will it help to make the fsimage/edit file ascii, so that it can be
easily edited by hand?
2. Does it make sense for HDFS to automatically create a directory
equivalent to /lost+found? While EditLog processing, if the parent directory
of a file does not exist, the file can go into /lost+found?

Thanks,
dhruba

-----Original Message-----
From: Albert Chern [mailto:[hidden email]]
Sent: Friday, December 08, 2006 1:43 PM
To: [hidden email]
Subject: Re: Corrupt DFS edits-file

This happened to me too, but the problem was the OP_MKDIR instructions
were in the wrong order.  That is, in the edits file the parent
directory was created after the child.  Maybe you should check to see
if that's the case.

I fixed it by using vi in combination with xxd.  When you have the
file open in vi, press escape and issue the command "%!xxd".  This
will convert the binary file to hexadecimal.  Then you can search
through and perform the necessary edits.  I don't remember what the
bytes were, but it was something like opcode, length of path (in
binary), path.  After you're done, issue the command "%!xxd -r" to
revert it to binary.  Remember to back up your files when you do this!
 I also had to kick off a trailing byte that got tagged on for some
reason during the binary/hex conversion.

Anyhow, this is a serious bug and could lead to data loss for a lot of
people.  I think we should report it.

On 12/8/06, Espen Amble Kolstad <[hidden email]> wrote:

> Hi,
>
> I run hadoop-0.9-dev and my edits-file has become corrupt. When I try to
> start the namenode I get the following error:
> 2006-12-08 20:38:57,431 ERROR dfs.NameNode -
> java.io.FileNotFoundException: Parent path does not exist:
> /user/trank/dotno/segments/20061208154235/parse_data/part-00000
>         at
> org.apache.hadoop.dfs.FSDirectory$INode.addNode(FSDirectory.java:186)
>         at
> org.apache.hadoop.dfs.FSDirectory.unprotectedMkdir(FSDirectory.java:714)
>         at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:254)
>         at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191)
>         at
> org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320)
>         at
org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226)

>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:142)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:134)
>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:585)
>
> I've grep'ed through my edits-file, to see what's wrong. It seems the
> edits-file is missing an OP_MKDIR for
> /user/trank/dotno/segments/20061208154235/parse_data.
>
> Is there a tool for fixing an edits-file, or to put in an OP_MKDIR ?
>
> - Espen
>

Reply | Threaded
Open this post in threaded view
|

RE: Corrupt DFS edits-file

Christian Kunz
FYI: there is an open issue for this:
HADOOP-745

-Christian

-----Original Message-----
From: Dhruba Borthakur [mailto:[hidden email]]
Sent: Friday, December 08, 2006 2:46 PM
To: [hidden email]
Subject: RE: Corrupt DFS edits-file

Hi Albert and Espen,

With an eye on debugging more on this issue, I have the following questions:
1. Did you have more than one directory in dfs.name.dir?
2. Was this a new cluster or was it an existing cluster and was upgraded to
0.9.0 recently?
3. Did any unnatural  Namenode restarts occur immediately before the problem
started occurring?

With an eye on making it easier to recover from such a corruption:
1. Will it help to make the fsimage/edit file ascii, so that it can be
easily edited by hand?
2. Does it make sense for HDFS to automatically create a directory
equivalent to /lost+found? While EditLog processing, if the parent directory
of a file does not exist, the file can go into /lost+found?

Thanks,
dhruba

-----Original Message-----
From: Albert Chern [mailto:[hidden email]]
Sent: Friday, December 08, 2006 1:43 PM
To: [hidden email]
Subject: Re: Corrupt DFS edits-file

This happened to me too, but the problem was the OP_MKDIR instructions were
in the wrong order.  That is, in the edits file the parent directory was
created after the child.  Maybe you should check to see if that's the case.

I fixed it by using vi in combination with xxd.  When you have the file open
in vi, press escape and issue the command "%!xxd".  This will convert the
binary file to hexadecimal.  Then you can search through and perform the
necessary edits.  I don't remember what the bytes were, but it was something
like opcode, length of path (in binary), path.  After you're done, issue the
command "%!xxd -r" to revert it to binary.  Remember to back up your files
when you do this!
 I also had to kick off a trailing byte that got tagged on for some reason
during the binary/hex conversion.

Anyhow, this is a serious bug and could lead to data loss for a lot of
people.  I think we should report it.

On 12/8/06, Espen Amble Kolstad <[hidden email]> wrote:

> Hi,
>
> I run hadoop-0.9-dev and my edits-file has become corrupt. When I try
> to start the namenode I get the following error:
> 2006-12-08 20:38:57,431 ERROR dfs.NameNode -
> java.io.FileNotFoundException: Parent path does not exist:
> /user/trank/dotno/segments/20061208154235/parse_data/part-00000
>         at
> org.apache.hadoop.dfs.FSDirectory$INode.addNode(FSDirectory.java:186)
>         at
> org.apache.hadoop.dfs.FSDirectory.unprotectedMkdir(FSDirectory.java:714)
>         at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:254)
>         at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191)
>         at
> org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320)
>         at
org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226)

>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:142)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:134)
>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:585)
>
> I've grep'ed through my edits-file, to see what's wrong. It seems the
> edits-file is missing an OP_MKDIR for
> /user/trank/dotno/segments/20061208154235/parse_data.
>
> Is there a tool for fixing an edits-file, or to put in an OP_MKDIR ?
>
> - Espen
>


Reply | Threaded
Open this post in threaded view
|

Re: Corrupt DFS edits-file

Konstantin Shvachko
Could you please add your comments to HADOOP-745.
http://issues.apache.org/jira/browse/HADOOP-745

It could be helpful for who ever is going to fix it.

Christian Kunz wrote:

>FYI: there is an open issue for this:
>HADOOP-745
>
>-Christian
>
>-----Original Message-----
>From: Dhruba Borthakur [mailto:[hidden email]]
>Sent: Friday, December 08, 2006 2:46 PM
>To: [hidden email]
>Subject: RE: Corrupt DFS edits-file
>
>Hi Albert and Espen,
>
>With an eye on debugging more on this issue, I have the following questions:
>1. Did you have more than one directory in dfs.name.dir?
>2. Was this a new cluster or was it an existing cluster and was upgraded to
>0.9.0 recently?
>3. Did any unnatural  Namenode restarts occur immediately before the problem
>started occurring?
>
>With an eye on making it easier to recover from such a corruption:
>1. Will it help to make the fsimage/edit file ascii, so that it can be
>easily edited by hand?
>2. Does it make sense for HDFS to automatically create a directory
>equivalent to /lost+found? While EditLog processing, if the parent directory
>of a file does not exist, the file can go into /lost+found?
>
>Thanks,
>dhruba
>
>-----Original Message-----
>From: Albert Chern [mailto:[hidden email]]
>Sent: Friday, December 08, 2006 1:43 PM
>To: [hidden email]
>Subject: Re: Corrupt DFS edits-file
>
>This happened to me too, but the problem was the OP_MKDIR instructions were
>in the wrong order.  That is, in the edits file the parent directory was
>created after the child.  Maybe you should check to see if that's the case.
>
>I fixed it by using vi in combination with xxd.  When you have the file open
>in vi, press escape and issue the command "%!xxd".  This will convert the
>binary file to hexadecimal.  Then you can search through and perform the
>necessary edits.  I don't remember what the bytes were, but it was something
>like opcode, length of path (in binary), path.  After you're done, issue the
>command "%!xxd -r" to revert it to binary.  Remember to back up your files
>when you do this!
> I also had to kick off a trailing byte that got tagged on for some reason
>during the binary/hex conversion.
>
>Anyhow, this is a serious bug and could lead to data loss for a lot of
>people.  I think we should report it.
>
>On 12/8/06, Espen Amble Kolstad <[hidden email]> wrote:
>  
>
>>Hi,
>>
>>I run hadoop-0.9-dev and my edits-file has become corrupt. When I try
>>to start the namenode I get the following error:
>>2006-12-08 20:38:57,431 ERROR dfs.NameNode -
>>java.io.FileNotFoundException: Parent path does not exist:
>>/user/trank/dotno/segments/20061208154235/parse_data/part-00000
>>        at
>>org.apache.hadoop.dfs.FSDirectory$INode.addNode(FSDirectory.java:186)
>>        at
>>org.apache.hadoop.dfs.FSDirectory.unprotectedMkdir(FSDirectory.java:714)
>>        at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:254)
>>        at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191)
>>        at
>>org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320)
>>        at
>>    
>>
>org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226)
>  
>
>>        at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:142)
>>        at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:134)
>>        at org.apache.hadoop.dfs.NameNode.main(NameNode.java:585)
>>
>>I've grep'ed through my edits-file, to see what's wrong. It seems the
>>edits-file is missing an OP_MKDIR for
>>/user/trank/dotno/segments/20061208154235/parse_data.
>>
>>Is there a tool for fixing an edits-file, or to put in an OP_MKDIR ?
>>
>>- Espen
>>
>>    
>>
>
>
>
>
>  
>

Reply | Threaded
Open this post in threaded view
|

Re: RE: Corrupt DFS edits-file

Albert Chern
In reply to this post by Dhruba Borthakur-2
Hi Dhruba,

This happened some time ago so my memory's sketchy, but I'll do my
best to answer the questions:

> 1. Did you have more than one directory in dfs.name.dir?
> 2. Was this a new cluster or was it an existing cluster and was upgraded to
> 0.9.0 recently?

It happened after we restarted the DFS when upgrading from 0.7 to 0.8.
 In the process, we also added multiple directories to dfs.name.dir.

> 3. Did any unnatural  Namenode restarts occur immediately before the problem
> started occurring?

Not sure about this one.

> 1. Will it help to make the fsimage/edit file ascii, so that it can be
> easily edited by hand?

That's a good idea, but I don't know if the effect on backwards
compatibility is worth it.  Editing these files is probably not
something that most people will do.  Maybe some sort of conversion
tool that goes from the binary to text and vice versa would be more
useful.

> 2. Does it make sense for HDFS to automatically create a directory
> equivalent to /lost+found? While EditLog processing, if the parent directory
> of a file does not exist, the file can go into /lost+found?

Yes.  At least this way people can start up their DFSs after corruption.

On 12/8/06, Dhruba Borthakur <[hidden email]> wrote:

> Hi Albert and Espen,
>
> With an eye on debugging more on this issue, I have the following questions:
> 1. Did you have more than one directory in dfs.name.dir?
> 2. Was this a new cluster or was it an existing cluster and was upgraded to
> 0.9.0 recently?
> 3. Did any unnatural  Namenode restarts occur immediately before the problem
> started occurring?
>
> With an eye on making it easier to recover from such a corruption:
> 1. Will it help to make the fsimage/edit file ascii, so that it can be
> easily edited by hand?
> 2. Does it make sense for HDFS to automatically create a directory
> equivalent to /lost+found? While EditLog processing, if the parent directory
> of a file does not exist, the file can go into /lost+found?
>
> Thanks,
> dhruba
>
> -----Original Message-----
> From: Albert Chern [mailto:[hidden email]]
> Sent: Friday, December 08, 2006 1:43 PM
> To: [hidden email]
> Subject: Re: Corrupt DFS edits-file
>
> This happened to me too, but the problem was the OP_MKDIR instructions
> were in the wrong order.  That is, in the edits file the parent
> directory was created after the child.  Maybe you should check to see
> if that's the case.
>
> I fixed it by using vi in combination with xxd.  When you have the
> file open in vi, press escape and issue the command "%!xxd".  This
> will convert the binary file to hexadecimal.  Then you can search
> through and perform the necessary edits.  I don't remember what the
> bytes were, but it was something like opcode, length of path (in
> binary), path.  After you're done, issue the command "%!xxd -r" to
> revert it to binary.  Remember to back up your files when you do this!
>  I also had to kick off a trailing byte that got tagged on for some
> reason during the binary/hex conversion.
>
> Anyhow, this is a serious bug and could lead to data loss for a lot of
> people.  I think we should report it.
>
> On 12/8/06, Espen Amble Kolstad <[hidden email]> wrote:
> > Hi,
> >
> > I run hadoop-0.9-dev and my edits-file has become corrupt. When I try to
> > start the namenode I get the following error:
> > 2006-12-08 20:38:57,431 ERROR dfs.NameNode -
> > java.io.FileNotFoundException: Parent path does not exist:
> > /user/trank/dotno/segments/20061208154235/parse_data/part-00000
> >         at
> > org.apache.hadoop.dfs.FSDirectory$INode.addNode(FSDirectory.java:186)
> >         at
> > org.apache.hadoop.dfs.FSDirectory.unprotectedMkdir(FSDirectory.java:714)
> >         at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:254)
> >         at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191)
> >         at
> > org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320)
> >         at
> org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226)
> >         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:142)
> >         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:134)
> >         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:585)
> >
> > I've grep'ed through my edits-file, to see what's wrong. It seems the
> > edits-file is missing an OP_MKDIR for
> > /user/trank/dotno/segments/20061208154235/parse_data.
> >
> > Is there a tool for fixing an edits-file, or to put in an OP_MKDIR ?
> >
> > - Espen
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt DFS edits-file

Philippe Gassmann
In reply to this post by Espen Amble Kolstad-2

Espen Amble Kolstad a écrit :

> Hi,
>
> I run hadoop-0.9-dev and my edits-file has become corrupt. When I try to
> start the namenode I get the following error:
> 2006-12-08 20:38:57,431 ERROR dfs.NameNode -
> java.io.FileNotFoundException: Parent path does not exist:
> /user/trank/dotno/segments/20061208154235/parse_data/part-00000
>         at
> org.apache.hadoop.dfs.FSDirectory$INode.addNode(FSDirectory.java:186)
>         at
> org.apache.hadoop.dfs.FSDirectory.unprotectedMkdir(FSDirectory.java:714)
>         at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:254)
>         at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191)
>         at
> org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320)
>         at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:142)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:134)
>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:585)
>
> I've grep'ed through my edits-file, to see what's wrong. It seems the
> edits-file is missing an OP_MKDIR for
> /user/trank/dotno/segments/20061208154235/parse_data.
>
> Is there a tool for fixing an edits-file, or to put in an OP_MKDIR ?
>
> - Espen

Hi all,

Some time ago, I had a similar issue
(http://issues.apache.org/jira/browse/HADOOP-760 that duplicates
http://issues.apache.org/jira/browse/HADOOP-227).

My first thougths about that was to do automatic checkpointing by
merging edits logs to the fsimage (as described in HADOOP-227).

But this approach cannot be considered if edits logs are corrupted (=
non mergeable). So I believe we should think about another recovery method.

AFAIK, datanodes are only aware about blocks they are owning. I think we
could add a little bit more information with each blocks : the path on
the filesystem and the block number. If the namenode is totally crashed
(corrupted edit logs), the fs image could be quite easily rebuilt by
quierying all datanodes about their blocks.

WDYT ?

cheers,
--
Philippe.
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt DFS edits-file

Konstantin Shvachko
Philippe,
Periodic checkpointing will bound the size of the edits file.
So it will not grow as big as it does now, and even if it will get
corrupted that will be a relatively small amount
of information compared to current state when one can loose weeks of
data if the name-node is not restarted periodically.

Another thing is that the name-node should fall into safe mode when a
log edit transaction fails, and
wait until the administrator fixes the problem and turns safe mode off.

Espen,
I once had a corrupted edits file. Don't remember what was corrupted,
but the behavior was similar, the name-node
won't start. I included some custom code into FSImage.loadFSImage to
deal with the inconsistency.
Once the correct image was created I discarded the custom code.
In your case the log is trying to create a directory named
/user/trank/dotno/segments/20061208154235/parse_data/part-00000
which is wrong, since part-00000 is supposed to be a file.
Have you already restored your image?

--Konstantin

Philippe Gassmann wrote:

>
> Espen Amble Kolstad a écrit :
>
>> Hi,
>>
>> I run hadoop-0.9-dev and my edits-file has become corrupt. When I try to
>> start the namenode I get the following error:
>> 2006-12-08 20:38:57,431 ERROR dfs.NameNode -
>> java.io.FileNotFoundException: Parent path does not exist:
>> /user/trank/dotno/segments/20061208154235/parse_data/part-00000
>>         at
>> org.apache.hadoop.dfs.FSDirectory$INode.addNode(FSDirectory.java:186)
>>         at
>> org.apache.hadoop.dfs.FSDirectory.unprotectedMkdir(FSDirectory.java:714)
>>         at
>> org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:254)
>>         at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191)
>>         at
>> org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320)
>>         at
>> org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226)
>>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:142)
>>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:134)
>>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:585)
>>
>> I've grep'ed through my edits-file, to see what's wrong. It seems the
>> edits-file is missing an OP_MKDIR for
>> /user/trank/dotno/segments/20061208154235/parse_data.
>>
>> Is there a tool for fixing an edits-file, or to put in an OP_MKDIR ?
>>
>> - Espen
>
>
> Hi all,
>
> Some time ago, I had a similar issue
> (http://issues.apache.org/jira/browse/HADOOP-760 that duplicates
> http://issues.apache.org/jira/browse/HADOOP-227).
>
> My first thougths about that was to do automatic checkpointing by
> merging edits logs to the fsimage (as described in HADOOP-227).
>
> But this approach cannot be considered if edits logs are corrupted (=
> non mergeable). So I believe we should think about another recovery
> method.
>
> AFAIK, datanodes are only aware about blocks they are owning. I think
> we could add a little bit more information with each blocks : the path
> on the filesystem and the block number. If the namenode is totally
> crashed (corrupted edit logs), the fs image could be quite easily
> rebuilt by quierying all datanodes about their blocks.
>
> WDYT ?
>
> cheers,
> --
> Philippe.
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Corrupt DFS edits-file

Andrzej Białecki-2
Konstantin Shvachko wrote:
> In your case the log is trying to create a directory named
> /user/trank/dotno/segments/20061208154235/parse_data/part-00000
> which is wrong, since part-00000 is supposed to be a file.

Not so - parse_data is created using MapFileOutputFormat, which creates
as many part-xxxxx subdirs (MapFile's) as there are reduce tasks, and
puts {data, index} in them. So, .../parse_data/part-00000 should be a
directory ...

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Urgent: Production Issues

Jagadeesh-3
In reply to this post by Konstantin Shvachko
Hi All,

I am running Hadoop 0.7.2 in a production environment and it has stored
~170GB of data. Please read below the deployment architecture I am using.

I am using 4 nodes with 1.3TB storage each and the master node is not being
used for storage. So I have 5 servers in total out of which 4 servers are
running Hadoop nodes. This setup was working fine for the last 20-25 days
and there were no issues. As mentioned earlier, now the total storage has
gone upto ~170GB.

Couple of days back, I noticed an error where Hadoop was not accepting new
files, I mean the upload always failed, but download was still working
great. I was getting the exception, writing <filename>.crc failed. When I
tried restarting the service, I was getting the message, jobtracker not
available and tasktracker not available. Then I had to kill all the
processes in the master node as well as in the client nodes to restart the
service.

After that everything worked fine for a day more and now I keep on getting
the message

failure closing block of file /user/root/.LICENSE.txt2233331.crc to node
node1:50010

Even if I restart the service, I get this message after 10 minutes.

I read in the mailing list that this issues is resolved in 0.9.0, but I am a
bit skeptical about moving to 0.9.0 as I don't know whether I will end up
loosing the files that are already stored. Kindly confirm this and I wil
move to 0.9.0 and also please tell me the steps or pre-cautions I should
take before moving to 0.9.0.

Thanks and Regards
Jugs

Reply | Threaded
Open this post in threaded view
|

RE: Urgent: Production Issues

Jagadeesh-3
Hi All,

Over the past day we have managed to migrate our clusters from 0.7.2 to
0.9.0. We had no hitches migrating our development cluster, and we just
finished up with upgrading our live cluster, through the process we picked
up a few tips for anybody else who is taking the same path:

* Always take a backup of your edits and fsimage files before migrating.

* Since  the live cluster was running for more than 3 months,  we  had some
problems with shutting it down  - something we didn't forsee when we tried
it on a much smaller development cluster (since it is restarted frequently)
. However I managed to do that by killing all the processes in the master
node as well as in the cluster nodes. There is no harm in doing that I
believe as the application interfacing with HDFS was shut down before
killing the processes.

* Please note that the Namenode was in Safe mode for a longer period that I
expected and I believe it was trying to re-index the files.

* With this new release there were some hiccups initially with respect to
the number of simultaneous connections allowed and we solved it by
introducing an object pool in our application.

 

Also, as a level of redundancy with the primary index, we are now mirroring
it using SuSE Linux clustering (it is a commercial product as part of SuSE
Linux Enterprise Server 10.2). This is the best way we have found to
introduce further redundancy with the index, previously we had tried to
solve this problem by synching and using heartbeat, and we also had our own
solution which would synch and attempt to detect a failure by making
frequent requests, but neither worked as well as the SuSE server.

 

We created virtual environments (User modes) in 3 servers running SuSE and
the masternode is running within that. So any change in those environments
will be propagated to other 2 servers in the SuSE cluster and in the event
of one server going down, other will take charge and the application can
still communicate using the same hostname / ip address, a very fast and
stable solution. Failure detection we developed ourselves so that the
cluster could respond faster, with custom requests and responses to make
sure our application is functioning (eg. instead of doing a very simple ping
test to check the state of a server in the cluster, we would do a number of
application-level API calls to make sure that the server is not only alive,
but that it responds with the expected result and there is full data
integrity).

 

I would like to thank everybody who helped us out with the Hadoop aspects of
our storage cluster, if you are experiencing something similar to what we
are feel free to contact me.


Merry X'mas and Happy New Year!!!

 

Thanks

Jugs

-----Original Message-----
From: Jagadeesh [mailto:jagadeesh]
Sent: Monday, December 18, 2006 11:30 AM
To: '[hidden email]'
Subject: Urgent: Production Issues

Hi All,

I am running Hadoop 0.7.2 in a production environment and it has stored
~170GB of data. Please read below the deployment architecture I am using.

I am using 4 nodes with 1.3TB storage each and the master node is not being
used for storage. So I have 5 servers in total out of which 4 servers are
running Hadoop nodes. This setup was working fine for the last 20-25 days
and there were no issues. As mentioned earlier, now the total storage has
gone upto ~170GB.

Couple of days back, I noticed an error where Hadoop was not accepting new
files, I mean the upload always failed, but download was still working
great. I was getting the exception, writing <filename>.crc failed. When I
tried restarting the service, I was getting the message, jobtracker not
available and tasktracker not available. Then I had to kill all the
processes in the master node as well as in the client nodes to restart the
service.

After that everything worked fine for a day more and now I keep on getting
the message

failure closing block of file /user/root/.LICENSE.txt2233331.crc to node
node1:50010

Even if I restart the service, I get this message after 10 minutes.

I read in the mailing list that this issues is resolved in 0.9.0, but I am a
bit skeptical about moving to 0.9.0 as I don't know whether I will end up
loosing the files that are already stored. Kindly confirm this and I wil
move to 0.9.0 and also please tell me the steps or pre-cautions I should
take before moving to 0.9.0.

Thanks and Regards
Jugs

Reply | Threaded
Open this post in threaded view
|

Re: Urgent: Production Issues

Doug Cutting
Jagadeesh wrote:
> Over the past day we have managed to migrate our clusters from 0.7.2 to
> 0.9.0.

Thanks for sharing your experiences.

Please note that there is now a 0.9.2 release.  There should be no
compatibility issues upgrading from 0.9.0 to 0.9.2, and a number of bugs
are fixed, so I would recommend that upgrade.

Cheers,

Doug
Reply | Threaded
Open this post in threaded view
|

Re: Urgent: Production Issues

Konstantin Shvachko
In reply to this post by Jagadeesh-3
Jagadeesh wrote:

>Hi All,
>
>Over the past day we have managed to migrate our clusters from 0.7.2 to
>0.9.0. We had no hitches migrating our development cluster, and we just
>finished up with upgrading our live cluster, through the process we picked
>up a few tips for anybody else who is taking the same path:
>
>* Always take a backup of your edits and fsimage files before migrating.
>  
>
There is a procedure describing steps to be taken to make an upgrade safer
http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
It is recommended to follow at least non-optional steps during upgrade.
An automatic upgrade implementation (coming soon) will make upgrades
less painful.

>* Since  the live cluster was running for more than 3 months,  we  had some
>problems with shutting it down  - something we didn't forsee when we tried
>it on a much smaller development cluster (since it is restarted frequently)
>. However I managed to do that by killing all the processes in the master
>node as well as in the cluster nodes. There is no harm in doing that I
>believe as the application interfacing with HDFS was shut down before
>killing the processes.
>
>* Please note that the Namenode was in Safe mode for a longer period that I
>expected and I believe it was trying to re-index the files.
>  
>
The reason for that is most probably that the edits file got very big so
the name-node had to spend
a lot of time merging the edits with the image.
It is highly recommended to restart the cluster or at least the
name-node once in a while
until the periodic checkpointing is implemented.

>* With this new release there were some hiccups initially with respect to
>the number of simultaneous connections allowed and we solved it by
>introducing an object pool in our application.
>
>
>
>Also, as a level of redundancy with the primary index, we are now mirroring
>it using SuSE Linux clustering (it is a commercial product as part of SuSE
>Linux Enterprise Server 10.2). This is the best way we have found to
>introduce further redundancy with the index, previously we had tried to
>solve this problem by synching and using heartbeat, and we also had our own
>solution which would synch and attempt to detect a failure by making
>frequent requests, but neither worked as well as the SuSE server.
>  
>
Hadoop 0.9.* introduces multiple directories for replicating name-space
image and edits files.
So that if you loose one hard drive you have the same image on another
drive potentially on another
node accessible through nfs.

>We created virtual environments (User modes) in 3 servers running SuSE and
>the masternode is running within that. So any change in those environments
>will be propagated to other 2 servers in the SuSE cluster and in the event
>of one server going down, other will take charge and the application can
>still communicate using the same hostname / ip address, a very fast and
>stable solution. Failure detection we developed ourselves so that the
>cluster could respond faster, with custom requests and responses to make
>sure our application is functioning (eg. instead of doing a very simple ping
>test to check the state of a server in the cluster, we would do a number of
>application-level API calls to make sure that the server is not only alive,
>but that it responds with the expected result and there is full data
>integrity).
>  
>
Very interesting.

>I would like to thank everybody who helped us out with the Hadoop aspects of
>our storage cluster, if you are experiencing something similar to what we
>are feel free to contact me.
>
>
>Merry X'mas and Happy New Year!!!
>
>
>
>Thanks
>
>Jugs
>
>-----Original Message-----
>From: Jagadeesh [mailto:jagadeesh]
>Sent: Monday, December 18, 2006 11:30 AM
>To: '[hidden email]'
>Subject: Urgent: Production Issues
>
>Hi All,
>
>I am running Hadoop 0.7.2 in a production environment and it has stored
>~170GB of data. Please read below the deployment architecture I am using.
>
>I am using 4 nodes with 1.3TB storage each and the master node is not being
>used for storage. So I have 5 servers in total out of which 4 servers are
>running Hadoop nodes. This setup was working fine for the last 20-25 days
>and there were no issues. As mentioned earlier, now the total storage has
>gone upto ~170GB.
>
>Couple of days back, I noticed an error where Hadoop was not accepting new
>files, I mean the upload always failed, but download was still working
>great. I was getting the exception, writing <filename>.crc failed. When I
>tried restarting the service, I was getting the message, jobtracker not
>available and tasktracker not available. Then I had to kill all the
>processes in the master node as well as in the client nodes to restart the
>service.
>
>After that everything worked fine for a day more and now I keep on getting
>the message
>
>failure closing block of file /user/root/.LICENSE.txt2233331.crc to node
>node1:50010
>
>Even if I restart the service, I get this message after 10 minutes.
>
>I read in the mailing list that this issues is resolved in 0.9.0, but I am a
>bit skeptical about moving to 0.9.0 as I don't know whether I will end up
>loosing the files that are already stored. Kindly confirm this and I wil
>move to 0.9.0 and also please tell me the steps or pre-cautions I should
>take before moving to 0.9.0.
>
>Thanks and Regards
>Jugs
>
>
>
>  
>