NDFS Bug, Mapred from SVN - Tokenizer and New Line Error

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

NDFS Bug, Mapred from SVN - Tokenizer and New Line Error

Jon Shoberg
I'm trying to start a NDFS datanode and keep getting the following error:

[jon@vortech nutchmapre]$ bin/nutch datanode
050728 213401 10 parsing file:/usr/local/nutchmapre/conf/nutch-default.xml
050728 213402 10 parsing file:/usr/local/nutchmapre/conf/nutch-site.xml
050728 213402 10 Opened server at 7000
050728 213402 11 Starting DataNode in: /tmp/nutch/ndfs/data/data
050728 213402 11 Exception: java.util.NoSuchElementException
050728 213402 11 Lost connection to namenode.  Retrying...

I opened the source and added a stack trace to
src/java/org/apache/nutch/ndfs/DataNode.java

    551   public void run() {
    552     LOG.info("Starting DataNode in: "+data.data);
    553     while (true) {
    554       try {
    555         offerService();
    556       } catch (Exception ex) {
    557         LOG.info("Exception: " + ex);
    558         LOG.info("Lost connection to namenode.  Retrying...");
    559         ex.printStackTrace(); /*** Added by [hidden email] ***/
    560         try {
    561           Thread.sleep(5000);
    562         } catch (InterruptedException ie) {
    563         }
    564       }
    565     }
    566   }

The stack trace presents the following:

java.util.NoSuchElementException
        at java.util.StringTokenizer.nextToken(StringTokenizer.java:259)
        at org.apache.nutch.ndfs.DF.<init>(DF.java:52)
        at org.apache.nutch.ndfs.FSDataset.getCapacity(FSDataset.java:204)
        at org.apache.nutch.ndfs.DataNode.offerService(DataNode.java:134)
        at org.apache.nutch.ndfs.DataNode.run(DataNode.java:555)
        at java.lang.Thread.run(Thread.java:534)

Looking at the code in  src/java/org/apache/nutch/ndfs/DF.java

     38     Process process = Runtime.getRuntime().exec(new String[]
{"df","-k",path});
     39
     40     try {
     41       if (process.waitFor() == 0) {
     42         BufferedReader lines =
     43           new BufferedReader(new
InputStreamReader(process.getInputStream()));
     44
     45         lines.readLine();                         // skip headings
     46
     47         StringTokenizer tokens =
     48           new StringTokenizer(lines.readLine(), "\t\n\r\f%");
     49
     50         this.filesystem = tokens.nextToken();
     51         this.capacity = Long.parseLong(tokens.nextToken()) * 1024;
     52         this.used = Long.parseLong(tokens.nextToken()) * 1024;
     53         this.available = Long.parseLong(tokens.nextToken()) * 1024;
     54         this.percentUsed = Integer.parseInt(tokens.nextToken());
     55         this.mount = tokens.nextToken();
     56
     57       } else {
     58         throw new IOException
     59           (new BufferedReader(new
InputStreamReader(process.getErrorStream()))
     60            .readLine());
     61       }

There is a call to "df -k".  Here is the output from my df -k

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                     286735816  31398804 240771636  12% /
/dev/hda1               101086     31962     63905  34% /boot
none                    387484         0    387484   0% /dev/shm

I'm sure this email will not format the text 100% but you can see there
is an extra newline at /dev/mapper.  This should be easy to fix, I have
some local hacks but may be able to submit something more final.

-j