NDFS question

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

NDFS question

Egor Chernodarov
Hello!

I want to test NDFS on my nutch installation, but I have some problem.
I have started from wiki, where is quick demo for NDFS:
http://wiki.apache.org/nutch/NutchDistributedFileSystem

On "$ nutch ndfs -put local_file /test/testfile"(or ./nutch admin db
-create and etc.) I always have exception "Could not obtain new output block":
=======================================================================
050830 061956 Waiting to find target node
Exception in thread "main" java.io.IOException: Could not obtain new
output block for file /test/testfile
        at org.apache.nutch.ndfs.NDFSClient$NameNodeCaller.getNewOutputBlock(NDFSClient.java:921)
        at org.apache.nutch.ndfs.NDFSClient$NDFSOutputStream.nextBlockOutputStream(NDFSClient.java:616)
        at org.apache.nutch.ndfs.NDFSClient$NDFSOutputStream.<init>(NDFSClient.java:597)
        at org.apache.nutch.ndfs.NDFSClient.create(NDFSClient.java:85)
        at org.apache.nutch.fs.NDFSFileSystem.create(NDFSFileSystem.java:76)
        at org.apache.nutch.fs.NDFSFileSystem.create(NDFSFileSystem.java:71)
        at org.apache.nutch.io.SequenceFile$Writer.<init>(SequenceFile.java:80)
        at org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:94)
        at org.apache.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1507)
        at org.apache.nutch.db.WebDBWriter.createWebDB(WebDBWriter.java:1438)
        at org.apache.nutch.tools.WebDBAdminTool.main(WebDBAdminTool.java:172)
=======================================================================

On namenode I see something like this:
=======================================================================
050830 061445 Pending transfer from server.domain.com:7000 to 3 destinations
050830 061447 Renewed lease [Lease.  Holder: NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
050830 061448 Pending transfer from server.domain.com:7000 to 3 destinations
050830 061451 Pending transfer from server.domain.com:7000 to 3 destinations
050830 061454 Pending transfer from server.domain.com:7000 to 3 destinations
050830 061455 Renewed lease [Lease.  Holder: NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
050830 061457 Pending transfer from server.domain.com:7000 to 3 destinations
050830 061500 Pending transfer from server.domain.com:7000 to 3 destinations
050830 061503 Pending transfer from server.domain.com:7000 to 3 destinations
050830 061503 Renewed lease [Lease.  Holder: NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
=======================================================================

But if I run datanode and namenode on the same server - all's ok!

On "$ nutch ndfs -report" I see list of my datanodes, but these
datanodes defined by external hostname. I think that namenode tries
to connect to datanodes by this NOT LOCAL hostnames. It is impossible
because firewall not allow incoming connection from external network
interfaces to this port(7000).

It's right? The error can be generated in this case?

So, can you tell me, please, what I can make to define namenode for use
local interfaces for data transfer?  I can't reconfigure firewall..

Red hat ES3.0, nutch-2005-08-25 (>nutch-0.7).
$ java -version
java version "1.4.2-01"
Java(TM) 2 Runtime Environment, Standard Edition (build Blackdown-1.4.2-01)
Java HotSpot(TM) 64-Bit Server VM (build Blackdown-1.4.2-01, mixed mode)


Thanks for your time!


--
Best regards,
 Chernodarov Egor
                               

Reply | Threaded
Open this post in threaded view
|

Another NDFS question

Ian C. Blenke
Egor Chernodarov wrote:

>Hello!
>
>I want to test NDFS on my nutch installation, but I have some problem.
>I have started from wiki, where is quick demo for NDFS:
>http://wiki.apache.org/nutch/NutchDistributedFileSystem
>  
>
Would there be any interest in a FUSE (well FUSE-J) or FiST system level
filesystem presentation?

I've written CornFS to solve an internal cluster storage problem, but
NDFS looks like it would address the distributed archival problem with
an eye toward retreival.
(http://ian.blenke.com/blog/projects/cornfs/cornfs.html)

As Lucene/Apache are more multi-platform, something akin to a WebDAV
backend might be more appropriate.

When NDFS is exposed to userspace for scripts to use, admins types will
embrace it for managing the cluster.

It might not be a focus now, but it's seems to be a low hanging fruit
that would only prove to help the project.

 - Ian C. Blenke <[hidden email]> <[hidden email]> http://ian.blenke.com


Reply | Threaded
Open this post in threaded view
|

Re: NDFS question

Doug Cutting-2
In reply to this post by Egor Chernodarov
It sounds like you're using a nightly build of trunk.  The NDFS code in
trunk is old.  The NDFS code is currently maintained in a branch named
"mapred".  Please check out the mapred branch and retry.

svn co https://svn.apache.org/repos/asf/lucene/nutch/branches/mapred/

Doug

Egor Chernodarov wrote:

> Hello!
>
> I want to test NDFS on my nutch installation, but I have some problem.
> I have started from wiki, where is quick demo for NDFS:
> http://wiki.apache.org/nutch/NutchDistributedFileSystem
>
> On "$ nutch ndfs -put local_file /test/testfile"(or ./nutch admin db
> -create and etc.) I always have exception "Could not obtain new output block":
> =======================================================================
> 050830 061956 Waiting to find target node
> Exception in thread "main" java.io.IOException: Could not obtain new
> output block for file /test/testfile
>         at org.apache.nutch.ndfs.NDFSClient$NameNodeCaller.getNewOutputBlock(NDFSClient.java:921)
>         at org.apache.nutch.ndfs.NDFSClient$NDFSOutputStream.nextBlockOutputStream(NDFSClient.java:616)
>         at org.apache.nutch.ndfs.NDFSClient$NDFSOutputStream.<init>(NDFSClient.java:597)
>         at org.apache.nutch.ndfs.NDFSClient.create(NDFSClient.java:85)
>         at org.apache.nutch.fs.NDFSFileSystem.create(NDFSFileSystem.java:76)
>         at org.apache.nutch.fs.NDFSFileSystem.create(NDFSFileSystem.java:71)
>         at org.apache.nutch.io.SequenceFile$Writer.<init>(SequenceFile.java:80)
>         at org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:94)
>         at org.apache.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1507)
>         at org.apache.nutch.db.WebDBWriter.createWebDB(WebDBWriter.java:1438)
>         at org.apache.nutch.tools.WebDBAdminTool.main(WebDBAdminTool.java:172)
> =======================================================================
>
> On namenode I see something like this:
> =======================================================================
> 050830 061445 Pending transfer from server.domain.com:7000 to 3 destinations
> 050830 061447 Renewed lease [Lease.  Holder: NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
> 050830 061448 Pending transfer from server.domain.com:7000 to 3 destinations
> 050830 061451 Pending transfer from server.domain.com:7000 to 3 destinations
> 050830 061454 Pending transfer from server.domain.com:7000 to 3 destinations
> 050830 061455 Renewed lease [Lease.  Holder: NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
> 050830 061457 Pending transfer from server.domain.com:7000 to 3 destinations
> 050830 061500 Pending transfer from server.domain.com:7000 to 3 destinations
> 050830 061503 Pending transfer from server.domain.com:7000 to 3 destinations
> 050830 061503 Renewed lease [Lease.  Holder: NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
> =======================================================================
>
> But if I run datanode and namenode on the same server - all's ok!
>
> On "$ nutch ndfs -report" I see list of my datanodes, but these
> datanodes defined by external hostname. I think that namenode tries
> to connect to datanodes by this NOT LOCAL hostnames. It is impossible
> because firewall not allow incoming connection from external network
> interfaces to this port(7000).
>
> It's right? The error can be generated in this case?
>
> So, can you tell me, please, what I can make to define namenode for use
> local interfaces for data transfer?  I can't reconfigure firewall..
>
> Red hat ES3.0, nutch-2005-08-25 (>nutch-0.7).
> $ java -version
> java version "1.4.2-01"
> Java(TM) 2 Runtime Environment, Standard Edition (build Blackdown-1.4.2-01)
> Java HotSpot(TM) 64-Bit Server VM (build Blackdown-1.4.2-01, mixed mode)
>
>
> Thanks for your time!
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Another NDFS question

Doug Cutting-2
In reply to this post by Ian C. Blenke
Ian C. Blenke wrote:
> When NDFS is exposed to userspace for scripts to use, admins types will
> embrace it for managing the cluster.

Our intent is to add some servlets which run on each datanode providing
access to the filesystem for non-Java programs.

Most operations would be quite simple, e.g.:

- to write a file, post its content to a url like:
   <a href="http://datanode:XXXX/write?name=my.file">http://datanode:XXXX/write?name=my.file

- to read a file, get file content from urls like:
   <a href="http://datanode:XXXX/read?name=my.file">http://datanode:XXXX/read?name=my.file
   <a href="http://datanode:XXXX/read?name=my.file&start=2048&length=1024">http://datanode:XXXX/read?name=my.file&start=2048&length=1024

- to remove a file:
   <a href="http://datanode:XXX/remove?name=my.file">http://datanode:XXX/remove?name=my.file

Similarly for rename, copy, etc.

The only somewhat complicated thing would be directory listings.  These
would be handled with a simple REST interface, where some simple XML is
returned.  Ideally a stylesheet could be specified so that one can use
the directory listing url to view the filesystem from a brower.

These servlets could easily be implemented in terms of the
NutchFileSystem API, and deployed with Jetty.  To my knowledge, no one
is currently working on this.  A volunteer would be welcome.

Doug
Reply | Threaded
Open this post in threaded view
|

Re: Another NDFS question

Ian C. Blenke
Doug Cutting wrote:

> Ian C. Blenke wrote:
>
>> When NDFS is exposed to userspace for scripts to use, admins types
>> will embrace it for managing the cluster.
>
> Our intent is to add some servlets which run on each datanode
> providing access to the filesystem for non-Java programs.
>
> Most operations would be quite simple, e.g.:
>
> - to write a file, post its content to a url like:
>   <a href="http://datanode:XXXX/write?name=my.file">http://datanode:XXXX/write?name=my.file
>
> - to read a file, get file content from urls like:
>   <a href="http://datanode:XXXX/read?name=my.file">http://datanode:XXXX/read?name=my.file
>   <a href="http://datanode:XXXX/read?name=my.file&start=2048&length=1024">http://datanode:XXXX/read?name=my.file&start=2048&length=1024
>
> - to remove a file:
>   <a href="http://datanode:XXX/remove?name=my.file">http://datanode:XXX/remove?name=my.file
> Similarly for rename, copy, etc.

Not very RESTful, but simple.

> The only somewhat complicated thing would be directory listings.  
> These would be handled with a simple REST interface, where some simple
> XML is returned.  Ideally a stylesheet could be specified so that one
> can use the directory listing url to view the filesystem from a brower.

 From a bash scripting standpoint, this would be complicated to access
without a userspace command to wrap it.

A RESTish interface works well for perl/python/ruby, though I think they
would much rather have a native object wrapper (SWIG something together).

> These servlets could easily be implemented in terms of the
> NutchFileSystem API, and deployed with Jetty.  To my knowledge, no one
> is currently working on this.  A volunteer would be welcome.

If portability is a key goal, FUSE or FiST probably aren't the ideal (no
Windows or OS/X ports, for example).

A simple WebDAV interface seems like the closest thing to a standard
that you are attempting to approximate with the RESTful interface. The
added benefit would be support from DavFS2, Finder, Microsoft
Webfolders, etc.

Perhaps something that plugs into Jakarta Slide? A NDFS backend to Slide
would potentially benefit a distributed CMS as well (without  a
versioning history, as that appears to be beyond the scope of NDFS).

I would be interested in implementing something like this if there is
indeed interest.

- Ian C. Blenke <[hidden email]> <[hidden email]> http://ian.blenke.com/


Reply | Threaded
Open this post in threaded view
|

Re: Re: Another NDFS question

Erik Hatcher
In reply to this post by Doug Cutting-2
What you've just described, Doug, is WebDAV!   There is an  
implementation of it built into Tomcat, but a more full-featured  
version is Slide - http://jakarta.apache.org/slide/ .

There is also JSR (#170) for a content repository, being implemented  
open-source as Jackrabbit: http://incubator.apache.org/projects/ 
jackrabbit.html

Apache's mod_dav is also well worth mentioning, as it is extensible  
and surely quite fast.

I'm not sure how well any of these that I've mentioned jive with the  
goals of NDFS.  I have done a fair bit of homework on WebDAV in the  
past, once even implementing a prototype server before Slide was viable.

     Erik



On Aug 30, 2005, at 12:08 PM, Doug Cutting wrote:

> Ian C. Blenke wrote:
>
>> When NDFS is exposed to userspace for scripts to use, admins types  
>> will embrace it for managing the cluster.
>>
>
> Our intent is to add some servlets which run on each datanode  
> providing access to the filesystem for non-Java programs.
>
> Most operations would be quite simple, e.g.:
>
> - to write a file, post its content to a url like:
>   <a href="http://datanode:XXXX/write?name=my.file">http://datanode:XXXX/write?name=my.file
>
> - to read a file, get file content from urls like:
>   <a href="http://datanode:XXXX/read?name=my.file">http://datanode:XXXX/read?name=my.file
>   <a href="http://datanode:XXXX/read?name=my.file&start=2048&length=1024">http://datanode:XXXX/read?name=my.file&start=2048&length=1024
>
> - to remove a file:
>   <a href="http://datanode:XXX/remove?name=my.file">http://datanode:XXX/remove?name=my.file
>
> Similarly for rename, copy, etc.
>
> The only somewhat complicated thing would be directory listings.  
> These would be handled with a simple REST interface, where some  
> simple XML is returned.  Ideally a stylesheet could be specified so  
> that one can use the directory listing url to view the filesystem  
> from a brower.
>
> These servlets could easily be implemented in terms of the  
> NutchFileSystem API, and deployed with Jetty.  To my knowledge, no  
> one is currently working on this.  A volunteer would be welcome.
>
> Doug
>
>
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle  
> Practices
> Agile & Plan-Driven Development * Managing Projects & Teams *  
> Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/ 
> bsce5sf
> _______________________________________________
> Nutch-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/nutch-developers
>

Reply | Threaded
Open this post in threaded view
|

Re: Another NDFS question

Doug Cutting-2
In reply to this post by Ian C. Blenke
Ian C. Blenke wrote:
>> The only somewhat complicated thing would be directory listings.  
>> These would be handled with a simple REST interface, where some simple
>> XML is returned.  Ideally a stylesheet could be specified so that one
>> can use the directory listing url to view the filesystem from a brower.
>
>  From a bash scripting standpoint, this would be complicated to access
> without a userspace command to wrap it.

Good point.  With WebDAV has cadaver for shell access, so maybe WebDAV
is the way to go.

> A simple WebDAV interface seems like the closest thing to a standard
> that you are attempting to approximate with the RESTful interface. The
> added benefit would be support from DavFS2, Finder, Microsoft
> Webfolders, etc.
>
> Perhaps something that plugs into Jakarta Slide? A NDFS backend to Slide
> would potentially benefit a distributed CMS as well (without  a
> versioning history, as that appears to be beyond the scope of NDFS).
>
> I would be interested in implementing something like this if there is
> indeed interest.

That would be great!

NDFS is designed to reliably and efficiently support very large data
collections.  It is not designed to be a full-featured replacement for
desktop filesystems, but rather is a lean-and-mean storage system for
distributed computations.  Its primary users are developers and system
administrators.  Such folks don't require fancy graphical user
interfaces, but they are a nice bonus.  Programmatic access from
non-Java is also a goal.  Easy publishing from, e.g., web authoring
tools is not a goal.

WebDAV looks to me to meet these needs without too much baggage.  It may
encourage non-target audiences to use NDFS, but we can deal with that as
a documentation issue.  For example, sophisticated versioning, security
and permission systems are outside the scope of NDFS.

Doug
Reply | Threaded
Open this post in threaded view
|

Re[2]: NDFS question

Egor Chernodarov
In reply to this post by Doug Cutting-2
Hello, Doug!

I try with "mapred" branch, but anyway get errors like this:
$./nutch ndfs -put ./test.txt /test.txt
=====================
050831 055936 Client connection to 192.168.0.170:9000: starting
050831 060245 Waiting to find target node
=====================
On namenode I see :
050831 055936 Server connection on port 9000 from 192.168.0.170: starting

At the same time $./nutch ndfs -report    works fine:
=====================
Total effective bytes: 0 (0.0 k)
Effective replication multiplier: Infinity
-------------------------------------------------
Datanodes available: 1

Name: server.domain.com:7000
Total raw bytes: 75487932416 (70.30 Gb)
Used raw bytes: 7289752863 (6.78 Gb)
% used: 9.65%
Last contact with namenode: Wed Aug 31 06:08:32 CDT 2005
=====================

What also I can try? I really interested in NDFS...

Thanks for any help.


Tuesday, August 30, 2005, 10:51:10 PM, you wrote:

Doug Cutting> It sounds like you're using a nightly
Doug Cutting> build of trunk.  The NDFS code in
Doug Cutting> trunk is old.  The NDFS code is currently
Doug Cutting> maintained in a branch named
Doug Cutting> "mapred".  Please check out the mapred branch and retry.

Doug Cutting> svn co
Doug Cutting> https://svn.apache.org/repos/asf/lucene/nutch/branches/mapred/

Doug Cutting> Doug

Doug Cutting> Egor Chernodarov wrote:

>> Hello!
>>
>> I want to test NDFS on my nutch installation, but I have some problem.
>> I have started from wiki, where is quick demo for NDFS:
>> http://wiki.apache.org/nutch/NutchDistributedFileSystem
>>
>> On "$ nutch ndfs -put local_file /test/testfile"(or ./nutch admin db
>> -create and etc.) I always have exception "Could not obtain new output block":
>> =======================================================================
>> 050830 061956 Waiting to find target node
>> Exception in thread "main" java.io.IOException: Could not obtain new
>> output block for file /test/testfile
>>         at
>> org.apache.nutch.ndfs.NDFSClient$NameNodeCaller.getNewOutputBlock(NDFSClient.java:921)
>>         at
>> org.apache.nutch.ndfs.NDFSClient$NDFSOutputStream.nextBlockOutputStream(NDFSClient.java:616)
>>         at
>> org.apache.nutch.ndfs.NDFSClient$NDFSOutputStream.<init>(NDFSClient.java:597)
>>         at
>> org.apache.nutch.ndfs.NDFSClient.create(NDFSClient.java:85)
>>         at
>> org.apache.nutch.fs.NDFSFileSystem.create(NDFSFileSystem.java:76)
>>         at
>> org.apache.nutch.fs.NDFSFileSystem.create(NDFSFileSystem.java:71)
>>         at
>> org.apache.nutch.io.SequenceFile$Writer.<init>(SequenceFile.java:80)
>>         at
>> org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:94)
>>         at
>> org.apache.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1507)
>>         at
>> org.apache.nutch.db.WebDBWriter.createWebDB(WebDBWriter.java:1438)
>>         at
>> org.apache.nutch.tools.WebDBAdminTool.main(WebDBAdminTool.java:172)
>> =======================================================================
>>
>> On namenode I see something like this:
>> =======================================================================
>> 050830 061445 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061447 Renewed lease [Lease.  Holder:
>> NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
>> 050830 061448 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061451 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061454 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061455 Renewed lease [Lease.  Holder:
>> NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
>> 050830 061457 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061500 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061503 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061503 Renewed lease [Lease.  Holder:
>> NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
>> =======================================================================
>>
>> But if I run datanode and namenode on the same server - all's ok!
>>
>> On "$ nutch ndfs -report" I see list of my datanodes, but these
>> datanodes defined by external hostname. I think that namenode tries
>> to connect to datanodes by this NOT LOCAL hostnames. It is impossible
>> because firewall not allow incoming connection from external network
>> interfaces to this port(7000).
>>
>> It's right? The error can be generated in this case?
>>
>> So, can you tell me, please, what I can make to define namenode for use
>> local interfaces for data transfer?  I can't reconfigure firewall..
>>
>> Red hat ES3.0, nutch-2005-08-25 (>nutch-0.7).
>> $ java -version
>> java version "1.4.2-01"
>> Java(TM) 2 Runtime Environment, Standard Edition (build Blackdown-1.4.2-01)
>> Java HotSpot(TM) 64-Bit Server VM (build Blackdown-1.4.2-01, mixed mode)
>>
>>
>> Thanks for your time!
>>
>>



--
Best regards,              
 Chernodarov Egor

Reply | Threaded
Open this post in threaded view
|

Re[2]: NDFS question

Egor Chernodarov
In reply to this post by Doug Cutting-2
Hello, Doug!

 I have fixed my problem. As I suppose, problem was with network
 interfaces. Datanode take internet(external) address, instead local.
 I believe that it can be configured in virtual machine, but I can't
 find where it.

 I think that many peoples have several IP per server, but need to use
 specific IP for NDFS. My solution for this situation is below.
 Changes at NDFS/DataNode.java:
--------------------------------------------------------------------
   public DataNode(NutchConf conf, String datadir) throws IOException {
        this(/*InetAddress.getLocalHost().getHostName(),*/
             conf.get(InetAddress.getLocalHost().getHostName()+".realip",InetAddress.getLocalHost().getHostName()),
             new File(datadir),
             createSocketAddr(conf.get("fs.default.name", "local")));
--------------------------------------------------------------------

And now we can define hostname or IP for ndfs at nutch-site.xml like this:
<property>
  <name>your.hostname.here.realip</name>
  <value>192.168.0.24</value>
</property>


 P.S. I've used nutch-mapred release.

Tuesday, August 30, 2005, 10:51:10 PM, you wrote:

Doug Cutting> It sounds like you're using a nightly
Doug Cutting> build of trunk.  The NDFS code in
Doug Cutting> trunk is old.  The NDFS code is currently
Doug Cutting> maintained in a branch named
Doug Cutting> "mapred".  Please check out the mapred branch and retry.

Doug Cutting> svn co
Doug Cutting> https://svn.apache.org/repos/asf/lucene/nutch/branches/mapred/

Doug Cutting> Doug

Doug Cutting> Egor Chernodarov wrote:

>> Hello!
>>
>> I want to test NDFS on my nutch installation, but I have some problem.
>> I have started from wiki, where is quick demo for NDFS:
>> http://wiki.apache.org/nutch/NutchDistributedFileSystem
>>
>> On "$ nutch ndfs -put local_file /test/testfile"(or ./nutch admin db
>> -create and etc.) I always have exception "Could not obtain new output block":
>> =======================================================================
>> 050830 061956 Waiting to find target node
>> Exception in thread "main" java.io.IOException: Could not obtain new
>> output block for file /test/testfile
>>         at
>> org.apache.nutch.ndfs.NDFSClient$NameNodeCaller.getNewOutputBlock(NDFSClient.java:921)
>>         at
>> org.apache.nutch.ndfs.NDFSClient$NDFSOutputStream.nextBlockOutputStream(NDFSClient.java:616)
>>         at
>> org.apache.nutch.ndfs.NDFSClient$NDFSOutputStream.<init>(NDFSClient.java:597)
>>         at
>> org.apache.nutch.ndfs.NDFSClient.create(NDFSClient.java:85)
>>         at
>> org.apache.nutch.fs.NDFSFileSystem.create(NDFSFileSystem.java:76)
>>         at
>> org.apache.nutch.fs.NDFSFileSystem.create(NDFSFileSystem.java:71)
>>         at
>> org.apache.nutch.io.SequenceFile$Writer.<init>(SequenceFile.java:80)
>>         at
>> org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:94)
>>         at
>> org.apache.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1507)
>>         at
>> org.apache.nutch.db.WebDBWriter.createWebDB(WebDBWriter.java:1438)
>>         at
>> org.apache.nutch.tools.WebDBAdminTool.main(WebDBAdminTool.java:172)
>> =======================================================================
>>
>> On namenode I see something like this:
>> =======================================================================
>> 050830 061445 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061447 Renewed lease [Lease.  Holder:
>> NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
>> 050830 061448 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061451 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061454 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061455 Renewed lease [Lease.  Holder:
>> NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
>> 050830 061457 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061500 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061503 Pending transfer from server.domain.com:7000 to 3 destinations
>> 050830 061503 Renewed lease [Lease.  Holder:
>> NDFSClient_-1094164187, heldlocks: 1, pendingcreates: 1]
>> =======================================================================
>>
>> But if I run datanode and namenode on the same server - all's ok!
>>
>> On "$ nutch ndfs -report" I see list of my datanodes, but these
>> datanodes defined by external hostname. I think that namenode tries
>> to connect to datanodes by this NOT LOCAL hostnames. It is impossible
>> because firewall not allow incoming connection from external network
>> interfaces to this port(7000).
>>
>> It's right? The error can be generated in this case?
>>
>> So, can you tell me, please, what I can make to define namenode for use
>> local interfaces for data transfer?  I can't reconfigure firewall..
>>
>> Red hat ES3.0, nutch-2005-08-25 (>nutch-0.7).
>> $ java -version
>> java version "1.4.2-01"
>> Java(TM) 2 Runtime Environment, Standard Edition (build Blackdown-1.4.2-01)
>> Java HotSpot(TM) 64-Bit Server VM (build Blackdown-1.4.2-01, mixed mode)
>>
>>
>> Thanks for your time!
>>
>>



--
Best regards,              
 Chernodarov Egor