Running into an Issue

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Running into an Issue

Jamal, Sarfaraz
So I feel I have made some progress on Nutch

However I am now getting another error which I am having difficulty navigating through:

bin/nutch inject TestCrawl/crawldb url

produces this below

Do you have to run Cygwin under Admin for it to work?

Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
        at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
        at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:570)
        at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
        at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:173)
        at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:160)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
        at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
        at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Unknown Source)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:376)
        at org.apache.nutch.crawl.Injector.run(Injector.java:467)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.nutch.crawl.Injector.main(Injector.java:441)
Reply | Threaded
Open this post in threaded view
|

RE: Running into an Issue

Markus Jelsma-2
Hi , there are some Windows API calls in there that i will never understand. Are there some kinds of symlinks you are working with or whatever they are called in Windows? There must be something with Nutch/Hadoop getting access to your disk. Check permissions, disk space and whatever you can think of.

M.
 
-----Original message-----

> From:Jamal, Sarfaraz <[hidden email]>
> Sent: Monday 11th July 2016 22:46
> To: Nutch help <[hidden email]>
> Subject: Running into an Issue
>
> So I feel I have made some progress on Nutch
>
> However I am now getting another error which I am having difficulty navigating through:
>
> bin/nutch inject TestCrawl/crawldb url
>
> produces this below
>
> Do you have to run Cygwin under Admin for it to work?
>
> Injector: Converting injected urls to crawl db entries.
> Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
>         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
>         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:570)
>         at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
>         at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:173)
>         at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:160)
>         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
>         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
>         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
>         at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
>         at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
>         at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Unknown Source)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:376)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
>
Reply | Threaded
Open this post in threaded view
|

RE: Running into an Issue

Jamal, Sarfaraz
It's actually using Cygwin on top of windows. But yes, I do believe Cygwin might be employing use of symbolic links.
However, It is not running under Administrator -

I will try to get it up on a sandbox or virtual machine that I can have Admin access to.

Thanks for your help,

Sas

-----Original Message-----
From: Markus Jelsma [mailto:[hidden email]]
Sent: Tuesday, July 12, 2016 4:52 AM
To: [hidden email]
Subject: RE: Running into an Issue

Hi , there are some Windows API calls in there that i will never understand. Are there some kinds of symlinks you are working with or whatever they are called in Windows? There must be something with Nutch/Hadoop getting access to your disk. Check permissions, disk space and whatever you can think of.

M.
 
-----Original message-----

> From:Jamal, Sarfaraz <[hidden email]>
> Sent: Monday 11th July 2016 22:46
> To: Nutch help <[hidden email]>
> Subject: Running into an Issue
>
> So I feel I have made some progress on Nutch
>
> However I am now getting another error which I am having difficulty navigating through:
>
> bin/nutch inject TestCrawl/crawldb url
>
> produces this below
>
> Do you have to run Cygwin under Admin for it to work?
>
> Injector: Converting injected urls to crawl db entries.
> Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
>         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
>         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:570)
>         at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
>         at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:173)
>         at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:160)
>         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
>         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
>         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
>         at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
>         at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
>         at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Unknown Source)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:376)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
>
Reply | Threaded
Open this post in threaded view
|

RE: Running into an Issue

Jamal, Sarfaraz
In reply to this post by Markus Jelsma-2
Hi Markus,

So I am now running this on a machine that I have administrative access to
And I am getting a different error message:

Do you (or anyone) have any ideas?

$ ./nutch inject ../TestCrawl/crawldb ../url [I tried from the bin folder and the root nutch folder[

Injector: starting at 2016-07-13 10:02:54
Injector: crawlDb: ../TestCrawl/crawldb
Injector: urlDir: ../url
Injector: Converting injected urls to crawl db entries.
Injector: java.lang.NullPointerException
        at java.lang.ProcessBuilder.start(Unknown Source)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
        at org.apache.hadoop.util.Shell.run(Shell.java:418)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:                                650)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys                                tem.java:633)
        at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.                                java:467)
        at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav                                a:456)
        at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav                                a:424)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:849)
        at org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:1149)
        at org.apache.nutch.util.LockUtil.createLockFile(LockUtil.java:58)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:357)
        at org.apache.nutch.crawl.Injector.run(Injector.java:467)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.nutch.crawl.Injector.main(Injector.java:441)

 

-----Original Message-----
From: Markus Jelsma [mailto:[hidden email]]
Sent: Tuesday, July 12, 2016 4:52 AM
To: [hidden email]
Subject: RE: Running into an Issue

Hi , there are some Windows API calls in there that i will never understand. Are there some kinds of symlinks you are working with or whatever they are called in Windows? There must be something with Nutch/Hadoop getting access to your disk. Check permissions, disk space and whatever you can think of.

M.
 
-----Original message-----

> From:Jamal, Sarfaraz <[hidden email]>
> Sent: Monday 11th July 2016 22:46
> To: Nutch help <[hidden email]>
> Subject: Running into an Issue
>
> So I feel I have made some progress on Nutch
>
> However I am now getting another error which I am having difficulty navigating through:
>
> bin/nutch inject TestCrawl/crawldb url
>
> produces this below
>
> Do you have to run Cygwin under Admin for it to work?
>
> Injector: Converting injected urls to crawl db entries.
> Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
>         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
>         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:570)
>         at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
>         at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:173)
>         at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:160)
>         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
>         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
>         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
>         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
>         at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
>         at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
>         at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Unknown Source)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:376)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
>
Reply | Threaded
Open this post in threaded view
|

RE: Running into an Issue

Markus Jelsma-2
In reply to this post by Jamal, Sarfaraz
Jamal, i really have no idea but it smells like a Windows related problem again.

See:
http://stackoverflow.com/questions/27201505/hadoop-exception-in-thread-main-java-lang-nullpointerexception
http://stackoverflow.com/questions/30379441/mapreduce-development-inside-eclipse-on-windows

Markus
 
 
-----Original message-----

> From:Jamal, Sarfaraz <[hidden email]>
> Sent: Wednesday 13th July 2016 16:09
> To: [hidden email]
> Subject: RE: Running into an Issue
>
> Hi Markus,
>
> So I am now running this on a machine that I have administrative access to
> And I am getting a different error message:
>
> Do you (or anyone) have any ideas?
>
> $ ./nutch inject ../TestCrawl/crawldb ../url [I tried from the bin folder and the root nutch folder[
>
> Injector: starting at 2016-07-13 10:02:54
> Injector: crawlDb: ../TestCrawl/crawldb
> Injector: urlDir: ../url
> Injector: Converting injected urls to crawl db entries.
> Injector: java.lang.NullPointerException
>         at java.lang.ProcessBuilder.start(Unknown Source)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
>         at org.apache.hadoop.util.Shell.run(Shell.java:418)
>         at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:                                650)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
>         at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys                                tem.java:633)
>         at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.                                java:467)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav                                a:456)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav                                a:424)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:849)
>         at org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:1149)
>         at org.apache.nutch.util.LockUtil.createLockFile(LockUtil.java:58)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:357)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
>

>
> -----Original Message-----
> From: Markus Jelsma [mailto:[hidden email]]
> Sent: Tuesday, July 12, 2016 4:52 AM
> To: [hidden email]
> Subject: RE: Running into an Issue
>
> Hi , there are some Windows API calls in there that i will never understand. Are there some kinds of symlinks you are working with or whatever they are called in Windows? There must be something with Nutch/Hadoop getting access to your disk. Check permissions, disk space and whatever you can think of.
>
> M.

> -----Original message-----
> > From:Jamal, Sarfaraz <[hidden email]>
> > Sent: Monday 11th July 2016 22:46
> > To: Nutch help <[hidden email]>
> > Subject: Running into an Issue
> >
> > So I feel I have made some progress on Nutch
> >
> > However I am now getting another error which I am having difficulty navigating through:
> >
> > bin/nutch inject TestCrawl/crawldb url
> >
> > produces this below
> >
> > Do you have to run Cygwin under Admin for it to work?
> >
> > Injector: Converting injected urls to crawl db entries.
> > Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
> >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
> >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:570)
> >         at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
> >         at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:173)
> >         at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:160)
> >         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
> >         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
> >         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
> >         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
> >         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
> >         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
> >         at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
> >         at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
> >         at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
> >         at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
> >         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> >         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Unknown Source)
> >         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> >         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> >         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
> >         at org.apache.nutch.crawl.Injector.inject(Injector.java:376)
> >         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> >         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: Running into an Issue

Jamal, Sarfaraz
HI Markus,

I got nutch up and running on centos =) so yay!

However, I would still like to try and see if I can get it up and running on windows.

I enabled debug and saw some small problems and fixed them:

Now I am getting this error, does this one ring a bell by any chance? (to anyone?).

$ bin/nutch inject testcrawl/crawldb urls
Injector: starting at 2016-07-14 10:01:08
Injector: crawlDb: testcrawl/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.elapsedMillis()J
        at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:278)
        at org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)
        at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:375)
        at org.apache.hadoop.mapreduce.lib.input.DelegatingInputFormat.getSplits(DelegatingInputFormat.java:115)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:493)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:376)
        at org.apache.nutch.crawl.Injector.run(Injector.java:467)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.nutch.crawl.Injector.main(Injector.java:441)


-----Original Message-----
From: Markus Jelsma [mailto:[hidden email]]
Sent: Wednesday, July 13, 2016 10:44 AM
To: [hidden email]
Subject: RE: Running into an Issue

Jamal, i really have no idea but it smells like a Windows related problem again.

See:
http://stackoverflow.com/questions/27201505/hadoop-exception-in-thread-main-java-lang-nullpointerexception
http://stackoverflow.com/questions/30379441/mapreduce-development-inside-eclipse-on-windows

Markus
 
 
-----Original message-----

> From:Jamal, Sarfaraz <[hidden email]>
> Sent: Wednesday 13th July 2016 16:09
> To: [hidden email]
> Subject: RE: Running into an Issue
>
> Hi Markus,
>
> So I am now running this on a machine that I have administrative
> access to And I am getting a different error message:
>
> Do you (or anyone) have any ideas?
>
> $ ./nutch inject ../TestCrawl/crawldb ../url [I tried from the bin
> folder and the root nutch folder[
>
> Injector: starting at 2016-07-13 10:02:54
> Injector: crawlDb: ../TestCrawl/crawldb
> Injector: urlDir: ../url
> Injector: Converting injected urls to crawl db entries.
> Injector: java.lang.NullPointerException
>         at java.lang.ProcessBuilder.start(Unknown Source)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
>         at org.apache.hadoop.util.Shell.run(Shell.java:418)
>         at
>org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:                               
>650)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
>         at
>org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys                               
>tem.java:633)
>         at
>org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.                               
>java:467)
>         at
>org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav                               
>a:456)
>         at
>org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav                               
>a:424)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:849)
>         at
>org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:1149)
>         at
>org.apache.nutch.util.LockUtil.createLockFile(LockUtil.java:58)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:357)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
>

>
> -----Original Message-----
> From: Markus Jelsma [mailto:[hidden email]]
> Sent: Tuesday, July 12, 2016 4:52 AM
> To: [hidden email]
> Subject: RE: Running into an Issue
>
> Hi , there are some Windows API calls in there that i will never understand. Are there some kinds of symlinks you are working with or whatever they are called in Windows? There must be something with Nutch/Hadoop getting access to your disk. Check permissions, disk space and whatever you can think of.
>
> M.

> -----Original message-----
> > From:Jamal, Sarfaraz <[hidden email]>
> > Sent: Monday 11th July 2016 22:46
> > To: Nutch help <[hidden email]>
> > Subject: Running into an Issue
> >
> > So I feel I have made some progress on Nutch
> >
> > However I am now getting another error which I am having difficulty navigating through:
> >
> > bin/nutch inject TestCrawl/crawldb url
> >
> > produces this below
> >
> > Do you have to run Cygwin under Admin for it to work?
> >
> > Injector: Converting injected urls to crawl db entries.
> > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> >org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/Str
> >ing;I)Z
> >         at
> >org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
> >         at
> >org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:5
> >70)
> >         at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
> >         at
> >org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskCheck
> >er.java:173)
> >         at
> >org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:16
> >0)
> >         at
> >org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChange
> >d(LocalDirAllocator.java:285)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPa
> >thForWrite(LocalDirAllocator.java:344)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirA
> >llocator.java:150)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirA
> >llocator.java:131)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirA
> >llocator.java:115)
> >         at
> >org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDist
> >ributedCacheManager.java:131)
> >         at
> >org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.jav
> >a:163)
> >         at
> >org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java
> >:731)
> >         at
> >org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitt
> >er.java:432)
> >         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> >         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> >         at java.security.AccessController.doPrivileged(Native
> >Method)
> >         at javax.security.auth.Subject.doAs(Unknown Source)
> >         at
> >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> >tion.java:1548)
> >         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> >         at
> >org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
> >         at org.apache.nutch.crawl.Injector.inject(Injector.java:376)
> >         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> >         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: Running into an Issue

Jamal, Sarfaraz
In reply to this post by Markus Jelsma-2
I GOT IT TO WORK!!! (on windopws) =) =) =)

Thank you everyone :)

Sas

-----Original Message-----
From: Markus Jelsma [mailto:[hidden email]]
Sent: Wednesday, July 13, 2016 10:44 AM
To: [hidden email]
Subject: RE: Running into an Issue

Jamal, i really have no idea but it smells like a Windows related problem again.

See:
http://stackoverflow.com/questions/27201505/hadoop-exception-in-thread-main-java-lang-nullpointerexception
http://stackoverflow.com/questions/30379441/mapreduce-development-inside-eclipse-on-windows

Markus
 
 
-----Original message-----

> From:Jamal, Sarfaraz <[hidden email]>
> Sent: Wednesday 13th July 2016 16:09
> To: [hidden email]
> Subject: RE: Running into an Issue
>
> Hi Markus,
>
> So I am now running this on a machine that I have administrative
> access to And I am getting a different error message:
>
> Do you (or anyone) have any ideas?
>
> $ ./nutch inject ../TestCrawl/crawldb ../url [I tried from the bin
> folder and the root nutch folder[
>
> Injector: starting at 2016-07-13 10:02:54
> Injector: crawlDb: ../TestCrawl/crawldb
> Injector: urlDir: ../url
> Injector: Converting injected urls to crawl db entries.
> Injector: java.lang.NullPointerException
>         at java.lang.ProcessBuilder.start(Unknown Source)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
>         at org.apache.hadoop.util.Shell.run(Shell.java:418)
>         at
>org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:                               
>650)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
>         at
>org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys                               
>tem.java:633)
>         at
>org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.                               
>java:467)
>         at
>org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav                               
>a:456)
>         at
>org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.jav                               
>a:424)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:849)
>         at
>org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:1149)
>         at
>org.apache.nutch.util.LockUtil.createLockFile(LockUtil.java:58)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:357)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
>

>
> -----Original Message-----
> From: Markus Jelsma [mailto:[hidden email]]
> Sent: Tuesday, July 12, 2016 4:52 AM
> To: [hidden email]
> Subject: RE: Running into an Issue
>
> Hi , there are some Windows API calls in there that i will never understand. Are there some kinds of symlinks you are working with or whatever they are called in Windows? There must be something with Nutch/Hadoop getting access to your disk. Check permissions, disk space and whatever you can think of.
>
> M.

> -----Original message-----
> > From:Jamal, Sarfaraz <[hidden email]>
> > Sent: Monday 11th July 2016 22:46
> > To: Nutch help <[hidden email]>
> > Subject: Running into an Issue
> >
> > So I feel I have made some progress on Nutch
> >
> > However I am now getting another error which I am having difficulty navigating through:
> >
> > bin/nutch inject TestCrawl/crawldb url
> >
> > produces this below
> >
> > Do you have to run Cygwin under Admin for it to work?
> >
> > Injector: Converting injected urls to crawl db entries.
> > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> >org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/Str
> >ing;I)Z
> >         at
> >org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
> >         at
> >org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:5
> >70)
> >         at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
> >         at
> >org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskCheck
> >er.java:173)
> >         at
> >org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:16
> >0)
> >         at
> >org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChange
> >d(LocalDirAllocator.java:285)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPa
> >thForWrite(LocalDirAllocator.java:344)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirA
> >llocator.java:150)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirA
> >llocator.java:131)
> >         at
> >org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirA
> >llocator.java:115)
> >         at
> >org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDist
> >ributedCacheManager.java:131)
> >         at
> >org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.jav
> >a:163)
> >         at
> >org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java
> >:731)
> >         at
> >org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitt
> >er.java:432)
> >         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> >         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> >         at java.security.AccessController.doPrivileged(Native
> >Method)
> >         at javax.security.auth.Subject.doAs(Unknown Source)
> >         at
> >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> >tion.java:1548)
> >         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> >         at
> >org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
> >         at org.apache.nutch.crawl.Injector.inject(Injector.java:376)
> >         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> >         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: Running into an Issue

yongyao
This post has NOT been accepted by the mailing list yet.
Hi Jamal,

I am having the same problem. Can you tell me how you solved this?

Thanks,
J
Reply | Threaded
Open this post in threaded view
|

Re: Running into an Issue

huihui
This post has NOT been accepted by the mailing list yet.
In reply to this post by Jamal, Sarfaraz
Hi Jamal

I am stuck the whole month trying to setup Ntuch 2.3.
Can you please share some insight how you solved that "Exception in thread "main" java.lang.UnsatisfiedLinkError" issue?

Thanks!!!!