calcualting Page Rank using Nutch-Crawler

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

calcualting Page Rank using Nutch-Crawler

Anurag
How can i make the nutch-crawler calulate the no of outlinks & inlinks.
 
       Is there any article regarding this?

--
Kumar Anurag
Kumar Anurag
Reply | Threaded
Open this post in threaded view
|

Re: calcualting Page Rank using Nutch-Crawler

Alex McLintock
Hi -

First thing to do is to learn about the linkdb. When you've learnt what it
contains then come back with some more questions :-)


On 27 November 2010 11:21, Anurag <[hidden email]> wrote:

>
> How can i make the nutch-crawler calulate the no of outlinks & inlinks.
>
>       Is there any article regarding this?
>
> --
> Kumar Anurag
>
Reply | Threaded
Open this post in threaded view
|

Re: calcualting Page Rank using Nutch-Crawler

Anurag
Thanks for replying!!

  I studied my linkdb that is in my nutch-1.0 folder. used "bin/nutch readlinkdb crawl/linkdb -dump links" command to see that there were urls with inlinks mentioned in each line...

some of examples are----

http://abinader.cabledogs.org        Inlinks:
 fromUrl: http://blog.qt.nokia.com/2010/11/09/qt-4-7-1-and-qt-mobility-1-1-0-released/ anchor: Bruno Abinader

http://barnacity.net        Inlinks:
 fromUrl: http://blog.qt.nokia.com/2010/11/17/kdab-and-partners-build-kde-based-mobile-app-suite-using-qt-4-7/ anchor: suy

http://blog.qt.nokia.com        Inlinks:
 fromUrl: http://qt.nokia.com/products anchor: The Qt Blog
 fromUrl: http://qt.nokia.com/products/qt-addons anchor: More…


Now i know that i have got   urls<---inlinks .

Now how should i proceed to calculate page rank of each of the link??

I also read this 1
             2

Thanks.
Kumar Anurag
Reply | Threaded
Open this post in threaded view
|

Re: calcualting Page Rank using Nutch-Crawler

Anurag
Help!!! Pls reply..
Kumar Anurag
Reply | Threaded
Open this post in threaded view
|

Re: calcualting Page Rank using Nutch-Crawler

Dennis Kubes-2
LinkRank in the o.a.n.scoring.webgraph package calculates a modified
version of pagerank.  The WebGraph tool calculates the inlinks and
outlinks to pages.  The LinkRank tool uses that to create a PageRank
minus reciprocal links and links in the same domain.  The NodeDumper
tool will export the number of inlinks, outlinks, scores, etc. for the
urls.  The ScoreUpdater tool will put the LinkRank scores back into the
CrawlDb to be used with fetching and index scoring.  All of this is in
the 1.2 branch but it should get you started.

Dennis

On 12/21/2010 02:38 AM, Anurag wrote:
> Help!!! Pls reply..
>
> -----
> Kumar Anurag
>
Reply | Threaded
Open this post in threaded view
|

Re: calcualting Page Rank using Nutch-Crawler

Anurag
Thanks a lot ! i will try it .
Kumar Anurag