invertlinks running slow

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

invertlinks running slow

DS jha
Hi everyone -

I am trying to build my crawl database by crawling in smaller chunks -
by running generate, fetch, updatedb commands for about 30-50k URLs in
each run. To make sure that the index is also updated with each run, I
am also performing invertlinks and index command for each segment. It
seems that as my crawl db size is growing,  time it takes to perform
invertlinks for each segment is taking lot longer. Even for a segment
of about 10-15K urls, it is taking 3+ hrs. Should it be taking this
long just to perform invertlinks on about 10-20K documents.


DS Jha