[crawl] Response content length is not known

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[crawl] Response content length is not known

Christophe Noel
Hello,

I was used to crawl with Nutch 0.6 last release... But new CVS version,
I get some "response content length is not known" sometimes. What is it ?

Thanks for help :


(sample log)
050719 145804 http.auth.ntlm.username =
050719 145804 fetcher.server.delay = 2000
050719 145804 http.max.delays = 100
050719 145805 Configured Client
050719 145806 fetching http://www.certech.be/
050719 145806  - protocol redirect to http://www.cetic.be/indexEN.php3
050719 145806 fetching http://www.cetic.be/indexEN.php3
050719 145807 Response content length is not known
050719 145807 Response content length is not known
050719 145812 Response content length is not known
050719 145813 Response content length is not known
050719 145817 Response content length is not known
050719 145818 Response content length is not known
050719 145822 Response content length is not known
050719 145822 Response content length is not known
050719 145826 Response content length is not known
050719 145827 Response content length is not known
050719 145830 Response content length is not known
050719 145835 Response content length is not known
050719 145842 Response content length is not known
050719 145848 Response content length is not known
050719 145854 Response content length is not known
050719 145858 Response content length is not known
050719 145901 Response content length is not known
050719 145907 Response content length is not known
050719 145910 status: segment 20050719145801, 21 pages, 0 errors, 349565
bytes, 67271 ms
050719 145910 status: 0.31217015 pages/s, 40.596638 kb/s, 16645.953
bytes/page
050719 145911 Updating /home/cn/nutch-focus/nutch/trunk/agoria2.19jul/db