<noinde>do not index</noindex>

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

<noinde>do not index</noindex>

Stefan Groschupf-2
Hi,
as far I can see nutch's html parser does only support the meta tag  
noindex (<meta name="ROBOTS" content="NOINDEX,NOFOLLOW"> ) but there  
is an inoffiziel html <noindex> tag.
http://www.webmasterworld.com/forum10003/2703.htm

May be this would be another thing to make nutch more polite.
Also please remember my patch to support crawl-delay properties in  
robots.txt. That would be also something important to make nutch more  
polite and may be a better way than removing the nutch crawler  
identification.

Thoughts?
Stefan
Reply | Threaded
Open this post in threaded view
|

Re: <noinde>do not index</noindex>

Jérôme Charron
> as far I can see nutch's html parser does only support the meta tag
> noindex (<meta name="ROBOTS" content="NOINDEX,NOFOLLOW"> ) but there
> is an inoffiziel html <noindex> tag.
> http://www.webmasterworld.com/forum10003/2703.htm

Hello Stefan,

Here is a previous discussion about this :
http://www.mail-archive.com/nutch-user@.../msg04576.html


> May be this would be another thing to make nutch more polite.

+1

Jérôme

--
http://motrech.free.fr/
http://www.frutch.org/