Sending an empty http.agent.version

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Sending an empty http.agent.version

Yossi Tamari
Hi,

 

http.agent.version defaults in nutch-default.xml to Nutch-1.14-SNAPSHOT
(depending on the version of course).

If I want to override it to not send a version as part of the user-agent,
there is nothing I can do in nutch-site.xml, since putting an empty string
there causes the default to be taken, and putting any value there causes a
slash to be appended to the http.agent.name.

As far as I can see, the only way to override it is to remove the value in
nutch-default.xml, which is probably not the "correct" way, considering it
contains a comment saying "Do not modify this file directly".

 

This was asked previously in
https://www.mail-archive.com/user@.../msg15341.html, but
without a helpful answer.

 

I would be willing to push a fix where setting the string to "null" would
cause it to be ignored, if the maintainers are on board.

 

               Yossi.

Reply | Threaded
Open this post in threaded view
|

Re: Sending an empty http.agent.version

Sebastian Nagel
Hi,

> I would be willing to push a fix where setting the string to "null" would
> cause it to be ignored, if the maintainers are on board.

There are a couple of properties (http.accept.*) which are unset by setting
their value to white space only.  Why not follow this convention?

Yes, help is welcome! Please open an issue to fix this on
    https://issues.apache.org/jira/projects/NUTCH
and if possible provide a patch or a pull-request on github
with the Jira issue id "NUTCH-XXX" in the title.


Thanks,
Sebastian


On 10/23/2017 02:59 PM, Yossi Tamari wrote:

> Hi,
>
>  
>
> http.agent.version defaults in nutch-default.xml to Nutch-1.14-SNAPSHOT
> (depending on the version of course).
>
> If I want to override it to not send a version as part of the user-agent,
> there is nothing I can do in nutch-site.xml, since putting an empty string
> there causes the default to be taken, and putting any value there causes a
> slash to be appended to the http.agent.name.
>
> As far as I can see, the only way to override it is to remove the value in
> nutch-default.xml, which is probably not the "correct" way, considering it
> contains a comment saying "Do not modify this file directly".
>
>  
>
> This was asked previously in
> https://www.mail-archive.com/user@.../msg15341.html, but
> without a helpful answer.
>
>  
>
> I would be willing to push a fix where setting the string to "null" would
> cause it to be ignored, if the maintainers are on board.
>
>  
>
>                Yossi.
>
>