[jira] [Commented] (NUTCH-2649) Optionally skip TLS/SSL certificate validation for protocol-selenium and protocol-htmlunit

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (NUTCH-2649) Optionally skip TLS/SSL certificate validation for protocol-selenium and protocol-htmlunit

Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/NUTCH-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020226#comment-17020226 ]

ASF GitHub Bot commented on NUTCH-2649:
---------------------------------------

sebastian-nagel commented on issue #496: Fix for NUTCH-2649: Optionally skip TLS/SSL certificate validation fo…
URL: https://github.com/apache/nutch/pull/496#issuecomment-576678970
 
 
   Thanks, @balashashanka! Code compiles now. I've tested protocol-selenium on https://expired.badssl.com/:
   ```
   $> bin/nutch parsechecker -Dhttp.tls.certificates.check=true \
        -Dplugin.includes='protocol-selenium|parse-tika' \
        -Dselenium.grid.binary=.../geckodriver \
        -Dselenium.enable.headless=true \
        -followRedirects -dumpText    https://expired.badssl.com/
   Fetch failed with protocol status: exception(16), lastModified=0: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed
   ```
   and with check disabled:
   ```
   $> bin/nutch parsechecker -Dhttp.tls.certificates.check=false \
        -Dplugin.includes='protocol-selenium|parse-tika' \
        -Dselenium.grid.binary=.../geckodriver \
        -Dselenium.enable.headless=true \
        -followRedirects -dumpText    https://expired.badssl.com/
   ...
   https://expired.badssl.com/
   Version: 5
   Status: success(1,0)
   ...
   expired. badssl.com
   ```
   
   I'll merge the PR soon.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


> Optionally skip TLS/SSL certificate validation for protocol-selenium and protocol-htmlunit
> ------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-2649
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2649
>             Project: Nutch
>          Issue Type: Improvement
>          Components: protocol
>    Affects Versions: 1.15
>            Reporter: Sebastian Nagel
>            Assignee: Shashanka Balakuntala Srinivasa
>            Priority: Minor
>             Fix For: 1.17
>
>
> NUTCH-2648 adds a property to enable/disable the TLS/SSL certificate validation for protocol-http, protocol-httpclient and protocol-okhttp. It should be also supported by remaining protocol plugins:
> * protocol-selenium,
> * protocol-interactiveselenium and
> * protocol-htmlunit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)