Solr 8.4.1 Hadoop authentication with Kerberos

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Solr 8.4.1 Hadoop authentication with Kerberos

Andras Salamon

Hi Solr devs,


We are trying to use Hadoop authentication with Kerberos in Solr 8.4.1 and encountered a problem. We’re using a Hadoop 3.1.1 based fork. We are using JDK8 so we fall back to HTTP/1.1 but also tested with JDK11 (HTTP/2) and we got the same error.


We have already added a few upstream changes which are not yet committed (https://issues.apache.org/jira/browse/SOLR-9840) or committed only later (https://issues.apache.org/jira/browse/SOLR-11554).


The important part of our security.json file is:


"authentication": {

        "class": "org.apache.solr.security.ConfigurableInternodeAuthHadoopPlugin",

        "sysPropPrefix": "solr.authentication.",

        "type": "multi-scheme",

        "clientBuilderFactory": "org.apache.solr.client.solrj.impl.Krb5HttpClientBuilder",


When we try to add a document using curl we receive 401 error:


curl -k --negotiate -u : 'https://quasar-mdzaga-1.vpc.cloudera.com:8985/solr/test2/update' -H 'Content-type:application/json' -d ' [ {"id":"book3", "title":"book3title", "author":"author"} ]'{  "responseHeader":{    "rf":2147483647,    "status":401,    "QTime":18},  "error":{    "metadata":[      "error-class","org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException",      "root-error-class","org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException"],    "msg":"Async exception during distributed update: Error from server at https://quasar-mdzaga-3.vpc.cloudera.com:8985/solr/test2_shard2_replica_n6/: Authentication required\n\n\n\nrequest: https://quasar-mdzaga-3.vpc.cloudera.com:8985/solr/test2_shard2_replica_n6/",    "Code":401}}

We have debugged the problem and found that curl can send the information to the node, and the internode TOLEADER request fails, because we don’t answer to the 401 challenge that is part of the SPNEGO mechanism:


HTTP/1.1 401 Unauthorized access

...

WWW-Authenticate: Negotiate

Set-Cookie: hadoop.auth=; HttpOnly

Cache-Control: must-revalidate,no-cache,no-store

Content-Type: text/html;charset=iso-8859-1

Content-Length: 287



Checking the code shows that ConcurrentUpdateHttp2SolrClient calls Http2SolrClient.initOutStream which creates an OutputStreamContentProvider where the value of the isReproducible flag is false and jetty’s AuthenticationProtocolHandler will not continue the authentication in this case


Relevant code sections:


https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateHttp2SolrClient.java#L212


https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L299


https://github.com/eclipse/jetty.project/blob/jetty-9.4.19.v20190610/jetty-client/src/main/java/org/eclipse/jetty/client/AuthenticationProtocolHandler.java#L192


We have also found a workaround. If we send a simple successfully authenticated message before Http2SolrClient.initOutStream in ConcurrentUpdateHttp2SolrClient the authentication works correctly. Not only for the simple message but also for the upcoming requests. So right now we send an OPTIONS request here and just ignore the answer.


Sending the OPTIONS request happens before setting up the update stream, so if we send multiple documents in a single update, only one OPTIONS will be sent to each leader.


Although this workaround works for us, we are not sure that this is the best place to ensure pre-authentication between the nodes. Does anybody have a better place to handle it?


Is there anybody here who is successfully using Solr8 with Hadoop Authentication and Kerberos?


Thanks,

Andras