[jira] Commented: (SOLR-1395) Integrate Katta

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-1395) Integrate Katta

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919784#action_12919784 ]

tom liu commented on SOLR-1395:
-------------------------------

No, that's four queries:
# on solr01, url is /select?fl=id,score&...
#* Shard=solrhome02#solrhome02
#* Shard=solrhome01#solrhome01
# on solr01, url is /select?ids=SOLR1000&fl=id,score,id&...
#* Shard=solrhome02#solrhome02
# on solr02, url is /select?ids=SOLR1000&fl=id,score&...
#* Shard=solrhome02#solrhome02
#* Shard=solrhome01#solrhome01
# on solr02, url is /select?ids=SOLR1000&ids=SOLR1000&...
#* Shard=solrhome01#solrhome01

If the orient query includes shards=*, then master solr would send * to kattaclient.
And then, kattaclient or katta.Client would select node such as solr01, and send shards=solrhome01#solrhome01,solrhome02#solrhome02
in middle-shard, searchHandler and queryComponent would invoke distributed process, such as createMainQuery and createRetrieveDocs.
So, in any node, the query would be distributed into two queries:
# first is selecting id and score
# second is selecting docs

i have changed the queryComponent class. that is:
{code:title=distributedProcess|borderStyle=solid}
        // Added by tom liu
        // do or not need distributed process
        boolean isShard = rb.req.getParams().getBool(ShardParams.IS_SHARD, false);
        // if in sub shards, do not need distributed process
        if (isShard) {
                if (rb.stage < ResponseBuilder.STAGE_PARSE_QUERY)
                        return ResponseBuilder.STAGE_PARSE_QUERY;
                if (rb.stage == ResponseBuilder.STAGE_PARSE_QUERY) {
                        createDistributedIdf(rb);
                        return ResponseBuilder.STAGE_EXECUTE_QUERY;
                }
                if (rb.stage < ResponseBuilder.STAGE_EXECUTE_QUERY)
                        return ResponseBuilder.STAGE_EXECUTE_QUERY;
                if (rb.stage == ResponseBuilder.STAGE_EXECUTE_QUERY) {
                        createMainQuery(rb);
                        return ResponseBuilder.STAGE_GET_FIELDS;
                }
                if (rb.stage < ResponseBuilder.STAGE_GET_FIELDS)
                        return ResponseBuilder.STAGE_GET_FIELDS;
                if (rb.stage == ResponseBuilder.STAGE_GET_FIELDS) {
                        return ResponseBuilder.STAGE_DONE;
                }
                return ResponseBuilder.STAGE_DONE;
        }
        // add end
        ...
{code}

{code:title=handleResponses|borderStyle=solid}
  if ((sreq.purpose & ShardRequest.PURPOSE_GET_TOP_IDS) != 0) {
      mergeIds(rb, sreq);
   // Added by tom liu
   // do or not need distributed process
   boolean isShard = rb.req.getParams().getBool(ShardParams.IS_SHARD, false);
      if(isShard){
      sreq.purpose = ShardRequest.PURPOSE_GET_FIELDS;
      }
     // add end
    }

    if ((sreq.purpose & ShardRequest.PURPOSE_GET_FIELDS) != 0) {
      returnFields(rb, sreq);
      return;
    }
{code}

{code:title=createMainQuery|borderStyle=solid}
    sreq.params = new ModifiableSolrParams(rb.req.getParams());
    // TODO: base on current params or original params?

        // Added by tom liu
        // do or not need distributed process
        boolean isShard = rb.req.getParams().getBool(ShardParams.IS_SHARD, false);
    if(isShard){
        // isShard=true, then do not change params
    }else{
    // add end
            // don't pass through any shards param
            sreq.params.remove(ShardParams.SHARDS);
    ...
{code}

{code:title=returnFields|borderStyle=solid}
      boolean returnScores = (rb.getFieldFlags() & SolrIndexSearcher.GET_SCORES) != 0;

      // changed by tom liu
      // add for loop
      //assert(sreq.responses.size() == 1);
      //ShardResponse srsp = sreq.responses.get(0);
      for(ShardResponse srsp : sreq.responses){
              SolrDocumentList docs = (SolrDocumentList)srsp.getSolrResponse().getResponse().get("response");

              String keyFieldName = rb.req.getSchema().getUniqueKeyField().getName();
      ...
{code}


> Integrate Katta
> ---------------
>
>                 Key: SOLR-1395
>                 URL: https://issues.apache.org/jira/browse/SOLR-1395
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: Next
>
>         Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, katta-core-0.6-dev.jar, katta.node.properties, katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, SOLR-1395.patch, SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> We'll integrate Katta into Solr so that:
> * Distributed search uses Hadoop RPC
> * Shard/SolrCore distribution and management
> * Zookeeper based failover
> * Indexes may be built using Hadoop

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]