[jira] Created: (SOLR-471) Distributed Solr Client

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (SOLR-471) Distributed Solr Client

Prajeeth Emanuel (Jira)
Distributed Solr Client
-----------------------

                 Key: SOLR-471
                 URL: https://issues.apache.org/jira/browse/SOLR-471
             Project: Solr
          Issue Type: New Feature
          Components: clients - java
    Affects Versions: 1.3
            Reporter: Nguyen Kien Trung
            Priority: Minor


Inspired by memcached java clients.
The ability to update/search/delete among many solr instances
Client parametters:
- List of solr servers
- Number of replicas
Client functions:
- Update: using consistent hashing to determine what documents are going to be stored in what server. Get the list of servers (equal to number of replicas) and issue parallel UPDATE
- Search: parallel search all servers, aggregate distinct results
- Delete: parallel delete in all servers

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-471) Distributed Solr Client

Prajeeth Emanuel (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nguyen Kien Trung updated SOLR-471:
-----------------------------------

    Description:
Inspired by memcached java clients.
The ability to update/search/delete among many solr instances
Client parametters:
- List of solr servers
- Number of replicas

Client functions:
- Update: using consistent hashing to determine what documents are going to be stored in what server. Get the list of servers (equal to number of replicas) and issue parallel UPDATE
- Search: parallel search all servers, aggregate distinct results
- Delete: parallel delete in all servers

  was:
Inspired by memcached java clients.
The ability to update/search/delete among many solr instances
Client parametters:
- List of solr servers
- Number of replicas
Client functions:
- Update: using consistent hashing to determine what documents are going to be stored in what server. Get the list of servers (equal to number of replicas) and issue parallel UPDATE
- Search: parallel search all servers, aggregate distinct results
- Delete: parallel delete in all servers


> Distributed Solr Client
> -----------------------
>
>                 Key: SOLR-471
>                 URL: https://issues.apache.org/jira/browse/SOLR-471
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>    Affects Versions: 1.3
>            Reporter: Nguyen Kien Trung
>            Priority: Minor
>
> Inspired by memcached java clients.
> The ability to update/search/delete among many solr instances
> Client parametters:
> - List of solr servers
> - Number of replicas
> Client functions:
> - Update: using consistent hashing to determine what documents are going to be stored in what server. Get the list of servers (equal to number of replicas) and issue parallel UPDATE
> - Search: parallel search all servers, aggregate distinct results
> - Delete: parallel delete in all servers

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-471) Distributed Solr Client

Prajeeth Emanuel (Jira)
In reply to this post by Prajeeth Emanuel (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nguyen Kien Trung updated SOLR-471:
-----------------------------------

    Attachment: distributedclient.patch

- Changed method toString() and override hashCode(), equals() in SolrDocument model. These modifications are to filter unique SolrDocument objects in a set
- Created test cases to setup multiple SolrHttpServers and perform update/delete/query operations

> Distributed Solr Client
> -----------------------
>
>                 Key: SOLR-471
>                 URL: https://issues.apache.org/jira/browse/SOLR-471
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>    Affects Versions: 1.3
>            Reporter: Nguyen Kien Trung
>            Priority: Minor
>         Attachments: distributedclient.patch
>
>
> Inspired by memcached java clients.
> The ability to update/search/delete among many solr instances
> Client parametters:
> - List of solr servers
> - Number of replicas
> Client functions:
> - Update: using consistent hashing to determine what documents are going to be stored in what server. Get the list of servers (equal to number of replicas) and issue parallel UPDATE
> - Search: parallel search all servers, aggregate distinct results
> - Delete: parallel delete in all servers

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-471) Distributed Solr Client

Prajeeth Emanuel (Jira)
In reply to this post by Prajeeth Emanuel (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566144#action_12566144 ]

Yonik Seeley commented on SOLR-471:
-----------------------------------

Hi Trung, have you had a look at SOLR-303 ?
It implements distributed search in Solr itself... I think that may have a couple of advantages:
- if it's in Solr, any type of client can use it
- possible (but not easy) for custom components to be distributed
- access to schema for proper sorting
- easier multi-tier distributed search

I've been thinking about the indexing side recently too.  Longer term we need something very robust (fault tolerant on the indexing side, ability to resize the server pool, ability to self-synchronize among shards, etc,).  In the short term I was thinking of something that simply fanned out requests to a list of servers based on a simple hash (no need for consistent hash in this simple scheme).  I originally thought about having this simple fan-out indexer reside outside solr, but it occured to me that if we wanted to support all of Solr's input types (multi-doc XML, CSV, etc) that it should probably happen inside solr after the doc had been parsed.

> Distributed Solr Client
> -----------------------
>
>                 Key: SOLR-471
>                 URL: https://issues.apache.org/jira/browse/SOLR-471
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>    Affects Versions: 1.3
>            Reporter: Nguyen Kien Trung
>            Priority: Minor
>         Attachments: distributedclient.patch
>
>
> Inspired by memcached java clients.
> The ability to update/search/delete among many solr instances
> Client parametters:
> - List of solr servers
> - Number of replicas
> Client functions:
> - Update: using consistent hashing to determine what documents are going to be stored in what server. Get the list of servers (equal to number of replicas) and issue parallel UPDATE
> - Search: parallel search all servers, aggregate distinct results
> - Delete: parallel delete in all servers

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-471) Distributed Solr Client

Prajeeth Emanuel (Jira)
In reply to this post by Prajeeth Emanuel (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566666#action_12566666 ]

Nguyen Kien Trung commented on SOLR-471:
----------------------------------------

Thanks Yonik. Actually I did have a glance at SOLR-303
As I'm doing a Java project which requires interaction with multiple customized-solr instances and it happened to me that the requirement was not meet with the solution which SOLR-303 offers, so I made the workaround with the thought that the patch may be helpful to those who are having same situation like me.

I'm quite new to solr but very excited with the promising features that solr is going to achieve

> Distributed Solr Client
> -----------------------
>
>                 Key: SOLR-471
>                 URL: https://issues.apache.org/jira/browse/SOLR-471
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>    Affects Versions: 1.3
>            Reporter: Nguyen Kien Trung
>            Priority: Minor
>         Attachments: distributedclient.patch
>
>
> Inspired by memcached java clients.
> The ability to update/search/delete among many solr instances
> Client parametters:
> - List of solr servers
> - Number of replicas
> Client functions:
> - Update: using consistent hashing to determine what documents are going to be stored in what server. Get the list of servers (equal to number of replicas) and issue parallel UPDATE
> - Search: parallel search all servers, aggregate distinct results
> - Delete: parallel delete in all servers

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.