Document Security Model Question

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Document Security Model Question

kchellappa
I had earlier posted a similar discussion in LinkedIn and David Smiley rightly advised me that solr-user is a better place for technical discussions

----------------------------------

Our product which is hosted supports searching on educational resources. Our customers can choose to make specific resources unavailable for their users and also it depends on licensing. Our current solution uses full text search support in the database and handles availability as part of sql .

My task is to move the search from the database full text search into Solr. I searched through posts and found some that were kind of related and I am thinking along the following lines

  a)  Use the authorization model.   I can add fields like allow and/or deny in the index which contain the list of customers.  At query time, I can add the constraint based on the customer Id.  I am concerned about the performance if there are lot of values for these fields and also it requires constant reindexing if a value in this field changes
 b) Use Query-time Join.  
     Have the resource to availability for customer in separate inner documents.
     We are planning to deploy in SolrCloud.  I have read some challenges about Query-time join and SolrCloud. So this may not work for us.

c) Other ideas?
 
Excerpts from David Smiley's response

You're right that there may be some re-indexing as security rules change. If many Lucene/Solr documents share identical access control with other documents, then it may make more sense to externally determine which unique set of access-control sets the user has access to, then finally search by id -- which will hopefully not be a huge number. I've seen this done both externally and with a Solr core to join on.


Reply | Threaded
Open this post in threaded view
|

Re: Document Security Model Question

Rajinimaski
Hi,

For the case: *"it requires *constant reindexing if a value in this field
changes"
 If the acl for documents keep changing, Solr PostFilter is one of the
option. We use it in our system. We have almost near to billion documents
and 5000 approx users.


But it is important to check whether the acl changes are frequent and
decide solution based on that. The first option in your list works
efficiently without effecting search performance. In case the value changes
are less frequent then re-indexing of only those documents should not be
the concern.  But then, If changes are frequent, Post filter can be used
and will add some amount of delay.


Thanks












On Fri, Nov 15, 2013 at 4:32 AM, kchellappa <[hidden email]>wrote:

> I had earlier posted a similar discussion in LinkedIn and David Smiley
> rightly advised me that solr-user is a better place for technical
> discussions
>
> ----------------------------------
>
> Our product which is hosted supports searching on educational resources.
> Our
> customers can choose to make specific resources unavailable for their users
> and also it depends on licensing. Our current solution uses full text
> search
> support in the database and handles availability as part of sql .
>
> My task is to move the search from the database full text search into Solr.
> I searched through posts and found some that were kind of related and I am
> thinking along the following lines
>
>   a)  Use the authorization model.   I can add fields like allow and/or
> deny
> in the index which contain the list of customers.  At query time, I can add
> the constraint based on the customer Id.  I am concerned about the
> performance if there are lot of values for these fields and also it
> requires
> constant reindexing if a value in this field changes
>  b) Use Query-time Join.
>      Have the resource to availability for customer in separate inner
> documents.
>      We are planning to deploy in SolrCloud.  I have read some challenges
> about Query-time join and SolrCloud. So this may not work for us.
>
> c) Other ideas?
>
> Excerpts from David Smiley's response
>
> You're right that there may be some re-indexing as security rules change.
> If
> many Lucene/Solr documents share identical access control with other
> documents, then it may make more sense to externally determine which unique
> set of access-control sets the user has access to, then finally search by
> id
> -- which will hopefully not be a huge number. I've seen this done both
> externally and with a Solr core to join on.
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Document-Security-Model-Question-tp4101078.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Document Security Model Question

kchellappa
Thanks Rajinimaski for the reposnse.

Agree that if the changes are frequent, then first option wouldn't work efficiently.  Also the other challenge is that in our case for each resource, it is easy/efficient to get a list of changes since last checkpoint (because of our model of deployment of customer databases) rather than getting a snapshot of allowed/disallowed across all customers for each resource.


In your PostFilter implementation, do you cache the acls in memory, then they get updated periodically externally to solr and the post filter just uses the cache or something along these lines?