Conceptual Question

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Conceptual Question

Andreas Balke [Digiden GmbH]
Hey all,

i checked out solr and i'm pretty amazed since this could save us a lot
of work. we are working on a document managment system and currently
change the document structure to be valid to predefined schemas. each
document will contain of several 'complex types' what is compareable to
xml-snipplets (e.g. two column set of image and list, two column list of
links and paragraph,...).

there are two main points that i'm asking me (and discussing with my
collegues), which will make the decision to spend more time in this
solution or not. to be short, here the are:

* we will have contents that become similar nested to a part of html.
this means they will be much more nested than the examples given to
solr. i guess this point will be possible via the import schema.

the important point (and first question) is that our schema is very
likely to change. that means we will have 'revisions' of documents
whereas each revision has its own, slightly different, schema.
structuring the documents itself wouldn't be a problem i guess, as we
could define an 'id' and 'rev' as unique in combination. but how can we
handle the revision dependent schema? is there a good way for such thing?

* the second big problem is also related to the versioning and its
changing schemas. assume we will have a lot of documents that are built
of a document type 'Foo' rev 1. now we decide that the schema of Foo
changes, the documents, already stored, become invalid somehow. to solve
this, we will create some kind of update procedure that fits all
documents to the new schema.

will there be a way to solve this problem without fetching all Foo:rev:1
documents and re-importing them as Foo:rev:2 documents? as i write this,
it seems to me that this is a stupid question since there is no change
interface. nevertheless, do you see any problems here, if ca. 10000
documents will be affected at once?

i would be very happy about each single opinion for my questions.

thank you very much,
andi
 

--
 Andreas Balke // Lead Developer
Digiden GmbH • Agentur für Kommunikationslösungen
In der Backfabrik • Saarbrückerstraße 37b • D-10405 Berlin
Fon: +49 (30) 446 749 425 • Fax: +49 (30) 446 749 479
www.digiden.de

HRB 96276 B • Geschäftsführer: Mike Petersen



smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Conceptual Question

Yonik Seeley-2
On 6/14/07, Andreas Balke [Digiden GmbH] <[hidden email]> wrote:
> the important point (and first question) is that our schema is very
> likely to change. that means we will have 'revisions' of documents
> whereas each revision has its own, slightly different, schema.
> structuring the documents itself wouldn't be a problem i guess, as we
> could define an 'id' and 'rev' as unique in combination. but how can we
> handle the revision dependent schema? is there a good way for such thing?

Make changes to the schema in a backward compatible way.
You can easily add new fields to a schema without any impact to
existing documents.

However, if you change the type of an existing field, or how it's
analyzed, then it doesn't make sense for documents before the change
and after to coexist (both sets would not be searchable in a
consistent manner).

> * the second big problem is also related to the versioning and its
> changing schemas. assume we will have a lot of documents that are built
> of a document type 'Foo' rev 1. now we decide that the schema of Foo
> changes, the documents, already stored, become invalid somehow. to solve
> this, we will create some kind of update procedure that fits all
> documents to the new schema.

The easiest way is to simply delete and reindex all the docs that should change.

> will there be a way to solve this problem without fetching all Foo:rev:1
> documents and re-importing them as Foo:rev:2 documents? as i write this,
> it seems to me that this is a stupid question since there is no change
> interface.

There is a change interface in JIRA, as long as all of the fields
originally sent are stored.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Conceptual Question

Frédéric Glorieux
Hi Yonik,

Sorry to jump on an old post

> There is a change interface in JIRA, as long as all of the fields
> originally sent are stored.

Do you remember the JIRA issue, or a token to find it ? It sounds useful
in some cases, for example, when you are working on analysers. That
could be real life for me in future.

--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique
Reply | Threaded
Open this post in threaded view
|

Re: Conceptual Question

Chris Hostetter-3

: > There is a change interface in JIRA, as long as all of the fields
: > originally sent are stored.
:
: Do you remember the JIRA issue, or a token to find it ? It sounds useful
: in some cases, for example, when you are working on analysers. That
: could be real life for me in future.

https://issues.apache.org/jira/browse/SOLR-139


-Hoss