For an "XML" fieldtype

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

For an "XML" fieldtype

Frédéric Glorieux
Hi all,

And thanks for all work done.

We are some to need an "XML" fieldType. I implement something working
for me. I have some questions :
  * did I miss a feature ?
  * is it the good way ?
  * is it generic enough to be commited ?

Usage, field store:
  * for short records, like bibliography or products, where document is
only metadata, a result could be the full original document, a catalog
is a query, lucene could be used to store this, instead of another
persistance;
  * an abstract may contain a link or some other tags, they could be
important to display in search results;
  * a field may contain structured data, not only for display (ex:
personal infos), xml is easier to parse than a text format when in XSLT;
...

Solution:
Indexation framework use a fast pull XML library, field content should
be kept as a string (or CDATA) for performances, but, an xml fieldtype
could avoid escaping when field is display in result.

This could be an XMLField, very close to StrField (XML isn't the best
source for tokenisation).

Thanks for comments.

--
Frédéric Glorieux
École nationale des chartes
Direction des nouvelles technologies et de l'informatique
Reply | Threaded
Open this post in threaded view
|

Re: For an "XML" fieldtype

Frédéric Glorieux
Hi all,

Sorry to repost on this issue.
Is there a regular way to use a field to store XML source of a document?
If not, is a fieldType the solution ?

Or, is it a "solr-user" question ?

Sorry if I have post in the bad place.

--
Frédéric Glorieux
École nationale des chartes
Direction des nouvelles technologies et de l'informatique


> Hi all,
>
> And thanks for all work done.
>
> We are some to need an "XML" fieldType. I implement something working
> for me. I have some questions :
>  * did I miss a feature ?
>  * is it the good way ?
>  * is it generic enough to be commited ?
>
> Usage, field store:
>  * for short records, like bibliography or products, where document is
> only metadata, a result could be the full original document, a catalog
> is a query, lucene could be used to store this, instead of another
> persistance;
>  * an abstract may contain a link or some other tags, they could be
> important to display in search results;
>  * a field may contain structured data, not only for display (ex:
> personal infos), xml is easier to parse than a text format when in XSLT;
> ...
>
> Solution:
> Indexation framework use a fast pull XML library, field content should
> be kept as a string (or CDATA) for performances, but, an xml fieldtype
> could avoid escaping when field is display in result.
>
> This could be an XMLField, very close to StrField (XML isn't the best
> source for tokenisation).
>
> Thanks for comments.
>