Default XML Output Schema

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Default XML Output Schema

sangraal
Perhaps a silly questions, but I'm wondering if anyone can tell me why solr
outputs XML like this:

<doc>
<int name="id">201038</id>
<int name="siteId">31</siteId>
<date name="modified">2006-09-15T21:36:39.000Z</date>
</doc>

rather than like this:

<doc>
<id type="int">201038</id>
<siteId type="int">31</siteId>
<modified type="date">2006-09-15T21:36:39.000Z</modified>
</doc>

A front-end PHP developer I know is having trouble parsing the default Solr
output because of that format and mentioned it would be much easier in the
former format... so I was curious if there was a reason it is the way it is.

-Sangraal
Reply | Threaded
Open this post in threaded view
|

Re: Default XML Output Schema

Yonik Seeley-2
On 9/21/06, sangraal aiken <[hidden email]> wrote:
> Perhaps a silly questions, but I'm wondering if anyone can tell me why solr
> outputs XML like this:

During the initial development of Solr (2004), I remember throwing up
both options, and most developers preferred to have a limited number
of well defined tags.

It allows you to have rather arbitrary field names, which you couldn't
have if you used the field name as the tag.

It also allows consistency with custom data.  For example, here is the
representation of an array of integer:
<arr><int>1</int><int>2</int></arr>
If field names were used as tags, we would have to either make up a
dummy-name, or we wouldn't be able to use the same style.


> <doc>
> <int name="id">201038</id>
> <int name="siteId">31</siteId>
> <date name="modified">2006-09-15T21:36:39.000Z</date>
> </doc>
>
> rather than like this:
>
> <doc>
> <id type="int">201038</id>
> <siteId type="int">31</siteId>
> <modified type="date">2006-09-15T21:36:39.000Z</modified>
> </doc>
>
> A front-end PHP developer I know is having trouble parsing the default Solr
> output because of that format and mentioned it would be much easier in the
> former format... so I was curious if there was a reason it is the way it is.

There are a number of options for you.
You could write your own QueryResponseWriter to output XML just as you
like it, or use an XSLT stylesheet in conjunction with
http://issues.apache.org/jira/browse/SOLR-49
or use another format such as JSON.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Default XML Output Schema

sangraal
Thanks for the great explanation Yonik, I passed it on to my collegues for
reference... I knew there was a good reason.

-Sangraal

On 9/21/06, Yonik Seeley <[hidden email]> wrote:

>
> On 9/21/06, sangraal aiken <[hidden email]> wrote:
> > Perhaps a silly questions, but I'm wondering if anyone can tell me why
> solr
> > outputs XML like this:
>
> During the initial development of Solr (2004), I remember throwing up
> both options, and most developers preferred to have a limited number
> of well defined tags.
>
> It allows you to have rather arbitrary field names, which you couldn't
> have if you used the field name as the tag.
>
> It also allows consistency with custom data.  For example, here is the
> representation of an array of integer:
> <arr><int>1</int><int>2</int></arr>
> If field names were used as tags, we would have to either make up a
> dummy-name, or we wouldn't be able to use the same style.
>
>
> > <doc>
> > <int name="id">201038</id>
> > <int name="siteId">31</siteId>
> > <date name="modified">2006-09-15T21:36:39.000Z</date>
> > </doc>
> >
> > rather than like this:
> >
> > <doc>
> > <id type="int">201038</id>
> > <siteId type="int">31</siteId>
> > <modified type="date">2006-09-15T21:36:39.000Z</modified>
> > </doc>
> >
> > A front-end PHP developer I know is having trouble parsing the default
> Solr
> > output because of that format and mentioned it would be much easier in
> the
> > former format... so I was curious if there was a reason it is the way it
> is.
>
> There are a number of options for you.
> You could write your own QueryResponseWriter to output XML just as you
> like it, or use an XSLT stylesheet in conjunction with
> http://issues.apache.org/jira/browse/SOLR-49
> or use another format such as JSON.
>
> -Yonik
>
Reply | Threaded
Open this post in threaded view
|

Re: Re: Default XML Output Schema

Tim Archambault-2
This structure was inhibiting to me at first too using Coldfusion.
However, I was able to create a function that dynamically creates a
query recordset for both facets and search results and will accomodate
new/additional fields at any time. If I can do it, any reasonable
programmer can handle it.

On 9/21/06, sangraal aiken <[hidden email]> wrote:

> Thanks for the great explanation Yonik, I passed it on to my collegues for
> reference... I knew there was a good reason.
>
> -Sangraal
>
> On 9/21/06, Yonik Seeley <[hidden email]> wrote:
> >
> > On 9/21/06, sangraal aiken <[hidden email]> wrote:
> > > Perhaps a silly questions, but I'm wondering if anyone can tell me why
> > solr
> > > outputs XML like this:
> >
> > During the initial development of Solr (2004), I remember throwing up
> > both options, and most developers preferred to have a limited number
> > of well defined tags.
> >
> > It allows you to have rather arbitrary field names, which you couldn't
> > have if you used the field name as the tag.
> >
> > It also allows consistency with custom data.  For example, here is the
> > representation of an array of integer:
> > <arr><int>1</int><int>2</int></arr>
> > If field names were used as tags, we would have to either make up a
> > dummy-name, or we wouldn't be able to use the same style.
> >
> >
> > > <doc>
> > > <int name="id">201038</id>
> > > <int name="siteId">31</siteId>
> > > <date name="modified">2006-09-15T21:36:39.000Z</date>
> > > </doc>
> > >
> > > rather than like this:
> > >
> > > <doc>
> > > <id type="int">201038</id>
> > > <siteId type="int">31</siteId>
> > > <modified type="date">2006-09-15T21:36:39.000Z</modified>
> > > </doc>
> > >
> > > A front-end PHP developer I know is having trouble parsing the default
> > Solr
> > > output because of that format and mentioned it would be much easier in
> > the
> > > former format... so I was curious if there was a reason it is the way it
> > is.
> >
> > There are a number of options for you.
> > You could write your own QueryResponseWriter to output XML just as you
> > like it, or use an XSLT stylesheet in conjunction with
> > http://issues.apache.org/jira/browse/SOLR-49
> > or use another format such as JSON.
> >
> > -Yonik
> >
>
>