OAI on SOLR already done?

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

OAI on SOLR already done?

Paul Libbrecht-4

Hello list,

I've met a few google matches that indicate that SOLR-based servers implement the Open Archive Initiative's Metadata Harvesting Protocol.

Is there something made to be re-usable that would be an add-on to solr?

thanks in advance

paul
Reply | Threaded
Open this post in threaded view
|

Re: OAI on SOLR already done?

Péter Király
Hi,

I don't know whether it fits to your need, but we are builing a tool
based on Drupal (eXtensible Catalog Drupal Toolkit), which can harvest
with OAI-PMH and index the harvested records into Solr. The records is
harvested, processed, and stored into MySQL, then we index them into
Solr. We created some ways to manipulate the original values before
sending to Solr. We created it in a modular way, so you can change
settings in an admin interface or write your own "hooks" (special
Drupal functions), to taylor the application to your needs. We support
only Dublin Core, and our own FRBR-like schema (called XC schema), but
you can add more schemas. Since this forum is about Solr, and not
applications using Solr, if you interested this tool, plase write me a
private message, or visit http://eXtensibleCatalog.org, or the
module's page at http://drupal.org/project/xc.

Hope this helps,

Péter
eXtensible Catalog

2011/2/2 Paul Libbrecht <[hidden email]>:

>
> Hello list,
>
> I've met a few google matches that indicate that SOLR-based servers implement the Open Archive Initiative's Metadata Harvesting Protocol.
>
> Is there something made to be re-usable that would be an add-on to solr?
>
> thanks in advance
>
> paul
Reply | Threaded
Open this post in threaded view
|

Re: OAI on SOLR already done?

Paul Libbrecht-4
Peter,

I'm afraid your service is harvesting and I am trying to look at a PMH provider service.

Your project appeared early in the goolge matches.

paul


Le 2 févr. 2011 à 20:46, Péter Király a écrit :

> Hi,
>
> I don't know whether it fits to your need, but we are builing a tool
> based on Drupal (eXtensible Catalog Drupal Toolkit), which can harvest
> with OAI-PMH and index the harvested records into Solr. The records is
> harvested, processed, and stored into MySQL, then we index them into
> Solr. We created some ways to manipulate the original values before
> sending to Solr. We created it in a modular way, so you can change
> settings in an admin interface or write your own "hooks" (special
> Drupal functions), to taylor the application to your needs. We support
> only Dublin Core, and our own FRBR-like schema (called XC schema), but
> you can add more schemas. Since this forum is about Solr, and not
> applications using Solr, if you interested this tool, plase write me a
> private message, or visit http://eXtensibleCatalog.org, or the
> module's page at http://drupal.org/project/xc.
>
> Hope this helps,
>
> Péter
> eXtensible Catalog
>
> 2011/2/2 Paul Libbrecht <[hidden email]>:
>>
>> Hello list,
>>
>> I've met a few google matches that indicate that SOLR-based servers implement the Open Archive Initiative's Metadata Harvesting Protocol.
>>
>> Is there something made to be re-usable that would be an add-on to solr?
>>
>> thanks in advance
>>
>> paul

Reply | Threaded
Open this post in threaded view
|

Re: OAI on SOLR already done?

Péter Király
Hi Paul,

yes, you are right, the project is about harvesting, and not to be harvestable.

Péter

2011/2/2 Paul Libbrecht <[hidden email]>:

> Peter,
>
> I'm afraid your service is harvesting and I am trying to look at a PMH provider service.
>
> Your project appeared early in the goolge matches.
>
> paul
>
>
> Le 2 févr. 2011 à 20:46, Péter Király a écrit :
>
>> Hi,
>>
>> I don't know whether it fits to your need, but we are builing a tool
>> based on Drupal (eXtensible Catalog Drupal Toolkit), which can harvest
>> with OAI-PMH and index the harvested records into Solr. The records is
>> harvested, processed, and stored into MySQL, then we index them into
>> Solr. We created some ways to manipulate the original values before
>> sending to Solr. We created it in a modular way, so you can change
>> settings in an admin interface or write your own "hooks" (special
>> Drupal functions), to taylor the application to your needs. We support
>> only Dublin Core, and our own FRBR-like schema (called XC schema), but
>> you can add more schemas. Since this forum is about Solr, and not
>> applications using Solr, if you interested this tool, plase write me a
>> private message, or visit http://eXtensibleCatalog.org, or the
>> module's page at http://drupal.org/project/xc.
>>
>> Hope this helps,
>>
>> Péter
>> eXtensible Catalog
>>
>> 2011/2/2 Paul Libbrecht <[hidden email]>:
>>>
>>> Hello list,
>>>
>>> I've met a few google matches that indicate that SOLR-based servers implement the Open Archive Initiative's Metadata Harvesting Protocol.
>>>
>>> Is there something made to be re-usable that would be an add-on to solr?
>>>
>>> thanks in advance
>>>
>>> paul
>
>
Reply | Threaded
Open this post in threaded view
|

Re: OAI on SOLR already done?

Jonathan Rochkind
In reply to this post by Paul Libbrecht-4
The trick is that you can't just have a generic black box OAI-PMH
provider on top of any Solr index. How would it know where to get the
metadata elements it needs, such as title, or last-updated date, etc.
Any given solr index might not even have this in stored fields -- and a
given app might want to look them up from somewhere other than stored
fields.

If the Solr index does have them in stored fields, and you do want to
get them from the stored fields, then it's, I think (famous last words)
relatively straightforward code to write. A mapping from solr stored
fields to metadata elements needed for OAI-PMH, and then simply
outputting the XML template with those filled in.

I am not aware of anyone that has done this in a
re-useable/configurable-for-your-solr tool. You could possibly do it
solely using the built-in Solr
JSP/XSLT/other-templating-stuff-I-am-not-familiar-with stuff, rather
than as an external Solr client app, or it could be an external Solr
client app.

This is actually a very similar problem to something someone else asked
a few days ago "Does anyone have an OpenSearch add-on for Solr?"  Very
very similar problem, just with a different XML template for output
(usually RSS or Atom) instead of OAI-PMH.

On 2/2/2011 3:14 PM, Paul Libbrecht wrote:

> Peter,
>
> I'm afraid your service is harvesting and I am trying to look at a PMH provider service.
>
> Your project appeared early in the goolge matches.
>
> paul
>
>
> Le 2 févr. 2011 à 20:46, Péter Király a écrit :
>
>> Hi,
>>
>> I don't know whether it fits to your need, but we are builing a tool
>> based on Drupal (eXtensible Catalog Drupal Toolkit), which can harvest
>> with OAI-PMH and index the harvested records into Solr. The records is
>> harvested, processed, and stored into MySQL, then we index them into
>> Solr. We created some ways to manipulate the original values before
>> sending to Solr. We created it in a modular way, so you can change
>> settings in an admin interface or write your own "hooks" (special
>> Drupal functions), to taylor the application to your needs. We support
>> only Dublin Core, and our own FRBR-like schema (called XC schema), but
>> you can add more schemas. Since this forum is about Solr, and not
>> applications using Solr, if you interested this tool, plase write me a
>> private message, or visit http://eXtensibleCatalog.org, or the
>> module's page at http://drupal.org/project/xc.
>>
>> Hope this helps,
>>
>> Péter
>> eXtensible Catalog
>>
>> 2011/2/2 Paul Libbrecht<[hidden email]>:
>>> Hello list,
>>>
>>> I've met a few google matches that indicate that SOLR-based servers implement the Open Archive Initiative's Metadata Harvesting Protocol.
>>>
>>> Is there something made to be re-usable that would be an add-on to solr?
>>>
>>> thanks in advance
>>>
>>> paul
>
Reply | Threaded
Open this post in threaded view
|

RE: OAI on SOLR already done?

Demian Katz
I already replied to the original poster off-list, but it seems that it may be worth weighing in here as well...

The next release of VuFind (http://vufind.org) is going to include OAI-PMH server support.  As you say, there is really no way to plug OAI-PMH directly into Solr...  but a tool like VuFind can provide a fairly generic, extensible, Solr-based platform for building an OAI-PMH server.  Obviously this is helpful for some use cases and not others...  but I'm happy to provide more information if anyone needs it.

- Demian
________________________________________
From: Jonathan Rochkind [[hidden email]]
Sent: Wednesday, February 02, 2011 3:38 PM
To: [hidden email]
Cc: Paul Libbrecht
Subject: Re: OAI on SOLR already done?

The trick is that you can't just have a generic black box OAI-PMH
provider on top of any Solr index. How would it know where to get the
metadata elements it needs, such as title, or last-updated date, etc.
Any given solr index might not even have this in stored fields -- and a
given app might want to look them up from somewhere other than stored
fields.

If the Solr index does have them in stored fields, and you do want to
get them from the stored fields, then it's, I think (famous last words)
relatively straightforward code to write. A mapping from solr stored
fields to metadata elements needed for OAI-PMH, and then simply
outputting the XML template with those filled in.

I am not aware of anyone that has done this in a
re-useable/configurable-for-your-solr tool. You could possibly do it
solely using the built-in Solr
JSP/XSLT/other-templating-stuff-I-am-not-familiar-with stuff, rather
than as an external Solr client app, or it could be an external Solr
client app.

This is actually a very similar problem to something someone else asked
a few days ago "Does anyone have an OpenSearch add-on for Solr?"  Very
very similar problem, just with a different XML template for output
(usually RSS or Atom) instead of OAI-PMH.

On 2/2/2011 3:14 PM, Paul Libbrecht wrote:

> Peter,
>
> I'm afraid your service is harvesting and I am trying to look at a PMH provider service.
>
> Your project appeared early in the goolge matches.
>
> paul
>
>
> Le 2 févr. 2011 à 20:46, Péter Király a écrit :
>
>> Hi,
>>
>> I don't know whether it fits to your need, but we are builing a tool
>> based on Drupal (eXtensible Catalog Drupal Toolkit), which can harvest
>> with OAI-PMH and index the harvested records into Solr. The records is
>> harvested, processed, and stored into MySQL, then we index them into
>> Solr. We created some ways to manipulate the original values before
>> sending to Solr. We created it in a modular way, so you can change
>> settings in an admin interface or write your own "hooks" (special
>> Drupal functions), to taylor the application to your needs. We support
>> only Dublin Core, and our own FRBR-like schema (called XC schema), but
>> you can add more schemas. Since this forum is about Solr, and not
>> applications using Solr, if you interested this tool, plase write me a
>> private message, or visit http://eXtensibleCatalog.org, or the
>> module's page at http://drupal.org/project/xc.
>>
>> Hope this helps,
>>
>> Péter
>> eXtensible Catalog
>>
>> 2011/2/2 Paul Libbrecht<[hidden email]>:
>>> Hello list,
>>>
>>> I've met a few google matches that indicate that SOLR-based servers implement the Open Archive Initiative's Metadata Harvesting Protocol.
>>>
>>> Is there something made to be re-usable that would be an add-on to solr?
>>>
>>> thanks in advance
>>>
>>> paul
>
Reply | Threaded
Open this post in threaded view
|

Re: OAI on SOLR already done?

gearond
Does something like this work to extract dates, phone numbers, addresses across
international formats and languages?

Or, just in the plain ol' USA?

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: Demian Katz <[hidden email]>
To: "[hidden email]" <[hidden email]>
Cc: Paul Libbrecht <[hidden email]>
Sent: Wed, February 2, 2011 12:40:58 PM
Subject: RE: OAI on SOLR already done?

I already replied to the original poster off-list, but it seems that it may be
worth weighing in here as well...

The next release of VuFind (http://vufind.org) is going to include OAI-PMH
server support.  As you say, there is really no way to plug OAI-PMH directly
into Solr...  but a tool like VuFind can provide a fairly generic, extensible,
Solr-based platform for building an OAI-PMH server.  Obviously this is helpful
for some use cases and not others...  but I'm happy to provide more information
if anyone needs it.

- Demian
________________________________________
From: Jonathan Rochkind [[hidden email]]
Sent: Wednesday, February 02, 2011 3:38 PM
To: [hidden email]
Cc: Paul Libbrecht
Subject: Re: OAI on SOLR already done?

The trick is that you can't just have a generic black box OAI-PMH
provider on top of any Solr index. How would it know where to get the
metadata elements it needs, such as title, or last-updated date, etc.
Any given solr index might not even have this in stored fields -- and a
given app might want to look them up from somewhere other than stored
fields.

If the Solr index does have them in stored fields, and you do want to
get them from the stored fields, then it's, I think (famous last words)
relatively straightforward code to write. A mapping from solr stored
fields to metadata elements needed for OAI-PMH, and then simply
outputting the XML template with those filled in.

I am not aware of anyone that has done this in a
re-useable/configurable-for-your-solr tool. You could possibly do it
solely using the built-in Solr
JSP/XSLT/other-templating-stuff-I-am-not-familiar-with stuff, rather
than as an external Solr client app, or it could be an external Solr
client app.

This is actually a very similar problem to something someone else asked
a few days ago "Does anyone have an OpenSearch add-on for Solr?"  Very
very similar problem, just with a different XML template for output
(usually RSS or Atom) instead of OAI-PMH.

On 2/2/2011 3:14 PM, Paul Libbrecht wrote:

> Peter,
>
> I'm afraid your service is harvesting and I am trying to look at a PMH provider
>service.
>
> Your project appeared early in the goolge matches.
>
> paul
>
>
> Le 2 févr. 2011 à 20:46, Péter Király a écrit :
>
>> Hi,
>>
>> I don't know whether it fits to your need, but we are builing a tool
>> based on Drupal (eXtensible Catalog Drupal Toolkit), which can harvest
>> with OAI-PMH and index the harvested records into Solr. The records is
>> harvested, processed, and stored into MySQL, then we index them into
>> Solr. We created some ways to manipulate the original values before
>> sending to Solr. We created it in a modular way, so you can change
>> settings in an admin interface or write your own "hooks" (special
>> Drupal functions), to taylor the application to your needs. We support
>> only Dublin Core, and our own FRBR-like schema (called XC schema), but
>> you can add more schemas. Since this forum is about Solr, and not
>> applications using Solr, if you interested this tool, plase write me a
>> private message, or visit http://eXtensibleCatalog.org, or the
>> module's page at http://drupal.org/project/xc.
>>
>> Hope this helps,
>>
>> Péter
>> eXtensible Catalog
>>
>> 2011/2/2 Paul Libbrecht<[hidden email]>:
>>> Hello list,
>>>
>>> I've met a few google matches that indicate that SOLR-based servers implement
>>>the Open Archive Initiative's Metadata Harvesting Protocol.
>>>
>>> Is there something made to be re-usable that would be an add-on to solr?
>>>
>>> thanks in advance
>>>
>>> paul
>

Reply | Threaded
Open this post in threaded view
|

Re: OAI on SOLR already done?

Jonathan Rochkind
On 2/2/2011 5:19 PM, Dennis Gearon wrote:
> Does something like this work to extract dates, phone numbers, addresses across
> international formats and languages?
>
> Or, just in the plain ol' USA?

What are you talking about?  There is nothing discussed in this thread
that does any 'extracting' of dates, phone numbers or addresses at all ,
whether in international or domestic formats.

Reply | Threaded
Open this post in threaded view
|

Re: OAI on SOLR already done?

Paul Libbrecht-4
In reply to this post by gearond
I would think OAI certainly has a trans-national format for dates.
And that probably dives well into SOLR's own date format.

But all of that is non-user-oriented so... no culture dependency in principle.

paul


Le 2 févr. 2011 à 23:19, Dennis Gearon a écrit :

> Does something like this work to extract dates, phone numbers, addresses across
> international formats and languages?
>
> Or, just in the plain ol' USA?

Reply | Threaded
Open this post in threaded view
|

Re: OAI on SOLR already done?

gearond
In reply to this post by Jonathan Rochkind
I guess I didn't understand 'meta data'. That's why I asked the question.

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: Jonathan Rochkind <[hidden email]>
To: "[hidden email]" <[hidden email]>
Sent: Wed, February 2, 2011 2:26:32 PM
Subject: Re: OAI on SOLR already done?

On 2/2/2011 5:19 PM, Dennis Gearon wrote:
> Does something like this work to extract dates, phone numbers, addresses
across
> international formats and languages?
>
> Or, just in the plain ol' USA?

What are you talking about?  There is nothing discussed in this thread that does
any 'extracting' of dates, phone numbers or addresses at all , whether in
international or domestic formats.