How to index data from multiple data source

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

How to index data from multiple data source

Ing. Yusniel Hidalgo Delgado

Dear Lucene/Solr community,

I am diving into Solr recently and I need help in the following usage scenery. I am working on a project for extract and search bibliographic metadata from PDF files. Firstly, my PDF files are processed to extract bibliographic metadata such as title, authors, affiliations, keywords and abstract. These metadata are stored in a relational database and then are indexed in Solr via DIH, however, I need to index also the fulltext of PDF and maintain the same ID between metadata indexed from DIH and fulltext of PDF indexed in Solr index. How to do that? How to configure sorlconfig.xml and schema.xml to do it?

Thanks in advance.

Best regards.

Yusniel Hidalgo Delgado
Semantic Web Research Group
University of Informatics Sciences 
Havana, Cuba

XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.