lucene searching in pdf

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

lucene searching in pdf

anton feldmann
I am writing a program to search into an PDF document. I have problems
with generate an index file outof a lot of pdf documents. I want that i
can store more than one pdfFile into the indexFile and i want to that
the program is giving back the  1. file (apsolutepath) 2. word and lexem
3. score 4. and line how do i get n pdf documents in one indexfile
stored by 1, 2, 4?
i wrote a program that make an index of my filesystem and i can search
in the filesystem to find files. i can not read pdf files and pars them
with lucene.

i want to have an analyzer for all language lucene works with.

       IndexWriter write = new IndexWriter(index, new GermanAnalyzer(),

i use only the germananalyzer.


anton feldmann