Could we even avoid inlining the tika-core and
tika-parsers jars, or is that something that's
needed for the Export-Package rules to work? If
the latter, can we exclude org.apache.tika.parser
subpackages from being exported so that only
tika-core gets inlined?
I have created TIKA-342  with a patch which embeds all dependencies
and tweaks the exports using a different export statement
(<_exportcontents>) and explicitly import org.w3c.dom to not require
xmlbeans inlining. This seems to work.
Now, I do not understand the phrase "can we exclude
org.apache.tika.parser subpackages from being exported". Do you mean the
Tika-Parser contents should not be exported ? Maybe that would be a good
thing in terms of loose coupling.
But at least one user (Apache Jackrabbit Trunk) of the Tika is directly
accessing the tika.parser subpackages thus not exporting these would
require the user to be adapted (which may even be a good idea -- in
terms of loose coupling ...).
Also the Tika-Parsers bundle exports everything. So having the
Tika-Bundle not exporting the parsers subpackages sounds a bit asymmetric...
My (uninitiated) proposal would be:
* In the tika-parsers project move all tika.parser subpackages down
one level below an impl package. That is, e.g. move the
org.apache.tika.parser.image package to
This clearly signals: this is implemetation and should not be
directly accessed by client code. Also it causes the bundle plugin
to not export it any longer (the default is to export everything
except impl and internal packages and their subpackages).
* Not exporting the tika.parser.impl subpackages in the tika-bundle
* Require clients to refrain directly calling into the parsers but
instead use the AutoDetectParser.