What to export from the tika-bundle ?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

What to export from the tika-bundle ?

Felix Meschberger-2

In a comment to TIKA-340 Jukka Zitting writes:

      Could we even avoid inlining the tika-core and
      tika-parsers jars, or is that something that's
      needed for the Export-Package rules to work? If
      the latter, can we exclude org.apache.tika.parser
      subpackages from being exported so that only
      tika-core gets inlined?

I have created TIKA-342 [1] with a patch which embeds all dependencies
and tweaks the exports using a different export statement
(<_exportcontents>) and explicitly import org.w3c.dom to not require
xmlbeans inlining. This seems to work.

Now, I do not understand the phrase "can we exclude
org.apache.tika.parser subpackages from being exported". Do you mean the
Tika-Parser contents should not be exported ? Maybe that would be a good
thing in terms of loose coupling.

But at least one user (Apache Jackrabbit Trunk) of the Tika is directly
accessing the tika.parser subpackages thus not exporting these would
require the user to be adapted (which may even be a good idea -- in
terms of loose coupling ...).

Also the Tika-Parsers bundle exports everything. So having the
Tika-Bundle not exporting the parsers subpackages sounds a bit asymmetric...

My (uninitiated) proposal would be:

   * In the tika-parsers project move all tika.parser subpackages down
     one level below an impl package. That is, e.g. move the
     org.apache.tika.parser.image package to
     This clearly signals: this is implemetation and should not be
     directly accessed by client code. Also it causes the bundle plugin
     to not export it any longer (the default is to export everything
     except impl and internal packages and their subpackages).

   * Not exporting the tika.parser.impl subpackages in the tika-bundle

   * Require clients to refrain directly calling into the parsers but
     instead use the AutoDetectParser.



[1] https://issues.apache.org/jira/browse/TIKA-342