Tika GUI can't get the original file

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Tika GUI can't get the original file

Juri Linkov
Hello,

Using the project http://mediainfo.sourceforge.net/
to get metadata from audio/video files, their Java JNA wrapper available from
http://sourceforge.net/p/mediainfo/code/5416/tree/MediaInfoLib/trunk/Source/MediaInfoDLL/MediaInfoDLL.JNA.java
requires a filename as the input argument.  This is not a problem
since getFile() can return the original file from TikaInputStream.

This works flawlessly in the CLI version.  But when using the GUI interface,
the original stream gets wrapped through ProgressMonitorInputStream,
so hasFile() returns false.  Fortunately, getFile() automagically creates
a spooled temporary file.  Thanks for handling this.

The remaining problem is that often this is inefficient,
i.e. when dropping a very large multi-GB video file
to the GUI window causes performance degradation
waiting when this file gets copied to a temporary file.

What do you think about creating a subclass of ProgressMonitorInputStream
that like TikaInputStream would keep the reference to the original file?

--
Best regards,
Juri
Reply | Threaded
Open this post in threaded view
|

Re: Tika GUI can't get the original file

Nick Burch-2
On Fri, 8 Mar 2013, Juri Linkov wrote:
> This works flawlessly in the CLI version.  But when using the GUI
> interface, the original stream gets wrapped through
> ProgressMonitorInputStream, so hasFile() returns false.  Fortunately,
> getFile() automagically creates a spooled temporary file.  Thanks for
> handling this.

The Tika GUI is mostly intended to be used for debugging and demos. Is
there a reason why you're using it with such large files? Is it still for
demos/testing, or for something else?

Nick
Reply | Threaded
Open this post in threaded view
|

Re: Tika GUI can't get the original file

Juri Linkov
 > The Tika GUI is mostly intended to be used for debugging and demos. Is
 > there a reason why you're using it with such large files? Is it still for
 > demos/testing, or for something else?

Yes, it is for testing mostly, so there is no pressing need to improve this :)

--
Best regards,
Juri