[jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
Report Content as Inappropriate

[jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932626#comment-15932626 ]

Avtar Singh commented on TIKA-2298:

Not able run the VGG16 model in dl4j
When I try to run full fledged model i get this error.
Exception in thread "main" java.lang.OutOfMemoryError: Cannot allocate new FloatPointer(138357544): totalBytes = 1G, physicalBytes = 2G
        at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:76)
        at org.nd4j.linalg.api.buffer.BaseDataBuffer.<init>(BaseDataBuffer.java:445)
        at org.nd4j.linalg.api.buffer.FloatBuffer.<init>(FloatBuffer.java:57)
        at org.nd4j.linalg.api.buffer.factory.DefaultDataBufferFactory.createFloat(DefaultDataBufferFactory.java:236)
        at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1301)
        at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1275)
        at org.nd4j.linalg.api.ndarray.BaseNDArray.<init>(BaseNDArray.java:252)
        at org.nd4j.linalg.cpu.nativecpu.NDArray.<init>(NDArray.java:109)
        at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.create(CpuNDArrayFactory.java:247)
        at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4768)
        at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4726)
        at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:3861)
        at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:342)
        at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:274)
        at org.deeplearning4j.nn.modelimport.keras.KerasModel.getComputationGraph(KerasModel.java:483)
        at org.deeplearning4j.nn.modelimport.keras.KerasModel.getComputationGraph(KerasModel.java:471)
        at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasModelAndWeights(KerasModelImport.java:178)
        at modelImport.ModelImportConfig.main(ModelImportConfig.java:18)
Caused by: java.lang.OutOfMemoryError: Native allocator returned address == 0
        at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:70)
        ... 17 more

when i run the model that says 'NoTop' It is says: Invalid configuration
I found out in the source code for helper functions, that the json file needs  fixing.

I am running on i5 6th gen with 4gb RAM.
I tried 2 OS: Ubuntu and Window.
Is there any way i can run it?

> To improve object recognition parser so that it may work without external RESTful service setup
> -----------------------------------------------------------------------------------------------
>                 Key: TIKA-2298
>                 URL: https://issues.apache.org/jira/browse/TIKA-2298
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.14
>            Reporter: Avtar Singh
>              Labels: ObjectRecognitionParser
>             Fix For: 1.15
>   Original Estimate: 672h
>  Remaining Estimate: 672h
> When ObjectRecognitionParser was built to do image recognition, there wasn't
> good support for Java frameworks.  All the popular neural networks were in
> C++ or python.  Since there was nothing that runs within JVM, we tried
> several ways to glue them to Tika (like CLI, JNI, gRPC, REST).
> However, this game is changing slowly now. Deeplearning4j, the most famous
> neural network library for JVM, now supports importing models that are
> pre-trained in python/C++ based kits [5].
> *Improvement:*
> It will be nice to have an implementation of ObjectRecogniser that
> doesn't require any external setup(like installation of native libraries or
> starting REST services). Reasons: easy to distribute and also to cut the IO
> time.

This message was sent by Atlassian JIRA