[jira] [Commented] (TIKA-2672) Upgrade dl4j to 1.0.0-beta

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (TIKA-2672) Upgrade dl4j to 1.0.0-beta

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/TIKA-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537293#comment-16537293 ]

ASF GitHub Bot commented on TIKA-2672:
--------------------------------------

chrismattmann commented on issue #241: Fix for TIKA-2672
URL: https://github.com/apache/tika/pull/241#issuecomment-403554981
 
 
   Inceptionv3 works great!
   
   ## Inception server
   
   ```
   nonas:tika2.0.0 mattmann$ tika --config=tika-dl/src/test/resources/org/apache/tika/dl/imagerec/dl4j-inception3-config.xml
   Jul 09, 2018 10:18:53 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
   WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed.
   See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
   for optional dependencies.
   
   Jul 09, 2018 10:18:53 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
   WARNING: Tesseract OCR is installed and will be automatically applied to image files unless
   you've excluded the TesseractOCRParser from the default parser.
   Tesseract may dramatically slow down content extraction (TIKA-2359).
   As of Tika 1.15 (and prior versions), Tesseract is automatically called.
   In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig.
   Jul 09, 2018 10:18:53 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
   WARNING: org.xerial's sqlite-jdbc is not loaded.
   Please provide the jar on your classpath to parse sqlite files.
   See tika-parsers/pom.xml for the correct version.
   INFO  Starting Apache Tika 2.0.0-SNAPSHOT server
   INFO  Using custom config: tika-dl/src/test/resources/org/apache/tika/dl/imagerec/dl4j-inception3-config.xml
   INFO  Cache doesn't exist. Going to make a copy
   INFO  This might take a while! GET https://github.com/USCDataScience/tika-dockers/releases/download/v0.2/inception_v3_keras_2.h5
   INFO  Cache doesn't exist. Going to make a copy
   INFO  This might take a while! GET https://github.com/USCDataScience/tika-dockers/releases/download/v0.2/imagenet_class_index.json
   INFO  Going to load Inception network...
   INFO  Unexpected end-of-input: expected close marker for OBJECT (from [Source: {"config": {"output_layers": [["predictions", 0, 0]], "layers": [{"class_name": "InputLayer", "name": "input_1", "config": {"batch_input_shape": [null, null, null, 3], "dtype": "float32", "sparse": false, "name": "input_1"}, "inbound_nodes": []}, {"class_name": "Conv2D", "name": "conv2d_1", "config": {"activity_regularizer": null, "strides": [2, 2], "padding": "valid", "kernel_regularizer": null, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "distribution": "uniform", "mode": "fan_avg", "seed": null}}, "data_format": "channels_last", "activation": "linear", "bias_regularizer": null, "kernel_size": [3, 3], "dilation_rate": [1, 1], "use_bias": false, "trainable": true, "kernel_constraint": null, "bias_constraint": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "filters": 32, "name": "conv2d_1"}, "inbound_nodes": [[["input_1", 0, 0, {}]]]}, {"class_name": "BatchNormalization", "name": "batch_normalization_1", "config": {"center": true, "gamma_initializer": {"class_name": "Ones", "config": {}}, "beta_constraint": null, "gamma_constraint": null, "moving_variance_initializer": {"class_name": "Ones", "config": {}}, "moving_mean_initializer": {"class_name": "Zeros", "config": {}}, "scale": false, "momentum": 0.99, "gamma_regularizer": null, "trainable": true, "epsilon": 0.001, "axis": 3, "beta_initializer": {"class_name": "Zeros", "config": {}}, "beta_regularizer": null, "name": "batch_normalization_1"}, "inbound_nodes": [[["conv2d_1", 0, 0, {}]]]}, {"class_name": "Activation", "name": "activation_1", "config": {"activation": "relu", "trainable": ....suppressed
   {"activity_regularizer": null, "strides": [1, 1], "padding": "same", "kernel_regularizer": null, "kernel_i; line: 1, column: 64001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@7c711375; line: 1, column: 36001]
   INFO  Unexpected end-of-input within/between OBJECT entries
    at [Source: java.io.StringReader@57cf54e1; line: 1, column: 40001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@5b03b9fe; line: 1, column: 44001]
   INFO  Unrecognized token 'tru': was expecting 'null', 'true', 'false' or NaN
    at [Source: java.io.StringReader@37d4349f; line: 1, column: 56001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@434a63ab; line: 1, column: 52001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@6e0f5f7f; line: 1, column: 56001]
   INFO  Unexpected end-of-input: expected close marker for ARRAY (from [Source: java.io.StringReader@2805d709; line: 1, column: 45999])
    at [Source: java.io.StringReader@2805d709; line: 1, column: 60001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@3ee37e5a; line: 1, column: 64001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@2ea41516; line: 1, column: 68001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@3a44431a; line: 1, column: 72001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@3c7f66c4; line: 1, column: 76001]
   INFO  Unexpected end-of-input: was expecting closing quote for a string value
    at [Source: java.io.StringReader@194bcebf; line: 1, column: 80001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@17497425; line: 1, column: 84001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@f0da945; line: 1, column: 88001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@4803b726; line: 1, column: 92001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@ffaa6af; line: 1, column: 96001]
   INFO  Unexpected end-of-input: was expecting closing quote for a string value
    at [Source: java.io.StringReader@53ce1329; line: 1, column: 68001]
   INFO  Unexpected end-of-input: expected close marker for ARRAY (from [Source: java.io.StringReader@316bcf94; line: 1, column: 67972])
    at [Source: java.io.StringReader@316bcf94; line: 1, column: 80001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@6404f418; line: 1, column: 76001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@3e11f9e9; line: 1, column: 80001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@1de5f259; line: 1, column: 84001]
   INFO  Unexpected end-of-input within/between OBJECT entries
    at [Source: java.io.StringReader@729d991e; line: 1, column: 88001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@31fa1761; line: 1, column: 92001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@957e06; line: 1, column: 96001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@32502377; line: 1, column: 100001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@2c1b194a; line: 1, column: 104001]
   INFO  Unexpected end-of-input within/between ARRAY entries
    at [Source: java.io.StringReader@4dbb42b7; line: 1, column: 108001]
   INFO  Unexpected end-of-input within/between OBJECT entries
    at [Source: java.io.StringReader@66f57048; line: 1, column: 112001]
   INFO  Unexpected end-of-input: was expecting closing quote for a string value
    at [Source: java.io.StringReader@550dbc7a; line: 1, column: 116001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@21282ed8; line: 1, column: 120001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@36916eb0; line: 1, column: 124001]
   INFO  Unrecognized token 'fals': was expecting 'null', 'true', 'false' or NaN
    at [Source: java.io.StringReader@7bab3f1a; line: 1, column: 160001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@437da279; line: 1, column: 100001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@23c30a20; line: 1, column: 104001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@1e1a0406; line: 1, column: 108001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@3cebbb30; line: 1, column: 112001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@12aba8be; line: 1, column: 116001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@290222c1; line: 1, column: 120001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@67f639d3; line: 1, column: 124001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@6253c26; line: 1, column: 128001]
   INFO  Unexpected end-of-input: was expecting closing quote for a string value
    at [Source: java.io.StringReader@49049a04; line: 1, column: 132001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@71a8adcf; line: 1, column: 136001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@27462a88; line: 1, column: 140001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@82de64a; line: 1, column: 144001]
   INFO  Unexpected end-of-input within/between ARRAY entries
    at [Source: java.io.StringReader@659499f1; line: 1, column: 148001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@51e69659; line: 1, column: 152001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@47e2e487; line: 1, column: 156001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@201a4587; line: 1, column: 160001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@61001b64; line: 1, column: 132001]
   INFO  Unexpected end-of-input: was expecting closing quote for a string value
    at [Source: java.io.StringReader@4310d43; line: 1, column: 136001]
   INFO  Unexpected end-of-input in FIELD_NAME
    at [Source: java.io.StringReader@54a7079e; line: 1, column: 140001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@26e356f0; line: 1, column: 144001]
   INFO  Unexpected end-of-input: was expecting closing quote for a string value
    at [Source: java.io.StringReader@47d9a273; line: 1, column: 148001]
   INFO  Unexpected end-of-input: was expecting closing quote for a string value
    at [Source: java.io.StringReader@4b8ee4de; line: 1, column: 152001]
   INFO  Unexpected end-of-input: was expecting closing '"' for name
    at [Source: java.io.StringReader@27f981c6; line: 1, column: 156001]
   INFO  Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.StringReader@1b11171f; line: 1, column: 143942])
    at [Source: java.io.StringReader@1b11171f; line: 1, column: 160001]
   INFO  Unrecognized token 'tru': was expecting 'null', 'true', 'false' or NaN
    at [Source: java.io.StringReader@1151e434; line: 1, column: 182001]
   INFO  Loaded [CpuBackend] backend
   INFO  Number of threads used for NativeOps: 2
   INFO  Number of threads used for BLAS: 2
   INFO  Backend used: [CPU]; OS: [Mac OS X]
   INFO  Cores: [4]; Memory: [3.6GB];
   INFO  Blas vendor: [MKL]
   INFO  Starting ComputationGraph with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
   INFO  Loaded the Inception model. Time taken=2657ms
   INFO  Recogniser = org.apache.tika.dl.imagerec.DL4JInceptionV3Net
   INFO  Recogniser Available = true
   INFO  Setting the server's publish address to be http://localhost:9998/
   INFO  jetty-8.y.z-SNAPSHOT
   INFO  Started SelectChannelConnector@localhost:9998
   INFO  Started Apache Tika server at http://localhost:9998/
   INFO  rmeta (autodetecting type)
   INFO  Time taken 1014ms
   INFO  Add RecognisedObject{label='lion' (en), id='291', confidence=0.9375439286231995}
   
   ```
   
   ## Inception Client
   
   ```
   nonas:imagerec mattmann$ curl -T lion.jpg http://localhost:9998/rmeta | python -mjson.tool
     % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                    Dload  Upload   Total   Spent    Left  Speed
   100 45617    0  1176  100 44441    920  34790  0:00:01  0:00:01 --:--:-- 34801
   [
       {
           "Content-Type": "image/jpeg",
           "OBJECT": "lion (0.93754)",
           "X-Parsed-By": [
               "org.apache.tika.parser.CompositeParser",
               "org.apache.tika.parser.recognition.ObjectRecognitionParser"
           ],
           "X-TIKA:content": "<html xmlns=\"http://www.w3.org/1999/xhtml\">\n<head>\n<meta name=\"org.apache.tika.parser.recognition.object.rec.impl\" content=\"org.apache.tika.dl.imagerec.DL4JInceptionV3Net\" />\n<meta name=\"X-Parsed-By\" content=\"org.apache.tika.parser.CompositeParser\" />\n<meta name=\"X-Parsed-By\" content=\"org.apache.tika.parser.recognition.ObjectRecognitionParser\" />\n<meta name=\"OBJECT\" content=\"lion (0.93754)\" />\n<meta name=\"Content-Type\" content=\"image/jpeg\" />\n<title></title>\n</head>\n<body><ol id=\"objects\">\t<li id=\"291\"> lion [en](confidence = 0.937544)</li>\n</ol>\n</body></html>",
           "X-TIKA:parse_time_millis": "1086",
           "org.apache.tika.parser.recognition.object.rec.impl": "org.apache.tika.dl.imagerec.DL4JInceptionV3Net"
       }
   ]
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


> Upgrade dl4j to 1.0.0-beta
> --------------------------
>
>                 Key: TIKA-2672
>                 URL: https://issues.apache.org/jira/browse/TIKA-2672
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: TIKA-2672.patch
>
>
> Let's try to upgrade dl4j.  I think I got us most of the way there, but I got this error when reading the json config file.  Can someone with more knowledge of layer specs help ([~thammegowda], perhaps :))?
> {noformat}
> org.deeplearning4j.exception.DL4JInvalidConfigException: Invalid configuration for layer (idx=-1, name=convolution2d_2, type=ConvolutionLayer) for width dimension:  Invalid input configuration for kernel width. Require 0 < kW <= inWidth + 2*padW; got (kW=3, inWidth=1, padW=0)
> Input type = InputTypeConvolutional(h=149,w=1,c=32), kernel = [3, 3], strides = [1, 1], padding = [0, 0], layer size (output channels) = 32, convolution mode = Truncate
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)