Re: Regarding Image Captioning in Tika for Image MIME Types

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Image Captioning in Tika for Image MIME Types

Thamme Gowda
Hi Kranthi Kiran,

Welcome to Tika Community. we are glad you are interested in working on the
issue.
Please remember to CC dev@tika mailing list for future discussions related
to tika.

 *Should the model be trainable by the user?*
The basic minimum requirement is to provide a pre-trained model and make
the parser work out of the box without Training (expect no GPUs; expect a
JVM and nothing else).
Of course, the parser configuration should have options to change the
models by changing the path.

As part of this GSoC project, integration isn't enough work. If you go
through the links provided in the Jira page you will notice that there
models for image recognition but no ready-made models for captioning. We
will have to train the im2text network from the dataset and make it
available. Thus we will have to open source the training utilities,
documentation or any supplementary tools we build along the way. We will
have to document all these in Tika wiki for the advanced users!

This is a GSoC issue and thus we expect to work on it during the summer.

For now, if you want a small task to familiarise yourself with Tika, I have
a suggestion:
Currently, Tika uses InceptionV3 model from Google for image recognition.
The InceptionV4 model is out recently which proved to be more accurate than
V3.

How about upgrading tika to use newer Inception model?

Let me know if you have more questions.

Cheers,
TG

*--*
*Thamme Gowda*
TG | @thammegowda <https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V <
[hidden email]> wrote:

> Hello,
> I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a
> member of Deep Learning research group at out college. I'm interested to
> take up the issue. I believe it would be a great contribution to the Apache
> Tika community.
>
> This is what I have done until now:
>
> 1) Build Tika from source using maven and explore it.
> 2) Tried the object recognition module from the command line. (I should
> probably start using the docker version to speed up my progress.)
>
> I am yet to import a keras model in dl4j. I have some doubts regarding the
> requirements since I'm new to this community. *Should the model be
> trainable by the user?* This is important because the Inception v3 model
> without re-training has performed poorly for me (I'm currently training it
> with less number of steps due to limited computational resources I have --
> GTX 1070).
>
> TODO (Before submitting the proposal):
>
> 1) Create a test REST API for Tika
> 2) Import a few models in dl4j.
> 3) Train im2txt on my computer.
>
> Thank you,
> Kranthi Kiran
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Image Captioning in Tika for Image MIME Types

Thamme Gowda
Hi Kranthi Kiran,

Please find my replies below:

Let me know if you have more questions.

Thanks,
TG
*--*
*Thamme Gowda*
TG | @thammegowda <https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Tue, Mar 21, 2017 at 12:21 PM, Kranthi Kiran G V <
[hidden email]> wrote:

> Hello Thamme Gowda,
>
> Thank you for letting me know of the developer mailing list. I have
> created an issue [1] and I would be working on it.
> The change is not straightforward since Inception V3 pre-trained model has
> a graph while the Inception V3 pre-trained model is packaged in the form of
> a check-point (ckpt) [2].
>

Okay, I see Inception-V3 has a graph, V4 has a checkpoint.
I assume there should be a way to restore model from checkpoint? Please
refer
https://www.tensorflow.org/programmers_guide/variables#checkpoint_files


>
> What do you think of using Keras to implement the Inception V4 model? It
> would make the job of scaling it on CPU clusters easier if we can use
> deeplearning4j's model import.
>
> Should I proceed in that direction?
>
> Regarding GSoC, what kind of computation resources are we given access to?
> We would have to train the show and tell network. It takes a lot of
> computation resources.
>
> If GPUs are not used, we would have to use a CPU cluster. So, the code has
> to be re-written (from the Google implementation of Inception V4).
>
>
Training IncpetionV4 from scratch requires too much effort, time, and
resources.  We are not aiming for such things, atleast not as part of Tika
and GSoC. The suggestion i mentioned earlier was to upgrade IncpetionV3
model with Inception V4 pretrained model/checkpoint since that will be more
benificial to Tika users community :-)



>
> [1] https://issues.apache.org/jira/browse/TIKA-2306
> [2] https://github.com/tensorflow/models/tree/master/
> slim#pre-trained-models
>
>
>
>
>
> On Mon, Mar 20, 2017 at 3:17 AM, Thamme Gowda <[hidden email]>
> wrote:
>
>> Hi Kranthi Kiran,
>>
>> Welcome to Tika Community. we are glad you are interested in working on
>> the issue.
>> Please remember to CC dev@tika mailing list for future discussions
>> related to tika.
>>
>>  *Should the model be trainable by the user?*
>> The basic minimum requirement is to provide a pre-trained model and make
>> the parser work out of the box without Training (expect no GPUs; expect
>> a JVM and nothing else).
>> Of course, the parser configuration should have options to change the
>> models by changing the path.
>>
>> As part of this GSoC project, integration isn't enough work. If you go
>> through the links provided in the Jira page you will notice that there
>> models for image recognition but no ready-made models for captioning. We
>> will have to train the im2text network from the dataset and make it
>> available. Thus we will have to open source the training utilities,
>> documentation or any supplementary tools we build along the way. We will
>> have to document all these in Tika wiki for the advanced users!
>>
>> This is a GSoC issue and thus we expect to work on it during the summer.
>>
>> For now, if you want a small task to familiarise yourself with Tika, I
>> have a suggestion:
>> Currently, Tika uses InceptionV3 model from Google for image recognition.
>> The InceptionV4 model is out recently which proved to be more accurate
>> than V3.
>>
>> How about upgrading tika to use newer Inception model?
>>
>> Let me know if you have more questions.
>>
>> Cheers,
>> TG
>>
>> *--*
>> *Thamme Gowda*
>> TG | @thammegowda <https://twitter.com/thammegowda>
>> ~Sent via somebody's Webmail server!
>>
>> On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V <
>> [hidden email]> wrote:
>>
>>> Hello,
>>> I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a
>>> member of Deep Learning research group at out college. I'm interested to
>>> take up the issue. I believe it would be a great contribution to the Apache
>>> Tika community.
>>>
>>> This is what I have done until now:
>>>
>>> 1) Build Tika from source using maven and explore it.
>>> 2) Tried the object recognition module from the command line. (I should
>>> probably start using the docker version to speed up my progress.)
>>>
>>> I am yet to import a keras model in dl4j. I have some doubts regarding
>>> the requirements since I'm new to this community. *Should the model be
>>> trainable by the user?* This is important because the Inception v3
>>> model without re-training has performed poorly for me (I'm currently
>>> training it with less number of steps due to limited computational
>>> resources I have -- GTX 1070).
>>>
>>> TODO (Before submitting the proposal):
>>>
>>> 1) Create a test REST API for Tika
>>> 2) Import a few models in dl4j.
>>> 3) Train im2txt on my computer.
>>>
>>> Thank you,
>>> Kranthi Kiran
>>>
>>
>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Image Captioning in Tika for Image MIME Types

Raunaq Abhyankar
In reply to this post by Thamme Gowda
Hi
I'm Raunaq Abhyankar from Mumbai. I'm a final year computer engineering student. I'm interested in working on Tika during the summer.

I was able to successfully classify image using Inception v4 and the results are better than Inception v3! 

However, I have one problem- I can run the script independently but am finding it difficult to integrate it with Tika. Can you pls guide me with this regard?

Thanks

Pfa: Screenshot of result of Inception v4 on testJPEG.jpg

On Mon, Mar 20, 2017 at 3:17 AM, Thamme Gowda <[hidden email]> wrote:
Hi Kranthi Kiran,

Welcome to Tika Community. we are glad you are interested in working on the
issue.
Please remember to CC dev@tika mailing list for future discussions related
to tika.

 *Should the model be trainable by the user?*
The basic minimum requirement is to provide a pre-trained model and make
the parser work out of the box without Training (expect no GPUs; expect a
JVM and nothing else).
Of course, the parser configuration should have options to change the
models by changing the path.

As part of this GSoC project, integration isn't enough work. If you go
through the links provided in the Jira page you will notice that there
models for image recognition but no ready-made models for captioning. We
will have to train the im2text network from the dataset and make it
available. Thus we will have to open source the training utilities,
documentation or any supplementary tools we build along the way. We will
have to document all these in Tika wiki for the advanced users!

This is a GSoC issue and thus we expect to work on it during the summer.

For now, if you want a small task to familiarise yourself with Tika, I have
a suggestion:
Currently, Tika uses InceptionV3 model from Google for image recognition.
The InceptionV4 model is out recently which proved to be more accurate than
V3.

How about upgrading tika to use newer Inception model?

Let me know if you have more questions.

Cheers,
TG

*--*
*Thamme Gowda*
TG | @thammegowda <https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V <
[hidden email]> wrote:

> Hello,
> I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a
> member of Deep Learning research group at out college. I'm interested to
> take up the issue. I believe it would be a great contribution to the Apache
> Tika community.
>
> This is what I have done until now:
>
> 1) Build Tika from source using maven and explore it.
> 2) Tried the object recognition module from the command line. (I should
> probably start using the docker version to speed up my progress.)
>
> I am yet to import a keras model in dl4j. I have some doubts regarding the
> requirements since I'm new to this community. *Should the model be
> trainable by the user?* This is important because the Inception v3 model
> without re-training has performed poorly for me (I'm currently training it
> with less number of steps due to limited computational resources I have --
> GTX 1070).
>
> TODO (Before submitting the proposal):
>
> 1) Create a test REST API for Tika
> 2) Import a few models in dl4j.
> 3) Train im2txt on my computer.
>
> Thank you,
> Kranthi Kiran
>



--
Regards,
Raunaq Abhyankar
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Image Captioning in Tika for Image MIME Types

Thamme Gowda
In reply to this post by Thamme Gowda
Hi Kranthi Kiran,

1. Thanks for the update. I look forward to your PR.

2. I don't have complete details about compute resources from GSoC. I think
google offers free credits (Approx. 300$) when students signup to Google
Compute Engine. I am not worried about it at this time, we can sort it out
later.

3. Great to know!'

Best,
TG

*--*
*Thamme Gowda*
TG | @thammegowda <https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Fri, Mar 24, 2017 at 10:42 PM, Kranthi Kiran G V <
[hidden email]> wrote:

> Apologies if I was ambiguous.
>
> 1) I have already started working on the improvement. The general method
> is working. I'll send a merge request after I port the REST method, too.
>
> 2) I was mentioning about the computational resources to train the final
> layer of im2txt to output the captions. Google hasn't released a
> pre-trained model.
>
> 3) I would update the developer community with a tentative GSoC schedule
> by tonight. It would be great if the community gives me suggestions.
>
> On Mar 25, 2017 12:06 AM, "Thamme Gowda" <[hidden email]> wrote:
>
>> Hi Kranthi Kiran,
>>
>> Please find my replies below:
>>
>> Let me know if you have more questions.
>>
>> Thanks,
>> TG
>> *--*
>> *Thamme Gowda*
>> TG | @thammegowda <https://twitter.com/thammegowda>
>> ~Sent via somebody's Webmail server!
>>
>> On Tue, Mar 21, 2017 at 12:21 PM, Kranthi Kiran G V <
>> [hidden email]> wrote:
>>
>>> Hello Thamme Gowda,
>>>
>>> Thank you for letting me know of the developer mailing list. I have
>>> created an issue [1] and I would be working on it.
>>> The change is not straightforward since Inception V3 pre-trained model
>>> has a graph while the Inception V3 pre-trained model is packaged in the
>>> form of a check-point (ckpt) [2].
>>>
>>
>> Okay, I see Inception-V3 has a graph, V4 has a checkpoint.
>> I assume there should be a way to restore model from checkpoint? Please
>> refer https://www.tensorflow.org/programmers_guide/variables
>> #checkpoint_files
>>
>>
>>>
>>> What do you think of using Keras to implement the Inception V4 model? It
>>> would make the job of scaling it on CPU clusters easier if we can use
>>> deeplearning4j's model import.
>>>
>>> Should I proceed in that direction?
>>>
>>> Regarding GSoC, what kind of computation resources are we given access
>>> to? We would have to train the show and tell network. It takes a lot of
>>> computation resources.
>>>
>>> If GPUs are not used, we would have to use a CPU cluster. So, the code
>>> has to be re-written (from the Google implementation of Inception V4).
>>>
>>>
>> Training IncpetionV4 from scratch requires too much effort, time, and
>> resources.  We are not aiming for such things, atleast not as part of Tika
>> and GSoC. The suggestion i mentioned earlier was to upgrade IncpetionV3
>> model with Inception V4 pretrained model/checkpoint since that will be more
>> benificial to Tika users community :-)
>>
>>
>>
>>>
>>> [1] https://issues.apache.org/jira/browse/TIKA-2306
>>> [2] https://github.com/tensorflow/models/tree/master/slim#pr
>>> e-trained-models
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Mar 20, 2017 at 3:17 AM, Thamme Gowda <[hidden email]>
>>> wrote:
>>>
>>>> Hi Kranthi Kiran,
>>>>
>>>> Welcome to Tika Community. we are glad you are interested in working on
>>>> the issue.
>>>> Please remember to CC dev@tika mailing list for future discussions
>>>> related to tika.
>>>>
>>>>  *Should the model be trainable by the user?*
>>>> The basic minimum requirement is to provide a pre-trained model and
>>>> make the parser work out of the box without Training (expect no GPUs; expect
>>>> a JVM and nothing else).
>>>> Of course, the parser configuration should have options to change the
>>>> models by changing the path.
>>>>
>>>> As part of this GSoC project, integration isn't enough work. If you go
>>>> through the links provided in the Jira page you will notice that there
>>>> models for image recognition but no ready-made models for captioning. We
>>>> will have to train the im2text network from the dataset and make it
>>>> available. Thus we will have to open source the training utilities,
>>>> documentation or any supplementary tools we build along the way. We will
>>>> have to document all these in Tika wiki for the advanced users!
>>>>
>>>> This is a GSoC issue and thus we expect to work on it during the summer.
>>>>
>>>> For now, if you want a small task to familiarise yourself with Tika, I
>>>> have a suggestion:
>>>> Currently, Tika uses InceptionV3 model from Google for image
>>>> recognition.
>>>> The InceptionV4 model is out recently which proved to be more accurate
>>>> than V3.
>>>>
>>>> How about upgrading tika to use newer Inception model?
>>>>
>>>> Let me know if you have more questions.
>>>>
>>>> Cheers,
>>>> TG
>>>>
>>>> *--*
>>>> *Thamme Gowda*
>>>> TG | @thammegowda <https://twitter.com/thammegowda>
>>>> ~Sent via somebody's Webmail server!
>>>>
>>>> On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V <
>>>> [hidden email]> wrote:
>>>>
>>>>> Hello,
>>>>> I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a
>>>>> member of Deep Learning research group at out college. I'm interested to
>>>>> take up the issue. I believe it would be a great contribution to the Apache
>>>>> Tika community.
>>>>>
>>>>> This is what I have done until now:
>>>>>
>>>>> 1) Build Tika from source using maven and explore it.
>>>>> 2) Tried the object recognition module from the command line. (I
>>>>> should probably start using the docker version to speed up my progress.)
>>>>>
>>>>> I am yet to import a keras model in dl4j. I have some doubts regarding
>>>>> the requirements since I'm new to this community. *Should the model
>>>>> be trainable by the user?* This is important because the Inception v3
>>>>> model without re-training has performed poorly for me (I'm currently
>>>>> training it with less number of steps due to limited computational
>>>>> resources I have -- GTX 1070).
>>>>>
>>>>> TODO (Before submitting the proposal):
>>>>>
>>>>> 1) Create a test REST API for Tika
>>>>> 2) Import a few models in dl4j.
>>>>> 3) Train im2txt on my computer.
>>>>>
>>>>> Thank you,
>>>>> Kranthi Kiran
>>>>>
>>>>
>>>>
>>>
>>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Image Captioning in Tika for Image MIME Types

Thamme Gowda
In reply to this post by Raunaq Abhyankar
Hi Raunaq,

Welcome to Tika community! We are pleased to know that you are interested
in working on this issue!!

Please coordinate with Kranthi Kiran who is also working on the same issue
and avoid the duplicate efforts.
Yes, https://issues.apache.org/jira/browse/TIKA-2306 is the place to carry
out the discussions!


Thanks,
TG

*--*
*Thamme Gowda*
TG | @thammegowda <https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Sat, Mar 25, 2017 at 11:49 AM, Raunaq Abhyankar <
[hidden email]> wrote:

> Hi
> I'm Raunaq Abhyankar from Mumbai. I'm a final year computer engineering
> student. I'm interested in working on Tika during the summer.
>
> I was able to successfully classify image using Inception v4 and the
> results are better than Inception v3!
>
> However, I have one problem- I can run the script independently but am
> finding it difficult to integrate it with Tika. Can you pls guide me with
> this regard?
>
> Thanks
>
> Pfa: Screenshot of result of Inception v4 on testJPEG.jpg
>
> On Mon, Mar 20, 2017 at 3:17 AM, Thamme Gowda <[hidden email]>
> wrote:
>
>> Hi Kranthi Kiran,
>>
>> Welcome to Tika Community. we are glad you are interested in working on
>> the
>> issue.
>> Please remember to CC dev@tika mailing list for future discussions
>> related
>> to tika.
>>
>>  *Should the model be trainable by the user?*
>> The basic minimum requirement is to provide a pre-trained model and make
>> the parser work out of the box without Training (expect no GPUs; expect a
>> JVM and nothing else).
>> Of course, the parser configuration should have options to change the
>> models by changing the path.
>>
>> As part of this GSoC project, integration isn't enough work. If you go
>> through the links provided in the Jira page you will notice that there
>> models for image recognition but no ready-made models for captioning. We
>> will have to train the im2text network from the dataset and make it
>> available. Thus we will have to open source the training utilities,
>> documentation or any supplementary tools we build along the way. We will
>> have to document all these in Tika wiki for the advanced users!
>>
>> This is a GSoC issue and thus we expect to work on it during the summer.
>>
>> For now, if you want a small task to familiarise yourself with Tika, I
>> have
>> a suggestion:
>> Currently, Tika uses InceptionV3 model from Google for image recognition.
>> The InceptionV4 model is out recently which proved to be more accurate
>> than
>> V3.
>>
>> How about upgrading tika to use newer Inception model?
>>
>> Let me know if you have more questions.
>>
>> Cheers,
>> TG
>>
>> *--*
>> *Thamme Gowda*
>> TG | @thammegowda <https://twitter.com/thammegowda>
>> ~Sent via somebody's Webmail server!
>>
>> On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V <
>> [hidden email]> wrote:
>>
>> > Hello,
>> > I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a
>> > member of Deep Learning research group at out college. I'm interested to
>> > take up the issue. I believe it would be a great contribution to the
>> Apache
>> > Tika community.
>> >
>> > This is what I have done until now:
>> >
>> > 1) Build Tika from source using maven and explore it.
>> > 2) Tried the object recognition module from the command line. (I should
>> > probably start using the docker version to speed up my progress.)
>> >
>> > I am yet to import a keras model in dl4j. I have some doubts regarding
>> the
>> > requirements since I'm new to this community. *Should the model be
>> > trainable by the user?* This is important because the Inception v3 model
>> > without re-training has performed poorly for me (I'm currently training
>> it
>> > with less number of steps due to limited computational resources I have
>> --
>> > GTX 1070).
>> >
>> > TODO (Before submitting the proposal):
>> >
>> > 1) Create a test REST API for Tika
>> > 2) Import a few models in dl4j.
>> > 3) Train im2txt on my computer.
>> >
>> > Thank you,
>> > Kranthi Kiran
>> >
>>
>
>
>
> --
> Regards,
> Raunaq Abhyankar
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Image Captioning in Tika for Image MIME Types

Mattmann, Chris A (3010)
In reply to this post by Thamme Gowda
Sounds great, and understood. Please prepare your proposal and share with Thamme and I for
feedback as your (potential) mentors.

Thanks much.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Principal Data Scientist, Engineering Administrative Office (3010)
Manager, NSF & Open Source Projects Formulation and Development Offices (8212)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 180-503E, Mailstop: 180-503
Email: [hidden email]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


From: Kranthi Kiran G V <[hidden email]>
Date: Wednesday, March 29, 2017 at 9:17 AM
To: Thamme Gowda <[hidden email]>
Cc: Chris Mattmann <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Regarding Image Captioning in Tika for Image MIME Types

Hello,
1) I have submitted a PR which can be found here<https://github.com/apache/tika/pull/163>.
2) After working on the Show and Tell model since a week, I realized that the amount of computation resources I have are enough to take up the challenge.
Here is a sample caption I generated after a few days of training.
INFO:tensorflow:Loading model from checkpoint: /media/timberners/magicae/models/im2txt/im2txt/model/train/model.ckpt-174685
INFO:tensorflow:Successfully loaded checkpoint: model.ckpt-174685
Captions for image COCO_val2014_000000224477.jpg:
  0) a man riding a wave on top of a surfboard . (p=0.016002)
  1) a man riding a surfboard on a wave in the ocean . (p=0.007747)
  2) a man riding a wave on a surfboard in the ocean . (p=0.007673)
The evaluation is on the image in the example at im2txt's page<https://github.com/tensorflow/models/tree/master/im2txt#generating-captions>.
I'm excited to release the pre-trained model (if I'm allowed to) to the public during my GSoC journey to enable everyone to use it even though they do not have enough resources. I think it would be a great contribution to both Apache Tika and Computer Vision community as a whole.
3) I am working on the schedule. I would be submitting a draft in GSoC page. Should I send it here, too?
Regarding my other commitments, I would be working with Amazon India Development Centre during May 10th to July 10th. They offer flexible working hours.
I would be able to dedicate 40-45 hours per week. My ability to balance both of them can be showcased by how I am working at Deep Learning Research Group - NITW currently in the college.
What do you think?

On Mon, Mar 27, 2017 at 11:00 PM, Thamme Gowda <[hidden email]<mailto:[hidden email]>> wrote:
Hi Kranthi Kiran,

1. Thanks for the update. I look forward to your PR.

2. I don't have complete details about compute resources from GSoC. I think google offers free credits (Approx. 300$) when students signup to Google Compute Engine. I am not worried about it at this time, we can sort it out later.

3. Great to know!'

Best,
TG

--
Thamme Gowda
TG | @thammegowda<https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Fri, Mar 24, 2017 at 10:42 PM, Kranthi Kiran G V <[hidden email]<mailto:[hidden email]>> wrote:
Apologies if I was ambiguous.

1) I have already started working on the improvement. The general method is working. I'll send a merge request after I port the REST method, too.

2) I was mentioning about the computational resources to train the final layer of im2txt to output the captions. Google hasn't released a pre-trained model.

3) I would update the developer community with a tentative GSoC schedule by tonight. It would be great if the community gives me suggestions.

On Mar 25, 2017 12:06 AM, "Thamme Gowda" <[hidden email]<mailto:[hidden email]>> wrote:
Hi Kranthi Kiran,

Please find my replies below:

Let me know if you have more questions.

Thanks,
TG
--
Thamme Gowda
TG | @thammegowda<https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Tue, Mar 21, 2017 at 12:21 PM, Kranthi Kiran G V <[hidden email]<mailto:[hidden email]>> wrote:
Hello Thamme Gowda,
Thank you for letting me know of the developer mailing list. I have created an issue [1] and I would be working on it.
The change is not straightforward since Inception V3 pre-trained model has a graph while the Inception V3 pre-trained model is packaged in the form of a check-point (ckpt) [2].

Okay, I see Inception-V3 has a graph, V4 has a checkpoint.
I assume there should be a way to restore model from checkpoint? Please refer https://www.tensorflow.org/programmers_guide/variables#checkpoint_files


What do you think of using Keras to implement the Inception V4 model? It would make the job of scaling it on CPU clusters easier if we can use deeplearning4j's model import.

Should I proceed in that direction?

Regarding GSoC, what kind of computation resources are we given access to? We would have to train the show and tell network. It takes a lot of computation resources.

If GPUs are not used, we would have to use a CPU cluster. So, the code has to be re-written (from the Google implementation of Inception V4).


Training IncpetionV4 from scratch requires too much effort, time, and resources.  We are not aiming for such things, atleast not as part of Tika and GSoC. The suggestion i mentioned earlier was to upgrade IncpetionV3 model with Inception V4 pretrained model/checkpoint since that will be more benificial to Tika users community :-)



[1] https://issues.apache.org/jira/browse/TIKA-2306
[2] https://github.com/tensorflow/models/tree/master/slim#pre-trained-models




On Mon, Mar 20, 2017 at 3:17 AM, Thamme Gowda <[hidden email]<mailto:[hidden email]>> wrote:
Hi Kranthi Kiran,

Welcome to Tika Community. we are glad you are interested in working on the issue.
Please remember to CC dev@tika mailing list for future discussions related to tika.

 Should the model be trainable by the user?
The basic minimum requirement is to provide a pre-trained model and make the parser work out of the box without Training (expect no GPUs; expect a JVM and nothing else).
Of course, the parser configuration should have options to change the models by changing the path.

As part of this GSoC project, integration isn't enough work. If you go through the links provided in the Jira page you will notice that there models for image recognition but no ready-made models for captioning. We will have to train the im2text network from the dataset and make it available. Thus we will have to open source the training utilities, documentation or any supplementary tools we build along the way. We will have to document all these in Tika wiki for the advanced users!

This is a GSoC issue and thus we expect to work on it during the summer.

For now, if you want a small task to familiarise yourself with Tika, I have a suggestion:
Currently, Tika uses InceptionV3 model from Google for image recognition.
The InceptionV4 model is out recently which proved to be more accurate than V3.

How about upgrading tika to use newer Inception model?

Let me know if you have more questions.

Cheers,
TG

--
Thamme Gowda
TG | @thammegowda<https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V <[hidden email]<mailto:[hidden email]>> wrote:
Hello,
I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a member of Deep Learning research group at out college. I'm interested to take up the issue. I believe it would be a great contribution to the Apache Tika community.

This is what I have done until now:

1) Build Tika from source using maven and explore it.
2) Tried the object recognition module from the command line. (I should probably start using the docker version to speed up my progress.)

I am yet to import a keras model in dl4j. I have some doubts regarding the requirements since I'm new to this community. Should the model be trainable by the user? This is important because the Inception v3 model without re-training has performed poorly for me (I'm currently training it with less number of steps due to limited computational resources I have -- GTX 1070).
TODO (Before submitting the proposal):
1) Create a test REST API for Tika
2) Import a few models in dl4j.
3) Train im2txt on my computer.
Thank you,
Kranthi Kiran





Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Image Captioning in Tika for Image MIME Types

Kranthi Kiran G V
Hello mentors,

I have released a trained model of the neural image captioning system,
im2txt.
It can be found here:
https://github.com/KranthiGV/Pretrained-Show-and-Tell-model

I am hopeful it would benefit both the researchers community and Apache
Tika's
community for the image captioning.

Have a lot at it!

Thank you,
Kranthi Kiran GV,
CS 3/4 Undergrad,
NIT Warangal

On Wed, Mar 29, 2017 at 6:50 PM, Mattmann, Chris A (3010) <
[hidden email]> wrote:

> Sounds great, and understood. Please prepare your proposal and share with
> Thamme and I for
> feedback as your (potential) mentors.
>
>
>
> Thanks much.
>
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> Chris Mattmann, Ph.D.
>
> Principal Data Scientist, Engineering Administrative Office (3010)
>
> Manager, NSF & Open Source Projects Formulation and Development Offices
> (8212)
>
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>
> Office: 180-503E, Mailstop: 180-503
>
> Email: [hidden email]
>
> WWW:  http://sunset.usc.edu/~mattmann/
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> Director, Information Retrieval and Data Science Group (IRDS)
>
> Adjunct Associate Professor, Computer Science Department
>
> University of Southern California, Los Angeles, CA 90089 USA
>
> WWW: http://irds.usc.edu/
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
> *From: *Kranthi Kiran G V <[hidden email]>
> *Date: *Wednesday, March 29, 2017 at 9:17 AM
> *To: *Thamme Gowda <[hidden email]>
> *Cc: *Chris Mattmann <[hidden email]>, "[hidden email]" <
> [hidden email]>
> *Subject: *Re: Regarding Image Captioning in Tika for Image MIME Types
>
>
>
> Hello,
>
> 1) I have submitted a PR which can be found here
> <https://github.com/apache/tika/pull/163>.
>
> 2) After working on the Show and Tell model since a week, I realized that
> the amount of computation resources I have are enough to take up the
> challenge.
>
> Here is a sample caption I generated after a few days of training.
>
> INFO:tensorflow:Loading model from checkpoint: /media/timberners/magicae/
> models/im2txt/im2txt/model/train/model.ckpt-174685
> INFO:tensorflow:Successfully loaded checkpoint: model.ckpt-174685
> Captions for image COCO_val2014_000000224477.jpg:
>   0) a man riding a wave on top of a surfboard . (p=0.016002)
>   1) a man riding a surfboard on a wave in the ocean . (p=0.007747)
>   2) a man riding a wave on a surfboard in the ocean . (p=0.007673)
>
> The evaluation is on the image in the example at im2txt's page
> <https://github.com/tensorflow/models/tree/master/im2txt#generating-captions>.
>
>
> I'm excited to release the pre-trained model (if I'm allowed to) to the
> public during my GSoC journey to enable everyone to use it even though they
> do not have enough resources. I think it would be a great contribution to
> both Apache Tika and Computer Vision community as a whole.
>
> 3) I am working on the schedule. I would be submitting a draft in GSoC
> page. Should I send it here, too?
>
> Regarding my other commitments, I would be working with Amazon India
> Development Centre during May 10th to July 10th. They offer flexible
> working hours.
>
> I would be able to dedicate 40-45 hours per week. My ability to balance
> both of them can be showcased by how I am working at Deep Learning Research
> Group - NITW currently in the college.
>
> What do you think?
>
>
>
> On Mon, Mar 27, 2017 at 11:00 PM, Thamme Gowda <[hidden email]>
> wrote:
>
> Hi Kranthi Kiran,
>
>
>
> 1. Thanks for the update. I look forward to your PR.
>
>
>
> 2. I don't have complete details about compute resources from GSoC. I
> think google offers free credits (Approx. 300$) when students signup to
> Google Compute Engine. I am not worried about it at this time, we can sort
> it out later.
>
>
>
> 3. Great to know!'
>
>
>
> Best,
>
> TG
>
>
> *--*
>
> *Thamme Gowda*
>
> TG | @thammegowda <https://twitter.com/thammegowda>
>
> ~Sent via somebody's Webmail server!
>
>
>
> On Fri, Mar 24, 2017 at 10:42 PM, Kranthi Kiran G V <
> [hidden email]> wrote:
>
> Apologies if I was ambiguous.
>
>
>
> 1) I have already started working on the improvement. The general method
> is working. I'll send a merge request after I port the REST method, too.
>
>
>
> 2) I was mentioning about the computational resources to train the final
> layer of im2txt to output the captions. Google hasn't released a
> pre-trained model.
>
>
>
> 3) I would update the developer community with a tentative GSoC schedule
> by tonight. It would be great if the community gives me suggestions.
>
>
>
> On Mar 25, 2017 12:06 AM, "Thamme Gowda" <[hidden email]> wrote:
>
> Hi Kranthi Kiran,
>
>
>
> Please find my replies below:
>
>
>
> Let me know if you have more questions.
>
>
>
> Thanks,
>
> TG
>
> *--*
>
> *Thamme Gowda*
>
> TG | @thammegowda <https://twitter.com/thammegowda>
>
> ~Sent via somebody's Webmail server!
>
>
>
> On Tue, Mar 21, 2017 at 12:21 PM, Kranthi Kiran G V <
> [hidden email]> wrote:
>
> Hello Thamme Gowda,
>
> Thank you for letting me know of the developer mailing list. I have
> created an issue [1] and I would be working on it.
>
> The change is not straightforward since Inception V3 pre-trained model has
> a graph while the Inception V3 pre-trained model is packaged in the form of
> a check-point (ckpt) [2].
>
>
>
> Okay, I see Inception-V3 has a graph, V4 has a checkpoint.
>
> I assume there should be a way to restore model from checkpoint? Please
> refer https://www.tensorflow.org/programmers_guide/
> variables#checkpoint_files
>
>
>
>
>
> What do you think of using Keras to implement the Inception V4 model? It
> would make the job of scaling it on CPU clusters easier if we can use
> deeplearning4j's model import.
>
>
>
> Should I proceed in that direction?
>
>
>
> Regarding GSoC, what kind of computation resources are we given access to?
> We would have to train the show and tell network. It takes a lot of
> computation resources.
>
>
>
> If GPUs are not used, we would have to use a CPU cluster. So, the code has
> to be re-written (from the Google implementation of Inception V4).
>
>
>
>
> Training IncpetionV4 from scratch requires too much effort, time, and
> resources.  We are not aiming for such things, atleast not as part of Tika
> and GSoC. The suggestion i mentioned earlier was to upgrade IncpetionV3
> model with Inception V4 pretrained model/checkpoint since that will be more
> benificial to Tika users community :-)
>
>
>
>
>
>
>
> [1] https://issues.apache.org/jira/browse/TIKA-2306
>
> [2] https://github.com/tensorflow/models/tree/master/
> slim#pre-trained-models
>
>
>
>
>
>
> On Mon, Mar 20, 2017 at 3:17 AM, Thamme Gowda <[hidden email]>
> wrote:
>
> Hi Kranthi Kiran,
>
>
>
> Welcome to Tika Community. we are glad you are interested in working on
> the issue.
>
> Please remember to CC dev@tika mailing list for future discussions
> related to tika.
>
>
>
>  *Should the model be trainable by the user?*
>
> The basic minimum requirement is to provide a pre-trained model and make
> the parser work out of the box without Training (expect no GPUs; expect a
> JVM and nothing else).
>
> Of course, the parser configuration should have options to change the
> models by changing the path.
>
>
>
> As part of this GSoC project, integration isn't enough work. If you go
> through the links provided in the Jira page you will notice that there
> models for image recognition but no ready-made models for captioning. We
> will have to train the im2text network from the dataset and make it
> available. Thus we will have to open source the training utilities,
> documentation or any supplementary tools we build along the way. We will
> have to document all these in Tika wiki for the advanced users!
>
>
>
> This is a GSoC issue and thus we expect to work on it during the summer.
>
>
>
> For now, if you want a small task to familiarise yourself with Tika, I
> have a suggestion:
>
> Currently, Tika uses InceptionV3 model from Google for image recognition.
>
> The InceptionV4 model is out recently which proved to be more accurate
> than V3.
>
>
>
> How about upgrading tika to use newer Inception model?
>
>
>
> Let me know if you have more questions.
>
>
>
> Cheers,
>
> TG
>
>
> *--*
>
> *Thamme Gowda*
>
> TG | @thammegowda <https://twitter.com/thammegowda>
>
> ~Sent via somebody's Webmail server!
>
>
>
> On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V <
> [hidden email]> wrote:
>
> Hello,
> I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a
> member of Deep Learning research group at out college. I'm interested to
> take up the issue. I believe it would be a great contribution to the Apache
> Tika community.
>
> This is what I have done until now:
>
> 1) Build Tika from source using maven and explore it.
> 2) Tried the object recognition module from the command line. (I should
> probably start using the docker version to speed up my progress.)
>
> I am yet to import a keras model in dl4j. I have some doubts regarding the
> requirements since I'm new to this community. *Should the model be
> trainable by the user?* This is important because the Inception v3 model
> without re-training has performed poorly for me (I'm currently training it
> with less number of steps due to limited computational resources I have --
> GTX 1070).
>
> TODO (Before submitting the proposal):
>
> 1) Create a test REST API for Tika
>
> 2) Import a few models in dl4j.
>
> 3) Train im2txt on my computer.
>
> Thank you,
>
> Kranthi Kiran
>
>
>
>
>
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Image Captioning in Tika for Image MIME Types

Thamme Gowda
This is awesome.
Thanks :-)

*--*
*Thamme Gowda*
TG | @thammegowda <https://twitter.com/thammegowda>
~Sent via somebody's Webmail server!

On Wed, Apr 19, 2017 at 1:43 PM, Kranthi Kiran G V <
[hidden email]> wrote:

> Hello mentors,
>
> I have released a trained model of the neural image captioning system,
> im2txt.
> It can be found here:
> https://github.com/KranthiGV/Pretrained-Show-and-Tell-model
>
> I am hopeful it would benefit both the researchers community and Apache
> Tika's
> community for the image captioning.
>
> Have a lot at it!
>
> Thank you,
> Kranthi Kiran GV,
> CS 3/4 Undergrad,
> NIT Warangal
>
> On Wed, Mar 29, 2017 at 6:50 PM, Mattmann, Chris A (3010) <
> [hidden email]> wrote:
>
>> Sounds great, and understood. Please prepare your proposal and share with
>> Thamme and I for
>> feedback as your (potential) mentors.
>>
>>
>>
>> Thanks much.
>>
>>
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++++++++++++++
>>
>> Chris Mattmann, Ph.D.
>>
>> Principal Data Scientist, Engineering Administrative Office (3010)
>>
>> Manager, NSF & Open Source Projects Formulation and Development Offices
>> (8212)
>>
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>
>> Office: 180-503E, Mailstop: 180-503
>>
>> Email: [hidden email]
>>
>> WWW:  http://sunset.usc.edu/~mattmann/
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++++++++++++++
>>
>> Director, Information Retrieval and Data Science Group (IRDS)
>>
>> Adjunct Associate Professor, Computer Science Department
>>
>> University of Southern California, Los Angeles, CA 90089 USA
>>
>> WWW: http://irds.usc.edu/
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++++++++++++++
>>
>>
>>
>>
>>
>> *From: *Kranthi Kiran G V <[hidden email]>
>> *Date: *Wednesday, March 29, 2017 at 9:17 AM
>> *To: *Thamme Gowda <[hidden email]>
>> *Cc: *Chris Mattmann <[hidden email]>, "[hidden email]" <
>> [hidden email]>
>> *Subject: *Re: Regarding Image Captioning in Tika for Image MIME Types
>>
>>
>>
>> Hello,
>>
>> 1) I have submitted a PR which can be found here
>> <https://github.com/apache/tika/pull/163>.
>>
>> 2) After working on the Show and Tell model since a week, I realized that
>> the amount of computation resources I have are enough to take up the
>> challenge.
>>
>> Here is a sample caption I generated after a few days of training.
>>
>> INFO:tensorflow:Loading model from checkpoint:
>> /media/timberners/magicae/models/im2txt/im2txt/model/train/
>> model.ckpt-174685
>> INFO:tensorflow:Successfully loaded checkpoint: model.ckpt-174685
>> Captions for image COCO_val2014_000000224477.jpg:
>>   0) a man riding a wave on top of a surfboard . (p=0.016002)
>>   1) a man riding a surfboard on a wave in the ocean . (p=0.007747)
>>   2) a man riding a wave on a surfboard in the ocean . (p=0.007673)
>>
>> The evaluation is on the image in the example at im2txt's page
>> <https://github.com/tensorflow/models/tree/master/im2txt#generating-captions>.
>>
>>
>> I'm excited to release the pre-trained model (if I'm allowed to) to the
>> public during my GSoC journey to enable everyone to use it even though they
>> do not have enough resources. I think it would be a great contribution to
>> both Apache Tika and Computer Vision community as a whole.
>>
>> 3) I am working on the schedule. I would be submitting a draft in GSoC
>> page. Should I send it here, too?
>>
>> Regarding my other commitments, I would be working with Amazon India
>> Development Centre during May 10th to July 10th. They offer flexible
>> working hours.
>>
>> I would be able to dedicate 40-45 hours per week. My ability to balance
>> both of them can be showcased by how I am working at Deep Learning Research
>> Group - NITW currently in the college.
>>
>> What do you think?
>>
>>
>>
>> On Mon, Mar 27, 2017 at 11:00 PM, Thamme Gowda <[hidden email]>
>> wrote:
>>
>> Hi Kranthi Kiran,
>>
>>
>>
>> 1. Thanks for the update. I look forward to your PR.
>>
>>
>>
>> 2. I don't have complete details about compute resources from GSoC. I
>> think google offers free credits (Approx. 300$) when students signup to
>> Google Compute Engine. I am not worried about it at this time, we can sort
>> it out later.
>>
>>
>>
>> 3. Great to know!'
>>
>>
>>
>> Best,
>>
>> TG
>>
>>
>> *--*
>>
>> *Thamme Gowda*
>>
>> TG | @thammegowda <https://twitter.com/thammegowda>
>>
>> ~Sent via somebody's Webmail server!
>>
>>
>>
>> On Fri, Mar 24, 2017 at 10:42 PM, Kranthi Kiran G V <
>> [hidden email]> wrote:
>>
>> Apologies if I was ambiguous.
>>
>>
>>
>> 1) I have already started working on the improvement. The general method
>> is working. I'll send a merge request after I port the REST method, too.
>>
>>
>>
>> 2) I was mentioning about the computational resources to train the final
>> layer of im2txt to output the captions. Google hasn't released a
>> pre-trained model.
>>
>>
>>
>> 3) I would update the developer community with a tentative GSoC schedule
>> by tonight. It would be great if the community gives me suggestions.
>>
>>
>>
>> On Mar 25, 2017 12:06 AM, "Thamme Gowda" <[hidden email]> wrote:
>>
>> Hi Kranthi Kiran,
>>
>>
>>
>> Please find my replies below:
>>
>>
>>
>> Let me know if you have more questions.
>>
>>
>>
>> Thanks,
>>
>> TG
>>
>> *--*
>>
>> *Thamme Gowda*
>>
>> TG | @thammegowda <https://twitter.com/thammegowda>
>>
>> ~Sent via somebody's Webmail server!
>>
>>
>>
>> On Tue, Mar 21, 2017 at 12:21 PM, Kranthi Kiran G V <
>> [hidden email]> wrote:
>>
>> Hello Thamme Gowda,
>>
>> Thank you for letting me know of the developer mailing list. I have
>> created an issue [1] and I would be working on it.
>>
>> The change is not straightforward since Inception V3 pre-trained model
>> has a graph while the Inception V3 pre-trained model is packaged in the
>> form of a check-point (ckpt) [2].
>>
>>
>>
>> Okay, I see Inception-V3 has a graph, V4 has a checkpoint.
>>
>> I assume there should be a way to restore model from checkpoint? Please
>> refer https://www.tensorflow.org/programmers_guide/variables
>> #checkpoint_files
>>
>>
>>
>>
>>
>> What do you think of using Keras to implement the Inception V4 model? It
>> would make the job of scaling it on CPU clusters easier if we can use
>> deeplearning4j's model import.
>>
>>
>>
>> Should I proceed in that direction?
>>
>>
>>
>> Regarding GSoC, what kind of computation resources are we given access
>> to? We would have to train the show and tell network. It takes a lot of
>> computation resources.
>>
>>
>>
>> If GPUs are not used, we would have to use a CPU cluster. So, the code
>> has to be re-written (from the Google implementation of Inception V4).
>>
>>
>>
>>
>> Training IncpetionV4 from scratch requires too much effort, time, and
>> resources.  We are not aiming for such things, atleast not as part of Tika
>> and GSoC. The suggestion i mentioned earlier was to upgrade IncpetionV3
>> model with Inception V4 pretrained model/checkpoint since that will be more
>> benificial to Tika users community :-)
>>
>>
>>
>>
>>
>>
>>
>> [1] https://issues.apache.org/jira/browse/TIKA-2306
>>
>> [2] https://github.com/tensorflow/models/tree/master/slim#
>> pre-trained-models
>>
>>
>>
>>
>>
>>
>> On Mon, Mar 20, 2017 at 3:17 AM, Thamme Gowda <[hidden email]>
>> wrote:
>>
>> Hi Kranthi Kiran,
>>
>>
>>
>> Welcome to Tika Community. we are glad you are interested in working on
>> the issue.
>>
>> Please remember to CC dev@tika mailing list for future discussions
>> related to tika.
>>
>>
>>
>>  *Should the model be trainable by the user?*
>>
>> The basic minimum requirement is to provide a pre-trained model and make
>> the parser work out of the box without Training (expect no GPUs; expect a
>> JVM and nothing else).
>>
>> Of course, the parser configuration should have options to change the
>> models by changing the path.
>>
>>
>>
>> As part of this GSoC project, integration isn't enough work. If you go
>> through the links provided in the Jira page you will notice that there
>> models for image recognition but no ready-made models for captioning. We
>> will have to train the im2text network from the dataset and make it
>> available. Thus we will have to open source the training utilities,
>> documentation or any supplementary tools we build along the way. We will
>> have to document all these in Tika wiki for the advanced users!
>>
>>
>>
>> This is a GSoC issue and thus we expect to work on it during the summer.
>>
>>
>>
>> For now, if you want a small task to familiarise yourself with Tika, I
>> have a suggestion:
>>
>> Currently, Tika uses InceptionV3 model from Google for image recognition.
>>
>> The InceptionV4 model is out recently which proved to be more accurate
>> than V3.
>>
>>
>>
>> How about upgrading tika to use newer Inception model?
>>
>>
>>
>> Let me know if you have more questions.
>>
>>
>>
>> Cheers,
>>
>> TG
>>
>>
>> *--*
>>
>> *Thamme Gowda*
>>
>> TG | @thammegowda <https://twitter.com/thammegowda>
>>
>> ~Sent via somebody's Webmail server!
>>
>>
>>
>> On Sun, Mar 19, 2017 at 11:56 AM, Kranthi Kiran G V <
>> [hidden email]> wrote:
>>
>> Hello,
>> I'm Kranthi, a 3rd computer science undergrad at NIT, Warangal and a
>> member of Deep Learning research group at out college. I'm interested to
>> take up the issue. I believe it would be a great contribution to the Apache
>> Tika community.
>>
>> This is what I have done until now:
>>
>> 1) Build Tika from source using maven and explore it.
>> 2) Tried the object recognition module from the command line. (I should
>> probably start using the docker version to speed up my progress.)
>>
>> I am yet to import a keras model in dl4j. I have some doubts regarding
>> the requirements since I'm new to this community. *Should the model be
>> trainable by the user?* This is important because the Inception v3 model
>> without re-training has performed poorly for me (I'm currently training it
>> with less number of steps due to limited computational resources I have --
>> GTX 1070).
>>
>> TODO (Before submitting the proposal):
>>
>> 1) Create a test REST API for Tika
>>
>> 2) Import a few models in dl4j.
>>
>> 3) Train im2txt on my computer.
>>
>> Thank you,
>>
>> Kranthi Kiran
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
Loading...