Welcome Thejan Wijesinghe as an Apache Tika PMC and committer!

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Welcome Thejan Wijesinghe as an Apache Tika PMC and committer!

Chris Mattmann
Welcome to Thejan Wijesinghe who has joined as a new Tika PMC member and
committer!

 

Please say a bit about yourself…thanks!

 

Cheers,

Chris

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Welcome Thejan Wijesinghe as an Apache Tika PMC and committer!

Thejan Wijesinghe-2
*Hi Chris Mattmann,Thank you for the invitation.Hi everyone,First of all, I
should say, I am very excited to be on board. Being a PMC member in Tika is
a huge accomplishment because Tika is one of those TLPs in Apache with a
history of more than 10 years.I’m currently a final year undergraduate at
Univ. of Moratuwa, Sri Lanka. I found a keen interest in information
retrieval, data science and machine learning related domains. Tika, being
one of the key technologies, used in many information retrieval
applications, I got the opportunity to work with Tika, couple of years back
but never got the chance to use Tika for an industry level application
until my internship. During my internship, I worked with a startup in SL,
to build their own cognitive platform where I had to use some of the Apache
technologies such as Kafka, Solr, Superset(incubating) and Tika. We could
successfully complete the initial version of the platform and I still work
as an external consultant for the same project. However, becoming a
committer to Apache Tika was one of the life goals I set when I got
selected as the Google Summer of Code intern at Apache Tika in 2017. My
project was “Supporting Image-to-Text (Image Captioning) in Tika for Image
MIME Types”[1], it was an amazing project idea by Thamme Gowda, which lots
of people paid so much attention. I was mentored by Chris Mattmann and
Thamme Gowda. I feel myself very lucky to have met these two people in my
life, because not for them, I don’t think, I would ever find the guidance
to become a PMC member or a committer. Most of my contributions are related
to enhancing ML based capabilities in Tika. I have many future plans to
improve the Tika-dl module. Including a parser with NMT based translation,
a sentiment parser, a dl4j based captioning parser to tika-dl. I also love
to improve Tika’s capabilities in mime type detection and language
detection. Other than that, I would love to clean up some of the parsers in
Tika. Our code base is quite a big one, evolved throughout many years and I
have seen instances where some of the parsers, not being in their
appropriate place, just to point out as an example, we have an age
recognizer parser in the Tika-nlp module while having a sentiment parser
under Tika-parsers module. I know that’s quite a lot of plans, I got there
for Tika, but I have nothing to be afraid of because I got an entire
lifetime to accomplish them.[1]
https://issues.apache.org/jira/browse/TIKA-2262
<https://issues.apache.org/jira/browse/TIKA-2262>   Thanks and Best
Regards,ThejanW*


On Tue, May 8, 2018 at 12:10 AM Chris Mattmann <[hidden email]> wrote:

> Welcome to Thejan Wijesinghe who has joined as a new Tika PMC member and
> committer!
>
>
>
> Please say a bit about yourself…thanks!
>
>
>
> Cheers,
>
> Chris
>
>
>
>
>
>
>
>