Welcome Thejan Wijesinghe as an Apache Tika PMC and committer!

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Welcome Thejan Wijesinghe as an Apache Tika PMC and committer!

Chris Mattmann
Welcome to Thejan Wijesinghe who has joined as a new Tika PMC member and
committer!

 

Please say a bit about yourself…thanks!

 

Cheers,

Chris

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Welcome Thejan Wijesinghe as an Apache Tika PMC and committer!

Thejan Wijesinghe-2
*Hi Chris Mattmann,Thank you for the invitation.Hi everyone,First of all, I
should say, I am very excited to be on board. Being a PMC member in Tika is
a huge accomplishment because Tika is one of those TLPs in Apache with a
history of more than 10 years.I’m currently a final year undergraduate at
Univ. of Moratuwa, Sri Lanka. I found a keen interest in information
retrieval, data science and machine learning related domains. Tika, being
one of the key technologies, used in many information retrieval
applications, I got the opportunity to work with Tika, couple of years back
but never got the chance to use Tika for an industry level application
until my internship. During my internship, I worked with a startup in SL,
to build their own cognitive platform where I had to use some of the Apache
technologies such as Kafka, Solr, Superset(incubating) and Tika. We could
successfully complete the initial version of the platform and I still work
as an external consultant for the same project. However, becoming a
committer to Apache Tika was one of the life goals I set when I got
selected as the Google Summer of Code intern at Apache Tika in 2017. My
project was “Supporting Image-to-Text (Image Captioning) in Tika for Image
MIME Types”[1], it was an amazing project idea by Thamme Gowda, which lots
of people paid so much attention. I was mentored by Chris Mattmann and
Thamme Gowda. I feel myself very lucky to have met these two people in my
life, because not for them, I don’t think, I would ever find the guidance
to become a PMC member or a committer. Most of my contributions are related
to enhancing ML based capabilities in Tika. I have many future plans to
improve the Tika-dl module. Including a parser with NMT based translation,
a sentiment parser, a dl4j based captioning parser to tika-dl. I also love
to improve Tika’s capabilities in mime type detection and language
detection. Other than that, I would love to clean up some of the parsers in
Tika. Our code base is quite a big one, evolved throughout many years and I
have seen instances where some of the parsers, not being in their
appropriate place, just to point out as an example, we have an age
recognizer parser in the Tika-nlp module while having a sentiment parser
under Tika-parsers module. I know that’s quite a lot of plans, I got there
for Tika, but I have nothing to be afraid of because I got an entire
lifetime to accomplish them.[1]
https://issues.apache.org/jira/browse/TIKA-2262
<https://issues.apache.org/jira/browse/TIKA-2262>   Thanks and Best
Regards,ThejanW*


On Tue, May 8, 2018 at 12:10 AM Chris Mattmann <[hidden email]> wrote:

> Welcome to Thejan Wijesinghe who has joined as a new Tika PMC member and
> committer!
>
>
>
> Please say a bit about yourself…thanks!
>
>
>
> Cheers,
>
> Chris
>
>
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Welcome Thejan Wijesinghe as an Apache Tika PMC and committer!

Tyler Palsulich-2
Welcome, Thejan! It's great to have you on board!

Tyler

(Catching up on some old email.)

On Thu, May 10, 2018, 4:52 PM Thejan Wijesinghe <[hidden email]> wrote:

> *Hi Chris Mattmann,Thank you for the invitation.Hi everyone,First of all, I
> should say, I am very excited to be on board. Being a PMC member in Tika is
> a huge accomplishment because Tika is one of those TLPs in Apache with a
> history of more than 10 years.I’m currently a final year undergraduate at
> Univ. of Moratuwa, Sri Lanka. I found a keen interest in information
> retrieval, data science and machine learning related domains. Tika, being
> one of the key technologies, used in many information retrieval
> applications, I got the opportunity to work with Tika, couple of years back
> but never got the chance to use Tika for an industry level application
> until my internship. During my internship, I worked with a startup in SL,
> to build their own cognitive platform where I had to use some of the Apache
> technologies such as Kafka, Solr, Superset(incubating) and Tika. We could
> successfully complete the initial version of the platform and I still work
> as an external consultant for the same project. However, becoming a
> committer to Apache Tika was one of the life goals I set when I got
> selected as the Google Summer of Code intern at Apache Tika in 2017. My
> project was “Supporting Image-to-Text (Image Captioning) in Tika for Image
> MIME Types”[1], it was an amazing project idea by Thamme Gowda, which lots
> of people paid so much attention. I was mentored by Chris Mattmann and
> Thamme Gowda. I feel myself very lucky to have met these two people in my
> life, because not for them, I don’t think, I would ever find the guidance
> to become a PMC member or a committer. Most of my contributions are related
> to enhancing ML based capabilities in Tika. I have many future plans to
> improve the Tika-dl module. Including a parser with NMT based translation,
> a sentiment parser, a dl4j based captioning parser to tika-dl. I also love
> to improve Tika’s capabilities in mime type detection and language
> detection. Other than that, I would love to clean up some of the parsers in
> Tika. Our code base is quite a big one, evolved throughout many years and I
> have seen instances where some of the parsers, not being in their
> appropriate place, just to point out as an example, we have an age
> recognizer parser in the Tika-nlp module while having a sentiment parser
> under Tika-parsers module. I know that’s quite a lot of plans, I got there
> for Tika, but I have nothing to be afraid of because I got an entire
> lifetime to accomplish them.[1]
> https://issues.apache.org/jira/browse/TIKA-2262
> <https://issues.apache.org/jira/browse/TIKA-2262>   Thanks and Best
> Regards,ThejanW*
>
>
> On Tue, May 8, 2018 at 12:10 AM Chris Mattmann <[hidden email]>
> wrote:
>
> > Welcome to Thejan Wijesinghe who has joined as a new Tika PMC member and
> > committer!
> >
> >
> >
> > Please say a bit about yourself…thanks!
> >
> >
> >
> > Cheers,
> >
> > Chris
> >
> >
> >
> >
> >
> >
> >
> >
>