Third Tika report

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Third Tika report

Jukka Zitting
Hi,

This is my draft for the third Tika report. This report completest the
initial three-month period after which we will be reporting only once
per quarter.

I'm hoping that we could have our first release out by the next report
in September, but I guess it's safer not to set any expectations in
the official report at this point.

<report>
Tika is a toolkit for detecting and extracting metadata and structured
text content from various documents using existing parser libraries.
Tika entered incubation on March 22nd, 2007.

Community

The Tika mailing lists have been relatively quiet lately, probably
because with little code we don't yet have many concrete issues to
talk about.

Development

We saw the first piece of Tika code when Chris A. Mattmann ported the
Nutch metadata framework to Tika. Rida Benjelloun is currently working
on bringing Lius code into Tika but the initial commits on that front
have not yet happened.

Issues before graduation

The Tika project is still at an early stage of incubation. We need to
continue bringing in the initial codebases and probably target for an
initial incubating release later this year. We also need to work on
growing the community and figuring out how to best interact with
external parser projects.
</report>

BR,

Jukka Zitting
Reply | Threaded
Open this post in threaded view
|

Re: Third Tika report

Bertrand Delacretaz
On 6/12/07, Jukka Zitting <[hidden email]> wrote:

> This is my draft for the third Tika report....

Looks good to me, thanks!

-Bertrand
Reply | Threaded
Open this post in threaded view
|

Re: Third Tika report

Rida Benjelloun
Hi Jukka,
Thanks for the report. I have create Lius issue in Jira 3 days ago. This
version of Lius remove lucene dependencies and use Nutch office parser.
https://issues.apache.org/jira/browse/TIKA-7
Regards.

On 6/12/07, Bertrand Delacretaz <[hidden email]> wrote:
>
> On 6/12/07, Jukka Zitting <[hidden email]> wrote:
>
> > This is my draft for the third Tika report....
>
> Looks good to me, thanks!
>
> -Bertrand
>



--
---------------------------------------------------------
Rida Benjelloun
Doculibre inc.
[hidden email]
[hidden email]
Cel: 418-262-3222
Tel: 418-353-3390
Site Web : http://www.doculibre.com
---------------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Third Tika report

Jukka Zitting
Hi,

On 6/13/07, Rida Benjelloun <[hidden email]> wrote:
> Thanks for the report. I have create Lius issue in Jira 3 days ago. This
> version of Lius remove lucene dependencies and use Nutch office parser.
> https://issues.apache.org/jira/browse/TIKA-7

Thanks for that! I noticed your work but unfortunately didn't yet have
time to give it a better look. I'll change the report to mention (see
[1]) that we have the source in the issue tracker and get back to the
details in Jira.

[1] http://wiki.apache.org/incubator/June2007

BR,

Jukka Zitting
Reply | Threaded
Open this post in threaded view
|

Re: Third Tika report

chrismattmann
In reply to this post by Jukka Zitting
Jukka,

 Thanks for putting this together. +1, looks great, as usual.

Cheers,
  Chris



On 6/12/07 12:48 PM, "Jukka Zitting" <[hidden email]> wrote:

> Hi,
>
> This is my draft for the third Tika report. This report completest the
> initial three-month period after which we will be reporting only once
> per quarter.
>
> I'm hoping that we could have our first release out by the next report
> in September, but I guess it's safer not to set any expectations in
> the official report at this point.
>
> <report>
> Tika is a toolkit for detecting and extracting metadata and structured
> text content from various documents using existing parser libraries.
> Tika entered incubation on March 22nd, 2007.
>
> Community
>
> The Tika mailing lists have been relatively quiet lately, probably
> because with little code we don't yet have many concrete issues to
> talk about.
>
> Development
>
> We saw the first piece of Tika code when Chris A. Mattmann ported the
> Nutch metadata framework to Tika. Rida Benjelloun is currently working
> on bringing Lius code into Tika but the initial commits on that front
> have not yet happened.
>
> Issues before graduation
>
> The Tika project is still at an early stage of incubation. We need to
> continue bringing in the initial codebases and probably target for an
> initial incubating release later this year. We also need to work on
> growing the community and figuring out how to best interact with
> external parser projects.
> </report>
>
> BR,
>
> Jukka Zitting