[jira] [Commented] (TIKA-94) Speech-to-text transcription

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (TIKA-94) Speech-to-text transcription

ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/TIKA-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338663#comment-17338663 ]

ASF GitHub Bot commented on TIKA-94:
------------------------------------

lewismc commented on pull request #406:
URL: https://github.com/apache/tika/pull/406#issuecomment-831595625


   @tballison I know you and I spoke about refactoring this as simple a parser interface...
   I would like to merge it for the time being and I can begin to work on the refactoring in a separate ticket.


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


> Speech-to-text transcription
> ----------------------------
>
>                 Key: TIKA-94
>                 URL: https://issues.apache.org/jira/browse/TIKA-94
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>              Labels: new-parser
>
> Like OCR for image files (TIKA-93), we could try using speech recognition to extract text content (where available) from audio (and video!) files.
> The CMU Sphinx engine (http://cmusphinx.sourceforge.net/) looks promising and comes with a friendly license.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)