"An Open Source HMM-based Text-to-Speech System for Brazilian Portuguese".

  • 2010
  • Igor Couto, Nelson Neto, Vincent Tadaiesky, Aldebaro Klautau, Ranniery Maia.

Abstract — Text-to-speech (TTS) is currently a mature tech- nology that is used in many applications. Some modules of a TTS depend on the language and, while there are many public resources for English, the resources for some under- represented languages are still limited. This work describes the development of a complete TTS system for Brazilian Portuguese which expands the already available resources. The system uses the MARY framework and is based on the hidden Markov model (HMM) speech synthesis approach. Some of the contributions of this work consist in implementing syllab- ification, determination of stressed syllable and grapheme-to- phoneme (G2P) conversion. This work also describes the steps for organizing the developed resources and implementing a Brazilian Portuguese voice within the MARY. These resources are made available and facilitate the research in text analysis and HMM-based synthesis for Brazilian Portuguese. Keywords— Text-to-speech systems, HMM-based speech syn- thesis, text analysis.

Download

"FFTranscriber: Software para Transcrição Otimizado para Aplicações Forenses".

  • 2010
  • Renan Moura, Nelson Neto, Carlos Patrick, Pedro Batista e Aldebaro Klautau.

Com o intuito de otimizar o processo de transcrição textual de áudio, a presente proposta apresenta o aplicativo FFTranscriber, que integra em uma única interface todas as ferramentas necessárias para desempenhar tal atividade. O FFTranscriber consiste de duas interfaces de trabalho integradas, uma para tratamento de áudio e um editor de texto. Outra facilidade é o módulo de reconhecimento de voz, onde o perito responsável pela transcrição pode prover o arquivo de áudio, ou mesmo falar o conteúdo do áudio, para recuperar de forma automática a transcrição correspondente, que pode ser posteriormente editada.

Download

"Free tools and resources for Brazilian Portuguese speech recognition".

  • 2010
  • Nelson Neto, Patrick Silva, Aldebaro Klautau, Isabel Trancoso.

An automatic speech recognition system has modules that depend on the language and, while there are many public resources for some languages (e.g., English and Japanese), the resources for Brazilian Portuguese (BP) are still limited. This work describes the development of resources and free tools for BP speech recognition, consisting of text and audio corpora, phonetic dictionary, grapheme-to-phone converter, language and acoustic models. All of them are publicly available and, together with a proposed application programming interface, have been used for the development of several new applications, including a speech module for the OpenOffice suite. Performance tests are presented, comparing the developed BP system with a commercial software. The paper also describes an application that uses synthesis and speech recognition together with a natural language processing module dedicated to statistical machine translation. This application allows the translation of spoken conversations from BP to English and vice versa. The resources make easier the adoption of BP speech technologies by other academic groups and industry.

Downlaod

"New resources for brazilian portuguese: Results for grapheme-to-phoneme and phone classification".

  • 2006
  • C. Hosn, L. A. N. Baptista, T. Imbiriba, and A. Klautau.

Abstract — Speech processing is a data-driven technology that relies on public corpora and associated resources. In contrast to languages such as English, there are few resources for Brazilian Portuguese (BP). Consequently, there are no publicly available scripts to design baseline BP systems. This work discusses some efforts towards decreasing this gap and presents results for two speech processing tasks for BP: phone classification and grapheme to phoneme (G2P) conversion. The former task used hidden Markov models to classify phones from the Spoltech and TIMIT corpora. The G2P module adopted machine learning methods such as decision trees and was tested on a new BP pronunciation dictionary and the following languages: British English, American English and French.

Download