報告人:
Yannis Agiomyrgiannakis博士,谷歌高級研究科學家 (London, UK)
李博 博士 谷歌研究科學家 (Mountain View, CA, USA)
主持人:謝磊教授
時間:2016年3月31日(周四)下午2:00-3:30
地點:計算機學院105報告廳
報告摘要:
As Speech-based conversational agents like Alexa, Cortana, Google Now and Siri become the preferred interface for Human-Machine interaction, there is a renewed interest in Text-To-Speech technologies. This talk highlights TTS from an industrial perspective and presents new developments in the fields of Vocoding, Statistical Mapping and Voice Morphing that significantly outperform the baseline and even challenge the status-quo.
(隨著基於語音交互的智能代理,例如亞馬遜Alexa、微軟小娜、谷歌Now和蘋果Siri在人機互動中的流行,它們對於語音合成技術的需求日趨旺盛。本次報告講從工業界的角度講述谷歌致力於的語音合成技術的最新進展,包括聲碼器、統計映射、語音轉換,谷歌的這些技術在業界處理領先水平。)
報告人簡介
Yannis Agiomyrgiannakis finished his PhD thesis on the subject "Sinusoidal Speech Coding for Voice-over-IP" in 2006 at the University of Crete, with Yannis Stylianou. He held a post-doc position regarding speech coding for TTS systems, glottal inversion and voice transformation, at the Text-to-Speech Synthesis group in France Telecom, working with Olivier Rosec. He joined Paul Taylor's startup called "Phonetic Arts" at Cambridge, a company that was introducing speech synthesis to the game industry and was acquired by Google in 2010, where he is the DSP tech-lead for Google TTS. He is the author of 20+ publications and 17 patents in speech coding, speech processing and speech synthesis. His interests are in Signal Processing, Speech Coding, Speech Analysis/Modeling, Statistical Modeling, Sinusoidal Synthesis, Text-to-Speech, Voice-over-IP, Source/Channel Coding, Vector Quantization, Multiple Description Coding, DSP implementation, Glottal Inversion, Voice Morphing, etc.
李博博士,畢業於西北工業大學計算機學院,新加坡國立大學博士,現任谷歌研究科學家,致力於語音識別與合成技術。
主辦:計算機學院 陝西省語音與圖像信息處理重點實驗室