威尼斯人官网 威尼斯人官网 English

学术报告

2013年9月9日学术报告通知

2013年09月06日  

                        

报告题目:Selected Topics in Multimedia Analytics (多媒体内容分析的热点问题)

报 告 人:Xavier Anguera 博士(Research Scientist, Multimedia Research Group, Telefonica Research, Barcelona, Spain)

主 持 人:谢磊 教授

地    点:澳门威尼斯人注册楼105学术报告厅

时    间:2013年9月9日(周一)上午10:20-:11:50

报告人概况:

Xavier Anguera: Ing. [MS] 2001 UPC University (Barcelona, Spain), [MS] 2001 European Masters in Language and Speech, Ph.D. 2006 UPC University, with a thesis on speaker diarization for multi-microphone meeting recordings. From 2001 to 2003 he worked for Panasonic Speech Technology Lab in Santa Barbara, CA on text-to-speech for several languages. From 2004 to 2006 he was a visiting researcher at the International Computer Science Institute (ICSI) in Berkeley, CA. Since 2007 he is a research scientist at Telefonica Research in Barcelona. His research interests cover speech processing (both speaker and content-based) and multimodal multimedia processing. He has published over 60 peer reviewed papers and has several accepted or pending patents. He is an active member of IEEE and ACM associations, for which he has served in the organization and in the PC of several multimedia and speech conferences.

报告摘要:

this talk I will cover three topics that I have been working on during the last 3 years. First, I will talk about multimodal video-copy detection, which focuses on finding whether a given video contains any modified video excerpts obtained from an original video. For this we implemented a novel binary audio fingerprint that we call MASK, which I will describe. Next, I will talk about speaker recognition, in which we want to find whether an audio recording of a speaker belongs or not to its claimed identity. For this task we have developed a novel binary speaker representation and modeling technique. Last, I will speak about spoken web search, in which a given audio query is searched for inside a big audio database. For this we have proposed a novel algorithm based on Dynamic Time Warping (DTW) that allows to pre-index the audio database for faster retrieval of matches, and uses very little memory in comparison to standard DTW techniques.

上一条:2013年9月12日上午学术报告通知 下一条:2013年9月18日学术报告通知

关闭

威尼斯人官网
XML 地图 | Sitemap 地图