Patents


P. Mowlaee, R. Saeidi, and G. Kubin, "Iterative closed-loop speech enhancement", European Patent, filed on August 23rd, 2013.

 

Publications


Note: Journal articles are reported in italics

2014

  • M. Matassoni, A. Brutti, P. Svaizer, "Acoustic modeling based on Early-to-Late Reverberation Ratio for robust ASR", Accepted In Proc. IWAENC 2014.

  • C. E. Cancino Chacon and P. Mowlaee, “Least Squares Phase Estimation of Mixed Signals”, Accepted In Proc. Interspeech 2014.

  • M. Matassoni, R. F. Astudillo, A. Katsamanis, and M. Ravanelli, “The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones,” Accepted In Proc. Interspeech 2014.

  • J. A. Morales Cordovilla, H. Pessentheiner, M. Hagmüller and G. Kubin, “Room Localization for Distant Speech Recognition”, Accepted In Proc. Interspeech 2014.

  • P. Mowlaee, R. Saeidi, Y. Stylianou, INTERSPEECH 2014 Special Session: "On Phase Importance in Speech Processing Applications", Accepted In Proc. Interspeech 2014.

  • P. Mowlaee, M. K. Watanabe and R. Saeidi, “Show & Tell: Iterative Refinement of Amplitude and Phase in Single-channel Speech Enhancement”, Accepted In Proc. Interspeech 2014.

  • B. Schuppler, M. Adda-Decker, and J. A. Morales-Cordovilla, "Pronunciation variation in read and conversational Austrian German", Accepted In Proc. Interspeech 2014.

  • A. Tsiami, I. Rodomagoulakis, P. Giannoulis, A. Katsamanis, G. Potamianos and P. Maragos, ATHENA: A Greek Multi-Sensory Database for Home Automation Control, Accepted In Proc. Interspeech 2014.

  • J. A. M. Cordovilla, M. Hagmüller, H. Pessentheiner and G. Kubin, “Distant Speech Recognition in Reverberant Noisy Conditions Employing a Microphone Array”, Accepted In Proc. Eusipco 2014.

  • A. Tsiami, A. Katsamanis, P. Maragos and G. Potamianos, “Experiments in Acoustic Source Localization Using Sparse Arrays in Adverse Indoors Environments”, Accepted In Proc. Eusipco 2014.

  • P. Giannoulis, G. Potamianos, A. Katsamanis and P. Maragos, “Multi-Microphone Fusion for Detection of Speech and Acoustic Events in Smart Spaces”, Accepted In Proc. Eusipco 2014.

  • R. F. Astudillo, S. Braun, and A. P. Habets Emanuel “A multi-channel feature compensation approach for robust ASR in noisy and reverberant environments,” In Proc. REverberant Voice Enhancement and Recognition Benchmark (REVERB) workshop, 2014.

  • R. F. Astudillo, A. Abad, I. Trancoso, “Accounting for the Residual Uncertainty of Multi-Layer Perceptron based Features”, In Proc. ICASSP 2014.

  • A. Katsamanis, I. Rodomagoulakis, G. Potamianos, P. Maragos, and A. Tsiami, “Robust far-field spoken command recognition for home automation combining adaptation and multichannel processing”, In Proc. ICASSP, 2014.

  • A. Brutti, M. Matassoni, "On the use of Early-to-Late Reverberation Ratio for ASR in reverberant environments", In Proc. ICASSP 2014.

  • A. Brutti, M. Ravanelli, P. Svaizer, M. Omologo, "A speech event detection and localization task for multiroom environments", In Proc. HSCMA 2014.

  • C. M. Guerrero, M. Omologo, "Word boundary agreement to combine multi-microphone hypotheses in distant speech recognition", In Proc. HSCMA 2014.

  • P. Giannoulis, A. Tsiami, I. Rodomagoulakis, A. Katsamanis, G. Potamianos, P. Maragos, “The ATHENA-RC system for speech activity detection and speaker localization in the DIRHA smart home”, In Proc. HSCMA 2014.

  • L. Cristoforetti, M. Ravanelli, M. Omologo, A. Sosi, A. Abad, M. Hagmueller, P. Maragos, "The DIRHA simulated corpus", In Proc. LREC 2014.

  • B. Schuppler, M. Hagmüller, J. A. Morales-Cordovilla, and H. Pessentheiner, "GRASS: The Graz Corpus of Read and Spontaneous Speech", In Proc. LREC 2014.

 

2013

  • G. Evangelopoulos, A. Zlatintsi, A. Potamianos, P. Maragos, K. Rapantzikos, G. Skoumas and Y. Avrithis,  “Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention”, IEEE Transactions on Multimedia, vol. 15, no. 7, November 2013.

  • P. Mowlaee, and R. Saeidi, "On Phase Importance in Parameter Estimation in single-channel Speech Enhancement", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, May. 2013, Vancouver, Canada, 2013.

  • A. Abad, R. Astudillo, I. Trancoso, “The L2F Spoken Web Search system for Mediaeval 2013”, Proc. of the Mediaeval Workshop, Barcelona, Spain, October 2013.

  • M. Matassoni, "Controllare la casa con la voce: il progetto DIRHA", Proc. AISV 2013, Venezia, Jan 2013.

  • P. Mowlaee, and R. Saeidi, "Target Speaker Separation in a Multisource Environment Using Speakerdependent Postfilter and Noise Estimation", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, May. 2013, Vancouver, Canada, 2013.

  • P. Mowlaee, and M. Watanabe, "Iterative Sinusoidal-based Partial Phase Reconstruction in Single-Channel Source Separation", Proc. 14th Annual Conference of International Speech Communication Association (Interspeech), Lyon, France, pp. 832-836, August 2013.

  • P. Mowlaee, M. Watanabe, and R. Saeidi, "SHOW & TELL: PHASE-AWARE SINGLE-CHANNEL SPEECH ENHANCEMENT", Proc. 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon, France, pp. 1872-1874, August 2013.

  • P. Langjahr, and P. Mowlaee, "Objective Quality Assessment of Target Speaker Separation Performance in Multisource Reverberant Environment", Proc. Interspeech 2013 Satellite Workshop: 4th International Workshop on Perceptual Quality of Systems (PQS 2013), Vienna, Austria, pp. 89-94, Sep. 2013.

  • A. K. Fuchs, J. A. Morales-Cordovilla, and M. Hagmueller “ASR for electrolaryngeal speech". Proc. Automatic Speech Recognition and Understanding Workshop (ASRU 2013), Dec. 2013 (accepted)

  • I. Rodomagoulakis, G. Potamianos and P. Maragos, “Advances in Large Vocabulary Continuous Speech Recognition in Greek: Modeling and Nonlinear Features”, to be presented in the 21st European Signal Processing Conference (EUSIPCO 2013), Marrakech, Morocco, September 2013.

  • H. Medeiros, H. Moniz, F. Batista, I. Trancoso, L. Nunes, “Disfluency Detection Based on Prosodic Features for University Lectures", In Proc. of Interspeech 2013, Lyon, France,  August 2013.

  • A. Abad, L.J. Rodríguez-Fuentes, M. Penagarikano, A. Varona and G. Bordel, "On the Calibration and Fusion of Heterogeneous Spoken Term Detection Systems", In Proc. of Interspeech 2013, Lyon, France, August 2013.

  • A. Brutti, M. Omologo, “Geometric contamination for GMM/UBM speaker verification in reverberant environments”, In Proc. of Interspeech 2013, Lyon, France, August 2013.

  • I. Rodomagoulakis, P. Giannoulis, Z.-I. Skordilis, P. Maragos and G. Potamianos, “Experiments on Far-field Multichannel Speech Processing in Smart Homes”, to be presented in the 18th Int’l Conference on Digital Signal Processing, (DSP 2013), Santorini, Greece, July 2013.

  • E. Khoury et al., "The 2013 Speaker Recognition Evaluation in Mobile Environment", In Proc. of the 6th IAPR International Conference on Biometrics (ICB 2013), Madrid, Spain, June 2013.

  • P. Mowlaee, J. A. Morales-Cordovilla, F. Pernkopf, H. Pessentheiner, M. Hagmueller, G. Kubin. "The 2nd 'CHIME‘ speech separation and recognition challenge: approaches on singlechannel source separation and model-driven speech enhancement". The 2nd CHiME Workshop, Vancouver, Canada, May, 2013.

  • J. A. Morales-Cordovilla, H. Pessentheiner, M. Hagmueller, P. Mowlaee, F. Pernkopf, G.Kubin. "A German distant speech recognizer based on 3D beamforming and harmonic missing data mask". AIA-DAGA, Merano, Italy, March 2013.
     

2012

  • A. Abad and R.F. Astudillo, "The L2F Spoken Web Search system for Mediaeval 2012", Proc. MediaEval 2012, Pisa, Italy, Oct. 2012.

  • E. Antonakos, V. Pitsikalis, I. Rodomagoulakis and P. Maragos, “Unsupervised Classification of Extreme Facial Events Using Active Appearance Models Tracking for Sign Language Videos”, Proc. ICIP 2012, Orlando, Florida, USA, Sep. 2012.

  • R.F. Astudillo, A. Abad, J.P.da S. Neto, “Uncertainty driven Compensation of Multi-Stream MLP Acoustic Models for Robust ASR", Proc. InterSpeech 2012, Portland, Oregon, Sep. 2012.

  • R.F. Astudillo, D. Kolossa, A. Abad, S. Zeiler, R. Saeidi, P. Mowlaee, J.P.da S. Neto and R. Martin, “Integration of Beamforming and Uncertainty-of-Observation Techniques for Robust ASR in Multi-Source Environments", Computer Speech and Language, Special issue on Multisource Environments (accepted).

  • A. Brutti, M. Omologo, P. Svaizer, “Maximum a Posteriori Trajectory Estimation for Acoustic Source Tracking”, Proc. IWAENC 2012, Aachen, Germany, Sep. 2012.

  • C. Georgakis, P. Maragos, G. Evangelopoulos and D. Dimitriadis, “Dominant Spatiotemporal Modulations and Energy Tracking in Videos: Application to Interest Point Detection for Action Recognition”, Proc. ICIP 2012, Orlando, Florida, USA, Sep. 2012.

  • P. Giannoulis and G. Potamianos, “A hierarchical approach with feature selection for emotion recognition from speech”, Proc. LREC 2012, Istanbul, Turkey, May 2012.

  • J.A. Morales-Cordovilla, P. Cabañas-Molero, V. Sanchez and A.M. Peinado, “A robust pitch extractor based on DTW lines and CASA with application in noisy speech recognition”, Proc. Iberspeech 2012, Madrid, Spain, Nov. 2012.

  • H. Pessentheiner, S. Petrik, and H. Romsdorfer, "Beamforming using uniform circular arrays for distant speech recognition in reverberant environments and double talk scenarios", Proc. InterSpeech 2012, Portland, Oregon, USA, Sep. 2012.

  • H. Pessentheiner, G. Kubin, and H. Romsdorfer, "Improving Beamforming for Distant Speech Recognition in Reverberant Environments Using a Genetic Algorithm for Planar Array Synthesis", Proc. ITG Conference on Speech Communication 2012, Braunschweig, Germany, Sep. 2012.

  • M. Ravanelli, A. Sosi, P. Svaizer, and M. Omologo, "Impulse Response Estimation for Robust Speech Recognition in a Reverberant Environment", Proc. EUSIPCO 2012, Bucharest, Romania, Aug. 2012.