P. Mowlaee, R. Saeidi, and G. Kubin, "Iterative closed-loop speech enhancement", European Patent, filed on August 23rd, 2013.



Note: Journal articles are reported in italics



  • E. Messner, H. Pessentheiner & J. A. Morales-Cordovilla et al. "Adaptive Differential Microphone Arrays Used As A Front-End For An Automatic Speech Recognition System", ICASSP 2015 (accepted).

  • Z. I. Skordilis, A. Tsiami, P. Maragos, G. Potamianos, L. Spelgatti and R. Sannino, "Multichannel Speech Enhancement Using MEMS Microphones", ICASSP 2015 (accepted).

  • M. Fakhry, P. Svaizer, and M. Omologo, "Audio source separation using a redundant library of source spectral bases for nonnegative tensor factorization",  ICASSP 2015 (accepted).

  • E. Zwyssig, M. Ravanelli, P. Svaizer, and M. Omologo, "A multi-channel corpus for distant-speech interaction in presence of known interferences",  ICASSP 2015 (accepted).




  • J. Kulmer, P. Mowlaee & M. K. Watanabe, "A probabilistic approach for phase estimation in single-channel speech enhancement using von mises phase priors", In Proc. MLSP 2014.

  • J. A. Morales-Cordovilla , H. Pessentheiner, M. Hagmüller et al., "CVX-Optimized Beamforming and Vector Taylor Series Compensation with German ASR Employing Star-Shaped Microphone Array", In Proc. IberSPEECH 2014.

  • M. Matos, A. Abad, R. Astudillo and I. Trancoso, "Recognition of distant voice commands for home applications in Portuguese", In Proc. IberSPEECH 2014.

  • A. Abad, M. Matos, H. Meinedo, R. Astudillo and I. Trancoso, "The L2F system for the EVALITA-2014 speech activity detection challenge in domestic environments", In Proc. EVALITA 2014.

  • A. Brutti, M. Ravanelli, M. Omologo, "SASLODOM: Speech Activity detection and Speaker LOcalization in DOMestic environments", In Proc. EVALITA 2014.

  • B. Schuppler, S. Grill, A. Menrath and J. A. Morales Cordovilla "Automatic phonetic transcription in two steps: forced alignment and burst detection",  In Proc. SLSP 2014.

  • M. Matassoni, A. Brutti, P. Svaizer, "Acoustic modeling based on Early-to-Late Reverberation Ratio for robust ASR", In Proc. IWAENC 2014.

  • P. Mowlaee, C. Nachbar, "Speaker Dependent Speech Enhancement Using Sinusoidal Model", In Proc. IWAENC 2014.

  • P. Mowlaee, R. Saeidi, "Time-Frequency Constraints for Phase Estimation in Single-Channel Speech Enhancement", In Proc. IWAENC 2014.

  • M. Fakhry, P. Svaizer, and M. Omologo, "Reverberant Audio Source Separation using Partially Pre-trained Nonnegative Matrix Factorization", In Proc. IWAENC 2014.

  • C. E. Cancino Chacon and P. Mowlaee, "Least Squares Phase Estimation of Mixed Signals", In Proc. INTERSPEECH 2014.

  • M. Matassoni, R. F. Astudillo, A. Katsamanis, and M. Ravanelli, "The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones", In Proc. INTERSPEECH 2014.

  • H. K. Maganti and M. Matassoni, "Auditory processing-based features for improving speech recognition in adverse acoustic conditions", EURASIP Journal on Audio, Speech, and Music Processing 2014.

  • M. Ravanelli, M. Omologo, "On the selection of the impulse responses for distant-speech recognition based on contaminated speech training", In Proc. INTERSPEECH 2014.

  • J. A. Morales Cordovilla, H. Pessentheiner, M. Hagmüller and G. Kubin, "Room Localization for Distant Speech Recognition", In Proc. INTERSPEECH 2014.

  • P. Mowlaee, R. Saeidi, Y. Stylianou, INTERSPEECH 2014 Special Session: "On Phase Importance in Speech Processing Applications", In Proc. INTERSPEECH 2014.

  • P. Mowlaee, M. K. Watanabe and R. Saeidi, "Show & Tell: Iterative Refinement of Amplitude and Phase in Single-channel Speech Enhancement", In Proc. INTERSPEECH 2014.

  • B. Schuppler, M. Adda-Decker, and J. A. Morales Cordovilla, "Pronunciation variation in read and conversational Austrian German", In Proc. INTERSPEECH 2014.

  • A. Tsiami, I. Rodomagoulakis, P. Giannoulis, A. Katsamanis, G. Potamianos and P. Maragos, "ATHENA: A Greek Multi-Sensory Database for Home Automation Control", In Proc. INTERSPEECH 2014.

  • C. M. Guerrero, M. Omologo, "Exploiting Inter-Microphone Agreement for Hypothesis Combination in Distant Speech Recognition", In Proc. EUSIPCO 2014.

  • J. A. Morales Cordovilla, M. Hagmüller, H. Pessentheiner and G. Kubin, "Distant Speech Recognition in Reverberant Noisy Conditions Employing a Microphone Array", In Proc. EUSIPCO 2014.

  • A. Tsiami, A. Katsamanis, P. Maragos and G. Potamianos, "Experiments in Acoustic Source Localization Using Sparse Arrays in Adverse Indoors Environments", In Proc. EUSIPCO 2014.

  • P. Giannoulis, G. Potamianos, A. Katsamanis and P. Maragos, "Multi-Microphone Fusion for Detection of Speech and Acoustic Events in Smart Spaces",  In Proc. EUSIPCO 2014.

  • C. Leitner,  J. A. Morales-Cordovilla and F. Pernkopf, "Evaluation of speech enhancement based on pre-image iterations using automatic speech recognition", In Proc. EUSIPCO 2014.

  • A. Zehetner, M. Hagmüller & F. Pernkopf, "Wake-Up-Word Spotting for MobileSystems", In Proc. EUSIPCO 2014.

  • R. F. Astudillo, S. Braun, and A. P. Habets Emanuel "A multi-channel feature compensation approach for robust ASR in noisy and reverberant environments", In Proc. REverberant Voice Enhancement and Recognition Benchmark (REVERB) workshop, 2014.

  • R. F. Astudillo, A. Abad, I. Trancoso, "Accounting for the Residual Uncertainty of Multi-Layer Perceptron based Features", In Proc. ICASSP 2014.

  • R. Peharz, G. Kapeller, P. Mowlaee et al., "Modeling speech with sum-product networks: Application to bandwidth extension”, In Proc. ICASSP 2014.

  • A. Katsamanis, I. Rodomagoulakis, G. Potamianos, P. Maragos, and A. Tsiami, "Robust far-field spoken command recognition for home automation combining adaptation and multichannel processing", In Proc. ICASSP, 2014.

  • A. Brutti, M. Matassoni, "On the use of Early-to-Late Reverberation Ratio for ASR in reverberant environments", In Proc. ICASSP 2014.

  • A. Brutti, M. Ravanelli, P. Svaizer, M. Omologo, "A speech event detection and localization task for multiroom environments", In Proc. HSCMA 2014.

  • C. M. Guerrero, M. Omologo, "Word boundary agreement to combine multi-microphone hypotheses in distant speech recognition", In Proc. HSCMA 2014.

  • P. Giannoulis, A. Tsiami, I. Rodomagoulakis, A. Katsamanis, G. Potamianos, P. Maragos, "The ATHENA-RC system for speech activity detection and speaker localization in the DIRHA smart home", In Proc. HSCMA 2014.

  • L. Cristoforetti, M. Ravanelli, M. Omologo, A. Sosi, A. Abad, M. Hagmueller, P. Maragos, "The DIRHA simulated corpus", In Proc. LREC 2014.

  • B. Schuppler, M. Hagmüller, J. A. Morales-Cordovilla, and H. Pessentheiner, "GRASS: The Graz Corpus of Read and Spontaneous Speech", In Proc. LREC 2014.



  • G. Evangelopoulos, A. Zlatintsi, A. Potamianos, P. Maragos, K. Rapantzikos, G. Skoumas and Y. Avrithis,  “Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention”, IEEE Transactions on Multimedia, vol. 15, no. 7, November 2013.

  • P. Mowlaee, and R. Saeidi, "On Phase Importance in Parameter Estimation in single-channel Speech Enhancement", In Proc. ICASSP 2013.

  • P. Mowlaee, and R. Saeidi, "Target Speaker Separation in a Multisource Environment Using Speakerdependent Postfilter and Noise Estimation", In Proc. ICASSP 2013.

  • A. Abad, R. Astudillo, I. Trancoso, “The L2F Spoken Web Search system for Mediaeval 2013”, In Proc. of the Mediaeval Workshop, Barcelona, Spain, October 2013.

  • M. Matassoni, "Controllare la casa con la voce: il progetto DIRHA", In Proc. AISV 2013.

  • P. Mowlaee, and M. Watanabe, "Iterative Sinusoidal-based Partial Phase Reconstruction in Single-Channel Source Separation", In Proc. INTERSPEECH 2013.


  • H. Medeiros, H. Moniz, F. Batista, I. Trancoso, L. Nunes, “Disfluency Detection Based on Prosodic Features for University Lectures", In Proc. INTERSPEECH 2013.

  • A. Abad, L.J. Rodríguez-Fuentes, M. Penagarikano, A. Varona and G. Bordel, "On the Calibration and Fusion of Heterogeneous Spoken Term Detection Systems", In Proc. INTERSPEECH 2013.

  • A. Brutti, M. Omologo, “Geometric contamination for GMM/UBM speaker verification in reverberant environments”, In Proc. of INTERSPEECH 2013.

  • P. Langjahr, and P. Mowlaee, "Objective Quality Assessment of Target Speaker Separation Performance in Multisource Reverberant Environment", Proc. Interspeech 2013 Satellite Workshop: 4th International Workshop on Perceptual Quality of Systems (PQS 2013), Vienna, Austria, pp. 89-94, Sep. 2013.

  • A. K. Fuchs, J. A. Morales-Cordovilla, and M. Hagmueller “ASR for electrolaryngeal speech". In Proc. ASRU 2013.

  • I. Rodomagoulakis, G. Potamianos and P. Maragos, “Advances in Large Vocabulary Continuous Speech Recognition in Greek: Modeling and Nonlinear Features”, In Proc. EUSIPCO 2013.

  • I. Rodomagoulakis, P. Giannoulis, Z.-I. Skordilis, P. Maragos and G. Potamianos, “Experiments on Far-field Multichannel Speech Processing in Smart Homes”, In Proc. DSP 2013.

  • E. Khoury et al., "The 2013 Speaker Recognition Evaluation in Mobile Environment", In Proc. of the 6th IAPR International Conference on Biometrics (ICB 2013), Madrid, Spain, June 2013.

  • P. Mowlaee, J. A. Morales-Cordovilla, F. Pernkopf, H. Pessentheiner, M. Hagmueller, G. Kubin. "The 2nd 'CHIME‘ speech separation and recognition challenge: approaches on singlechannel source separation and model-driven speech enhancement". The 2nd CHiME Workshop, Vancouver, Canada, May, 2013.

  • J. A. Morales-Cordovilla, H. Pessentheiner, M. Hagmueller, P. Mowlaee, F. Pernkopf, G.Kubin. "A German distant speech recognizer based on 3D beamforming and harmonic missing data mask". AIA-DAGA, Merano, Italy, March 2013.


  • A. Abad and R.F. Astudillo, "The L2F Spoken Web Search system for Mediaeval 2012", Proc. MediaEval 2012, Pisa, Italy, Oct. 2012.

  • E. Antonakos, V. Pitsikalis, I. Rodomagoulakis and P. Maragos, “Unsupervised Classification of Extreme Facial Events Using Active Appearance Models Tracking for Sign Language Videos”, Proc. ICIP 2012, Orlando, Florida, USA, Sep. 2012.

  • R.F. Astudillo, A. Abad, J.P.da S. Neto, “Uncertainty driven Compensation of Multi-Stream MLP Acoustic Models for Robust ASR", Proc. InterSpeech 2012, Portland, Oregon, Sep. 2012.

  • R.F. Astudillo, D. Kolossa, A. Abad, S. Zeiler, R. Saeidi, P. Mowlaee, J.P.da S. Neto and R. Martin, “Integration of Beamforming and Uncertainty-of-Observation Techniques for Robust ASR in Multi-Source Environments", Computer Speech and Language, Special issue on Multisource Environments (accepted).

  • A. Brutti, M. Omologo, P. Svaizer, “Maximum a Posteriori Trajectory Estimation for Acoustic Source Tracking”, Proc. IWAENC 2012, Aachen, Germany, Sep. 2012.

  • C. Georgakis, P. Maragos, G. Evangelopoulos and D. Dimitriadis, “Dominant Spatiotemporal Modulations and Energy Tracking in Videos: Application to Interest Point Detection for Action Recognition”, Proc. ICIP 2012, Orlando, Florida, USA, Sep. 2012.

  • P. Giannoulis and G. Potamianos, “A hierarchical approach with feature selection for emotion recognition from speech”, Proc. LREC 2012, Istanbul, Turkey, May 2012.

  • J.A. Morales-Cordovilla, P. Cabañas-Molero, V. Sanchez and A.M. Peinado, “A robust pitch extractor based on DTW lines and CASA with application in noisy speech recognition”, Proc. Iberspeech 2012, Madrid, Spain, Nov. 2012.

  • H. Pessentheiner, S. Petrik, and H. Romsdorfer, "Beamforming using uniform circular arrays for distant speech recognition in reverberant environments and double talk scenarios", Proc. INTERSPEECH 2012, Portland, Oregon, USA, Sep. 2012.

  • H. Pessentheiner, G. Kubin, and H. Romsdorfer, "Improving Beamforming for Distant Speech Recognition in Reverberant Environments Using a Genetic Algorithm for Planar Array Synthesis", Proc. ITG Conference on Speech Communication 2012, Braunschweig, Germany, Sep. 2012.

  • M. Ravanelli, A. Sosi, P. Svaizer, and M. Omologo, "Impulse Response Estimation for Robust Speech Recognition in a Reverberant Environment", Proc. EUSIPCO 2012, Bucharest, Romania, Aug. 2012.