Hemant A. Patil

Orcid: 0000-0002-4068-2005

Affiliations:
  • Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT), Gandhinagar, India


According to our database1, Hemant A. Patil authored at least 249 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Vulnerability issues in Automatic Speaker Verification (ASV) systems.
EURASIP J. Audio Speech Music. Process., December, 2024

Modeling musical expectancy via reinforcement learning and directed graphs.
Multim. Tools Appl., March, 2024

Morse wavelet transform-based features for voice liveness detection.
Comput. Speech Lang., March, 2024

CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Voice Privacy Using Time-Scale and Pitch Modification.
SN Comput. Sci., 2024

Noise Robust Whisper Features for Dysarthric Automatic Speech Recognition.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

2023
Multiple voice disorders in the same individual: Investigating handcrafted features, multi-label classification algorithms, and base-learners.
Speech Commun., July, 2023

On significance of constant-Q transform for pop noise detection.
Comput. Speech Lang., 2023

Replay spoof detection using energy separation based instantaneous frequency estimation from quadrature and in-phase components.
Comput. Speech Lang., 2023

Analysis of Mandarin vs English Language for Emotional Voice Conversion.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Linear Frequency Residual Features for Infant Cry Classification.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Transfer Learning Using Whisper for Dysarthric Automatic Speech Recognition.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Constant-Q Based Harmonic and Pitch Features for Normal vs. Pathological Infant Cry Classification.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Quantifying the Emotional Landscape of Music with Three Dimensions.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition.
Proceedings of the Speech and Computer - 25th International Conference, 2023

On the Asymptotic Behaviour of the Speech Signal.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Robustness of Whisper Features for Infant Cry Classification.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Modified Group Delay Features for Emotion Recognition.
Proceedings of the Pattern Recognition and Machine Intelligence, 2023

Noise Robust Whisper Features for Dysarthric Severity-Level Classification.
Proceedings of the Pattern Recognition and Machine Intelligence, 2023

Spoken Language Identification Using Linear Frequency Residual Cepstral Coefficients.
Proceedings of the Pattern Recognition and Machine Intelligence, 2023

Whisper Features for Dysarthric Severity-Level Classification.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Whisper Encoder features for Infant Cry Classification.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Cochlear Filter-Based Cepstral Features for Dysarthric Severity-Level Classification.
Proceedings of the 31st European Signal Processing Conference, 2023

Attentions for Short Duration Speech Classification.
Proceedings of the 31st European Signal Processing Conference, 2023

Analysis of Emotions in Speech using AESDD.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Exploring Residual Cepstral Features for Spoken Language Identification.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Classification of Normal vs. Pathological Infant Cries Using Morse Wavelets.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Relevance of Quadrature Phase For Replay Detection in Voice Assistants (VAs).
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
Music footprint recognition via sentiment, identity, and setting identification.
Multim. Tools Appl., 2022

Voice privacy using CycleGAN and time-scale modification.
Comput. Speech Lang., 2022

Effectiveness of energy separation-based instantaneous frequency estimation for cochlear cepstral features for synthetic and voice-converted spoofed speech detection.
Comput. Speech Lang., 2022

Improving the potential of Enhanced Teager Energy Cepstral Coefficients (ETECC) for replay attack detection.
Comput. Speech Lang., 2022

Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification.
Proceedings of the Speech and Computer - 24th International Conference, 2022

Significance of Energy Features for Severity Classification of Dysarthria.
Proceedings of the Speech and Computer - 24th International Conference, 2022

Continuous Wavelet Transform for Severity-Level Classification of Dysarthria.
Proceedings of the Speech and Computer - 24th International Conference, 2022

Significance of Distance on Pop Noise for Voice Liveness Detection.
Proceedings of the Speech and Computer - 24th International Conference, 2022

Significance of Distance Measures for Speaker Anonymization.
Proceedings of the IEEE International Conference on Signal Processing and Communications, 2022

Morse Wavelet Features for Pop Noise Detection.
Proceedings of the IEEE International Conference on Signal Processing and Communications, 2022

Robustness of DAS Beamformer Over MVDR for Replay Attack Detection On Voice Assistants.
Proceedings of the IEEE International Conference on Signal Processing and Communications, 2022

Noisy Student Teacher Training with Self Supervised Learning for Children ASR.
Proceedings of the IEEE International Conference on Signal Processing and Communications, 2022

Teager Energy Based-Detection of One-point and Two-point Replay Attacks: Towards Cross-Database Generalization.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

The Impact of Room Acoustics on Replay Speech Signal.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Data Augmentation for Infant Cry Classification.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Effect of Speaker-Microphone Proximity on Pop Noise: Continuous Wavelet Transform-Based Approach.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Constant Q Cepstral coefficients for classification of normal vs. Pathological infant cry.
Proceedings of the IEEE International Conference on Acoustics, 2022

Subband Teager Energy Representations for Infant Cry Analysis and Classification.
Proceedings of the 30th European Signal Processing Conference, 2022

Voice Liveness Detection using Constant-Q Transform-Based Features.
Proceedings of the 30th European Signal Processing Conference, 2022

Non-Cepstral Uncertainty Vector for Replay Spoofed Speech Detection.
Proceedings of the 30th European Signal Processing Conference, 2022

Features Motivated From Uncertainty Principle for Classification of Normal vs. Pathological Infant Cry.
Proceedings of the 30th European Signal Processing Conference, 2022

Linear Frequency Residual Cepstral Features for Replay Spoof Detection on ASVSpoof 2019.
Proceedings of the 30th European Signal Processing Conference, 2022

Energy Separation Based Instantaneous Frequency Estimation from Quadrature and In-Phase Components for Replay Spoof Detection.
Proceedings of the 30th European Signal Processing Conference, 2022

Morlet Wavelet-Based Voice Liveness Detection using Convolutional Neural Network.
Proceedings of the 30th European Signal Processing Conference, 2022

2021
Non-intrusive quality assessment of noise-suppressed speech using unsupervised deep features.
Speech Commun., 2021

Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments.
Neural Networks, 2021

Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework.
Int. J. Speech Technol., 2021

Detection of replay spoof speech using teager energy feature cues.
Comput. Speech Lang., 2021

Modified Group Delay Function Using Different Spectral Smoothing Techniques for Voice Liveness Detection.
Proceedings of the Speech and Computer - 23rd International Conference, 2021

Spectral Root Features for Replay Spoof Detection in Voice Assistants.
Proceedings of the Speech and Computer - 23rd International Conference, 2021

Voice Privacy Through Time-Scale and Pitch Modification.
Proceedings of the Pattern Recognition and Machine Intelligence, 2021

Voice Liveness Detection Using Bump Wavelet with CNN.
Proceedings of the Pattern Recognition and Machine Intelligence, 2021

Voice Privacy Through x-Vector and CycleGAN-Based Anonymization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cross-Teager Energy Cepstral Coefficients for Replay Spoof Detection on Voice Assistants.
Proceedings of the IEEE International Conference on Acoustics, 2021

Modified Group Delay Cepstral Coefficients for Voice Liveness Detection.
Proceedings of the 29th European Signal Processing Conference, 2021

Data Augmentation Using CycleGAN for End-to-End Children ASR.
Proceedings of the 29th European Signal Processing Conference, 2021

Exploiting Phase-based Features for Whisper vs. Speech Classification.
Proceedings of the 29th European Signal Processing Conference, 2021

Significance of Constant-Q Transform for Voice Liveness Detection.
Proceedings of the 29th European Signal Processing Conference, 2021

Teager Energy Subband Filtered Features for Near and Far-Field Automatic Speech Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Deep Convolutional Neural Network for Voice Liveness Detection.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Combination of Amplitude and Frequency Modulation Features for Presentation Attack Detection.
J. Signal Process. Syst., 2020

Amplitude and Frequency Modulation-based features for detection of replay Spoof Speech.
Speech Commun., 2020

Effectiveness of Transfer Learning on Singing Voice Conversion in the Presence of Background Music.
Proceedings of the International Conference on Signal Processing and Communications, 2020

Intelligibility Improvement of Dysarthric Speech using MMSE DiscoGAN.
Proceedings of the International Conference on Signal Processing and Communications, 2020

Analysis of Teager Energy Profiles for Spoof Speech Detection.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Novel Variable Length Teager Energy Profiles for Replay Spoof Detection.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Mspec-Net : Multi-Domain Speech Conversion Network.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Weak Speech Supervision: A case study of Dysarthria Severity Classification.
Proceedings of the 28th European Signal Processing Conference, 2020

Energy Separation Based Features for Replay Spoof Detection for Voice Assistant.
Proceedings of the 28th European Signal Processing Conference, 2020

CinC-GAN for Effective F0 prediction for Whisper-to-Normal Speech Conversion.
Proceedings of the 28th European Signal Processing Conference, 2020

Teager Energy Cepstral Coefficients for Classification of Normal vs. Whisper Speech.
Proceedings of the 28th European Signal Processing Conference, 2020

Query-By-Example Spoken Term Detection Using Generative Adversarial Network.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Symmetry In The Structure Of Musical Nodes.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Significance of CMVN for Replay Spoof Detection.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Subband Channel Selection using TEO for Replay Spoof Detection in Voice Assistants.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Design of Voice Privacy System using Linear Prediction.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019
A novel approach to remove outliers for parallel voice conversion.
Comput. Speech Lang., 2019

Vocal Tract Length Normalization using a Gaussian mixture model framework for query-by-example spoken term detection.
Comput. Speech Lang., 2019

Novel Inception-GAN for Whispered-to-Normal Speech Conversion.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Novel Teager Energy Based Subband Features for Audio Acoustic Scene Detection and Classification.
Proceedings of the Pattern Recognition and Machine Intelligence, 2019

Whether to Pretrain DNN or not?: An Empirical Analysis for Voice Conversion.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Phone Aware Nearest Neighbor Technique Using Spectral Transition Measure for Non-Parallel Voice Conversion.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Energy Separation-Based Instantaneous Frequency Estimation for Cochlear Cepstral Feature for Replay Spoof Detection.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Novel Metric Learning for Non-parallel Voice Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2019

Analysis of Reverberation via Teager Energy Features for Replay Spoof Speech Detection.
Proceedings of the IEEE International Conference on Acoustics, 2019

Energy Separation Algorithm Based Spectrum Estimation for Very Short Duration of Speech.
Proceedings of the 27th European Signal Processing Conference, 2019

Combining Evidences from Variable Teager Energy Source and Mel Cepstral Features for Classification of Normal vs. Pathological Voices.
Proceedings of the 27th European Signal Processing Conference, 2019

Effectiveness of Cross-Domain Architectures for Whisper-to-Normal Speech Conversion.
Proceedings of the 27th European Signal Processing Conference, 2019

Novel Enhanced Teager Energy Based Cepstral Coefficients for Replay Spoof Detection.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Novel Adaptive Generative Adversarial Network for Voice Conversion.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Speech Demodulation-based Techniques for Replay and Presentation Attack Detection.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Significance of Higher-Order Spectral Analysis in Infant Cry Classification.
Circuits Syst. Signal Process., 2018

Combining evidences from magnitude and phase information using VTEO for person recognition using humming.
Comput. Speech Lang., 2018

Design of mixture of GMMs for Query-by-Example Spoken Term Detection.
Comput. Speech Lang., 2018

Feature Extraction from Temporal Phase for Speaker Recognition.
Proceedings of the 2018 International Conference on Signal Processing and Communications (SPCOM), 2018

Advances in Low Resource ASR: A Deep Learning Perspective.
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Neural Networks-based Automatic Speech Recognition for Agricultural Commodity in Gujarati Language.
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Relative Phase Shift Features for Replay Spoof Detection System.
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Combining Phase-based Features for Replay Spoof Detection System.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Novel Demodulation-Based Features using Classifier-level Fusion of GMM and CNN for Replay Detection.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Novel Amplitude Weighted Frequency Modulation Features for Replay Spoof Detection.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Novel Empirical Mode Decomposition Cepstral Features for Replay Spoof Detection.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Novel Linear Frequency Residual Cepstral Features for Replay Attack Detection.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Effectiveness of Generative Adversarial Network for Non-Audible Murmur-to-Whisper Speech Conversion.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Effectiveness of Dynamic Features in INCA and Temporal Context-INCA.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Unsupervised Vocal Tract Length Warped Posterior Features for Non-Parallel Voice Conversion.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Auditory Filterbank Learning Using ConvRBM for Infant Cry Classification.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Auditory Filterbank Learning for Temporal Modulation Features in Replay Spoof Speech Detection.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

DA-IICT/IIITV System for Low Resource Speech Recognition Challenge 2018.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Effectiveness of Speech Demodulation-Based Features for Replay Detection.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Novel Variable Length Energy Separation Algorithm Using Instantaneous Amplitude Features for Replay Detection.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Time-Frequency Masking-Based Speech Enhancement Using Generative Adversarial Network.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Novel Spectral Root Cepstral Features for Replay Spoof Detection.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Significance of Teager Energy Operator Phase for Replay Spoof Detection.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Replay Spoof Detection using Power Function Based Features.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Novel Inter Mixture Weighted GMM Posteriorgram for DNN and GAN-based Voice Conversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Time-Frequency Mask-based Speech Enhancement using Convolutional Generative Adversarial Network.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

A Survey on Replay Attack Detection for Automatic Speaker Verification (ASV) System.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Significance of Source-Filter Interaction for Classification of Natural vs. Spoofed Speech.
IEEE J. Sel. Top. Signal Process., 2017

Cochlear Filter and Instantaneous Frequency Based Features for Spoofed Speech Detection.
IEEE J. Sel. Top. Signal Process., 2017

Partial matching and search space reduction for QbE-STD.
Comput. Speech Lang., 2017

Novel Phase Encoded Mel Cepstral Features for Speaker Verification.
Proceedings of the Speech and Computer - 19th International Conference, 2017

Novel Linear Prediction Temporal Phase Based Features for Speaker Recognition.
Proceedings of the Speech and Computer - 19th International Conference, 2017

Fusion of a Novel Volterra-Wiener Filter Based Nonlinear Residual Phase and MFCC for Speaker Verification.
Proceedings of the Speech and Computer - 19th International Conference, 2017

Novel Phase Encoded Mel Filterbank Energies for Environmental Sound Classification.
Proceedings of the Pattern Recognition and Machine Intelligence, 2017

Analysis of Features and Metrics for Alignment in Text-Dependent Voice Conversion.
Proceedings of the Pattern Recognition and Machine Intelligence, 2017

Novel Gammatone Filterbank Based Spectro-Temporal Features for Robust Phoneme Recognition.
Proceedings of the Pattern Recognition and Machine Intelligence, 2017

Spoken Keyword Retrieval Using Source and System Features.
Proceedings of the Pattern Recognition and Machine Intelligence, 2017

Effectiveness of Mel Scale-Based ESA-IFCC Features for Classification of Natural vs. Spoofed Speech.
Proceedings of the Pattern Recognition and Machine Intelligence, 2017

Novel Shifted Real Spectrum for Exact Signal Reconstruction.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Unsupervised Filterbank Learning Using Convolutional Restricted Boltzmann Machine for Environmental Sound Classification.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Novel Variable Length Teager Energy Separation Based Instantaneous Frequency Features for Replay Detection.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Novel Amplitude Scaling method for bilinear frequency Warping-based Voice Conversion.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Quality assessment of voice converted speech using articulatory features.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Sub-band Autoencoder features for Automatic Speech Recognition.
Proceedings of the Ninth International Conference on Advances in Pattern Recognition, 2017

Unsupervised Filterbank Learning for Speech-based Access System for Agricultural Commodity.
Proceedings of the Ninth International Conference on Advances in Pattern Recognition, 2017

Two Stage Zero-resource Approaches for QbE-STD.
Proceedings of the Ninth International Conference on Advances in Pattern Recognition, 2017

Novel Energy Separation Based Frequency Modulation Features for Spoofed Speech Classification.
Proceedings of the Ninth International Conference on Advances in Pattern Recognition, 2017

Effectiveness of ideal ratio mask for non-intrusive quality assessment of noise suppressed speech.
Proceedings of the 25th European Signal Processing Conference, 2017

VTLN-warped Gaussian posteriorgram for QbE-STD.
Proceedings of the 25th European Signal Processing Conference, 2017

Novel energy separation based instantaneous frequency features for spoof speech detection.
Proceedings of the 25th European Signal Processing Conference, 2017

Novel TEO-based Gammatone features for environmental sound classification.
Proceedings of the 25th European Signal Processing Conference, 2017

On the convergence of INCA algorithm.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

A novel filtering-based F0 estimation algorithm with an application to voice conversion.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Combining evidences from detection sources for query-by-example spoken term detection.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Novel Unsupervised Auditory Filterbank Learning Using Convolutional RBM for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Newborn infant's cry analysis.
Int. J. Speech Technol., 2016

Spectral analysis of infant cries and adult speech.
Int. J. Speech Technol., 2016

Non-intrusive Quality Assessment of Synthesized Speech using Spectral Features and Support Vector Regression.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Novel Pre-processing using Outlier Removal in Voice Conversion.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Jerk Minimization for Acoustic-To-Articulatory Inversion.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Novel Subband Autoencoder Features for Detection of Spoofed Speech.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Novel Subband Autoencoder Features for Non-Intrusive Quality Assessment of Noise Suppressed Speech.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Unsupervised Deep Auditory Model Using Stack of Convolutional RBMs for Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Native Language Identification Using Spectral and Source-Based Features.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Novel Nonlinear Prediction Based Features for Spoofed Speech Detection.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Filterbank learning using Convolutional Restricted Boltzmann Machine for speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Analysis of natural and synthetic speech using Fujisaki model.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Effectiveness of fundamental frequency (F0) and strength of excitation (SOE) for spoofed speech detection.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Novel deep autoencoder features for non-intrusive speech quality assessment.
Proceedings of the 24th European Signal Processing Conference, 2016

Unsupervised learning of temporal receptive fields using convolutional RBM for ASR task.
Proceedings of the 24th European Signal Processing Conference, 2016

2015
Combining Evidences from Mel Cepstral and Cochlear Cepstral Features for Speaker Recognition Using Whispered Speech.
Proceedings of the Text, Speech, and Dialogue - 18th International Conference, 2015

Vocal Tract Length Normalization Features for Audio Search.
Proceedings of the Text, Speech, and Dialogue - 18th International Conference, 2015

Modified Group Delay Based Features for Asthma and HIE Infant Cries Classification.
Proceedings of the Text, Speech, and Dialogue - 18th International Conference, 2015

Significance of Unvoiced Segments and Fundamental Frequency in Infant Cry Analysis.
Proceedings of the Text, Speech, and Dialogue - 18th International Conference, 2015

Combining Evidences from Bark Scale and Mel Scale Warped Features for VTLN.
Proceedings of the 2nd International Conference on Perception and Machine Intelligence, 2015

Significance of Phase-based Features for Person Recognition Using Humming.
Proceedings of the 2nd International Conference on Perception and Machine Intelligence, 2015

Classification of Stop Consonants using Modulation Spectrogram-Based Features.
Proceedings of the 2nd International Conference on Perception and Machine Intelligence, 2015

Fusion of TEO Phase with MFCC Features for Speaker Verification.
Proceedings of the 2nd International Conference on Perception and Machine Intelligence, 2015

Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A novel filtering based approach for epoch extraction.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Effectiveness of multiscale fractal dimension for improvement of frame classification rate.
Proceedings of the 23rd European Signal Processing Conference, 2015

Spectral transition measure for detection of obstruents.
Proceedings of the 23rd European Signal Processing Conference, 2015

Classification of normal and pathological infant cries using bispectrum features.
Proceedings of the 23rd European Signal Processing Conference, 2015

2014
Development of vocal tract length normalized phonetic engine for Gujarati and Marathi languages.
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014

Obstruent classification using modulation spectrogram based features.
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014

Effectiveness of fractal dimension for ASR in low resource language.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Exploiting speech source information for vowel landmark detection for low resource language.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Deterministic annealing EM algorithm for developing TTS system in Gujarati.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Fusion of magnitude and phase-based features for objective evaluation of TTS voice.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Novel approach for estimating length of the vocal folds using Fujisaki model.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Exploiting Variable length Teager Energy Operator in melcepstral features for person recognition from humming.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Classification of pathological infant cries using modulation spectrogram features.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Chaotic mixed excitation source for speech synthesis.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Effectiveness of PLP-based phonetic segmentation for speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2014

Effectiveness of multiscale fractal dimension-based phonetic segmentation in speech synthesis for low resource language.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

A spectral transition measure based MELCEPSTRAL features for obstruent detection.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

Vocal tract length normalization for vowel recognition in low resource languages.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

Influence of various asymmetrical contextual factors for TTS in a low resource language.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

A Cepstral Mean Subtraction based features for Singer Identification.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

Nonlinear analysis of natural vs. HTS-based synthetic speech.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

Development of language resources for speech application in Gujarati and Marathi.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

Use of glottal inverse filtering for asthma and HIE infant cries classification.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

Classification of phonemes using modulation spectrogram based features for Gujarati language.
Proceedings of the 2014 International Conference on Asian Language Processing, 2014

The Blizzard Challenge 2014.
Proceedings of the Blizzard Challenge 2014, Singapore, Singapore, September 19, 2014, 2014

2013
Classification of Fricatives Using Novel Modulation Spectrogram Based Features.
Proceedings of the Pattern Recognition and Machine Intelligence, 2013

Speaker Recognition Using Sparse Representation via Superimposed Features.
Proceedings of the Pattern Recognition and Machine Intelligence, 2013

Algorithms for speech segmentation at syllable-level for text-to-speech synthesis system in Gujarati.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

A syllable-based framework for unit selection synthesis in 13 Indian languages.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

Development of speech corpora in Gujarati and Marathi for phonetic transcription.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

Data collection and corpus design for analysis of nonnal and pathological infant cry.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

Development of corpora for person recognition using humming, singing and speech.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

Importance of Utterance Partitioning in SVM Classifier with GMM Supervectors for Text-Independent Speaker Verification.
Proceedings of the Mining Intelligence and Knowledge Exploration, 2013

Nonlinear prediction of speech signal using volterra-wiener series.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Use of PLP Cepstral Features for Phonetic Segmentation.
Proceedings of the 2013 International Conference on Asian Language Processing, 2013

A Novel Gaussian Filter-Based Automatic Labeling of Speech Data for TTS System in Gujarati Language.
Proceedings of the 2013 International Conference on Asian Language Processing, 2013

2012
Static and dynamic information derived from source and system features for person recognition from humming.
Int. J. Speech Technol., 2012

Combining Evidence from Temporal and Spectral Features for Person Recognition Using Humming.
Proceedings of the Perception and Machine Intelligence - First Indo-Japan Conference, 2012

Novel Interleaving Schemes for Speaker Recognition over Lossy Networks.
Proceedings of the Perception and Machine Intelligence - First Indo-Japan Conference, 2012

Significance of magnitude and phase information via VTEO for humming based biometrics.
Proceedings of the 5th IAPR International Conference on Biometrics, 2012

A comparison of waveform fractal dimension techniques for voice pathology classification.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Combining Evidences from Mel Cepstral Features and Cepstral Mean Subtracted Features for Singer Identification.
Proceedings of the 2012 International Conference on Asian Language Processing, 2012

Phonetic Transcription of Fricatives and Plosives for Gujarati and Marathi Languages.
Proceedings of the 2012 International Conference on Asian Language Processing, 2012

Person Recognition Using Humming, Singing and Speech.
Proceedings of the 2012 International Conference on Asian Language Processing, 2012

2011
Effectiveness of Teager energy operator for epoch detection from speech signals.
Int. J. Speech Technol., 2011

Combining Evidence from Spectral and Source-Like Features for Person Recognition from Humming.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Novel VTEO Based Mel Cepstral Features for Classification of Normal and Pathological Voices.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Novel Temporal and Spectral Features Derived from TEO for Classification Normal and Dysphonic Voices.
Proceedings of the Frontiers in Computer Education [International Conference on Frontiers in Computer Education, 2011

Design of a Query-by-Humming System for Hindi Songs Using DDTW Based Approach.
Proceedings of the International Conference on Asian Language Processing, 2011

2010
Novel Variable length Teager Energy Based features for person recognition from their hum.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Variable Length Teager Energy Based Mel Cepstral Features for Identification of Twins.
Proceedings of the Pattern Recognition and Machine Intelligence, 2009

Design and Implementation of HMM-VQ based Isolated Digit Recognition System.
Proceedings of the 4th Indian International Conference on Artificial Intelligence, 2009

DA-IICT Cross-lingual and Multilingual Corpora for Speaker Recognition.
Proceedings of the Seventh International Conference on Advances in Pattern Recognition, 2009

A Novel Approach to Identification of Speakers from Their Hum.
Proceedings of the Seventh International Conference on Advances in Pattern Recognition, 2009

A Novel Modified Polynomial Network Design for Dialect Recognition.
Proceedings of the Seventh International Conference on Advances in Pattern Recognition, 2009

Infant Identification from Their Cry.
Proceedings of the Seventh International Conference on Advances in Pattern Recognition, 2009

2008
A Novel Approach to Language Identification Using Modified Polynomial Networks.
Proceedings of the Speech, 2008

Development of speech corpora for speaker recognition research and evaluation in Indian languages.
Int. J. Speech Technol., 2008

LP spectra vs. Mel spectra for identification of professional mimics in Indian languages.
Int. J. Speech Technol., 2008

Identifying Perceptually Similar Languages Using Teager Energy Based Cepstrum.
Eng. Lett., 2008

Identification of Speakers from Their Hum.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

On the development of variable length Teager energy operator (VTEO).
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

2007
Cepstral Domain Teager Energy for Identifying Perceptually Similar Languages.
Proceedings of the Pattern Recognition and Machine Intelligence, 2007

Advances in Speaker Recognition: A Feature Based Approach.
Proceedings of the International Conference on Artificial Intelligence and Pattern Recognition, 2007

Identifying Phonetically Similar Languages Using Teager Energy Based Cepstrum.
Proceedings of the International Conference on Artificial Intelligence and Pattern Recognition, 2007

2006
Design of Cross-lingual and Multilingual Corpora for Speaker Recognition Research and Evaluation in Indian Languages.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

A New Data Fusion Technique and Performance Measure for Identification of Twins in Marathi.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Design of Cubic Spline Wavelet for Open Set Speaker Classification in Marathi.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

2005
The Wavelet Packet Based Cepstral Features for Open Set Speaker Classification in Marathi.
Proceedings of the From Data and Information Analysis to Knowledge Engineering, 2005

2004
The Teager Energy Based Features for Identification of Identical Twins in Multi-lingual Environment.
Proceedings of the Neural Information Processing, 11th International Conference, 2004


  Loading...