Vidhyasaharan Sethu

Orcid: 0000-0001-8492-1787

According to our database1, Vidhyasaharan Sethu authored at least 102 papers between 2007 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 




Continuous Emotion Ambiguity Prediction: Modeling With Beta Distributions.
IEEE Trans. Affect. Comput., 2024

Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features.
CoRR, 2024

AER-LLM: Ambiguity-aware Emotion Recognition Leveraging Large Language Models.
CoRR, 2024

A Joint Spectro-Temporal Relational Thinking Based Acoustic Modeling Framework.
CoRR, 2024

Dual-Constrained Dynamical Neural ODEs for Ambiguity-aware Continuous Emotion Prediction.
CoRR, 2024

Binaural Selective Attention Model for Target Speaker Extraction.
CoRR, 2024

Aligning Tiered Assessments With Course Learning Outcomes.
Proceedings of the IEEE International Conference on Teaching, 2024

Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling.
Proceedings of the IEEE International Conference on Acoustics, 2024

A Probability Gradient Based Approach for Sampling Boundaries of In-Domain Data.
Proceedings of the IEEE International Conference on Acoustics, 2024

A Tiered Learning Framework for Self-Guided Engineering Design Education.
Proceedings of the IEEE Global Engineering Education Conference, 2024

ChatGPT in the Classroom: A Shift in Engineering Design Education.
Proceedings of the IEEE Global Engineering Education Conference, 2024

DNN controlled adaptive front-end for replay attack detection systems.
Speech Commun., October, 2023

A Novel Markovian Framework for Integrating Absolute and Relative Ordinal Emotion Information.
IEEE Trans. Affect. Comput., 2023

Spatial HuBERT: Self-supervised Spatial Speech Representation Learning for a Single Talker from Multi-channel Audio.
CoRR, 2023

From Interval to Ordinal: A HMM based Approach for Emotion Label Conversion.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Improving wav2vec2-based Spoken Language Identification by Learning Phonological Features.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Constrained Dynamical Neural ODE for Time Series Modelling: A Case Study on Continuous Emotion Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2023

Belief Mismatch Coefficient (BMC): A Novel Interpretable Measure of Prediction Accuracy for Ambiguous Emotion States.
Proceedings of the 11th International Conference on Affective Computing and Intelligent Interaction, 2023

A Novel Sequential Monte Carlo Framework for Predicting Ambiguous Emotion States.
Proceedings of the IEEE International Conference on Acoustics, 2022

Compensation Techniques for Speaker Variability in Continuous Emotion Prediction.
IEEE Trans. Affect. Comput., 2021

Teaching Signal Processing Through Frequent and Diverse Design: A Pedagogical Approach.
IEEE Signal Process. Mag., 2021

An adaptive transmission line cochlear model based front-end for replay attack detection.
Speech Commun., 2021

Multimodal Affect Models: An Investigation of Relative Salience of Audio and Visual Cues for Emotion Prediction.
Frontiers Comput. Sci., 2021

Parametric Distributions to Model Numerical Emotion Labels.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children's Speech.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Generalized Two-Stage Rank Regression Framework for Depression Score Prediction from Speech.
IEEE Trans. Affect. Comput., 2020

Natural Language Processing Methods for Acoustic and Landmark Event-Based Features in Speech-Based Depression Detection.
IEEE J. Sel. Top. Signal Process., 2020

Adversarial Multi-Task Learning for Speaker Normalization in Replay Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Cochlear Signal Processing: A Platform for Learning the Fundamentals of Digital Signal Processing.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Estimating cognitive load from speech gathered in a complex real-life training exercise.
Int. J. Hum. Comput. Stud., 2019

The Ambiguous World of Emotion Representation.
CoRR, 2019

Speech Based Emotion Prediction: Can a Linear Model Work?
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Auditory Inspired Spatial Differentiation for Replay Spoofing Attack Detection.
Proceedings of the IEEE International Conference on Acoustics, 2019

Phoneme Specific Modelling and Scoring Techniques for Anti Spoofing System.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Novel Bag-of-Optimised-Clusters Front-End for Speech based Continuous Emotion Prediction.
Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction, 2019

Using Gaussian Processes with LSTM Neural Networks to Predict Continuous-Time, Dimensional Emotion in Ambiguous Speech.
Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction, 2019

Generalized Variability Model for Speaker Verification.
IEEE Signal Process. Lett., 2018

Using language cluster models in hierarchical language identification.
Speech Commun., 2018

Speech-based Continuous Emotion Prediction by Learning Perception Responses related to Salient Events: A Study based on Vocal Affect Bursts and Cross-Cultural Affect in AVEC 2018.
Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop, 2018

Modulation Dynamic Features for the Detection of Replay Attacks.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Deep Siamese Architecture Based Replay Detection for Secure Voice Biometric.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Sub-band Envelope Features Using Frequency Domain Linear Prediction for Short Duration Language Identification.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Demonstrating and Modelling Systematic Time-varying Annotator Disagreement in Continuous Emotion Annotation.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speaker-Phonetic Vector Estimation for Short Duration Speaker Verification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

End-to-End Hierarchical Language Identification System.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Factorized Hidden Variability Learning for Adaptation of Short Duration Language Identification Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Dynamic Multi-Rater Gaussian Mixture Regression Incorporating Temporal Dependencies of Emotion Uncertainty Using Kalman Filters.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Use of Claimed Speaker Models for Replay Detection.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Second Order Factorized Model Adaptation for Short Duration Language Identification.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Front-End for Antispoofing Countermeasures in Speaker Verification: Scattering Spectral Decomposition.
IEEE J. Sel. Top. Signal Process., 2017

A flipped mode approach to teaching an electronic system design course.
Proceedings of the IEEE 6th International Conference on Teaching, 2017

Investigating Word Affect Features and Fusion of Probabilistic Predictions Incorporating Uncertainty in AVEC 2017.
Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, Mountain View, CA, USA, October 23, 2017

Independent Modelling of High and Low Energy Speech Frames for Spoofing Detection.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Incorporating Local Acoustic Variability Information into Short Duration Speaker Verification.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Investigating Scalability in Hierarchical Language Identification System.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Bidirectional Modelling for Short Duration Language Identification.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An Investigation of Emotion Prediction Uncertainty Using Gaussian Mixture Regression.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Gaussian Process Regression for Continuous Emotion Recognition with Global Temporal Invariance.
Proceedings of the 1st IJCAI Workshop on Artificial Intelligence in Affective Computing (AffComp 2017), 2017

Salience based lexical features for emotion recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Investigating the use of scattering coefficients for replay attack detection.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Modeling variable length phoneme sequences - A step towards linguistic information for speech emotion recognition in wider world.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction, 2017

Staircase Regression in OA RVM, Data Selection and Gender Dependency in AVEC 2016.
Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, 2016

Investigation of Sub-Band Discriminative Information Between Spoofed and Genuine Speech.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Parallel Speaker and Content Modelling for Text-Dependent Speaker Verification.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Out of Set Language Modelling in Hierarchical Language Identification.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A Feature Normalisation Technique for PLLR Based Language Identification Systems.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Factor Analysis Based Speaker Normalisation for Continuous Emotion Prediction.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A hierarchical framework for language identification.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Analysis of acoustic space variability in speech affected by depression.
Speech Commun., 2015

An Investigation of Annotation Delay Compensation and Output-Associative Fusion for Multimodal Continuous Emotion Prediction.
Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, 2015

Twitter: A New Online Source of Automatically Tagged Data for Conversational Speech Emotion Recognition.
Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia, 2015

An Iterative Multi Range Non-Negative Matrix Factorization Algorithm for Polyphonic Music Transcription.
Proceedings of the 16th International Society for Music Information Retrieval Conference, 2015

A model based voice activity detector for noisy environments.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Phonemes frequency based PLLR dimensionality reduction for language recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Relevance vector machine for depression prediction.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Weighted pairwise Gaussian likelihood regression for depression score prediction.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Scalable I-vector concatenation for PLDA based language identification system.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

An i-vector GPLDA system for speech based emotion recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

The UNSW submission to INTERSPEECH 2014 compare cognitive load challenge.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Probabilistic acoustic volume analysis for speech affected by depression.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Variability compensation in small data: Oversampled extraction of i-vectors for the classification of depressed speech.
Proceedings of the IEEE International Conference on Acoustics, 2014

On the use of speech parameter contours for emotion recognition.
EURASIP J. Audio Speech Music. Process., 2013

Diagnosis of depression by behavioural signals: a multimodal approach.
Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, 2013

GMM based speaker variability compensated system for interspeech 2013 compare emotion challenge.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Modeling spectral variability for the classification of depressed speech.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Speaker variability in speech based emotion models - Analysis and normalisation.
Proceedings of the IEEE International Conference on Acoustics, 2013

Speaker variability in emotion recognition - an adaptation based approach.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

PNCC-ivector-SRC based speaker verification.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Investigation of spectral centroid features for cognitive load classification.
Speech Commun., 2011

Investigation of the robustness of a non-uniform filterbank for cognitive load classification.
Proceedings of the 8th International Conference on Information, 2011

Automatic emotion recognition: an investigation of acoustic and prosodic parameters.
PhD thesis, 2009

Pitch contour parameterisation based on linear stylisation for emotion recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Speaker dependency of spectral features and speech production cues for automatic emotion classification.
Proceedings of the IEEE International Conference on Acoustics, 2009

Selective Weighting of Undecimated Wavelet Coefficients for Noise Reduction in SAR Interferograms.
EURASIP J. Adv. Signal Process., 2008

Phonetic and speaker variations in automatic emotion classification.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Empirical mode decomposition based weighted frequency feature for speech-based emotion classification.
Proceedings of the IEEE International Conference on Acoustics, 2008

A Novel Technique for Noise Reduction in InSAR Images.
IEEE Geosci. Remote. Sens. Lett., 2007

Group delay features for emotion detection.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Speaker Normalisation for Speech-Based Emotion Detection.
Proceedings of the 15th International Conference on Digital Signal Processing, 2007
