2025
MultiHuman-Testbench: Benchmarking Image Generation for Multiple Humans.
CoRR, June, 2025
Tripartite Weight-Space Ensemble for Few-Shot Class-Incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
2024
Hollowed Net for On-Device Personalization of Text-to-Image Diffusion Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Balanced Learning for Multi-Domain Long-Tailed Speaker Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
Feature Diversification and Adaptation for Federated Domain Generalization.
Proceedings of the Computer Vision - ECCV 2024, 2024
FedHide: Federated Learning by Hiding in the Neighbors.
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
Multi-Scale Temporal Feature Fusion for Few-Shot Action Recognition.
Proceedings of the IEEE International Conference on Image Processing, 2023
Label Shift Adapter for Test-Time Adaptation under Covariate and Label Shifts.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Few-Shot Common Action Localization via Cross-Attentional Fusion of Context and Temporal Dynamics.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Neural Transformation Network to Generate Diverse Views for Contrastive Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Progressive Random Convolutions for Single Domain Generalization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022
Domain Agnostic Few-shot Learning for Speaker Verification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
ConFeSS: A Framework for Single Source Cross-Domain Few-Shot Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022
Improving Test-Time Adaptation Via Shift-Agnostic Weight Regularization and Nearest Source Prototypes.
Proceedings of the Computer Vision - ECCV 2022, 2022
Multi-Head Modularization to Leverage Generalization Capability in Multi-Modal Networks.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
Distribution Estimation to Automate Transformation Policies for Self-Supervision.
CoRR, 2021
Federated Learning of User Verification Models Without Sharing Embeddings.
Proceedings of the 38th International Conference on Machine Learning, 2021
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization.
Proceedings of the 9th International Conference on Learning Representations, 2021
Efficient Action Recognition via Dynamic Knowledge Propagation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Prototype-Based Personalized Pruning.
Proceedings of the IEEE International Conference on Acoustics, 2021
Subspectral Normalization for Neural Audio Data Processing.
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
Federated Learning of User Authentication Models.
CoRR, 2020
End-to-End Lane Marker Detection via Row-wise Classification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
An End-to-End Text-Independent Speaker Verification Framework with a Keyword Adversarial Network.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Weakly Labeled Sound Event Detection using Tri-training and Adversarial Learning.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019
Acoustic Scene Classification Based on a Large-margin Factorized CNN.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019
2017
Speaker Clustering by Iteratively Finding Discriminative Feature Space and Cluster Labels.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
2012
Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification.
IEEE Trans. Speech Audio Process., 2012
Phoneme Classification using Constrained Variational Gaussian Process Dynamical System.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012
Joint Kernel Learning for Supervised Image Segmentation.
Proceedings of the Computer Vision - ACCV 2012, 2012
2011
Large Margin Discriminative Semi-Markov Model for Phonetic Recognition.
IEEE Trans. Speech Audio Process., 2011
Learning a discriminative visual codebook using homonym scheme.
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
Wearable sensor activity analysis using semi-Markov models with a grammar.
Pervasive Mob. Comput., 2010
Parametric emotional singing voice synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010
Largemargin training of semi-Markov model for phonetic recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
Speech emotion recognition via a max-margin framework incorporating a loss function based on the Watson and Tellegen's emotion model.
Proceedings of the IEEE International Conference on Acoustics, 2009
2004
Hybrid utterance verification based on n-best models and model derived from kulback-leibler divergence.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004