Ziqiang Shi
Orcid: 0000-0002-3105-6213
According to our database1,
Ziqiang Shi
authored at least 60 papers
between 2010 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
RealSinger: Ultra-realistic singing voice generation via stochastic differential equations.
Neurocomputing, 2024
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
SchröWave: Realistic voice generation by solving two-stage conditional Schrödinger bridge problems.
Digit. Signal Process., September, 2023
Semi-Supervised Contrastive Learning with Soft Mask Attention for Facial Action Unit Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023
CheckSORT: Refined Synthetic Data Combination and Optimized SORT for Automatic Retail Checkout.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Digit. Signal Process., 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
CoRR, 2021
It$\hat{\text{o}}$TTS and It$\hat{\text{o}}$Wave: Linear Stochastic Differential Equation Is All You Need For Audio Generation.
CoRR, 2021
2020
IEEE Trans. Comput. Soc. Syst., 2020
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Learning Temporal Relations from Semantic Neighbors for Acoustic Scene Classification.
IEEE Signal Process. Lett., 2020
CoRR, 2020
Hodge and Podge: Hybrid Supervised Sound Event Detection with Multi-Hot MixMatch and Composition Consistence Training.
CoRR, 2020
La Furca: Iterative Context-Aware End-to-End Monaural Speech Separation Based on Dual-Path Deep Parallel Inter-Intra Bi-LSTM with Attention.
CoRR, 2020
FurcaNeXt: End-to-End Monaural Speech Separation with Dynamic Gated Dilated Temporal Convolutional Networks.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020
ATReSN-Net: Capturing Attentive Temporal Relations in Semantic Neighborhood for Acoustic Scene Classification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Hodge and Podge: Hybrid Supervised Sound Event Detection with Multi-Hot MixMatch and Composition Consistence Training.
Proceedings of the 28th European Signal Processing Conference, 2020
2019
FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks.
CoRR, 2019
FurcaNet: An end-to-end deep gated convolutional, long short-term memory, deep neural networks for single channel speech separation.
CoRR, 2019
CoRR, 2019
Deep Attention Gated Dilated Temporal Convolutional Networks with Intra-Parallel Convolutional Modules for End-to-End Monaural Speech Separation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
End-to-End Monaural Speech Separation with Multi-Scale Dynamic Weighted Gated Dilated Convolutional Pyramid Network.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019
Furcax: End-to-end Monaural Speech Separation Based on Deep Gated (De)convolutional Neural Networks with Adversarial Example Training.
Proceedings of the IEEE International Conference on Acoustics, 2019
HODGEPODGE: Sound Event Detection Based on Ensemble of Semi-Supervised Learning Methods.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019
2018
A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
Latent Factor Analysis of Deep Bottleneck Features for Speaker Verification with Random Digit Strings.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Joint Learning of J-Vector Extractor and Joint Bayesian Model for Text Dependent Speaker Verification.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Double Joint Bayesian Modeling of DNN Local I-Vector for Text Dependent Speaker Verification with Random Digit Strings.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
2017
Multi-view Probability Linear Discrimination Analysis for Multi-view Vector Based Text Dependent Speaker Verification.
CoRR, 2017
Better Worst-Case Complexity Analysis of the Block Coordinate Descent Method for Large Scale Machine Learning.
Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017
Multi-view (Joint) probability linear discrimination analysis for J-vector based text dependent speaker verification.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
2016
Empirical study of PROXTONE and PROXTONE$^+$ for Fast Learning of Large Scale Sparse Models.
CoRR, 2016
2015
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2015
Online and Stochastic Universal Gradient Methods for Minimizing Regularized Hölder Continuous Finite Sums in Machine Learning.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2015
2013
ACM Trans. Intell. Syst. Technol., 2013
Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models.
IEEE Trans. Speech Audio Process., 2013
Audio Segment Classification Using Online Learning Based Tensor Representation Feature Discrimination.
IEEE Trans. Speech Audio Process., 2013
Fudan at MediaEval 2013: Violent Scenes Detection Using Motion Features and Part-Level Attributes.
Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013
Proceedings of the IJCAI 2013, 2013
2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
2011
Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification
CoRR, 2011
CoRR, 2011
Heterogeneous mixture models using sparse representation features for applause and laugh detection.
Proceedings of the 2011 IEEE International Workshop on Machine Learning for Signal Processing, 2011
Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the Neural Information Processing - 18th International Conference, 2011
2010
Int. J. Pattern Recognit. Artif. Intell., 2010