Ziqiang Shi

Orcid: 0000-0002-3105-6213

According to our database1, Ziqiang Shi authored at least 60 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
RealSinger: Ultra-realistic singing voice generation via stochastic differential equations.
Neurocomputing, 2024

Generative Modelling with High-Order Langevin Dynamics.
CoRR, 2024

Conditional Velocity Score Estimation for Image Restoration.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Multimedia Generative Modelling with High-Order Langevin Dynamics.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Langwave: Realistic Voice Generation Based on High-Order Langevin Dynamics.
Proceedings of the IEEE International Conference on Acoustics, 2024

Noisy Image Restoration Based on Conditional Acceleration Score Approximation.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
SchröWave: Realistic voice generation by solving two-stage conditional Schrödinger bridge problems.
Digit. Signal Process., September, 2023

Semi-Supervised Contrastive Learning with Soft Mask Attention for Facial Action Unit Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023

CheckSORT: Refined Synthetic Data Combination and Optimized SORT for Automatic Retail Checkout.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
ITÔN: End-to-end audio generation with Itô stochastic differential equations.
Digit. Signal Process., 2022

ItôWave: Itô Stochastic Differential Equation is all You Need for Wave Generation.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Multi-modal Affect Analysis using standardized data within subjects in the Wild.
CoRR, 2021

It$\hat{\text{o}}$TTS and It$\hat{\text{o}}$Wave: Linear Stochastic Differential Equation Is All You Need For Audio Generation.
CoRR, 2021

2020
Link Prediction Adversarial Attack Via Iterative Gradient Attack.
IEEE Trans. Comput. Soc. Syst., 2020

Pyramidal Temporal Pooling With Discriminative Mapping for Audio Classification.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Learning Temporal Relations from Semantic Neighbors for Acoustic Scene Classification.
IEEE Signal Process. Lett., 2020

LoRRaL: Facial Action Unit Detection Based on Local Region Relation Learning.
CoRR, 2020

Toward the pre-cocktail party problem with TasTas+.
CoRR, 2020

Hodge and Podge: Hybrid Supervised Sound Event Detection with Multi-Hot MixMatch and Composition Consistence Training.
CoRR, 2020

La Furca: Iterative Context-Aware End-to-End Monaural Speech Separation Based on Dual-Path Deep Parallel Inter-Intra Bi-LSTM with Attention.
CoRR, 2020

FurcaNeXt: End-to-End Monaural Speech Separation with Dynamic Gated Dilated Temporal Convolutional Networks.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

ATReSN-Net: Capturing Attentive Temporal Relations in Semantic Neighborhood for Acoustic Scene Classification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Hodge and Podge: Hybrid Supervised Sound Event Detection with Multi-Hot MixMatch and Composition Consistence Training.
Proceedings of the 28th European Signal Processing Conference, 2020

2019
Learning from Adversarial Features for Few-Shot Classification.
CoRR, 2019

FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks.
CoRR, 2019

FurcaNet: An end-to-end deep gated convolutional, long short-term memory, deep neural networks for single channel speech separation.
CoRR, 2019

Is CQT more suitable for monaural speech separation than STFT? an empirical study.
CoRR, 2019

Deep Attention Gated Dilated Temporal Convolutional Networks with Intra-Parallel Convolutional Modules for End-to-End Monaural Speech Separation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Monaural Speech Separation with Multi-Scale Dynamic Weighted Gated Dilated Convolutional Pyramid Network.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Robustness Evaluation of Deep Learning Models Based on Local Prediction Consistency.
Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019

Furcax: End-to-end Monaural Speech Separation Based on Deep Gated (De)convolutional Neural Networks with Adversarial Example Training.
Proceedings of the IEEE International Conference on Acoustics, 2019

HODGEPODGE: Sound Event Detection Based on Ensemble of Semi-Supervised Learning Methods.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019

2018
Link Prediction Adversarial Attack.
CoRR, 2018

A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Latent Factor Analysis of Deep Bottleneck Features for Speaker Verification with Random Digit Strings.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Joint Learning of J-Vector Extractor and Joint Bayesian Model for Text Dependent Speaker Verification.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Double Joint Bayesian Modeling of DNN Local I-Vector for Text Dependent Speaker Verification with Random Digit Strings.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017
Multi-view Probability Linear Discrimination Analysis for Multi-view Vector Based Text Dependent Speaker Verification.
CoRR, 2017

Better Worst-Case Complexity Analysis of the Block Coordinate Descent Method for Large Scale Machine Learning.
Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017

Multi-view (Joint) probability linear discrimination analysis for J-vector based text dependent speaker verification.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
Empirical study of PROXTONE and PROXTONE$^+$ for Fast Learning of Large Scale Sparse Models.
CoRR, 2016

2015
Soft Margin Based Low-Rank Audio Signal Classification.
Neural Process. Lett., 2015

Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2015

Online and Stochastic Universal Gradient Methods for Minimizing Regularized Hölder Continuous Finite Sums in Machine Learning.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2015

2013
Audio classification with low-rank matrix representation features.
ACM Trans. Intell. Syst. Technol., 2013

Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models.
IEEE Trans. Speech Audio Process., 2013

Audio Segment Classification Using Online Learning Based Tensor Representation Feature Discrimination.
IEEE Trans. Speech Audio Process., 2013

Online Douglas-Rachford splitting method.
CoRR, 2013

Fudan at MediaEval 2013: Violent Scenes Detection Using Motion Features and Part-Level Attributes.
Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013

Guarantees of Augmented Trace Norm Models in Tensor Recovery.
Proceedings of the IJCAI 2013, 2013

2012
Identifiability of multivariate logistic mixture models
CoRR, 2012

Guarantees of Augmented Trace Norm Models in Tensor Recovery
CoRR, 2012

Low-rank Audio Signal Classification Under Soft Margin and Trace Norm Constraints.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011
Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification
CoRR, 2011

Trace Norm Regularized Tensor Classification and Its Online Learning Approaches
CoRR, 2011

Heterogeneous mixture models using sparse representation features for applause and laugh detection.
Proceedings of the 2011 IEEE International Workshop on Machine Learning for Signal Processing, 2011

Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Novel Framework Based on Trace Norm Minimization for Audio Event Detection.
Proceedings of the Neural Information Processing - 18th International Conference, 2011

2010
Study on the Recognition of Objectionable Audio.
Int. J. Pattern Recognit. Artif. Intell., 2010


  Loading...