Shiliang Zhang
Orcid: 0000-0002-9524-1602
According to our database1,
Shiliang Zhang
authored at least 239 papers
between 2008 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Adaptive Robust Tracking Control With Active Learning for Linear Systems With Ellipsoidal Bounded Uncertainties.
IEEE Trans. Autom. Control., November, 2024
Efficient Video Transformers via Spatial-temporal Token Merging for Action Recognition.
ACM Trans. Multim. Comput. Commun. Appl., April, 2024
IEEE Trans. Pattern Anal. Mach. Intell., March, 2024
Switched Surplus-Based Distributed Security Dispatch for Smart Grid With Persistent Packet Loss.
IEEE Internet Things J., February, 2024
IEEE Trans. Image Process., 2024
IEEE Trans. Image Process., 2024
Neural Networks, 2024
CoRR, 2024
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition.
CoRR, 2024
CoRR, 2024
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.
CoRR, 2024
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs.
CoRR, 2024
CoRR, 2024
A Bionic Data-driven Approach for Long-distance Underwater Navigation with Anomaly Resistance.
CoRR, 2024
CoRR, 2024
CoTuning: A Large-Small Model Collaborating Distillation Framework for Better Model Generalization.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Hourglass-AVSR: Down-Up Sampling-Based Computational Efficiency Model for Audio-Visual Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
FunCodec: A Fundamental, Reproducible and Integrable Open-Source Toolkit for Neural Speech Codec.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the European Control Conference, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Pattern Recognit., November, 2023
IEEE Trans. Circuits Syst. Video Technol., August, 2023
IEEE Trans. Pattern Anal. Mach. Intell., August, 2023
IEEE Trans. Multim., 2023
IEEE Trans. Image Process., 2023
IEEE Signal Process. Lett., 2023
CoRR, 2023
Privacy-preserving transactive energy systems: Key topics and open research challenges.
CoRR, 2023
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models.
CoRR, 2023
Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation.
CoRR, 2023
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer.
CoRR, 2023
Adaptive robust tracking control with active learning for linear systems with ellipsoidal bounded uncertainties.
CoRR, 2023
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability.
CoRR, 2023
Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model.
CoRR, 2023
Unleashing the Full Potential of Product Quantization for Large-Scale Image Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for speech recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Personality-aware Training based Speaker Adaptation for End-to-end Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
2022
Introduction to the Special Issue on Fine-Grained Visual Recognition and Re-Identification.
ACM Trans. Multim. Comput. Commun. Appl., 2022
IEEE Trans. Neural Networks Learn. Syst., 2022
Bidirectional Posture-Appearance Interaction Network for Driver Behavior Recognition.
IEEE Trans. Intell. Transp. Syst., 2022
IEEE Trans. Circuits Syst. Video Technol., 2022
Pattern Recognit., 2022
IEEE Trans. Pattern Anal. Mach. Intell., 2022
IEEE Trans. Pattern Anal. Mach. Intell., 2022
Int. J. Comput. Vis., 2022
MFCCA: Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario.
CoRR, 2022
Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios.
CoRR, 2022
Extended vehicle energy dataset (eVED): an enhanced large-scale dataset for deep learning on vehicle trip energy consumption.
CoRR, 2022
CoRR, 2022
Contextualize differential privacy in image database: a lightweight image differential privacy approach based on principle component analysis inverse.
CoRR, 2022
MFCCA:Multi-Frame Cross-Channel Attention for Multi-Speaker ASR in Multi-Party Meeting Scenario.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022
Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the 33rd British Machine Vision Conference 2022, 2022
MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022
2021
IEEE Trans. Image Process., 2021
IEEE Trans. Circuits Syst. Video Technol., 2021
Pattern Recognit. Lett., 2021
Int. J. Comput. Vis., 2021
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information.
CoRR, 2021
Enhancing Social Relation Inference with Concise Interaction Graph and Discriminative Scene Representation.
CoRR, 2021
CoRR, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
An Energy Consumption Model for Electrical Vehicle Networks via Extended Federated-learning.
Proceedings of the IEEE Intelligent Vehicles Symposium, 2021
Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Graph Consistency Based Mean-Teaching for Unsupervised Domain Adaptive Person Re-Identification.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
Group-Group Loss-Based Global-Regional Feature Learning for Vehicle Re-Identification.
IEEE Trans. Image Process., 2020
IEEE Trans. Image Process., 2020
CDbin: Compact Discriminative Binary Descriptor Learned With Efficient Neural Network.
IEEE Trans. Circuits Syst. Video Technol., 2020
E<sup>2</sup>BoWs: An end-to-end Bag-of-Words model via deep convolutional neural network for image retrieval.
Neurocomputing, 2020
Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model.
CoRR, 2020
Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-Identification.
CoRR, 2020
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Neural Zero-Inflated Quality Estimation Model for Automatic Speech Recognition System.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Self-Supervised Adversarial Multi-Task Learning for Vocoder-Based Monaural Speech Enhancement.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-identification.
Proceedings of the Computer Vision - ECCV 2020, 2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
IEEE Trans. Multim., 2019
IEEE Trans. Image Process., 2019
DR<sup>2</sup>-Net: Deep Residual Reconstruction Network for image compressive sensing.
Neurocomputing, 2019
Commun. Stat. Simul. Comput., 2019
Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition.
CoRR, 2019
Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019
Proceedings of the 2nd IEEE Conference on Multimedia Information Processing and Retrieval, 2019
Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Audio Tagging with Compact Feedforward Sequential Memory Network and Audio-to-Audio Ratio Based Data Augmentation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Robust Audio-visual Speech Recognition Using Bimodal Dfsmn with Multi-condition Training and Dropout Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2019
Investigation of Modeling Units for Mandarin Speech Recognition Using Dfsmn-ctc-smbr.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
IEEE Trans. Neural Networks Learn. Syst., 2018
Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching.
IEEE Trans. Multim., 2018
AutoBD: Automated Bi-Level Description for Scalable Fine-Grained Visual Categorization.
IEEE Trans. Image Process., 2018
Learning Affective Features With a Hybrid Deep Model for Audio-Visual Emotion Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2018
Pattern Recognit., 2018
Multi-Task Learning with Low Rank Attribute Embedding for Multi-Camera Person Re-Identification.
IEEE Trans. Pattern Anal. Mach. Intell., 2018
The Scale Effect on Spatial Interaction Patterns: An Empirical Study Using Taxi O-D data of Beijing and Shanghai.
IEEE Access, 2018
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition.
Pattern Recognit., 2017
Attributes driven tracklet-to-tracklet person re-identification using latent prototypes space mapping.
Pattern Recognit., 2017
Image Vis. Comput., 2017
CoRR, 2017
DR<sup>2</sup>-Net: Deep Residual Reconstruction Network for Image Compressive Sensing.
CoRR, 2017
Proceedings of the 2017 ACM on Multimedia Conference, 2017
Proceedings of the 2017 ACM on Multimedia Conference, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017
Proceedings of the IEEE International Conference on Computer Vision, 2017
Feedforward sequential memory networks based encoder-decoder model for machine translation.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
IEEE Trans. Image Process., 2016
Hybrid Orthogonal Projection and Estimation (HOPE): A New Framework to Learn Neural Networks.
J. Mach. Learn. Res., 2016
The USTC NELSLIP Systems for Trilingual Entity Detection and Linking Tasks at TAC KBP 2016.
Proceedings of the 2016 Text Analysis Conference, 2016
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016
Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the Computer Vision - ECCV 2016, 2016
2015
IEEE Trans. Image Process., 2015
IEEE Trans. Pattern Anal. Mach. Intell., 2015
Multim. Syst., 2015
Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency.
CoRR, 2015
A Fixed-Size Encoding Method for Variable-Length Sequences with its Application to Neural Network Language Models.
CoRR, 2015
Hybrid Orthogonal Projection and Estimation (HOPE): A New Framework to Probe and Learn Neural Networks.
CoRR, 2015
Proceedings of the 2015 IEEE International Conference on Mobile Services, MS 2015, New York City, NY, USA, June 27, 2015
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015
The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015
2014
IEEE Trans. Image Process., 2014
IEEE J. Emerg. Sel. Topics Circuits Syst., 2014
Comput. Vis. Image Underst., 2014
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014
Proceedings of the International Conference on Multimedia Retrieval, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the Computer Vision - ACCV 2014, 2014
2013
Edge-SIFT: Discriminative Binary Descriptor for Scalable Partial-Duplicate Mobile Search.
IEEE Trans. Image Process., 2013
Proceedings of the International Conference on Multimedia Retrieval, 2013
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013
2011
Generating Descriptive Visual Words and Visual Phrases for Large-Scale Image Applications.
IEEE Trans. Image Process., 2011
Building descriptive and discriminative visual codebook for large-scale image applications.
Multim. Tools Appl., 2011
Comput. Vis. Image Underst., 2011
Proceedings of the IEEE 13th International Workshop on Multimedia Signal Processing (MMSP 2011), 2011
2010
Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010
Proceedings of the 18th International Conference on Multimedia 2010, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010
2009
Proceedings of the 17th International Conference on Multimedia 2009, 2009
Proceedings of the International Conference on Image Processing, 2009
2008
Proceedings of the Advances in Multimedia Information Processing, 2008
Proceedings of the 16th International Conference on Multimedia 2008, 2008
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008