Shiliang Zhang

Xuehui Ma

CoRR, 2022

MFCCA:Multi-Frame Cross-Channel Attention for Multi-Speaker ASR in Multi-Party Meeting Scenario.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Asymmetric Label Propagation for Video Object Segmentation.

[BibT_eX]

[DOI]

Zhen Chen

Ming Yang

Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

SpikingSIM: A Bio-Inspired Spiking Simulator.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Transformer-Based Domain Adaptation for Event Data Classification.

[BibT_eX]

[DOI]

Junwei Zhao

Tiejun Huang

Proceedings of the IEEE International Conference on Acoustics, 2022

Modeling The Detection Capability Of High-Speed Spiking Cameras.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

M2Met: The Icassp 2022 Multi-Channel Multi-Party Meeting Transcription Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Prosospeech: Enhancing Prosody with Quantized Vector Pre-Training in Text-To-Speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Contextual Instance Decoupling for Robust Multi-Person Pose Estimation.

[BibT_eX]

[DOI]

Dongkai Wang

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Domain Generalization Capability Enhancement for Binary Neural Networks.

[BibT_eX]

[DOI]

Jianming Ye

Shunan Mao

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021

Progressive Feature Enhancement for Person Re-Identification.

[BibT_eX]

[DOI]

Yingji Zhong

Yaowei Wang

IEEE Trans. Image Process., 2021

Multi-View Spatial Attention Embedding for Vehicle Re-Identification.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2021

Diverse part attentive network for video-based person re-identification.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2021

Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2021

Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information.

[BibT_eX]

[DOI]

CoRR, 2021

BeamTransformer: Microphone Array-based Overlapping Speech Detection.

[BibT_eX]

[DOI]

CoRR, 2021

Enhancing Social Relation Inference with Concise Interaction Graph and Discriminative Scene Representation.

[BibT_eX]

[DOI]

CoRR, 2021

Large-Scale Spatio-Temporal Person Re-identification: Algorithm and Benchmark.

[BibT_eX]

[DOI]

CoRR, 2021

AAformer: Auto-Aligned Transformer for Person Re-Identification.

[BibT_eX]

[DOI]

CoRR, 2021

Simplified Self-Attention for Transformer-Based end-to-end Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference.

[BibT_eX]

[DOI]

Dongkai Wang

Gang Hua

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Hybrid Network Compression via Meta-Learning.

[BibT_eX]

[DOI]

Jianming Ye

Jingdong Wang

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

An Energy Consumption Model for Electrical Vehicle Networks via Extended Federated-learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Intelligent Vehicles Symposium, 2021

Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Extremely Low Footprint End-to-End ASR System for Smart Device.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Graph Consistency Based Mean-Teaching for Unsupervised Domain Adaptive Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Intra-Inter Camera Similarity for Unsupervised Person Re-Identification.

[BibT_eX]

[DOI]

Shiyu Xuan

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Group-Group Loss-Based Global-Regional Feature Learning for Vehicle Re-Identification.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Multi-Scale Temporal Cues Learning for Video Person Re-Identification.

[BibT_eX]

[DOI]

Tiejun Huang

IEEE Trans. Image Process., 2020

CDbin: Compact Discriminative Binary Descriptor Learned With Efficient Neural Network.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2020

E2BoWs: An end-to-end Bag-of-Words model via deep convolutional neural network for image retrieval.

[BibT_eX]

[DOI]

Neurocomputing, 2020

Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model.

[BibT_eX]

[DOI]

CoRR, 2020

Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-Identification.

[BibT_eX]

[DOI]

CoRR, 2020

Domain Adaptive Person Re-Identification via Coupling Optimization.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

SAN-M: Memory Equipped Self-Attention for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Neural Zero-Inflated Quality Estimation Model for Automatic Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Self-Supervised Adversarial Multi-Task Learning for Vocoder-Based Monaural Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Pan: Phoneme-Aware Network for Monaural Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-identification.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Robust Partial Matching for Person Search in the Wild.

[BibT_eX]

[DOI]

Yingji Zhong

Xiaoyu Wang

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

GLAD: Global-Local-Alignment Descriptor for Scalable Person Re-Identification.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2019

Deep Representation Learning With Part Loss for Person Re-Identification.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2019

DR2-Net: Deep Residual Reconstruction Network for image compressive sensing.

[BibT_eX]

[DOI]

Neurocomputing, 2019

An outlier detection scheme for dynamical sequential datasets.

[BibT_eX]

[DOI]

Commun. Stat. Simul. Comput., 2019

Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Zhijie Yan

CoRR, 2019

EAGER: Edge-Aided imaGe undERstanding System.

[BibT_eX]

[DOI]

Jianzhong He

Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019

Self-Guided Hash Coding for Large-Scale Person Re-identification.

[BibT_eX]

[DOI]

Ming Yang

Proceedings of the 2nd IEEE Conference on Multimedia Information Processing and Retrieval, 2019

Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition.

[BibT_eX]

[DOI]

Zhijie Yan

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Towards Language-Universal Mandarin-English Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Audio Tagging with Compact Feedforward Sequential Memory Network and Audio-to-Audio Ratio Based Data Augmentation.

[BibT_eX]

[DOI]

Zhiying Huang

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Resolution-invariant Person Re-Identification.

[BibT_eX]

[DOI]

Shunan Mao

Ming Yang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Global-Local Temporal Representations for Video Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Robust Audio-visual Speech Recognition Using Bimodal Dfsmn with Multi-condition Training and Dropout Regularization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Investigation of Modeling Units for Mandarin Speech Recognition Using Dfsmn-ctc-smbr.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Bi-Directional Cascade Network for Perceptual Edge Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Multi-Scale 3D Convolution Network for Video Based Person Re-Identification.

[BibT_eX]

[DOI]

Tiejun Huang

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Sequential Outlier Criterion for Sparsification of Online Adaptive Filtering.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2018

Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2018

AutoBD: Automated Bi-Level Description for Scalable Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2018

Interacting Tracklets for Multi-Object Tracking.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2018

Learning Affective Features With a Hybrid Deep Model for Audio-Visual Emotion Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2018

Multi-type attributes driven multi-camera person re-identification.

[BibT_eX]

[DOI]

Pattern Recognit., 2018

Multi-Task Learning with Low Rank Attribute Embedding for Multi-Camera Person Re-Identification.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2018

The Scale Effect on Spatial Interaction Patterns: An Empirical Study Using Taxi O-D data of Beijing and Shanghai.

[BibT_eX]

[DOI]

IEEE Access, 2018

SCAN: Spatial and Channel Attention Network for Vehicle Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

VP-ReID: Vehicle and Person Re-Identification System.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Acoustic Modeling with DFSMN-CTC and Joint CTC-CE Learning.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Compact Feedforward Sequential Memory Networks for Small-footprint Keyword Spotting.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

RAM: A Region-Aware Deep Model for Vehicle Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Deep-FSMN for Large Vocabulary Continuous Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Deep Feed-Forward Sequential Memory Networks for Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Person Transfer GAN to Bridge Domain Gap for Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Nonrecurrent Neural Structure for Long-Term Dependence.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition.

[BibT_eX]

[DOI]

Pattern Recognit., 2017

Attributes driven tracklet-to-tracklet person re-identification using latent prototypes space mapping.

[BibT_eX]

[DOI]

Pattern Recognit., 2017

DSP: Discriminative Spatial Part modeling for Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

Image Vis. Comput., 2017

E$^2$BoWs: An End-to-End Bag-of-Words Model via Deep Convolutional Neural Network.

[BibT_eX]

[DOI]

CoRR, 2017

Deep Representation Learning with Part Loss for Person Re-Identification.

[BibT_eX]

[DOI]

CoRR, 2017

DR2-Net: Deep Residual Reconstruction Network for Image Compressive Sensing.

[BibT_eX]

[DOI]

CoRR, 2017

One-Shot Fine-Grained Instance Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Multimedia Conference, 2017

GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Multimedia Conference, 2017

Gaussian Prediction Based Attention for Online End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Junfeng Hou

Li-Rong Dai

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Large-scale person re-identification as retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Pose-Driven Deep Convolutional Model for Person Re-identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Feedforward sequential memory networks based encoder-decoder model for machine translation.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Learning the number of nodes in DNNs with activation mask.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Coarse-to-Fine Description for Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2016

Hybrid Orthogonal Projection and Estimation (HOPE): A New Framework to Learn Neural Networks.

[BibT_eX]

[DOI]

Hui Jiang

Li-Rong Dai

J. Mach. Learn. Res., 2016

Neural Networks Models for Entity Discovery and Linking.

[BibT_eX]

[DOI]

CoRR, 2016

Note on the perfect EIC-graphs.

[BibT_eX]

[DOI]

Jun Yue

Xia Zhang

Appl. Math. Comput., 2016

The USTC NELSLIP Systems for Trilingual Entity Detection and Linking Tasks at TAC KBP 2016.

[BibT_eX]

[DOI]

Proceedings of the 2016 Text Analysis Conference, 2016

USTC at NTCIR-12 STC Task.

[BibT_eX]

[DOI]

Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

Multimodal Deep Convolutional Neural Network for Audio-Visual Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features.

[BibT_eX]

[DOI]

Junfeng Hou

Li-Rong Dai

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Future Context Attention for Unidirectional LSTM Based Acoustic Model.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Deep Attributes Driven Multi-camera Person Re-identification.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2016, 2016

2015

Cross Indexing With Grouplets.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2015

An Attribute-Assisted Reranking Model for Web Image Search.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2015

Semantic-Aware Co-Indexing for Image Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2015

Multi-order visual phrase for scalable partial-duplicate visual search.

[BibT_eX]

[DOI]

Multim. Syst., 2015

Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency.

[BibT_eX]

[DOI]

CoRR, 2015

A Fixed-Size Encoding Method for Variable-Length Sequences with its Application to Neural Network Language Models.

[BibT_eX]

[DOI]

CoRR, 2015

Feedforward Sequential Memory Neural Networks without Recurrent Feedback.

[BibT_eX]

[DOI]

CoRR, 2015

Hybrid Orthogonal Projection and Estimation (HOPE): A New Framework to Probe and Learn Neural Networks.

[BibT_eX]

[DOI]

Hui Jiang

CoRR, 2015

Orientational Spatial Part Modeling for Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Mobile Services, MS 2015, New York City, NY, USA, June 27, 2015

Augmented Feature Fusion for Image Retrieval System.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Rectified linear neural networks with tied-scalar regularization for LVCSR.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Efficient indexing for large-scale image search.

[BibT_eX]

[DOI]