Zhenye Gan

Orcid: 0000-0002-2431-1159

According to our database1, Zhenye Gan authored at least 35 papers between 2013 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
SoftPatch+: Fully unsupervised anomaly classification and segmentation.
Pattern Recognit., 2025

2024
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network.
CoRR, 2024

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models.
CoRR, 2024

PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision.
CoRR, 2024

ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection.
CoRR, 2024

Efficient Multimodal Large Language Models: A Survey.
CoRR, 2024

MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection.
CoRR, 2024

Real-IAD: A Real-World Multi-View Dataset for Benchmarking Versatile Industrial Anomaly Detection.
CoRR, 2024

DMAD: Dual Memory Bank for Real-World Anomaly Detection.
CoRR, 2024

LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Learning Hybrid Negative Probability Model for Weakly-Supervised Whole Slide Image Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

TransAVS: End-to-End Audio-Visual Segmentation with Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2024

Real-IAD: A Real-World Multi-View Dataset for Benchmarking Versatile Industrial Anomaly Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Speech Recognition of Noisy Tibetan Based on Parallel Branch Structure.
Proceedings of the 2024 4th International Conference on Artificial Intelligence, 2024

Rethinking Reverse Distillation for Multi-Modal Anomaly Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Hear to Segment: Unmixing the Audio to Guide the Semantic Segmentation.
CoRR, 2023

MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Calibrated Teacher for Sparsely Annotated Object Detection.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
A tibetan-dependent speaker recognition method based on deep learning.
Multim. Tools Appl., 2022

CFNet: Learning Correlation Functions for One-Stage Panoptic Segmentation.
CoRR, 2022

Iterative Few-shot Semantic Segmentation from Image Label Text.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Learning Distinctive Margin toward Active Domain Adaptation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2019
Study on the Tones Biases of Mandarin Speaker in Amdo Tibetan Areas Based on Statistics.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Perception and Production of Mandarin Monosyllabic Tones by Amdo Tibetan College Students.
Proceedings of the Natural Language Processing and Chinese Computing, 2018

Mandarin-Tibetan Cross-Lingual Voice Conversion System Based on Deep Neural Network.
Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence, 2018

A DNN-based Mandarin-Tibetan cross-lingual speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus.
Proceedings of the Data Science, 2017

Improved CNN-based facial landmarks tracking via ridge regression at 150 Fps on mobile devices.
Proceedings of the 10th International Congress on Image and Signal Processing, 2017

2016
Towards Realizing Sign Language-to-Speech Conversion by Combining Deep Learning and Statistical Parametric Speech Synthesis.
Proceedings of the Social Computing, 2016

Research on text analysis for Tibetan statistical parametric speech synthesis.
Proceedings of the 9th International Congress on Image and Signal Processing, 2016

2015
Using speaker adaptive training to realize Mandarin-Tibetan cross-lingual speech synthesis.
Multim. Tools Appl., 2015

2014
Realizing speech enhancement by combining EEMD and K-SVD dictionary training algorithm.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

2013
Realizing Tibetan speech synthesis by speaker adaptive training.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013


  Loading...