Yapeng Tian

Orcid: 0000-0003-1423-4513

According to our database1, Yapeng Tian authored at least 85 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
STDAN: Deformable Attention Network for Space-Time Video Super-Resolution.
IEEE Trans. Neural Networks Learn. Syst., August, 2024

Cross Modality Bias in Visual Question Answering: A Causal View With Possible Worlds VQA.
IEEE Trans. Multim., 2024

STADNet: Spatial-Temporal Attention-Guided Dual-Path Network for cardiac cine MRI super-resolution.
Medical Image Anal., 2024

Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation.
CoRR, 2024

Semantic Grouping Network for Audio Source Separation.
CoRR, 2024

AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation.
CoRR, 2024

Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition.
CoRR, 2024

Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation.
CoRR, 2024

SignLLM: Sign Languages Production Large Language Models.
CoRR, 2024

Robust Active Speaker Detection in Noisy Environments.
CoRR, 2024

Text-to-Audio Generation Synchronized with Videos.
CoRR, 2024

Efficiently Leveraging Linguistic Priors for Scene Text Spotting.
CoRR, 2024

OSCaR: Object State Captioning and State Change Representation.
CoRR, 2024

LAVSS: Location-Guided Audio-Visual Spatial Audio Separation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision.
Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, 2024

OSCaR: Object State Captioning and State Change Representation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision Viewers.
Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024

MIMOSA: Human-AI Co-Creation of Computational Spatial Audio Effects on Videos.
Proceedings of the 16th Conference on Creativity & Cognition, 2024

2023
Adaptive channel-modulated personalized federated learning for magnetic resonance image reconstruction.
Comput. Biol. Medicine, October, 2023

Meta-Learning-Based Degradation Representation for Blind Super-Resolution.
IEEE Trans. Image Process., 2023

GDSSR: Toward Real-World Ultra-High-Resolution Image Super-Resolution.
IEEE Signal Process. Lett., 2023

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation.
CoRR, 2023

Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware Sound Separation.
CoRR, 2023

Neural Acoustic Context Field: Rendering Realistic Room Impulse Response With Neural Fields.
CoRR, 2023

CMRxRecon: An open cardiac MRI dataset for the competition of accelerated image reconstruction.
CoRR, 2023

SignDiff: Learning Diffusion Models for American Sign Language Production.
CoRR, 2023

DiffI2I: Efficient Diffusion Model for Image-to-Image Translation.
CoRR, 2023

DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models.
CoRR, 2023

Towards Long Form Audio-visual Video Understanding.
CoRR, 2023

Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA.
CoRR, 2023

EgoVSR: Towards High-Quality Egocentric Video Super-Resolution.
CoRR, 2023

DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment.
CoRR, 2023

AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation.
CoRR, 2023

PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data.
Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, 2023

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Dual Arbitrary Scale Super-Resolution for Multi-contrast MRI.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Knowledge Distillation based Degradation Estimation for Blind Super-Resolution.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Basic Binary Convolution Unit for Binarized Image Restoration Network.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

DiffIR: Efficient Diffusion Model for Image Restoration.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Audio-Visual Class-Incremental Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Class-Incremental Grouping Network for Continual Audio-Visual Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Structured Sparsity Learning for Efficient Video Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Audio-Visual Grouping Network for Sound Localization from Mixtures.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Egocentric Audio-Visual Object Localization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Unified, Explainable, and Robust Multisensory Perception.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Learning in Audio-visual Context: A Review, Analysis, and New Perspective.
CoRR, 2022

Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

DuDoCAF: Dual-Domain Cross-Attention Fusion with Recurrent Transformer for Fast Multi-contrast MR Imaging.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022

Correspondences for image and video reconstruction.
Proceedings of the Imaging and Multimedia Analytics at the Edge 2022, 2022

Learning Spatio-Temporal Downsampling for Effective Video Upscaling.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning to Answer Questions in Dynamic Audio-Visual Scenarios.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Transformer-empowered Multi-scale Contextual Matching and Aggregation for Multi-contrast MRI Super-resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation for Reference-Based Super-resolution.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Efficient Non-local Contrastive Attention for Image Super-resolution.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Residual Dense Network for Image Restoration.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Zooming SlowMo: An Efficient One-Stage Framework for Space-Time Video Super-Resolution.
CoRR, 2021

Video Matting via Consistency-Regularized Graph Neural Networks.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Can Audio-Visual Integration Strengthen Robustness Under Multimodal Attacks?
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Space-Time Memory Network for Sounding Object Localization in Videos.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
LCSCNet: Linear Compressing-Based Skip-Connecting Network for Image Super-Resolution.
IEEE Trans. Image Process., 2020

Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing.
Proceedings of the Computer Vision - ECCV 2020, 2020

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Deep Learning for Single Image Super-Resolution: A Brief Review.
IEEE Trans. Multim., 2019

Deep Audio Prior.
CoRR, 2019

CFSNet: Toward a Controllable Feature Space for Image Restoration.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Audio-Visual Interpretable and Controllable Video Captioning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Audio-Visual Event Localization in the Wild.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019


2018
An Attempt towards Interpretable Audio-Visual Video Captioning.
CoRR, 2018

Deep Learning for Single Image Super-Resolution: A Brief Review.
CoRR, 2018

Audio-Visual Event Localization in Unconstrained Videos.
Proceedings of the Computer Vision - ECCV 2018, 2018

Residual Dense Network for Image Super-Resolution.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Adaptive anchor-point selection for single image super-resolution.
Proceedings of the 2017 IEEE Visual Communications and Image Processing, 2017


2016
Consistent Coding Scheme for Single-Image Super-Resolution Via Independent Dictionaries.
IEEE Trans. Multim., 2016

Anchored neighborhood regression based single image super-resolution from self-examples.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

2015
Single-image super-resolution using clustering-based global regression and propagation filtering.
Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition, 2015


  Loading...