Zhenheng Yang

Orcid: 0000-0003-0303-5885

According to our database¹, Zhenheng Yang authored at least 28 papers between 2016 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths.

[BibT_eX]

[DOI]

Weijia Mao

Zhenheng Yang

Mike Zheng Shou

CoRR, February, 2025

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Parallelized Autoregressive Visual Generation.

[BibT_eX]

[DOI]

CoRR, 2024

InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption.

[BibT_eX]

[DOI]

CoRR, 2024

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning.

[BibT_eX]

[DOI]

CoRR, 2024

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, 2024

OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

2021

Weakly Supervised Instance Segmentation for Videos With Temporal Mask Consistency.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2020

Enhancing Model Parallelism in Neural Architecture Search for Multidevice System.

[BibT_eX]

[DOI]

IEEE Micro, 2020

SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization.

[BibT_eX]

[DOI]

CoRR, 2020

SPAN: Spatial Pyramid Attention Network for Image Manipulation Localization.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

Activity Driven Weakly Supervised Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos.

[BibT_eX]

[DOI]

CoRR, 2018

Face and Body Association for Video-Based Face Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

LEGO: Learning Edge With Geometry All at Once by Watching Videos.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Occlusion Aware Unsupervised Learning of Optical Flow.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Unsupervised Learning of Geometry From Videos With Edge-Aware Depth-Normal Consistency.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Occlusion Aware Unsupervised Learning of Optical Flow.

[BibT_eX]

[DOI]

CoRR, 2017

Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency.

[BibT_eX]

[DOI]

CoRR, 2017

TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

TALL: Temporal Activity Localization via Language Query.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation.

[BibT_eX]

[DOI]

Zhenheng Yang

Jiyang Gao

Ram Nevatia

Proceedings of the British Machine Vision Conference 2017, 2017

RED: Reinforced Encoder-Decoder Networks for Action Anticipation.

[BibT_eX]

[DOI]

Jiyang Gao

Zhenheng Yang

Ram Nevatia

Proceedings of the British Machine Vision Conference 2017, 2017

Cascaded Boundary Regression for Temporal Action Detection.

[BibT_eX]

[DOI]

Jiyang Gao

Zhenheng Yang

Ram Nevatia

Proceedings of the British Machine Vision Conference 2017, 2017

2016

A multi-scale cascade fully convolutional network face detector.

[BibT_eX]

[DOI]

Zhenheng Yang

Ramakant Nevatia

Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Zhenheng Yang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...