Bin Xiao

Orcid: 0000-0001-6477-5911

Affiliations:

Microsoft Cloud+AI, Microsoft Research Asia, China
South China University of Technology, School of Electronic and Information Engineering, China

According to our database¹, Bin Xiao authored at least 35 papers between 2014 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

Dynamic Ensemble Reasoning for LLM Experts.

[BibT_eX]

[DOI]

CoRR, 2024

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion.

[BibT_eX]

[DOI]

CoRR, 2024

Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search.

[BibT_eX]

[DOI]

CoRR, 2024

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Efficient Modulation for Vision Networks.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance.

[BibT_eX]

[DOI]

CoRR, 2023

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.

[BibT_eX]

[DOI]

CoRR, 2023

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

i-Code: An Integrative and Composable Multimodal Learning Framework.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks.

[BibT_eX]

[DOI]

CoRR, 2022

CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks.

[BibT_eX]

[DOI]

CoRR, 2022

Efficient Self-supervised Vision Transformers for Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

TinyViT: Fast Pretraining Distillation for Small Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

DaViT: Dual Attention Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision, 2022

MiniViT: Compressing Vision Transformers with Weight Multiplexing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Unified Contrastive Learning in Image-Text-Label Space.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Deep High-Resolution Representation Learning for Visual Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2021

Florence: A New Foundation Model for Computer Vision.

[BibT_eX]

[DOI]

CoRR, 2021

Focal Self-attention for Local-Global Interactions in Vision Transformers.

[BibT_eX]

[DOI]

CoRR, 2021

Focal Attention for Long-Range Interactions in Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

CvT: Introducing Convolutions to Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Lite-HRNet: A Lightweight High-Resolution Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Dynamic Head: Unifying Object Detection Heads With Attentions.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates.

[BibT_eX]

[DOI]

CoRR, 2020

HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

3D Human Pose Estimation via Explicit Compositional Depth Maps.

[BibT_eX]

[DOI]

Haiping Wu

Bin Xiao

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Bottom-up Higher-Resolution Networks for Multi-Person Pose Estimation.

[BibT_eX]

[DOI]

CoRR, 2019

High-Resolution Representations for Labeling Pixels and Regions.

[BibT_eX]

[DOI]

CoRR, 2019

Deep High-Resolution Representation Learning for Human Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Simple Baselines for Human Pose Estimation and Tracking.

[BibT_eX]

[DOI]

Bin Xiao

Haiping Wu

Yichen Wei

Proceedings of the Computer Vision - ECCV 2018, 2018

2014

Mariana: Tencent Deep Learning Platform and its Applications.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2014

Bin Xiao

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...