2024
FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning.
Trans. Mach. Learn. Res., 2024
TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Medical Image Anal., 2024
AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation.
CoRR, 2024
What If We Recaption Billions of Web Images with LLaMA-3?
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Autoregressive Pretraining with Mamba in Vision.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Medical Vision Generalist: Unifying Medical Imaging Tasks in Context.
CoRR, 2024
Mamba-R: Vision Mamba ALSO Needs Registers.
CoRR, 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation.
CoRR, 2024
Brain Tumor Segmentation Through Supervoxel Transformer.
Proceedings of the IEEE International Symposium on Biomedical Imaging, 2024
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation.
Proceedings of the Computer Vision - ECCV 2024, 2024
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties.
Proceedings of the Computer Vision - ECCV 2024, 2024
SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference.
Proceedings of the Computer Vision - ECCV 2024, 2024
Masked Autoencoders are Secretly Efficient Learners.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
BNET: Batch Normalization With Enhanced Linear Transformation.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties.
CoRR, 2023
SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference.
CoRR, 2023
3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
3D-TransUNet for Brain Metastases Segmentation in the BraTS2023 Challenge.
Proceedings of the Brain Tumor Segmentation, and Cross-Modality Domain Adaptation for Medical Image Segmentation, 2023
SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023
Superpixel Transformers for Efficient Semantic Segmentation.
IROS, 2023
3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
2022
Waymo Open Dataset: Panoramic Video Panoptic Segmentation.
CoRR, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Waymo Open Dataset: Panoramic Video Panoptic Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022
In Defense of Image Pre-Training for Spatiotemporal Recognition.
Proceedings of the Computer Vision - ECCV 2022, 2022
2021
Are Transformers more robust than CNNs?
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Shape-Texture Debiased Neural Network Training.
Proceedings of the 9th International Conference on Learning Representations, 2021
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Batch Normalization with Enhanced Linear Transformation.
CoRR, 2020
CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Network.
CoRR, 2020
AtomNAS: Fine-Grained End-to-End Neural Architecture Search.
Proceedings of the 8th International Conference on Learning Representations, 2020
Neural Architecture Search for Lightweight Non-Local Networks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
Learning to Refine 3D Human Pose Sequences.
Proceedings of the 2019 International Conference on 3D Vision, 2019
2018
Online Dictionary Learning for Approximate Archetypal Analysis.
Proceedings of the Computer Vision - ECCV 2018, 2018
2016
Scene text script identification with Convolutional Recurrent Neural Networks.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016