Boqing Gong

Orcid: 0000-0003-3915-5977

According to our database1, Boqing Gong authored at least 124 papers between 2009 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Large-scale multi-center CT and MRI segmentation of pancreas with deep learning.
Medical Image Anal., 2025

2024
Open Long-Tailed Recognition in a Dynamic World.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2024

OmnixR: Evaluating Omni-modality Language Models on Reasoning across Modalities.
CoRR, 2024

ε-VAE: Denoising as Visual Decoding.
CoRR, 2024

SOAR: Self-supervision Optimized UAV Action Recognition with Efficient Object-Aware Pretraining.
CoRR, 2024

The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise.
CoRR, 2024

Automatic Jailbreaking of the Text-to-Image Generative AI Systems.
CoRR, 2024

On Discrete Prompt Optimization for Diffusion Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

VideoPrism: A Foundational Visual Encoder for Video Understanding.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Language Model Beats Diffusion - Tokenizer is key to visual generation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Structured Video-Language Modeling with Temporal Grouping and Spatial Grounding.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?
Proceedings of the Computer Vision - ECCV 2024, 2024

Instruct-Imagen: Image Generation with Multi-modal Instruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Distilling Vision-Language Models on Millions of Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Towards A Unified Neural Architecture for Visual Recognition and Reasoning.
CoRR, 2023

Multi-modal Domain Adaptation for REG via Relation Transfer.
CoRR, 2023

VideoGLUE: Video General Understanding Evaluation of Foundation Models.
CoRR, 2023

Federated Learning of Shareable Bases for Personalization-Friendly Image Classification.
CoRR, 2023

Identity Encoder for Personalized Diffusion.
CoRR, 2023

Domain Generalization with Adversarial Intensity Attack for Medical Image Segmentation.
CoRR, 2023

Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models.
CoRR, 2023

Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding.
CoRR, 2023

Video Timeline Modeling For News Story Understanding.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Module-wise Adaptive Distillation for Multimodality Foundation Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Unified Visual Relationship Detection with Vision and Language Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

On Calibrating Semantic Segmentation Models: Analyses and An Algorithm.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
2.5D visual relationship detection.
Comput. Vis. Image Underst., 2022

On Calibrating Semantic Segmentation Models: Analysis and An Algorithm.
CoRR, 2022

Federated Multi-Target Domain Adaptation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Surrogate Gap Minimization Improves Sharpness-Aware Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks.
Proceedings of the Computer Vision - ECCV 2022, 2022

LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds.
Proceedings of the Computer Vision - ECCV 2022, 2022

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

medXGAN: Visual Explanations for Medical Classifiers through a Generative Latent Space.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

On Temporal Granularity in Self-Supervised Video Representation Learning.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text.
CoRR, 2021

Exploring Temporal Granularity in Self-Supervised Video Representation Learning.
CoRR, 2021

Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Model Training.
CoRR, 2021

Bridging the Gap Between Object Detection and User Intent via Query-Modulation.
CoRR, 2021

When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations.
CoRR, 2021

A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection.
CoRR, 2021

Analyzing Deep Neural Network's Transferability via Fréchet Distance.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Large-Scale Meta-Learning with Continual Trajectory Shifting.
Proceedings of the 38th International Conference on Machine Learning, 2021

Contrastive Learning for Label Efficient Semantic Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

MosaicOS: A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

A Lazy Approach to Long-Horizon Gradient-Based Meta-Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Spatiotemporal Contrastive Video Representation Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Ranking Neural Checkpoints.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

MoViNets: Mobile Video Networks for Efficient Video Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Adversarially Adaptive Normalization for Single Domain Generalization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Robust and Accurate Object Detection via Adversarial Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Class-Balanced Distillation for Long-Tailed Visual Recognition.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
A Curriculum Domain Adaptation Approach to the Semantic Segmentation of Urban Scenes.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Classifier and Exemplar Synthesis for Zero-Shot Learning.
Int. J. Comput. Vis., 2020

Smooth Adversarial Training.
CoRR, 2020

When Ensembling Smaller Models is More Efficient than Single Large Models.
CoRR, 2020

Look, Listen, and Act: Towards Audio-Visual Embodied Navigation.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius.
Proceedings of the 8th International Conference on Learning Representations, 2020

Improving Object Detection with Selective Self-supervised Self-training.
Proceedings of the Computer Vision - ECCV 2020, 2020

Adversarial Examples Improve Image Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Open Compound Domain Adaptation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Compound Domain Adaptation in an Open World.
CoRR, 2019

Defending Against Adversarial Attacks Using Random Forests.
CoRR, 2019

Synthesized Policies for Transfer and Adaptation across Tasks and Environments.
CoRR, 2019

Joint Modeling of Dense and Incomplete Trajectories for Citywide Traffic Volume Inference.
Proceedings of the World Wide Web Conference, 2019

End-to-End Video Captioning With Multitask Reinforcement Learning.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Facial Image-to-Video Translation by a Hidden Affine Transformation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

CAMOU: Learning Physical Vehicle Camouflages to Adversarially Attack Detectors in the Wild.
Proceedings of the 7th International Conference on Learning Representations, 2019

DHER: Hindsight Experience Replay for Dynamic Goals.
Proceedings of the 7th International Conference on Learning Representations, 2019

Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Fast and Accurate One-Stage Approach to Visual Grounding.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Constructing Self-Motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Defending Against Adversarial Attacks Using Random Forest.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Large-Scale Long-Tailed Recognition in an Open World.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

A Robust Zero-Sum Game Framework for Pool-based Active Learning.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Learning a Multi-Concept Video Retrieval Model with Multiple Latent Variables.
ACM Trans. Multim. Comput. Commun. Appl., 2018

Exploring a SOT-MRAM Based In-Memory Computing for Data Processing.
IEEE Trans. Multi Scale Comput. Syst., 2018

Defend Deep Neural Networks Against Adversarial Examples via Fixed andDynamic Quantized Activation Functions.
CoRR, 2018

Blind Pre-Processing: A Robust Defense Method Against Adversarial Examples.
CoRR, 2018

A Semi-Supervised Two-Stage Approach to Learning from Noisy Labels.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Synthesize Policies for Transfer and Adaptation across Tasks and Environments.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect.
Proceedings of the 6th International Conference on Learning Representations, 2018

Improving Sequential Determinantal Point Processes for Supervised Video Summarization.
Proceedings of the Computer Vision - ECCV 2018, 2018

How Local Is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization.
Proceedings of the Computer Vision - ECCV 2018, 2018

Deep Face Detector Adaptation Without Negative Transfer or Catastrophic Forgetting.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

End-to-End Learning of Motion Representation for Video Understanding.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Geodesic Flow Kernel and Landmarks: Kernel Methods for Unsupervised Domain Adaptation.
Proceedings of the Domain Adaptation in Computer Vision Applications., 2017

A Multisource Domain Generalization Approach to Visual Attribute Detection.
Proceedings of the Domain Adaptation in Computer Vision Applications., 2017

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes.
Proceedings of the IEEE International Conference on Computer Vision, 2017

VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Query-Focused Video Summarization: Dataset, Evaluation, and a Memory Network Based Approach.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Improving Facial Attribute Prediction Using Semantic Segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Weighted geodesic flow kernel for interpersonal mutual influence modeling and emotion recognition in dyadic interactions.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction, 2017

2016
Infinite-Label Learning with Semantic Output Codes.
CoRR, 2016

Improved Dropout for Shallow and Deep Learning.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Query-Focused Extractive Video Summarization.
Proceedings of the Computer Vision - ECCV 2016, 2016

Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames.
Proceedings of the Computer Vision - ECCV 2016, 2016

An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild.
Proceedings of the Computer Vision - ECCV 2016, 2016

Fast Zero-Shot Image Tagging.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Learning Attributes Equals Multi-Source Domain Generalization.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Synthesized Classifiers for Zero-Shot Learning.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Large-Margin Determinantal Point Processes.
Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

2014
Learning Kernels for Unsupervised Domain Adaptation with Applications to Visual Object Recognition.
Int. J. Comput. Vis., 2014

Diverse Sequential Subset Selection for Supervised Video Summarization.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
Learning Semantic Signatures for 3D Object Retrieval.
IEEE Trans. Multim., 2013

Reshaping Visual Datasets for Domain Adaptation.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
Geodesic flow kernel for unsupervised domain adaptation.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
3D object retrieval with semantic attributes.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

2009
Boosting 3D object retrieval by object flexibility.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Automatic facial expression recognition on a single 3D face by exploring shape deformation.
Proceedings of the 17th International Conference on Multimedia 2009, 2009


  Loading...