Junwei Liang

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Prioritized Semantic Learning for Zero-Shot Instance Navigation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FinTextQA: A Dataset for Long-form Financial Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

GeoDeformer: Geometric Deformable Transformer for Action Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

AdaFocus: Towards End-to-end Weakly Supervised Learning for Long-Video Action Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

PostRainBench: A comprehensive benchmark and a new model for precipitation forecasting.

[BibT_eX]

[DOI]

CoRR, 2023

PatchMixer: A Patch-Mixing Architecture for Long-Term Time Series Forecasting.

[BibT_eX]

[DOI]

Zeying Gong

Yujin Tang

CoRR, 2023

TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Spatial-Temporal Alignment Network for Action Recognition.

[BibT_eX]

[DOI]

Jinhui Ye

CoRR, 2023

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition.

[BibT_eX]

[DOI]

Christophe De Vleeschouwer

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Multi-dataset Training of Transformers for Robust Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Transformer-based System for Action Spotting in Soccer Videos.

[BibT_eX]

[DOI]

Proceedings of the MMSports@MM 2022: Proceedings of the 5th International ACM Workshop on Multimedia Content Analysis in Sports, 2022

SoccerNet 2022 Challenges Results.

[BibT_eX]

[DOI]

Alexandre Alahi

Bernard Ghanem

Marc Van Droogenbroeck

Miguel Santos Marques

Proceedings of the MMSports@MM 2022: Proceedings of the 5th International ACM Workshop on Multimedia Content Analysis in Sports, 2022

Stargazer: A Transformer-based Driver Action Detection System for Intelligent Transportation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021

MSNet: A Multilevel Instance Segmentation Network for Natural Disaster Damage Assessment in Aerial Videos.

[BibT_eX]

[DOI]

Xiaoyu Zhu

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Weakly Supervised 3D Semantic Segmentation Using Cross-Image Consensus and Inter-Voxel Affinity Relations.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

Spatial-Temporal Alignment Network for Action Recognition and Detection.

[BibT_eX]

[DOI]

CoRR, 2020

Joint Analysis and Prediction of Human Actions and Paths in Video.

[BibT_eX]

[DOI]

CoRR, 2020

SimAug: Learning Robust Representations from 3D Simulation for Pedestrian Trajectory Prediction in Unseen Cameras.

[BibT_eX]

[DOI]

CoRR, 2020

Argus: Efficient Activity Detection System for Extended Video Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Applications of Computer Vision Workshops, 2020

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Focal Visual-Text Attention for Memex Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2019

Technical Report of the DAISY System - Shooter Localization, Models, Interface, and Beyond.

[BibT_eX]

[DOI]

Jay D. Aronson

CoRR, 2019

Minding the Gaps in a Video Action Analysis Pipeline.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Applications of Computer Vision Workshops, 2019

MMVG-INF-Etrol@TRECVID 2019: Activities in Extended Video.

[BibT_eX]

[DOI]

Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

Shooter Localization Using Social Media Videos.

[BibT_eX]

[DOI]

Jay D. Aronson

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Peeking Into the Future: Predicting Future Person Activities and Locations in Videos.

[BibT_eX]

[DOI]

Juan Carlos Niebles

Li Fei-Fei

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Shooter Localization Using Videos in the Wild.

[BibT_eX]

[DOI]

Jay D. Aronson

Proceedings of the 2019 International Conference on Content-Based Multimedia Indexing, 2019

2018

Multimodal Co-Training for Selecting Good Examples from Webly Labeled Video.

[BibT_eX]

[DOI]

Ryota Hinami

Shin'ichi Satoh

CoRR, 2018

Informedia @ TRECVID 2018: Ad-hoc Video Search, Video to Text Description, Activities in Extended video.

[BibT_eX]

[DOI]

Jia Chen

Shizhe Chen

Qin Jin

Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

Multimodal Filtering of Social Media for Temporal Monitoring and Event Analysis.

[BibT_eX]

[DOI]

Po-Yao Huang

Jean-Baptiste Lamare

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Focal Visual-Text Attention for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

MemexQA: Visual Memex Question Answering.

[BibT_eX]

[DOI]

CoRR, 2017

Informedia @ TRECVID 2017.

[BibT_eX]

[DOI]

Proceedings of the 2017 TREC Video Retrieval Evaluation, 2017

Leveraging Multi-modal Prior Knowledge for Large-scale Concept Learning in Noisy Web Data.

[BibT_eX]

[DOI]

Deyu Meng

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

Temporal localization of audio events for conflict monitoring in social media.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Synchronization for multi-perspective videos in the wild.

[BibT_eX]

[DOI]

Poyao Huang

Jia Chen

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Webly-Supervised Learning of Multimodal Video Detectors.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

An Event Reconstruction Tool for Conflict Monitoring Using Social Media.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning.

[BibT_eX]

[DOI]

Deyu Meng

CoRR, 2016

Informedia @ TRECVID 2016.

[BibT_eX]

[DOI]

Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Video Description Generation using Audio and Visual Cues.

[BibT_eX]

[DOI]

Qin Jin

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Generating Natural Video Descriptions via Multimodal Processing.

[BibT_eX]

[DOI]

Qin Jin

Xiaozhu Lin

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Learning to Detect Concepts from Webly-Labeled Video Data.

[BibT_eX]

[DOI]

Deyu Meng