Zhou Yu

Orcid: 0000-0001-8407-1137

Affiliations:
  • Hangzhou Dianzi University, Key Laboratory of Complex Systems Modeling and Simulation, Hangzhou, China


According to our database1, Zhou Yu authored at least 55 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Effective Video Summarization by Extracting Parameter-Free Motion Attention.
ACM Trans. Multim. Comput. Commun. Appl., July, 2024

Confidence correction for trained graph convolutional networks.
Pattern Recognit., 2024

Imp: Highly Capable Large Multimodal Models for Mobile Devices.
CoRR, 2024

3D Question Answering with Scene Graph Reasoning.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MPOD123: One Image to 3D Content Generation Using Mask-Enhanced Progressive Outline-to-Detail Optimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
MARN: Multi-level Attentional Reconstruction Networks for Weakly Supervised Video Temporal Grounding.
Neurocomputing, October, 2023

Bilaterally Slimmable Transformer for Elastic and Efficient Visual Question Answering.
IEEE Trans. Multim., 2023

Parameter-Efficient Transfer Learning for Audio-Visual-Language Tasks.
CoRR, 2023

Contrastive Perturbation Network for Weakly Supervised Temporal Sentence Grounding.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

Parameter-Efficient Transfer Learning for Audio-Visual-Language Tasks.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Prompting Large Language Models with Answer Heuristics for Knowledge-Based Visual Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Deep relational self-Attention networks for scene graph generation.
Pattern Recognit. Lett., 2022

Question-relationship guided graph attention network for visual question answer.
Multim. Syst., 2022

Towards Efficient and Elastic Visual Question Answering with Doubly Slimmable Transformer.
CoRR, 2022

Delegate-based Utility Preserving Synthesis for Pedestrian Image Anonymization.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

2021
SPRNet: Single-Pixel Reconstruction for One-Stage Instance Segmentation.
IEEE Trans. Cybern., 2021

Long-Term Video Question Answering via Multimodal Hierarchical Memory Attentive Networks.
IEEE Trans. Circuits Syst. Video Technol., 2021

Accelerated masked transformer for dense video captioning.
Neurocomputing, 2021

Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems.
CoRR, 2021

Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2020
Compositional Attention Networks With Two-Stream Fusion for Video Question Answering.
IEEE Trans. Image Process., 2020

Multimodal Transformer With Multi-View Visual Representation for Image Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2020

Intra- and Inter-modal Multilinear Pooling with Multitask Learning for Video Grounding.
Neural Process. Lett., 2020

Weakly-Supervised Multi-Level Attentional Reconstruction Network for Grounding Textual Queries in Videos.
CoRR, 2020

Deep Multimodal Neural Architecture Search.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

2019
End-to-end visual grounding via region proposal networks and bilinear pooling.
IET Comput. Vis., 2019

Multimodal Unified Attention Networks for Vision-and-Language Interactions.
CoRR, 2019

Single Pixel Reconstruction for One-stage Instance Segmentation.
CoRR, 2019

Deep Modular Co-Attention Networks for Visual Question Answering.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
User-Click-Data-Based Fine-Grained Image Recognition via Weakly Supervised Metric Learning.
ACM Trans. Multim. Comput. Commun. Appl., 2018

Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering.
IEEE Trans. Neural Networks Learn. Syst., 2018

Comprehensive Distance-Preserving Autoencoders for Cross-Modal Retrieval.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Ontology-Driven Hierarchical Deep Learning for Fashion Recognition.
Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018

Open-Ended Long-form Video Question Answering via Adaptive Hierarchical Reinforced Networks.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

2017
Beyond Bilinear: Generalized Multi-modal Factorized High-order Pooling for Visual Question Answering.
CoRR, 2017

Privacy Setting Recommendation for Image Sharing.
Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017

Deep Mixture of Experts with Diverse Task Spaces.
Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017

Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2015
RAISE: A Whole Process Modeling Method for Unstructured Data Management.
Proceedings of the 2015 IEEE International Conference on Multimedia Big Data, BigMM 2015, 2015

2014
Sparse Multi-Modal Hashing.
IEEE Trans. Multim., 2014

Hashing with List-Wise learning to rank.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Discriminative coupled dictionary hashing for fast cross-media retrieval.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Cross-Media Hashing with Neural Networks.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Cross-media hashing with kernel regression.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

2012
LuSH: A Generic High-Dimensional Index Framework.
Proceedings of the Web-Age Information Management, 2012

Image Ranking via Attribute Boosted Hypergraph.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012

2010
Fire Surveillance Method Based on Quaternionic Wavelet Features.
Proceedings of the Advances in Multimedia Modeling, 2010

Error-correcting output hashing in fast similarity search.
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010

2009
Shanghai Jiao Tong University participation in high-level feature extraction and surveillance event detection at TRECVID 2009.
Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

Structure-Preserving Colorization Based on Quaternionic Phase Reconstruction.
Proceedings of the Advances in Multimedia Information Processing, 2009

2008
Shanghai Jiao Tong University participation in high-level feature extraction, automatic search and surveillance event detectionat TRECVID 2008.
Proceedings of the TRECVID 2008 workshop participants notebook papers, 2008


  Loading...