Yongkang Wong

Orcid: 0000-0002-1239-4428

Affiliations:
  • National University of Singapore, School of Computing, Singapore


According to our database1, Yongkang Wong authored at least 96 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Recurrent Appearance Flow for Occlusion-Free Virtual Try-On.
ACM Trans. Multim. Comput. Commun. Appl., August, 2024

PAINT: Photo-realistic Fashion Design Synthesis.
ACM Trans. Multim. Comput. Commun. Appl., February, 2024

Unsupervised Domain Adaptation by Causal Learning for Biometric Signal-based HCI.
ACM Trans. Multim. Comput. Commun. Appl., February, 2024

Multi2Human: Controllable human image generation with multimodal controls.
Neurocomputing, 2024

STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion Retargeting.
CoRR, 2024

TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment.
CoRR, 2024

Bridging the Intent Gap: Knowledge-Enhanced Visual Generation.
CoRR, 2024

Privacy-Enhancing Person Re-identification Framework - A Dual-Stage Approach.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

MCM: Multi-condition Motion Synthesis Framework.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Improving Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Finetuning Text-to-Image Diffusion Models for Fairness.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Semantic-Aware Triplet Loss for Image Classification.
IEEE Trans. Multim., 2023

Learning to Minimize the Remainder in Supervised Learning.
IEEE Trans. Multim., 2023

Fair Representation: Guaranteeing Approximate Multiple Group Fairness for Unknown Tasks.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens.
CoRR, 2023

MCM: Multi-condition Motion Synthesis Framework for Multi-scenario.
CoRR, 2023

A Study on Differentiable Logic and LLMs for EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2023.
CoRR, 2023

Narrative Graph for Narrative Generation from Long Videos.
Proceedings of the 2nd Workshop on User-centric Narrative Summarization of Long Videos, 2023

NarSUM '23: The 2nd Workshop on User-Centric Narrative Summarization of Long Videos.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

2022
Enhanced 3D Shape Reconstruction With Knowledge Graph of Category Concept.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Relation-Aware Compositional Zero-Shot Learning for Attribute-Object Pair Recognition.
IEEE Trans. Multim., 2022

Learning to Predict Gradients for Semi-Supervised Continual Learning.
CoRR, 2022

Don't Pour Cereal into Coffee: Differentiable Temporal Logic for Temporal Action Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Panel Discussion: Emerging Topics on Video Summarization.
Proceedings of the NarSUM '22: Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos, 2022

Compute to Tell the Tale: Goal-Driven Narrative Generation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Distance Matters in Human-Object Interaction Detection.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

NarSUM '22: 1st Workshop on User-centric Narrative Summarization of Long Videos.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Chairs Can Be Stood On: Overcoming Object Bias in Human-Object Interaction Detection.
Proceedings of the Computer Vision, 2022

2021
DeepDance: Music-to-Dance Motion Choreography With Adversarial Learning.
IEEE Trans. Multim., 2021

Toward Multi-Modal Conditioned Fashion Image Translation.
IEEE Trans. Multim., 2021

Scene Graph Inference via Multi-Scale Context Modeling.
IEEE Trans. Circuits Syst. Video Technol., 2021

Direction Concentration Learning: Enhancing Congruency in Machine Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Unsupervised Motion Representation Learning with Capsule Autoencoders.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning to Predict Trustworthiness with Steep Slope Loss.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Causal Representation for Training Cross-Domain Pose Estimator via Generative Interventions.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020
G-Softmax: Improving Intraclass Compactness and Interclass Separability of Features.
IEEE Trans. Neural Networks Learn. Syst., 2020

Interact as You Intend: Intention-Driven Human-Object Interaction Detection.
IEEE Trans. Multim., 2020

Video Storytelling: Textual Summaries for Events.
IEEE Trans. Multim., 2020

Unsupervised Online Video Object Segmentation With Motion Property Understanding.
IEEE Trans. Image Process., 2020

Visual Social Relationship Recognition.
Int. J. Comput. Vis., 2020

GradMix: Multi-source Transfer across Domains and Tasks.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Weakly-Supervised Multi-Person Action Recognition in 360° Videos.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

n-Reference Transfer Learning for Saliency Prediction.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
A Multi-sensor Framework for Personal Presentation Analytics.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Multi-Modal and Multi-Domain Embedding Learning for Fashion Retrieval and Analysis.
IEEE Trans. Multim., 2019

Dual-Stream Recurrent Neural Network for Video Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2019

Surface-Electromyography-Based Gesture Recognition by Multi-View Deep Learning.
IEEE Trans. Biomed. Eng., 2019

A multi-stream convolutional neural network for sEMG-based gesture recognition in muscle-computer interface.
Pattern Recognit. Lett., 2019

LSTM-based multi-label video event detection.
Multim. Tools Appl., 2019

G-softmax: Improving Intra-class Compactness and Inter-class Separability of Features.
CoRR, 2019

sEMG-Based Gesture Recognition With Embedded Virtual Hand Poses and Adversarial Learning.
IEEE Access, 2019

Explainable Video Action Reasoning via Prior Knowledge and State Transitions.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Unsupervised Domain Adaptation for 3D Human Pose Estimation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Human-imperceptible Privacy Protection Against Machines.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Self-supervised Representation Learning Using 360° Data.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Learning to Detect Human-Object Interactions With Knowledge.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning to Learn From Noisy Labeled Data.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning Controllable Face Generator from Disjoint Datasets.
Proceedings of the Computer Analysis of Images and Patterns, 2019

2018
Video Storytelling.
CoRR, 2018

A Fine-Grained Spatial-Temporal Attention Model for Video Captioning.
IEEE Access, 2018

Unsupervised Learning of View-invariant Action Representations.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

2017
Benchmarking a Multimodal and Multiview and Interactive Dataset for Human Action Recognition.
IEEE Trans. Cybern., 2017

Hierarchical & multimodal video captioning: Discovering and transferring multimodal knowledge for vision to language.
Comput. Vis. Image Underst., 2017

Multi-Camera Action Dataset for Cross-Camera Action Recognition Benchmarking.
Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

Tianjin University and National University of Singapore at TRECVID 2017: Video to Text Description.
Proceedings of the 2017 TREC Video Retrieval Evaluation, 2017

Attention Transfer from Web Images for Video Recognition.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Understanding Fashion Trends from Street Photos via Neighbor-Constrained Embedding Learning.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Semi-Supervised Learning for Surface EMG-based Gesture Recognition.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Dual-Glance Model for Deciphering Social Relationships.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
Multi-Camera Action Dataset (MCAD): A Dataset for Studying Non-overlapped Cross-Camera Action Recognition.
CoRR, 2016

Demo Paper: PreSense - An Assistive Presentation Self-Quantification System.
Proceedings of the IEEE International Symposium on Multimedia, 2016

Multi-stream Deep Learning Framework for Automated Presentation Assessment.
Proceedings of the IEEE International Symposium on Multimedia, 2016

Towards protecting biometric templates without sacrificing performance.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Marker-Less 3D Human Motion Capture with Monocular Image Sequence and Height-Maps.
Proceedings of the Computer Vision - ECCV 2016, 2016

2015
Multi-Camera Saliency.
IEEE Trans. Pattern Anal. Mach. Intell., 2015

Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Multi-sensor Self-Quantification of Presentations.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Label Consistent Quadratic Surrogate model for visual saliency prediction.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Automatic classification of Human Epithelial type 2 cell Indirect Immunofluorescence images using Cell Pyramid Matching.
Pattern Recognit., 2014

On robust face recognition via sparse coding: the good, the bad and the ugly.
IET Biom., 2014

Multi-view action recognition by cross-domain learning.
Proceedings of the IEEE 16th International Workshop on Multimedia Signal Processing, 2014

View-invariant feature discovering for multi-camera human action recognition.
Proceedings of the IEEE 16th International Workshop on Multimedia Signal Processing, 2014

Recovering Social Interaction Spatial Structure from Multiple First-Person Views.
Proceedings of the 3rd International Workshop on Socially-Aware Multimedia, 2014

Scalable Decision-Theoretic Coordination and Control for Real-time Active Multi-Camera Surveillance.
Proceedings of the International Conference on Distributed Smart Cameras, 2014

Discovering Person Identity via Large-Scale Observations.
Proceedings of the Computer Vision - ACCV 2014 Workshops, 2014

2013
On Robust Face Recognition via Sparse Encoding: the Good, the Bad, and the Ugly
CoRR, 2013

Classification of Human Epithelial type 2 cell indirect immunofluoresence images via codebook based descriptors.
Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision, 2013

Temporal encoded F-formation system for social interaction detection.
Proceedings of the ACM Multimedia Conference, 2013

Video analytics for surveillance camera networks.
Proceedings of the 19th IEEE International Conference on Networks, 2013

2012
Towards robust identity inference under surveillance environments: from still images to video sequences
PhD thesis, 2012

On robust biometric identity verification via sparse encoding of faces: Holistic vs local approaches.
Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), 2012

Combined Learning of Salient Local Descriptors and Distance Metrics for Image Set Face Verification.
Proceedings of the Ninth IEEE International Conference on Advanced Video and Signal-Based Surveillance, 2012

2011
Patch-based probabilistic image quality assessment for face selection and improved video-based face recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
Dynamic Amelioration of Resolution Mismatches for Local Feature Based Identity Inference.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

2009
Regression Based Non-frontal Face Synthesis for Improved Identity Verification.
Proceedings of the Computer Analysis of Images and Patterns, 13th International Conference, 2009


  Loading...