Qiang Wang

Orcid: 0000-0002-2986-967X

Affiliations:
  • Hong Kong Baptist University, Kowloon Tong, Hong Kong


According to our database1, Qiang Wang authored at least 51 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression.
CoRR, 2024

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs.
CoRR, 2024

ParZC: Parametric Zero-Cost Proxies for Efficient NAS.
CoRR, 2024

Towards Efficient and Reliable LLM Serving: A Real-World Workload Study.
CoRR, 2024

3D Question Answering for City Scene Understanding.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Scheduling Deep Learning Jobs in Multi-Tenant GPU Clusters via Wise Resource Sharing.
Proceedings of the 32nd IEEE/ACM International Symposium on Quality of Service, 2024

Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

ScheMoE: An Extensible Mixture-of-Experts Distributed Training System with Tasks Scheduling.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024

LPZero: Language Model Zero-cost Proxy Search from Zero.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Multi-task Domain Adaptation for Language Grounding with 3D Objects.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
CF-NeRF: Camera Parameter Free Neural Radiance Fields with Incremental Learning.
CoRR, 2023

FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs.
CoRR, 2023

SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning.
CoRR, 2023

Explicifying Neural Implicit Fields for Efficient Dynamic Human Avatar Modeling via a Neural Explicit Surface.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Rethinking Disparity: A Depth Range Free Multi-View Stereo Based on Disparity.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous Clusters.
IEEE Trans. Parallel Distributed Syst., 2022

Scale-Consistent Fusion: From Heterogeneous Local Sampling to Global Immersive Rendering.
IEEE Trans. Image Process., 2022

Energy-Efficient Online Scheduling of Transformer Inference Services on GPU Servers.
IEEE Trans. Green Commun. Netw., 2022

EASNet: Searching Elastic and Accurate Network Architecture for Stereo Matching.
Proceedings of the Computer Vision - ECCV 2022, 2022

SphereDepth: Panorama Depth Estimation from Spherical Domain.
Proceedings of the International Conference on 3D Vision, 2022

2021
FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks.
CoRR, 2021

Scale-Consistent Fusion: from Heterogeneous Local Sampling to Global Immersive Rendering.
CoRR, 2021

Energy-aware Task Scheduling with Deadline Constraint in DVFS-enabled Heterogeneous Clusters.
CoRR, 2021

IRS: A Large Naturalistic Indoor Robotics Stereo Dataset to Train Deep Models for Disparity and Surface Normal Estimation.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

EDNet: Efficient Disparity Estimation With Cost Volume Combination and Attention-Based Spatial Residual.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
GPGPU Performance Estimation With Core and Memory Frequency Scaling.
IEEE Trans. Parallel Distributed Syst., 2020

ESetStore: An Erasure-Coded Storage System With Fast Data Recovery.
IEEE Trans. Parallel Distributed Syst., 2020

Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs.
CoRR, 2020

GPGPU performance estimation for frequency scaling using cross-benchmarking.
Proceedings of the GPGPU@PPoPP '20: 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit colocated with 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

Energy-efficient Inference Service of Transformer-based Deep Learning Models on GPUs.
Proceedings of the 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, 2020

Communication-Efficient Distributed Deep Learning with Merged Gradient Sparsification on GPUs.
Proceedings of the 39th IEEE Conference on Computer Communications, 2020

FADNet: A Fast and Accurate Network for Disparity Estimation.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format.
Proceedings of the 26th IEEE International Conference on Parallel and Distributed Systems, 2020

Layer-Wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training.
Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

2019
IRS: A Large Synthetic Indoor Robotics Stereo Dataset for Disparity and Surface Normal Estimation.
CoRR, 2019

A Convergence Analysis of Distributed SGD with Communication-Efficient Gradient Sparsification.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

A Distributed Synchronous SGD Algorithm with Global Top-k Sparsification for Low Bandwidth Networks.
Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019

The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study.
Proceedings of the Tenth ACM International Conference on Future Energy Systems, 2019

2018
G-CRS: GPU Accelerated Cauchy Reed-Solomon Coding.
IEEE Trans. Parallel Distributed Syst., 2018

Modeling and Evaluation of Synchronous Stochastic Gradient Descent in Distributed Deep Learning on Multiple GPUs.
CoRR, 2018

A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018

Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs.
Proceedings of the 2018 IEEE 16th Intl Conf on Dependable, 2018

2017
GPGPU Power Estimation with Core and Memory Frequency Scaling.
SIGMETRICS Perform. Evaluation Rev., 2017

An optimal parallel implementation of Markov Clustering based on the coordination of CPU and GPU.
J. Intell. Fuzzy Syst., 2017

A survey and measurement study of GPU DVFS on energy conservation.
Digit. Commun. Networks, 2017

EPPMiner: An Extended Benchmark Suite for Energy, Power and Performance Characterization of Heterogeneous Architecture.
Proceedings of the Eighth International Conference on Future Energy Systems, 2017

2016
Benchmarking State-of-the-Art Deep Learning Software Tools.
Proceedings of the 7th International Conference on Cloud Computing and Big Data, 2016

2013
P-FAD: Real-Time Face Detection Scheme on Embedded Smart Cameras.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2013

On-line configuration of large scale surveillance networks using mobile smart camera.
Proceedings of the Seventh International Conference on Distributed Smart Cameras, 2013

2012
RaFFD: Resource-aware Fast Foreground Detection in embedded smart cameras.
Proceedings of the 2012 IEEE Global Communications Conference, 2012


  Loading...