Chuanxiong Guo

Orcid: 0000-0002-0730-8468

According to our database1, Chuanxiong Guo authored at least 74 papers between 2001 and 2023.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2022, "for contributions to design of data center networking".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Predicting GPU Failures With High Precision Under Deep Learning Workloads.
Proceedings of the 16th ACM International Conference on Systems and Storage, 2023

SRNIC: A Scalable Architecture for RDMA NICs.
Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, 2023

BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing.
Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, 2023

Lyra: Elastic Scheduling for Deep Learning Clusters.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

2022
dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training.
CoRR, 2022

Aryl: An Elastic Cluster Scheduler for Deep Learning.
CoRR, 2022

Prediction of GPU Failures Under Deep Learning Workloads.
CoRR, 2022

FAERY: An FPGA-accelerated Embedding-based Retrieval System.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

Tiara: A Scalable and Efficient Hardware Acceleration Architecture for Stateful Layer-4 Load Balancing.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

Collie: Finding Performance Anomalies in RDMA Subsystems.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN Training.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

2021
Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem.
CoRR, 2021

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly.
Proceedings of the 9th International Conference on Learning Representations, 2021

Building verified neural networks with specifications for systems.
Proceedings of the APSys '21: 12th ACM SIGOPS Asia-Pacific Workshop on Systems, 2021

2020
Observing and Mitigating Micro-Burst Traffic in Data Center Networks.
IEEE/ACM Trans. Netw., 2020

A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU/CPU Clusters.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Elastic parameter server load distribution in deep learning clusters.
Proceedings of the SoCC '20: ACM Symposium on Cloud Computing, 2020

2019
Tagger: Practical PFC Deadlock Prevention in Data Center Networks.
IEEE/ACM Trans. Netw., 2019

A generic communication scheduler for distributed DNN training acceleration.
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

NetBouncer: Active Device and Link Failure Localization in Data Center Networks.
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds.
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

Tiresias: A GPU Cluster Manager for Distributed Deep Learning.
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

2018
TerseCades: Efficient Data Compression in Stream Processing.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Capturing and Enhancing In Situ System Observability for Failure Detection.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Deepview: Virtual Disk Failure Diagnosis and Pattern Detection for Azure.
Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation, 2018

Micro-Burst in Data Centers: Observations, Analysis, and Mitigations.
Proceedings of the 2018 IEEE 26th International Conference on Network Protocols, 2018

Optimus: an efficient dynamic resource scheduler for deep learning clusters.
Proceedings of the Thirteenth EuroSys Conference, 2018

2017
CubicRing: Exploiting Network Proximity for Distributed In-Memory Key-Value Store.
IEEE/ACM Trans. Netw., 2017

Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters.
IEEE Trans. Cloud Comput., 2017

deTector: a Topology-aware Monitoring System for Data Center Networks.
Proceedings of the 2017 USENIX Annual Technical Conference, 2017

Virtualized Network Coding Functions on the Internet.
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

Gray Failure: The Achilles' Heel of Cloud-Scale Systems.
Proceedings of the 16th Workshop on Hot Topics in Operating Systems, 2017

2016
Explicit Path Control in Commodity Data Centers: Design and Applications.
IEEE/ACM Trans. Netw., 2016

RDMA over Commodity Ethernet at Scale.
Proceedings of the ACM SIGCOMM 2016 Conference, Florianopolis, Brazil, August 22-26, 2016, 2016

Deadlocks in Datacenter Networks: Why Do They Form, and How to Avoid Them.
Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 2016

2015
Congestion Control for Large-Scale RDMA Deployments.
Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015

Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis.
Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015

CubicRing: Enabling One-Hop Failure Detection and Recovery for Distributed In-Memory Storage Systems.
Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation, 2015

UniDrive: Synergize Multiple Consumer Cloud Storage Services.
Proceedings of the 16th Annual Middleware Conference, Vancouver, BC, Canada, December 07, 2015

2013
IP-Geolocation Mapping for Moderately Connected Internet Regions.
IEEE Trans. Parallel Distributed Syst., 2013

ICTCP: Incast Congestion Control for TCP in Data-Center Networks.
IEEE/ACM Trans. Netw., 2013

Moving Big Data to The Cloud: An Online Cost-Minimizing Approach.
IEEE J. Sel. Areas Commun., 2013

Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers.
IEEE J. Sel. Areas Commun., 2013

Moving big data to the cloud.
Proceedings of the IEEE INFOCOM 2013, Turin, Italy, April 14-19, 2013, 2013

PACE: Policy-Aware Application Cloud Embedding.
Proceedings of the IEEE INFOCOM 2013, Turin, Italy, April 14-19, 2013, 2013

Per-packet load-balanced, low-latency routing for clos-based data center networks.
Proceedings of the Conference on emerging Networking Experiments and Technologies, 2013

2012
DAC: Generic and Automatic Address Configuration for Data Center Networks.
IEEE/ACM Trans. Netw., 2012

Using CPU as a traffic co-processing unit in commodity switches.
Proceedings of the first workshop on Hot topics in software defined networks, 2012

RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store.
Proceedings of the 4th USENIX Workshop on Hot Topics in Cloud Computing, 2012

Tuning ECN for data center networks.
Proceedings of the Conference on emerging Networking Experiments and Technologies, 2012

Datacast: a scalable and efficient reliable group data delivery service for data centers.
Proceedings of the Conference on emerging Networking Experiments and Technologies, 2012

2011
Scalable and cost-effective interconnection of data-center servers using dual server ports.
IEEE/ACM Trans. Netw., 2011

ServerSwitch: A Programmable and High Performance Platform for Data Center Networks.
Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, 2011

RDCM: Reliable data center multicast.
Proceedings of the INFOCOM 2011. 30th IEEE International Conference on Computer Communications, 2011

2010
Time-domain sending rate and response function of eXplicit Control Protocol.
Telecommun. Syst., 2010

Generic and automatic address configuration for data center networks.
Proceedings of the ACM SIGCOMM 2010 Conference on Applications, 2010

SecondNet: a data center network virtualization architecture with bandwidth guarantees.
Proceedings of the 2010 ACM Conference on Emerging Networking Experiments and Technology, 2010

2009
CAFE: a configurable packet forwarding engine for data center networks.
Proceedings of the ACM SIGCOMM 2009 Workshop on Programmable Routers for Extensible Services of Tomorrow, 2009

BCube: a high performance, server-centric network architecture for modular data centers.
Proceedings of the ACM SIGCOMM 2009 Conference on Applications, 2009

A scalable micro wireless interconnect structure for CMPs.
Proceedings of the 15th Annual International Conference on Mobile Computing and Networking, 2009

FiConn: Using Backup Port for Server Interconnection in Data Centers.
Proceedings of the INFOCOM 2009. 28th IEEE International Conference on Computer Communications, 2009

Mining the Web and the Internet for Accurate IP Address Geolocations.
Proceedings of the INFOCOM 2009. 28th IEEE International Conference on Computer Communications, 2009

MDCube: a high performance network structure for modular data center interconnection.
Proceedings of the 2009 ACM Conference on Emerging Networking Experiments and Technology, 2009

2008
Dcell: a scalable and fault-tolerant network structure for data centers.
Proceedings of the ACM SIGCOMM 2008 Conference on Applications, 2008

Improved Smoothed Round Robin Schedulers for High-Speed Packet Networks.
Proceedings of the INFOCOM 2008. 27th IEEE International Conference on Computer Communications, 2008

Design and Analysis of an XCP-TCP Gateway.
Proceedings of the 2008 International Conference on Information Networking, 2008

2007
Generic Application-Level Protocol Analyzer and its Language.
Proceedings of the Network and Distributed System Security Symposium, 2007

G-3: An O(1) Time Complexity Packet Scheduler That Provides Bounded End-to-End Delay.
Proceedings of the INFOCOM 2007. 26th IEEE International Conference on Computer Communications, 2007

2005
End-system-based mobility support in IPv6.
IEEE J. Sel. Areas Commun., 2005

2004
SRR: an O(1) time-complexity packet scheduler for flows in multiservice packet networks.
IEEE/ACM Trans. Netw., 2004

A seamless and proactive end-to-end mobility solution for roaming across heterogeneous wireless networks.
IEEE J. Sel. Areas Commun., 2004

Shield: vulnerability-driven network filters for preventing known vulnerability exploits.
Proceedings of the ACM SIGCOMM 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, August 30, 2004

2003
Efficient mobility management for vertical handoff between WWAN and WLAN.
IEEE Commun. Mag., 2003

2001
SRR: An O(1) time complexity packet scheduler for flows in multi-service packet networks.
Proceedings of the ACM SIGCOMM 2001 Conference on Applications, 2001


  Loading...