Dezun Dong

IEEE Trans. Parallel Distributed Syst., 2022

Hybrid Memory Buffer Microarchitecture for High-Radix Routers.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2022

MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining Routers.

[BibT_eX]

[DOI]

Cunlu Li

ACM Trans. Archit. Code Optim., 2022

CP-SGD: Distributed stochastic gradient descent with compression and periodic compensation.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2022

Understanding node connection modes in Multi-Rail Fat-tree.

[BibT_eX]

[DOI]

Yuyang Wang

J. Parallel Distributed Comput., 2022

Revisiting network congestion avoidance through adaptive packet-chaining reservation.

[BibT_eX]

[DOI]

Comput. Networks, 2022

FastCredit: Expediting credit-based congestion control in datacenters.

[BibT_eX]

[DOI]

Comput. Networks, 2022

Alleviating Performance Interference Through Intra-Queue I/O Isolation for NVMe-over-Fabrics.

[BibT_eX]

[DOI]

Wenhao Gu

Xuchao Xie

Proceedings of the Network and Parallel Computing, 2022

Fine-grained code-comment semantic interaction analysis.

[BibT_eX]

[DOI]

Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, 2022

Fast-Converging Congestion Control in Datacenter Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on Computers and Communications, 2022

A Quantitative Study of the Spatiotemporal I/O Burstiness of HPC Application.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Optimized MPI collective algorithms for dragonfly topology.

[BibT_eX]

[DOI]

Guangnan Feng

Yutong Lu

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

DC4: Reconstructing Data-Credit-Coupled Congestion Control for Data Centers.

[BibT_eX]

[DOI]

Proceedings of the 51st International Conference on Parallel Processing, 2022

STEGNN: Spatial-Temporal Embedding Graph Neural Networks for Road Network Forecasting.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE International Conference on Parallel and Distributed Systems, 2022

A Transformable NVMeoF Queue Design for Better Differentiating Read and Write Request Processing.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE International Conference on Parallel and Distributed Systems, 2022

DNNEmu: A Lightweight Performance Emulator for Distributed DNN Training.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

LTNoT: Realizing the Trade-Offs Between Latency and Throughput in NVMe over TCP.

[BibT_eX]

[DOI]

Wenhao Gu

Xuchao Xie

Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

THperf: Enabling Accurate Network Latency Measurement for Tianhe-2 System.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

ERA: ECN-Ratio-Based Congestion Control in Datacenter Networks.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Cluster, 2022

Reservoir: Enhance the Burst-flow Tolerance in Datacenter Networks.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Advanced Cloud and Big Data, 2022

2021

CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2021

Communication optimization strategies for distributed deep neural network training: A survey.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2021

Harmonia: Explicit Congestion Notification and Credit-Reservation Transport Converged Congestion Control in Datacenters.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2021

Performance Evaluation of Memory-Centric ARMv8 Many-Core Architectures: A Case Study with Phytium 2000+.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2021

CCRP: Converging Credit-Based and Reactive Protocols in Datacenters.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2021

MP-CREDIT: Multi-path credit for high-speed data center transports.

[BibT_eX]

[DOI]

Comput. Networks, 2021

LIBSHALOM: optimizing small and irregular-shaped matrix multiplications on ARMv8 multi-cores.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

Taming Congestion and Latency in Low-Diameter High-Performance Datacenters.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2021

MPICC: Multi-Path INT-Based Congestion Control in Datacenter Networks.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2021

vSketchDLC: A Sketch on Distributed Deep Learning Communication via Fine-grained Tracing Visualization.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2021

Evaluation of Topology-Aware All-Reduce Algorithm for Dragonfly Networks.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2021

FastTune: Timely and Precise Congestion Control in Data Center Network.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

PAARD: Proximity-Aware All-Reduce Communication for Dragonfly Networks.

[BibT_eX]

[DOI]

FastHorovod: Expediting Parallel Message-Passing Schedule for Distributed DNN Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on Computers and Communications, 2021

Characterizing Small-Scale Matrix Multiplications on ARMv8-based Many-Core Architectures.

[BibT_eX]

[DOI]

Weiling Yang

Jianbin Fang

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

MR-tree: A Parametric Family of Multi-Rail Fat-tree.

[BibT_eX]

[DOI]

Yuyang Wang

Proceedings of the IEEE International Performance, 2021

PFT: A Congestion Avoidance Method based on Proactive Flow Throttling at Endpoints.

[BibT_eX]

[DOI]

Proceedings of the 17th IFIP/IEEE International Symposium on Integrated Network Management, 2021

CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

Breaking One-RTT Barrier: Ultra-Precise and Efficient Congestion Control in Datacenter Networks.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Computer Communications and Networks, 2021

NEPG: Partitioning Large-Scale Power-Law Graphs.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2021

A Novel Reinforcement Learning Framework for Adaptive Routing in Network-on-Chips.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, 2021

Exploring Node Connection Modes in Multi-Rail Fat-tree.

[BibT_eX]

[DOI]

Yuyang Wang

Proceedings of the IEEE International Conference on Cluster Computing, 2021

RELAR: A Reinforcement Learning Framework for Adaptive Routing in Network-on-Chips.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2021

2020

Spatially Bursty I/O on Supercomputers: Causes, Impacts and Solutions.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

OD-SGD: One-Step Delay Stochastic Gradient Descent for Distributed Training.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2020

DancerFly: An Order-Aware Network-on-Chip Router On-the-Fly Mitigating Multi-path Packet Reordering.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2020

ssd-sgd: communication sparsification for distributed deep learning training.

[BibT_eX]

[DOI]

CoRR, 2020

OD-SGD: One-step Delay Stochastic Gradient Descent for Distributed Training.

[BibT_eX]

[DOI]

CoRR, 2020

Communication Optimization Strategies for Distributed Deep Learning: A Survey.

[BibT_eX]

[DOI]

CoRR, 2020

APCC: Agile and Precise Congestion Control in Datacenters.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020

Bundlefly: a low-diameter topology for multicore fiber.

[BibT_eX]

[DOI]

Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

FastCredit: Expediting Credit-based Proactive Transports in Datacenters.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Conference on Parallel and Distributed Systems, 2020

Converging Credit-based and Reactive Datacenter Transport using ECN and RTT.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

Reducing Tail Latency in Proactive Congestion Control via Moderate Speculation.

[BibT_eX]

[DOI]

Ke Wu

SSP: Speeding up Small Flows for Proactive Transport in Datacenters.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2020

2019

SketchDLC: A Sketch on Distributed Deep Learning Communication via Trace Capturing.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

HARE: History-Aware Adaptive Routing Algorithm for Endpoint Congestion in Networks-on-Chip.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2019

ExpressPass+: ECN-friendly Credit Reservation Congestion Control for Datacenters.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2019 Conference Posters and Demos, 2019

HyFabric: Minimizing FCT in Optical and Electrical Hybrid Data Center Networks.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2019 Conference Posters and Demos, 2019

ExpressPass++: Credit-Effecient Congestion Control for Data Centers.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019

Measuring the Coexistence Competitiveness of ECN- or RTT-Based ExpressPass and TCP in Data Centers.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019

DeepHiR: improving high-radix router throughput with deep hybrid memory buffer microarchitecture.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Supercomputing, 2019

Network Congestion Avoidance through Packet-chaining Reservation.

[BibT_eX]

[DOI]

Proceedings of the 48th International Conference on Parallel Processing, 2019

EC4: ECN and Credit-Reservation Converged Congestion Control.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Conference on Parallel and Distributed Systems, 2019

PPS: A Low-Latency and Low-Complexity Switching Architecture Based on Packet Prefetch and Arbitration Prediction.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2019

2018

RoB-Router : A Reorder Buffer Enabled Low Latency Network-on-Chip Router.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2018

Congestion control in high-speed lossless data center networks: A survey.

[BibT_eX]

[DOI]

Shan Huang

Wei Bai

Future Gener. Comput. Syst., 2018

DETOUR: A Large-Scale Non-blocking Optical Data Center Fabric.

[BibT_eX]

[DOI]

Jinzhen Bao

Baokang Zhao

Proceedings of the Supercomputing Frontiers - 4th Asian Conference, 2018

CRSP: Network Congestion Control through Credit Reservation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018

BFRP: Endpoint Congestion Avoidance Through Bilateral Flow Reservation.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International Performance Computing and Communications Conference, 2018

Eca-Router : On Achieving Endpoint Congestion Aware Switch Allocation in the On-Chip Network.

[BibT_eX]

[DOI]

Cunlu Li

Proceedings of the 36th IEEE International Conference on Computer Design, 2018

2017

Energy-efficient NoC with multi-granularity power optimization.

[BibT_eX]

[DOI]

J. Supercomput., 2017

HERO: A Hybrid Electrical and Optical Multicast for Accelerating High-Performance Data Center Applications.

[BibT_eX]

[DOI]

Proceedings of the Posters and Demos Proceedings of the Conference of the ACM Special Interest Group on Data Communication, 2017

A Scalable and Resilient Microarchitecture Based on Multiport Binding for High-Radix Router Design.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

An Efficient Label Routing on High-Radix Interconnection Networks.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017

iCAST: Accelerating High-Performance Data Center Applications by Hybrid Electrical and Optical Multicast.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017

Exploiting contention and congestion aware switch allocation in network-on-chips.

[BibT_eX]

[DOI]

Cunlu Li

Proceedings of the ACM Turing 50th Celebration Conference, 2017

NoC power optimization using combined routing algorithms.

[BibT_eX]

[DOI]

Ji Wu

Li Wang

Proceedings of the 16th IEEE/ACIS International Conference on Computer and Information Science, 2017

2016

Detailed and clock-driven simulation for HPC interconnection network.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2016

Galaxyfly: A Novel Family of Flexible-Radix Low-Diameter Topologies for Large-Scales Interconnection Networks.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Supercomputing, 2016

CCAS: Contention and congestion aware switch allocation for network-on-chips.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Conference on Computer Design, 2016

RoB-Router: Low Latency Network-on-Chip Router Microarchitecture Using Reorder Buffer.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE Annual Symposium on High-Performance Interconnects, 2016

MBL: A Multi-stage Bufferless High-radix Router.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

2015

High Performance Interconnect Network for Tianhe System.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2015

FlyCast: Free-Space Optics Accelerating Multicast Communications in Physical Layer.

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015

Chameleon: Adaptive energy-efficient heterogeneous network-on-chip.

[BibT_eX]

[DOI]

Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

HVCRouter: Energy Efficient Network-on-Chip Router with Heterogeneous Virtual Channels.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

2014

The TH Express high performance interconnect networks.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2014

PathZip: A lightweight scheme for tracing packet path in wireless sensor networks.

[BibT_eX]

[DOI]

Comput. Networks, 2014

FLYER: Fine-grained landmark based greedy geographic routing under uncertain locations.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Communications, 2014

2013

Fine-Grained Location-Free Planarization in Wireless Sensor Networks.

[BibT_eX]

[DOI]

IEEE Trans. Mob. Comput., 2013

Fine-Grained Landmark Based Greedy Geographic Routing with Guaranteed Delivery Under Uncertain Locations.

[BibT_eX]

[DOI]

Proceedings of the IEEE 10th International Conference on Mobile Ad-Hoc and Sensor Systems, 2013

WormPlanar: Topological Planarization Based Wormhole Detection in Wireless Networks.

[BibT_eX]

[DOI]

Proceedings of the 42nd International Conference on Parallel Processing, 2013

2012

Distributed Coverage in Wireless Ad Hoc and Sensor Networks by Topological Graph Approaches.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2012

MDS-Based Wormhole Detection Using Local Topology in Wireless Sensor Networks.

[BibT_eX]

[DOI]

Int. J. Distributed Sens. Networks, 2012

PathZip: Packet path tracing in wireless sensor networks.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE International Conference on Mobile Ad-Hoc and Sensor Systems, 2012

2011

Edge Self-Monitoring for Wireless Sensor Networks.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2011

Component-based localization in sparse wireless networks.

[BibT_eX]

[DOI]

IEEE/ACM Trans. Netw., 2011

Topological Detection on Wormholes in Wireless Ad Hoc and Sensor Networks.

[BibT_eX]

[DOI]

IEEE/ACM Trans. Netw., 2011

Connectivity-Based Wormhole Detection in Ubiquitous Sensor Networks.

[BibT_eX]

[DOI]

J. Inf. Sci. Eng., 2011

Fine-grained location-free planarization in wireless sensor networks.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2011. 30th IEEE International Conference on Computer Communications, 2011

2010

Distributed Coverage in Wireless Ad Hoc and Sensor Networks by Topological Graph Approaches.

[BibT_eX]

[DOI]

Proceedings of the 2010 International Conference on Distributed Computing Systems, 2010

2009

Fine-grained boundary recognition in wireless ad hoc and sensor networks by topological methods.

[BibT_eX]

[DOI]

Yunhao Liu

Proceedings of the 10th ACM Interational Symposium on Mobile Ad Hoc Networking and Computing, 2009

WormCircle: Connectivity-Based Wormhole Detection in Wireless Ad Hoc and Sensor Networks.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

2008

Self-monitoring for sensor networks.

[BibT_eX]

[DOI]

Yunhao Liu