Jingwen Leng
Orcid: 0000-0002-5660-5493Affiliations:
- Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China
- IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
- University of Texas at Austin, Department of Electrical and Computer Engineering, TX, USA (PhD 2016)
According to our database1,
Jingwen Leng
authored at least 91 papers
between 2013 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on dl.acm.org
On csauthors.net:
Bibliography
2025
BAFT: bubble-aware fault-tolerant framework for distributed DNN training with hybrid parallelism.
Frontiers Comput. Sci., January, 2025
2024
Vortex: Efficient Sample-Free Dynamic Tensor Program Optimization via Hardware-aware Strategy Space Hierarchization.
CoRR, 2024
CoRR, 2024
A Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous Things.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Cicero: Addressing Algorithmic and Architectural Bottlenecks in Neural Rendering by Radiance Warping and Memory Optimizations.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-To-Coarse Attention.
Proceedings of the IEEE International Conference on Acoustics, 2024
JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN Pruning.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
Improving Cluster Utilization Through Adaptive Resource Management for Deep Neural Network and CPU Jobs Colocation.
IEEE Trans. Computers, December, 2023
Accelerating Generic Graph Neural Networks via Architecture, Compiler, Partition Method Co-Design.
CoRR, 2023
DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service.
CoRR, 2023
ImaGen: A General Framework for Generating Memory- and Power-Efficient Image Processing Accelerators.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
FIRST: Exploiting the Multi-Dimensional Attributes of Functions for Power-Aware Serverless Computing.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
PAC: Preference-Aware Co-location Scheduling on Heterogeneous NUMA Architectures To Improve Resource Utilization.
Proceedings of the 37th International Conference on Supercomputing, 2023
Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture.
Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023
Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023
Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023
uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural Networks.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization.
CoRR, 2022
Proceedings of the 18th International Conference on Mobility, Sensing and Networking, 2022
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Proceedings of the IEEE 33rd International Symposium on Software Reliability Engineering, 2022
PAME: precision-aware multi-exit DNN serving for reducing latencies of batched inferences.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
SALO: an efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
VELTAIR: towards high-performance multi-tenant deep learning services via adaptive compilation and scheduling.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
Erratum to "Predictive Guardbanding: Program-Driven Timing Margin Reduction for GPUs".
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021
System-level Early-stage Modeling and Evaluation of IVR-assisted Processor Power Delivery System.
ACM Trans. Archit. Code Optim., 2021
ZIPPER: Exploiting Tile- and Operator-level Parallelism for General and Scalable Graph Neural Network Acceleration.
CoRR, 2021
Enable simultaneous DNN services based on deterministic operator overlap and precise latency prediction.
Proceedings of the International Conference for High Performance Computing, 2021
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021
Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators.
Proceedings of the IEEE International Symposium on Workload Characterization, 2021
Dubhe: Towards Data Unbiasedness with Homomorphic Encryption in Federated Learning Client Selection.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021
Proceedings of the 39th IEEE International Conference on Computer Design, 2021
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021
2020
Voltage-Stacked Power Delivery Systems: Reliability, Efficiency, and Power Management.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020
J. Parallel Distributed Comput., 2020
Probabilistic robust regression with adaptive weights - a case study on face recognition.
Frontiers Comput. Sci., 2020
CoRR, 2020
Towards QoS-Aware and Resource-Efficient GPU Microservices Based on Spatial Multitasking GPUs In Datacenters.
CoRR, 2020
Survey and design of paleozoic: a high-performance compiler tool chain for deep learning inference accelerator.
CCF Trans. High Perform. Comput., 2020
IEEE Comput. Archit. Lett., 2020
Proceedings of the International Conference for High Performance Computing, 2020
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020
Sturgeon: Preference-aware Co-location for Improving Utilization of Power Constrained Computers.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020
URSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020
Asymmetric Resilience: Exploiting Task-Level Idempotency for Transient Error Recovery in Accelerator-Based Systems.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020
Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT's Attention.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020
2019
IEEE Trans. Computers, 2019
Bandwidth and Locality Aware Task-stealing for Manycore Architectures with Bandwidth-Asymmetric Memory.
ACM Trans. Archit. Code Optim., 2019
Characterizing Perception Module Performance and Robustness in Production-Scale Autonomous Driving System.
Proceedings of the Network and Parallel Computing, 2019
Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019
Proceedings of the 25th IEEE International Symposium on On-Line Testing and Robust System Design, 2019
Avalon: towards QoS awareness and improved utilization through multi-resource management in datacenters.
Proceedings of the ACM International Conference on Supercomputing, 2019
Ebird: Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services.
Proceedings of the 37th IEEE International Conference on Computer Design, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
Voltage-Stacked GPUs: A Control Theory Driven Cross-Layer Solution for Practical Voltage Stacking in GPUs.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Proceedings of the 36th IEEE International Conference on Computer Design, 2018
Efficient and reliable power delivery in voltage-stacked manycore system with hybrid charge-recycling regulators.
Proceedings of the 55th Annual Design Automation Conference, 2018
2017
Proceedings of the 54th Annual Design Automation Conference, 2017
2015
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Proceedings of the 48th International Symposium on Microarchitecture, 2015
GPU voltage noise: Characterization and hierarchical smoothing of spatial and temporal voltage noise interference in GPU architectures.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
2014
IEEE Comput. Archit. Lett., 2014
Proceedings of the International Symposium on Low Power Electronics and Design, 2014
2013
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013