Mingyu Gao

Orcid: 0000-0001-8433-7281

Affiliations:
  • Tsinghua University, Beijing, China
  • Stanford University, Stanford, CA, USA


According to our database1, Mingyu Gao authored at least 35 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
PimPam: Efficient Graph Pattern Matching on Real Processing-in-Memory Hardware.
Proc. ACM Manag. Data, 2024

A system capable of verifiably and privately screening global DNA synthesis.
CoRR, 2024

Bulkor: Enabling Bulk Loading for Path ORAM.
Proceedings of the IEEE Symposium on Security and Privacy, 2024

NDPBridge: Enabling Cross-Bank Coordination in Near-DRAM-Bank Processing Architectures.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

Seesaw: Compensating for Nonlinear Reduction with Linear Computations for Private Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Trimma: Trimming Metadata Storage and Latency for Hybrid Memory Systems.
Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023
Optimizing DNNs With Partially Equivalent Transformations and Automated Corrections.
IEEE Trans. Computers, December, 2023

SODA: A Set of Fast Oblivious Algorithms in Distributed Secure Data Analytics.
Proc. VLDB Endow., 2023

FLARE: A Fast, Secure, and Memory-Efficient Distributed Analytics Framework (Flavor: Systems).
Proc. VLDB Endow., 2023

When Tree Meets Hash: Reducing Random Reads for Index Structures on Persistent Memories.
Proc. ACM Manag. Data, 2023

KAPLA: Pragmatic Representation and Fast Solving of Scalable NN Accelerator Dataflow.
CoRR, 2023

Honeycomb: Secure and Efficient GPU Executions via Static Validation.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Baryon: Efficient Hybrid Memory Management with Compression and Sub-Blocking.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

ABNDP: Co-optimizing Data Access and Load Balance in Near-Data Processing.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

GZKP: A GPU Accelerated Zero-Knowledge Proof System.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Spada: Accelerating Sparse Matrix Multiplication with Adaptive Dataflow.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Secure MLaaS with Temper: Trusted and Efficient Model Partitioning and Enclave Reuse.
Proceedings of the Annual Computer Security Applications Conference, 2023

2022
PPMLAC: high performance chipset architecture for secure multi-party computation.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

ShEF: shielded enclaves for cloud FPGAs.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

FINGERS: exploiting fine-grained parallelism in graph mining accelerators.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections.
Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, 2021

PipeZK: Accelerating Zero-Knowledge Proof with a Pipelined Architecture.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

2020
Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc.
Proceedings of the Third Conference on Machine Learning and Systems, 2020

Interstellar: Using Halide's Scheduling Language to Analyze DNN Accelerators.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
Optimizing DNN Computation with Relaxed Graph Substitutions.
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

TANGRAM: Optimized Coarse-Grained Dataflow for Scalable NN Accelerators.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
DNN Dataflow Choice Is Overrated.
CoRR, 2018

GraphP: Reducing Communication for PIM-Based Graph Processing with Efficient Data Partition.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017
DRAF: A Low-Power DRAM-Based Reconfigurable Acceleration Fabric.
IEEE Micro, 2017

3D nanosystems enable <i>embedded</i> abundant-data computing: special session paper.
Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion, 2017

TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
HRL: Efficient and flexible reconfigurable logic for near-data processing.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015
Energy-Efficient Abundant-Data Computing: The N3XT 1, 000x.
Computer, 2015

Practical Near-Data Processing for In-Memory Analytics Frameworks.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015


  Loading...