Tsung Tai Yeh

Orcid: 0000-0002-2401-9916

According to our database1, Tsung Tai Yeh authored at least 18 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
ReSA: Reconfigurable Systolic Array for Multiple Tiny DNN Tensors.
ACM Trans. Archit. Code Optim., September, 2024

TinyTS: Memory-Efficient TinyML Model Compiler Framework on Microcontrollers.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

OC-DLRM: Minimizing the I/O Traffic of DLRM Between Main Memory and OCSSD.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

WER: Maximizing Parallelism of Irregular Graph Applications Through GPU Warp EqualizeR.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

2023
StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

COLAB: Collaborative and Efficient Processing of Replicated Cache Requests in GPU.
Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

2022
Lego: Dynamic Tensor-Splitting Multi-Tenant DNN Models on Multi-Chip-Module Architecture.
Proceedings of the 19th International SoC Design Conference, 2022

2021
Deadline-Aware Offloading for High-Throughput Accelerators.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

2020
Dimensionality-Aware Redundant SIMT Instruction Elimination.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
Pagoda: A GPU Runtime System for Narrow Tasks.
ACM Trans. Parallel Comput., 2019

Optimizing GPU Cache Policies for MI Workloads.
CoRR, 2019

Optimizing GPU Cache Policies for MI Workloads.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019

2017
Pagoda: Fine-Grained GPU Resource Virtualization for Narrow Tasks.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

2016
POSTER: Pagoda: A Runtime System to Maximize GPU Utilization in Data Parallel Tasks with Limited Parallelism.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
An Energy-Efficient and Reliable Storage Mechanism for Data-Intensive Academic Archive Systems.
ACM Trans. Storage, 2015

2012
CacheRAID: An Efficient Adaptive Write Cache Policy to Conserve RAID Disk Array Energy.
Proceedings of the IEEE Fifth International Conference on Utility and Cloud Computing, 2012

2011
Parallel non-linear dimension reduction algorithm on GPU.
Int. J. Granul. Comput. Rough Sets Intell. Syst., 2011

2010
Efficient Parallel Algorithm for Nonlinear Dimensionality Reduction on GPU.
Proceedings of the 2010 IEEE International Conference on Granular Computing, 2010


  Loading...