Tao Zhang

Affiliations:
  • NVIDIA Corporation, Santa Clara, CA, USA
  • Pennsylvania State University, Department of Computer Science and Engineering, PA, USA (former)


According to our database1, Tao Zhang authored at least 31 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Toward CXL-Native Memory Tiering via Device-Side Profiling.
CoRR, 2024

NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

2023
HBP: Hierarchically Balanced Pruning and Accelerator Co-Design for Efficient DNN Inference.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Polaris: Enhancing CXL-based Memory Expanders with Memory-side Prefetching.
Proceedings of the Advanced Parallel Processing Technologies, 2023

2022
Flatfish: A Reinforcement Learning Approach for Application-Aware Address Mapping.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

2020
DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
RC-NVM: Dual-Addressing Non-Volatile Memory Architecture Supporting Both Row and Column Memory Accesses.
IEEE Trans. Computers, 2019

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks.
CoRR, 2019

Sparse Tensor Core: Algorithm and Hardware Co-Design for Vector-wise Sparse Neural Networks on Modern GPUs.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

2018
RC-NVM: Enabling Symmetric Row and Column Memory Accesses for In-memory Databases.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2016
BACH: A Bandwidth-Aware Hybrid Cache Hierarchy Design with Nonvolatile Memories.
J. Comput. Sci. Technol., 2016

Building a Low Latency, Highly Associative DRAM Cache with the Buffered Way Predictor.
Proceedings of the 28th International Symposium on Computer Architecture and High Performance Computing, 2016

PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Fine-granularity tile-level parallelism in non-volatile memory architecture with two-dimensional bank subdivision.
Proceedings of the 53rd Annual Design Automation Conference, 2016

2015
NVMain 2.0: A User-Friendly Memory Simulator to Model (Non-)Volatile Memory Systems.
IEEE Comput. Archit. Lett., 2015

Overcoming the challenges of crossbar resistive memory architectures.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

2014
Half-DRAM: A high-bandwidth and low-power DRAM architecture from the rethinking of fine-grained activation.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

Using multi-level cell STT-RAM for fast and energy-efficient local checkpointing.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

CREAM: A Concurrent-Refresh-Aware DRAM Memory architecture.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

TSV power supply array electromigration lifetime analysis in 3D ICS.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

3D-SWIFT: a high-performance 3D-stacked wide IO DRAM.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

2013
Lazy Precharge: An overhead-free method to reduce precharge overhead for memory parallelism improvement of DRAM system.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

Thermomechanical stress-aware management for 3D IC designs.
Proceedings of the Design, Automation and Test in Europe, 2013

An efficient run-time encryption scheme for non-volatile main memory.
Proceedings of the International Conference on Compilers, 2013

2011
MorphCache: A Reconfigurable Adaptive Multi-level Cache hierarchy.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

Using NEM relay to improve 3DIC cost efficiency.
Proceedings of the 2011 IEEE International 3D Systems Integration Conference (3DIC), Osaka, Japan, January 31, 2011

2010
A customized design of DRAM controller for on-chip 3D DRAM stacking.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2010

A 3D SoC design for H.264 application with on-chip DRAM stacking.
Proceedings of the IEEE International Conference on 3D System Integration, 2010

2009
CheckerCore: enhancing an FPGA soft core to capture worst-case execution times.
Proceedings of the 2009 International Conference on Compilers, 2009

Arithmetic unit design using 180nm TSV-based 3D stacking technology.
Proceedings of the IEEE International Conference on 3D System Integration, 2009

Investigation and comparison of thermal distribution in synchronous and asynchronous 3D ICs.
Proceedings of the IEEE International Conference on 3D System Integration, 2009


  Loading...