Perry H. Wang

According to our database1, Perry H. Wang authored at least 20 papers between 2001 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
OzMAC: An Energy-Efficient Sparsity-Exploiting Multiply-Accumulate-Unit Design for DL Inference.
CoRR, 2024

Commercial Evaluation of Zero-Skipping MAC Design for Bit Sparsity Exploitation in DL Inference.
Proceedings of the 32nd IFIP/IEEE International Conference on Very Large Scale Integration, 2024

2023
tubGEMM: Energy-Efficient and Sparsity-Effective Temporal-Unary-Binary Based Matrix Multiply Unit.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2023

2020
7.1 A 3.4-to-13.3TOPS/W 3.6TOPS Dual-Core Deep-Learning Accelerator for Versatile AI Applications in 7nm 5G Smartphone SoC.
Proceedings of the 2020 IEEE International Solid- State Circuits Conference, 2020

2011
Bothnia: a dual-personality extension to the Intel integrated graphics driver.
ACM SIGOPS Oper. Syst. Rev., 2011

2010
Intel nehalem processor core made FPGA synthesizable.
Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays, 2010

2009
Intel® atom<sup>TM</sup> processor core made FPGA-synthesizable.
Proceedings of the ACM/SIGDA 17th International Symposium on Field Programmable Gate Arrays, 2009

2008
Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor.
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2007
EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system.
Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007

Sequencer virtualization.
Proceedings of the 21th Annual International Conference on Supercomputing, 2007

2006
Multiple Instruction Stream Processor.
Proceedings of the 33rd International Symposium on Computer Architecture (ISCA 2006), 2006

2005
A Dependency Chain Clustered Microarchitecture.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

2004
Helper Threads via Virtual Multithreading.
IEEE Micro, 2004

Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors.
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

Helper threads via virtual multithreading on an experimental itanium<sup>®</sup> 2 processor-based platform.
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

2003
Inferno: a functional simulation infrastructure for modeling microarchitectural data speculations.
Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, 2003

2002
Post-Pass Binary Adaptation for Software-Based Speculative Precomputation.
Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2002

Memory Latency-Tolerance Approaches for Itanium Processors: Out-of-Order Execution vs. Speculative Precomputation.
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002

Quantitative Evaluation of the Register Stack Engine and Optimizations for Future Itanium Processors.
Proceedings of the 6th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT-6 2002), 2002

2001
Register Renaming and Scheduling for Dynamic Execution of Predicated Code.
Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA'01), 2001


  Loading...