Wei-Chung Hsu

Shih-Wei Liao

Proceedings of the ICPP Workshops 2021: 50th International Conference on Parallel Processing, 2021

2019

Exploiting SIMD Asymmetry in ARM-to-x86 Dynamic Binary Translation.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

Processor-Tracing Guided Region Formation in Dynamic Binary Translation.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

Optimizing data permutations in structured loads/stores translation and SIMD register mapping for a cross-ISA dynamic binary translator.

[BibT_eX]

[DOI]

J. Syst. Archit., 2019

Efficient Dynamic Device Placement for Deep Neural Network Training on Heterogeneous Systems.

[BibT_eX]

[DOI]

Zi Xuan Huang

Sheng-Yu Fu

Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2019

Exploiting Vector Processing in Dynamic Binary Translation.

[BibT_eX]

[DOI]

Proceedings of the 48th International Conference on Parallel Processing, 2019

Translating Traditional SIMD Instructions to Vector Length Agnostic Architectures.

[BibT_eX]

[DOI]

Sheng-Yu Fu

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018

Exploring hidden coherency of Ray-Tracing for heterogeneous systems using online feedback methodology.

[BibT_eX]

[DOI]

Vis. Comput., 2018

Improving SIMD Parallelism via Dynamic Binary Translation.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2018

Efficient and retargetable SIMD translation in a dynamic binary translator.

[BibT_eX]

[DOI]

Softw. Pract. Exp., 2018

Efficient synthetic light field generation using adaptive multi-level rendering.

[BibT_eX]

[DOI]

Liang-Chi Tseng

Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems, 2018

Dynamic tuning of applications using restricted transactional memory.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems, 2018

Automatically Migrating Sequential Applications to Heterogeneous System Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

Exploiting SIMD capability in an ARMv7-to-ARMv8 dynamic binary translator.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Compilers, 2018

2017

A Pipeline-Based Ray-Tracing Runtime System for HSA-Compliant Frameworks.

[BibT_eX]

[DOI]

Yu-Tsung Miao

IEEE Trans. Multim., 2017

On Static Binary Translation of ARM/Thumb Mixed ISA Binaries.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2017

ReRanz: A Light-Weight Virtual Machine to Mitigate Memory Disclosure Attacks.

[BibT_eX]

[DOI]

Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2017

Adaptive runtime exploiting sparsity in tensor of deep learning neural network on heterogeneous systems.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Conference on Embedded Computer Systems: Architectures, 2017

Efficient Synthetic Light Field Rendering on Heterogeneous Systems Using a Pipeline-Based Runtime Design.

[BibT_eX]

[DOI]

Liang-Chi Tseng

Proceedings of the International Conference on Research in Adaptive and Convergent Systems, 2017

Dynamic translation of structured Loads/Stores and register mapping for architectures with SIMD extensions.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, 2017

Exploiting Asymmetric SIMD Register Configurations in ARM-to-x86 Dynamic Binary Translation.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016

Optimizing Control Transfer and Memory Virtualization in Full System Emulators.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

Building a KVM-based Hypervisor for a Heterogeneous System Architecture Compliant System.

[BibT_eX]

[DOI]

Proceedings of the 12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2016

HSAemu 2.0: Full System Emulation for HSA platforms with Soft-MMU.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Research in Adaptive and Convergent Systems, 2016

Exploiting Longer SIMD Lanes in Dynamic Binary Translation.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

A pipeline-based runtime technique for improving Ray-Tracing on HSA-compliant systems.

[BibT_eX]

[DOI]

Yu-Tsung Miao

Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

2015

An Adaptive Heterogeneous Runtime Framework for Irregular Applications.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2015

A dynamic binary translation system in a client/server environment.

[BibT_eX]

[DOI]

J. Syst. Archit., 2015

Automatic validation for binary translation.

[BibT_eX]

[DOI]

Comput. Lang. Syst. Struct., 2015

HSPT: Practical Implementation and Efficient Management of Embedded Shadow Page Tables for Cross-ISA System Virtual Machines.

[BibT_eX]

[DOI]

Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2015

SIMD Code Translation in an Enhanced HQEMU.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

Runtime techniques for efficient Ray-Tracing on heterogeneous systems.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

Improving SIMD code generation in QEMU.

[BibT_eX]

[DOI]

Sheng-Yu Fu

Jan-Jan Wu

Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2014

Efficient and Retargetable Dynamic Binary Translation on Multicores.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2014

Extended Instruction Exploration for Multiple-Issue Architectures.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2014

A Retargetable Static Binary Translator for the ARM Architecture.

[BibT_eX]

[DOI]

Bor-Yeh Shen

Wuu Yang

ACM Trans. Archit. Code Optim., 2014

DBILL: an efficient and retargetable dynamic binary instrumentation framework using llvm backend.

[BibT_eX]

[DOI]

Proceedings of the 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2014

Efficient memory virtualization for Cross-ISA system mode emulation.

[BibT_eX]

[DOI]

Proceedings of the 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2014

An Adaptive Heterogeneous Runtime for Irregular Applications in the Case of Ray-Tracing (Extended Abstract).

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2014

Author retrospective for code scheduling and register allocation in large basic blocks.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014

HSAemu - A full system emulator for HSA platforms.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis, 2014

Dynamic and Adaptive Calling Context Encoding.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014

2013

The design and implementation of heterogeneous multicore systems for energy-efficient speculative thread execution.

[BibT_eX]

[DOI]

Yangchun Luo

Antonia Zhai

ACM Trans. Archit. Code Optim., 2013

Improving dynamic binary optimization through early-exit guided code region formation.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (co-located with ASPLOS 2013), 2013

Effective code discovery for ARM/Thumb mixed ISA binaries in a static binary translator.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Compilers, 2013

2012

Effectiveness of Compiler-Directed Prefetching on Data Mining Benchmarks.

[BibT_eX]

[DOI]

J. Circuits Syst. Comput., 2012

Design of communication interface and control system for intelligent humanoid robot.

[BibT_eX]

[DOI]

Comput. Appl. Eng. Educ., 2012

An LLVM-based hybrid binary translation system.

[BibT_eX]

[DOI]

Proceedings of the 7th IEEE International Symposium on Industrial Embedded Systems, 2012

HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

LLBT: an LLVM-based static binary translator.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Compilers, 2012

A hybrid just-in-time compiler for android: comparing JIT types and the result of cooperation.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Compilers, 2012

2011

Efficient and effective misaligned data access handling in a dynamic binary translation system.

[BibT_eX]

[DOI]

Jianjun Li

Chenggang Wu

ACM Trans. Archit. Code Optim., 2011

LnQ: Building High Performance Dynamic Binary Translators with Existing Compiler Backends.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Processing, 2011

PQEMU: A Parallel System Emulator Based on QEMU.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

Dynamic register promotion of stack variables.

[BibT_eX]

[DOI]

Jianjun Li

Chenggang Wu

Proceedings of the CGO 2011, 2011

A method-based ahead-of-time compiler for android applications.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on Compilers, 2011

2010

Local-loop based robot action control module using independent microprocessors.

[BibT_eX]

[DOI]

Comput. Appl. Eng. Educ., 2010

Energy efficient speculative threads: dynamic thread allocation in Same-ISA heterogeneous multicore systems.

[BibT_eX]

[DOI]

Yangchun Luo

Antonia Zhai

Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009

Exploring speculative parallelism in SPEC2006.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2009

Dynamic performance tuning for speculative threads.

[BibT_eX]

[DOI]

Yangchun Luo

Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Reducing Code Size by Graph Coloring Register Allocation and Assignment Algorithm for Mixed-Width ISA Processor.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Computational Science and Engineering, 2009

An Evaluation of Misaligned Data Access Handling Mechanisms in Dynamic Binary Translation Systems.

[BibT_eX]

[DOI]

Jianjun Li

Chenggang Wu

Proceedings of the CGO 2009, 2009

2008

Sufficient sunlight supply for home care using local closed-loop shutter control system.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Systems, 2008

08441 Final Report - Emerging Uses and Paradigms for Dynamic Binary Translation.

[BibT_eX]

[DOI]

Proceedings of the Emerging Uses and Paradigms for Dynamic Binary Translation, 26.10., 2008

2007

CIM: A Reliable Metric for Evaluating Program Phase Classifications.

[BibT_eX]

[DOI]

Sreekumar V. Kodakara

IEEE Comput. Archit. Lett., 2007

Analysis of Statistical Sampling in Microarchitecture Simulation: Metric, Methodology and Program Characterization.

[BibT_eX]

[DOI]

Sreekumar V. Kodakara

Proceedings of the IEEE 10th International Symposium on Workload Characterization, 2007

An Architecture for the Interoperability of Multimedia Messaging Services between GPRS and PHS Cellular Networks.

[BibT_eX]

[DOI]

Wen-Chuan Hsieh

Yu-Yuan Hsu

Proceedings of the 3rd International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2007), 2007

COBRA: An Adaptive Runtime Binary Optimization Framework for Multithreaded Applications.

[BibT_eX]

[DOI]

Jinpyo Kim

Pen-Chung Yew

Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Entropy-Based Profile Characterization and Classification for Automatic Profile Management.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 2007

2006

Recovery code generation for general speculative optimizations.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2006

Supporting Speculative Multithreading on Simultaneous Multithreaded Processors.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2006

Region Monitoring for Local Phase Detection in Dynamic Optimization Systems.

[BibT_eX]

[DOI]

Abhinav Das

Jiwei Lu

Proceedings of the Fourth IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2006), 2006

A Study of the Performance Potential for Dynamic Instruction Hints Selection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

Issues and Support for Dynamic Register Allocation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

2005

Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor.

[BibT_eX]

[DOI]

Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005

Dynamic Code Region (DCR) Based Program Phase Tracking and Prediction for Dynamic Optimizations.

[BibT_eX]

[DOI]

Jinpyo Kim

Sreekumar V. Kodakara

David J. Lilja

Pen-Chung Yew

Proceedings of the High Performance Embedded Architectures and Compilers, 2005

Performance of Runtime Optimization on BLAST.

[BibT_eX]

[DOI]

Proceedings of the 3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005), 2005

A General Compiler Framework for Speculative Optimizations Using Data Speculative Code Motion.

[BibT_eX]

[DOI]

Proceedings of the 3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005), 2005

2004

A compiler framework for speculative optimizations.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2004

Design and Implementation of a Lightweight Dynamic Optimization System.

[BibT_eX]

[DOI]

J. Instr. Level Parallelism, 2004

Data Dependence Profiling for Speculative Optimizations.

[BibT_eX]

[DOI]

Proceedings of the Compiler Construction, 13th International Conference, 2004

Continuous Adaptive Object-Code Re-optimization Framework.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 9th Asia-Pacific Conference, 2004

A Compiler Framework for Recovery Code Generation in General Speculative Optimizations.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT 2004), 29 September, 2004

2003

A compiler framework for speculative analysis and optimizations.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003, 2003

The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization System.

[BibT_eX]

[DOI]

Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003

Speculative Register Promotion Using Advanced Load Address Table (ALAT).

[BibT_eX]

[DOI]

Proceedings of the 1st IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2003), 2003

Dynamic Trace Selection Using Performance Monitoring Hardware Sampling.

[BibT_eX]

[DOI]

Howard Chen

Dong-yuan Chen

Proceedings of the 1st IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2003), 2003

2002

An Empirical Study on the Granularity of Pointer Analysis in C Programs.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

On the Impact of Naming Methods for Heap-Oriented Pointers in C Programs.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Parallel Architectures, 2002

On the Predictability of Program Behavior Using Different Input Data Sets.

[BibT_eX]

[DOI]

Proceedings of the 6th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT-6 2002), 2002

1998

A Performance Study of Instruction Cache Prefetching Methods.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1998

1997

Data Prefetching on the HP PA-8000.

[BibT_eX]

[DOI]

Vatsa Santhanam

Edward H. Gornish

Proceedings of the 24th International Symposium on Computer Architecture, 1997

1996

Instruction Scheduling for the HP PA-8000.

[BibT_eX]

[DOI]

David A. Dunn

Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, 1996

1993

Toward Effective Scalar Hardware for Highly Vectorizable Applications.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1993

Performance of Cached DRAM Organizations in Vector Supercomputers.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual International Symposium on Computer Architecture, 1993

1992

Prefetching in Supercomputer Instruction Caches.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing '92, 1992

On the instruction-level characteristics of scalar code in highly-vectorized scientific applications.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual International Symposium on Microarchitecture, 1992

1991

An Empirical Study of the CRAY Y-MP Processor Using the Perfect Club Benchmarks.

[BibT_eX]

[DOI]

Gurindar S. Sohi

Proceedings of the 18th Annual International Symposium on Computer Architecture. Toronto, 1991

1990

The use of intermediate memories for low-latency memory access in supercomputer scalar units.

[BibT_eX]

[DOI]

Gurindar S. Sohi

J. Supercomput., 1990

Future general purpose supercomputer architectures.

[BibT_eX]

[DOI]

Christopher C. Hsiung

Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

Exploitation of operation-level parallelism in a processor of the CRAY X-MP.

[BibT_eX]

[DOI]

Gurindar S. Sohi

Proceedings of the 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors, 1990

1989

On the Minimization of Loads/Stores in Local Register Allocation.

[BibT_eX]

[DOI]

Charles N. Fischer

IEEE Trans. Software Eng., 1989

1988

Code scheduling and register allocation in large basic blocks.

[BibT_eX]

[DOI]

Proceedings of the 2nd international conference on Supercomputing, 1988

1987

WISQ: A Restartable Architecture Using Queues.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual International Symposium on Computer Architecture. Pittsburgh, 1987

1986

On the Use of Registers vs. Cache to Minimize Memory Traffic.

[BibT_eX]

[DOI]