Yuan-Shin Hwang

Orcid: 0000-0002-1348-082X

According to our database1, Yuan-Shin Hwang authored at least 45 papers between 1994 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Enhancing LLVM Optimizations for Linear Recurrence Programs on RVV.
Proceedings of the 52nd International Conference on Parallel Processing Workshops, 2023

Pointer-Based Divergence Analysis for OpenCL 2.0 Programs.
ACM Trans. Parallel Comput., 2021

A framework for scheduling dependent programs on GPU architectures.
J. Syst. Archit., 2020

Support OpenCL 2.0 Compiler on LLVM for PTX Simulators.
J. Signal Process. Syst., 2019

GPUBlocks: GUI Programming Tool for CUDA and OpenCL.
J. Signal Process. Syst., 2019

Devise Rust Compiler Optimizations on RISC-V Architectures with SIMD Instructions.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Architecture and Compiler Support for GPUs Using Energy-Efficient Affine Register Files.
ACM Trans. Design Autom. Electr. Syst., 2018

Scheduling Methods to Optimize Dependent Programs for GPU Architecture.
Proceedings of the 47th International Conference on Parallel Processing, 2018

Graph Support and Scheduling for OpenCL on Heterogeneous Multi-core Systems.
Proceedings of the 47th International Conference on Parallel Processing, 2018

Floating accumulator architecture.
Microprocess. Microsystems, 2017

Enabling PoCL-based runtime frameworks on the HSA for OpenCL 2.0 support.
J. Syst. Archit., 2017

Analyzing OpenCL 2.0 workloads using a heterogeneous CPU-GPU simulator.
Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

OpenCL 2.0 Compiler Adaptation on LLVM for PTX Simulators.
Proceedings of the 46th International Conference on Parallel Processing Workshops, 2017

Energy Efficient Affine Register File for GPU Microarchitecture.
Proceedings of the 45th International Conference on Parallel Processing Workshops, 2016

CUDABlock: A GUI Programming Tool for CUDA.
Proceedings of the 44th International Conference on Parallel Processing Workshops, 2015

Support of Probabilistic Pointer Analysis in the SSA Form.
IEEE Trans. Parallel Distributed Syst., 2012

Doubling the number of registers on ARM processors.
Proceedings of the 16th Workshop on Interaction between Compilers and Computer Architectures, 2012

DisIRer: Converting a retargetable compiler into a multiplatform binary translator.
ACM Trans. Archit. Code Optim., 2010

On reducing load/store latencies of cache accesses.
J. Syst. Archit., 2010

Trading Conditional Execution for More Registers on ARM Processors.
Proceedings of the IEEE/IFIP 8th International Conference on Embedded and Ubiquitous Computing, 2010

Indirect-Mapped Caches: Approximating Set-Associativity with Direct-Mapped Caches.
Proceedings of the 2009 International Conference on Computer Design, 2009

Snug set-associative caches: Reducing leakage power of instruction and data caches with no performance penalties.
ACM Trans. Archit. Code Optim., 2007

Dynamic Load-Balancing of Jini and .NET Services.
Proceedings of the 2006 International Conference on Parallel Processing Workshops (ICPP Workshops 2006), 2006

Dynamic Load-Balancing of Jini Services with Smart Proxies.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2005

Snug set-associative caches: reducing leakage power while improving performance.
Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005

Minimal Steiner Trees in X Architecture with Obstacles.
Proceedings of the 2005 International Conference on Computer Design, 2005

Interprocedural Probabilistic Pointer Analysis.
IEEE Trans. Parallel Distributed Syst., 2004

Novel Hierarchical Interconnection Networks for High-Performance Multicomputer Systems.
J. Inf. Sci. Eng., 2004

Hierarchical Interconnection Networks Based on (3, 3)-Graphs for Massively Parallel Processors.
IEICE Trans. Inf. Syst., 2004

An Efficient Algorithm for Perfect Load Balancing on Hypercube Multiprocessors.
J. Supercomput., 2003

Interprocedural definition-use chains of dynamic pointer-linked data structures.
Sci. Program., 2003

Identifying parallelism in programs with cyclic graphs.
J. Parallel Distributed Comput., 2003

Compiler support for speculative multithreading architecture with probabilistic points-to analysis.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2003

Parallelizing graph construction operations in programs with cyclic graphs.
Parallel Comput., 2002

Compiler Optimizations with DSP-Specific Semantic Descriptions.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

Probabilistic Points-to Analysis.
Proceedings of the Languages and Compilers for Parallel Computing, 2001

Runtime and Compiler Support for Irregular Computations.
Proceedings of the Compiler Optimizations for Scalable Parallel Systems Languages, 2001

Programming Irregular Applications: Runtime Support, Compilation and Tools.
Adv. Comput., 1997

Identifying DEF/USE Information of Statements that Construct and Traverse Dynamic Recursive Data Structures.
Proceedings of the Languages and Compilers for Parallel Computing, 1997

Side Effect Analysis on User-Defined Reduction Functions with Dynamic Pointer-Linked Data Structures.
Proceedings of the Languages and Compilers for Parallel Computing, 1996

Runtime Support and Compilation Methods for User-Specified Irregular Data Distributions.
IEEE Trans. Parallel Distributed Syst., 1995

Runtime and Language Support for Compiling Adaptive Irregular Programs on Distributed-memory Machines.
Softw. Pract. Exp., 1995

Supporting irregular distributions using data-parallel languages.
IEEE Parallel Distributed Technol. Syst. Appl., 1995

Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures.
J. Parallel Distributed Comput., 1994

Run-time and compile-time support for adaptive irregular problems.
Proceedings of the Proceedings Supercomputing '94, 1994
