2024
Simulation and parametric optimal design of active radial magnetic fluid bearing.
Int. J. Model. Identif. Control., 2024
Unleashing CPU Potential for Executing GPU Programs Through Compiler/Runtime Optimizations.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024
Enabling Fine-Grained Incremental Builds by Making Compiler Stateful.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2024
2023
Concrete Type Inference for Code Optimization using Machine Learning with SMT Solving.
Proc. ACM Program. Lang., October, 2023
HIPLZ: Enabling performance portability for exascale systems.
Concurr. Comput. Pract. Exp., 2023
2022
Study on the law of the structure parameters influence on thermal deformation of magnetic poles of magnetic-liquid double suspension bearing.
Int. J. Comput. Appl. Technol., 2022
2021
Attention Mechanism-Based CNN-LSTM Model for Wind Turbine Fault Prediction Using SSN Ontology Annotation.
Wirel. Commun. Mob. Comput., 2021
Study on temperature rise and thermal deformation of rotor caused by eddy current loss of magnetic-liquid double suspension bearing.
Int. J. Model. Identif. Control., 2021
Identifying Behavior Dispatchers for Malware Analysis.
Proceedings of the ASIA CCS '21: ACM Asia Conference on Computer and Communications Security, 2021
2020
Advanced Graph-Based Deep Learning for Probabilistic Type Inference.
CoRR, 2020
OmpMemOpt: Optimized Memory Movement for Heterogeneous Computing.
Proceedings of the Euro-Par 2020: Parallel Processing, 2020
2018
Compile-Time Library Call Detection Using CAASCADE and XALT.
Proceedings of the High Performance Computing, 2018
Detecting MPI usage anomalies via partial program symbolic execution.
Proceedings of the International Conference for High Performance Computing, 2018
Parallel sparse flow-sensitive points-to analysis.
Proceedings of the 27th International Conference on Compiler Construction, 2018
2015
Finding Tizen security bugs through whole-system static analysis.
CoRR, 2015
LLVM-based communication optimizations for PGAS programs.
Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 2015
Parallelizing a discrete event simulation application using the Habanero-Java multicore library.
Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, 2015
A Composable Deadlock-Free Approach to Object-Based Isolation.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015
2014
Inter-iteration Scalar Replacement Using Array SSA Form.
Proceedings of the Compiler Construction - 23rd International Conference, 2014
2013
A Transformation Framework for Optimizing Task-Parallel Programs.
ACM Trans. Program. Lang. Syst., 2013
A decoupled non-SSA global register allocation using bipartite liveness graphs.
ACM Trans. Archit. Code Optim., 2013
Accelerating Habanero-Java programs with OpenCL generation.
Proceedings of the 2013 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, 2013
Isolation for nested task parallelism.
Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, 2013
Speculative Execution of Parallel Programs with Precise Exception Semantics on GPUs.
Proceedings of the Languages and Compilers for Parallel Computing, 2013
Compiler-Driven Data Layout Transformation for Heterogeneous Platforms.
Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013
Interprocedural strength reduction of critical sections in explicitly-parallel programs.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013
2012
Efficient data race detection for async-finish parallelism.
Formal Methods Syst. Des., 2012
Scalable and precise dynamic datarace detection for structured parallelism.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012
Finish Accumulators: An Efficient Reduction Construct for Dynamic Task Parallelism.
Proceedings of the Languages and Compilers for Parallel Computing, 2012
Practical Permissions for Race-Free Parallelism.
Proceedings of the ECOOP 2012 - Object-Oriented Programming, 2012
2011
Permission Regions for Race-Free Parallelism.
Proceedings of the Runtime Verification - Second International Conference, 2011
Habanero-Java: the new adventures of old X10.
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java, 2011
Intermediate language extensions for parallelism.
Proceedings of the SPLASH'11 Workshops, 2011
Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2011
The design and implementation of the habanero-java parallel programming language.
Proceedings of the Companion to the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2011
Communication Optimizations for Distributed-Memory X10 Programs.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011
2010
Efficient Selection of Vector Instructions Using Dynamic Programming.
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010
SLAW: A scalable locality-aware adaptive work-stealing scheduler.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Reducing task creation and termination overhead in explicitly parallel programs.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010
Automatic vector instruction selection for dynamic compilation.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010
2009
Hierarchical Place Trees: A Portable Abstraction for Task Parallelism and Data Movement.
Proceedings of the Languages and Compilers for Parallel Computing, 2009
2008
Constraint based optimization of stationary fields.
Proceedings of the 6th International Symposium on Principles and Practice of Programming in Java, 2008
Adaptive Loop Tiling for a Multi-cluster CMP.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2008
2007
Optimizing Chip Multiprocessor Work Distribution Using Dynamic Compilation.
Proceedings of the Euro-Par 2007, 2007
2005
Loop Parallelisation for the Jikes RVM.
Proceedings of the Sixth International Conference on Parallel and Distributed Computing, 2005