Wei Zhang
Orcid: 0000-0003-1343-2817Affiliations:
- Virginia Commonwealth University, Compiler, Architecture, and Realtime Systems Lab, Richmond, VA, USA
- Southern Illinois University
- Pennsylvania State University (PhD)
According to our database1,
Wei Zhang
authored at least 155 papers
between 2001 and 2021.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2021
J. Circuits Syst. Comput., 2021
2020
Reducing CPU-GPU Interferences to Improve CPU Performance in Heterogeneous Architectures.
J. Comput. Sci. Eng., 2020
Proceedings of the 38th IEEE International Conference on Computer Design, 2020
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020
Packing Narrow-Width Operands to Improve Energy Efficiency of General-Purpose GPU Computing.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020
2019
Proceedings of the National Cyber Summit, 2019
Proceedings of the 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, 2019
Proceedings of the Advances in Neural Networks - ISNN 2019, 2019
Cracking Randomized Coalescing Techniques with An Efficient Profiling-Based Side-Channel Attack to GPU.
Proceedings of the 8th International Workshop on Hardware and Architectural Support for Security and Privacy, 2019
Improving Parallelism of Breadth First Search (BFS) Algorithm for Accelerated Performance on GPUs.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019
2018
Cache-Aware SPM Allocation Algorithms for Performance and Energy Optimization on Hybrid SPM-Cache Architecture.
J. Comput. Sci. Eng., 2018
Estimating the Worst-Case Execution Time of the Shared Data Cache in Integrated CPU-GPU Architectures.
J. Comput. Sci. Eng., 2018
J. Comput. Sci. Eng., 2018
Cache-Aware SPM Allocation to Reduce Worst-Case Execution Time for Hybrid SPM-Caches.
J. Circuits Syst. Comput., 2018
Reducing Inter-Application Interferences in Integrated CPU-GPU Heterogeneous Architecture.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018
Exploiting GPU with 3D Stacked Memory to Boost Performance for Data-Intensive Applications.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018
2017
Enhancing GPU Performance by Efficient Hardware-Based and Hybrid L1 Data Cache Bypassing.
J. Comput. Sci. Eng., 2017
J. Comput. Sci. Eng., 2017
A Sample-Based Dynamic CPU and GPU LLC Bypassing Method for Heterogeneous CPU-GPU Architectures.
Proceedings of the 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, Australia, August 1-4, 2017, 2017
GPU Register Packing: Dynamically Exploiting Narrow-Width Operands to Improve Performance.
Proceedings of the 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, Australia, August 1-4, 2017, 2017
Proceedings of the 20th IEEE International Symposium on Real-Time Distributed Computing, 2017
Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017
2016
Warp-Based Load/Store Reordering to Improve GPU Data Cache Time Predictability and Performance.
Proceedings of the 19th IEEE International Symposium on Real-Time Distributed Computing, 2016
Cache locking vs. partitioning for real-time computing on integrated CPU-GPU processors.
Proceedings of the 35th IEEE International Performance Computing and Communications Conference, 2016
2015
Profiling-based L1 data cache bypassing to improve GPU performance and energy efficiency.
SIGBED Rev., 2015
Scratchpad Memory Architectures and Allocation Algorithms for Hard Real-Time Multicore Processors.
J. Comput. Sci. Eng., 2015
J. Comput. Sci. Eng., 2015
Proceedings of the Sixteenth International Symposium on Quality Electronic Design, 2015
Proceedings of the Sixteenth International Symposium on Quality Electronic Design, 2015
Proceedings of the IEEE 18th International Symposium on Real-Time Distributed Computing, 2015
Proceedings of the IEEE 18th International Symposium on Real-Time Distributed Computing, 2015
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015
2014
J. Comput. Sci. Eng., 2014
Comparing Separate and Statically-Partitioned Caches for Time-Predictable Multicore Processors.
J. Comput. Sci. Eng., 2014
Two-Level Scratchpad Memory Architectures to Achieve Time Predictability and High Performance.
J. Comput. Sci. Eng., 2014
IEEE Embed. Syst. Lett., 2014
Proceedings of the Fifteenth International Symposium on Quality Electronic Design, 2014
Proceedings of the 17th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, 2014
Proceedings of the IEEE 33rd International Performance Computing and Communications Conference, 2014
Proceedings of the IEEE 33rd International Performance Computing and Communications Conference, 2014
Exploiting Hybrid SPM-Cache Architectures to Reduce Energy Consumption for Embedded Computing.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014
Characterizing Energy Consumption of Real-Time and Media Benchmarks on Hybrid SPM-Caches.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014
Improving Energy Efficiency with Dynamic Compiler-Directed Function Unit Power Control.
Proceedings of the 12th IEEE International Conference on Embedded and Ubiquitous Computing, 2014
Hop-Based Priority Scheduling to Improve Worst-Case Inter-core Communication Latency.
Proceedings of the 12th IEEE International Conference on Embedded and Ubiquitous Computing, 2014
Proceedings of the 2014 International Conference on Compilers, 2014
2013
Static worst-case lifetime estimation of wireless sensor networks: A case study on VigilNet.
J. Syst. Archit., 2013
J. Comput. Sci. Eng., 2013
Counter-Based Approaches for Efficient WCET Analysis of Multicore Processors with Shared Caches.
J. Comput. Sci. Eng., 2013
J. Comput. Sci. Eng., 2013
On the interactions between real-time scheduling and inter-thread cached interferences for multicore processors.
Proceedings of the International Symposium on Quality Electronic Design, 2013
Proceedings of the IEEE 32nd International Performance Computing and Communications Conference, 2013
Proceedings of the IEEE 32nd International Performance Computing and Communications Conference, 2013
Compiler-based approach to reducing leakage energy of instruction scratch-pad memories.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013
Proceedings of the 24th International Conference on Application-Specific Systems, 2013
Standard deviation of CPI: A new metric to evaluate architectural time predictability.
Proceedings of the 24th International Conference on Application-Specific Systems, 2013
2012
A Model Checking Based Approach to Bounding Worst-Case Execution Time for Multicore Processors.
ACM Trans. Embed. Comput. Syst., 2012
Architectural time-predictability factor (ATF): a metric to evaluate time predictability of processors.
SIGBED Rev., 2012
On-line Trace Based Automatic Parallelization of Java Programs on Multicore Platforms.
J. Comput. Sci. Eng., 2012
J. Comput. Sci. Eng., 2012
Multicore-Aware Code Co-Positioning to Reduce WCET on Dual-Core Processors with Shared Instruction Caches.
J. Comput. Sci. Eng., 2012
J. Comput. Sci. Eng., 2012
Proceedings of the 30th International IEEE Conference on Computer Design, 2012
Exploiting SPM-aware Scheduling on EPIC architectures for high-performance real-time systems.
Proceedings of the IEEE Conference on High Performance Extreme Computing, 2012
2011
An Interference Matrix Based Approach to Bounding Worst-Case Inter-Thread Cache Interferences and WCET for Multi-Core Processors.
J. Comput. Sci. Eng., 2011
J. Comput. Sci. Eng., 2011
Bounding Worst-Case Performance for Multi-Core Processors with Shared L2 Instruction Caches.
J. Comput. Sci. Eng., 2011
Exploiting Instruction Reuse to Improve the Performance of Dual Instruction Execution.
J. Circuits Syst. Comput., 2011
Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21, 2011
Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21, 2011
Proceedings of the 14th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, 2011
Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications (GreenCom), 2011
Work in progress - Course development of programming for general-purpose multicore processors.
Proceedings of the 2011 Frontiers in Education Conference, 2011
2010
IEEE Trans. Computers, 2010
J. Comput. Sci. Eng., 2010
Time-dependent density functional theory study on the hydrogen bonding-induced twisted intramolecular charge-transfer excited states of 2-(4'-<i>N</i>, <i>N</i>-dimethylaminophenyl)imidazo[4, 5-<i>b</i>]pyridine.
J. Comput. Chem., 2010
Int. J. High Perform. Syst. Archit., 2010
Proceedings of the 16th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, 2010
Proceedings of the 48th Annual Southeast Regional Conference, 2010
Improving the static real-time scheduling on multicore processors by reducing worst-case inter-thread cache interferences.
Proceedings of the 48th Annual Southeast Regional Conference, 2010
2009
J. Comput. Sci. Eng., 2009
Optimizing Instruction Prefetching to Improve Worst-Case Performance for Real-Time Applications.
J. Comput. Sci. Eng., 2009
Boosting the Performance of Software-Based Transient Errors Tolerant Techniques through Compiler Optimizations.
J. Circuits Syst. Comput., 2009
J. Circuits Syst. Comput., 2009
IEEE Des. Test Comput., 2009
Des. Autom. Embed. Syst., 2009
Proceedings of the 2009 ACM Symposium on Applied Computing (SAC), 2009
Accurately Estimating Worst-Case Execution Time for Multi-core Processors with Shared Direct-Mapped Instruction Caches.
Proceedings of the 15th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, 2009
Exploiting Multi-core Processors to Improve Time Predictability for Real-Time Java Computing.
Proceedings of the 15th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, 2009
2008
ACM Trans. Embed. Comput. Syst., 2008
ACM Trans. Archit. Code Optim., 2008
Proceedings of the 14th IEEE Real-Time and Embedded Technology and Applications Symposium, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008
On the Energy Efficiency of Java Virtual Machine.
Proceedings of the 2008 International Conference on Embedded Systems & Applications, 2008
Efficient code caching to improve performance and energy consumption for java applications.
Proceedings of the 2008 International Conference on Compilers, 2008
2007
ACM Trans. Embed. Comput. Syst., 2007
SIGARCH Comput. Archit. News, 2007
SIGARCH Comput. Archit. News, 2007
Proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, 2007
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007
Proceedings of the High Performance Embedded Architectures and Compilers, 2007
Exploring Functional Unit Design Space of VLIW Processors for Optimizing Both Performance and Energy Consumption.
Proceedings of the 21st International Conference on Advanced Information Networking and Applications (AINA 2007), 2007
Proceedings of the 21st International Conference on Advanced Information Networking and Applications (AINA 2007), 2007
An Area-Efficient Approach to Improving Register File Reliability against Transient Errors.
Proceedings of the 21st International Conference on Advanced Information Networking and Applications (AINA 2007), 2007
2006
ACM Trans. Embed. Comput. Syst., 2006
Reducing Instruction Translation Look-Aside Buffer Energy Through Compiler-Directed Resizing.
J. Low Power Electron., 2006
Compiler-guided next sub-bank prediction for reducing instruction cache leakage energy.
J. Embed. Comput., 2006
The Impact of Cache Organization in Optimizing Microprocessor Power Consumption.
Proceedings of the 2006 International Conference on Computer Design & Conference on Computing in Nanotechnology, 2006
2005
ACM Trans. Embed. Comput. Syst., 2005
Replication Cache: A Small Fully Associative Cache to Improve Data Cache Reliability.
IEEE Trans. Computers, 2005
Exploiting the replication cache to improve performance for multiple-issue microprocessors.
SIGARCH Comput. Archit. News, 2005
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005
The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005
Proceedings of the EMSOFT 2005, 2005
Proceedings of the 20th IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2005), 2005
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 2005
2004
ACM Trans. Archit. Code Optim., 2004
Proceedings of the 2004 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2004), 2004
Enhancing data cache reliability by the addition of a small fully-associative replication cache.
Proceedings of the 18th Annual International Conference on Supercomputing, 2004
Proceedings of the 2004 International Conference on Compilers, 2004
Proceedings of the 2004 International Conference on Compilers, 2004
Proceedings of the Advances in Computer Systems Architecture, 9th Asia-Pacific Conference, 2004
2003
Proceedings of the 17th Annual International Conference on Supercomputing, 2003
Proceedings of the 2003 International Conference on Dependable Systems and Networks (DSN 2003), 2003
Proceedings of the 2003 Euromicro Symposium on Digital Systems Design (DSD 2003), 2003
Proceedings of the 2003 Design, 2003
Proceedings of the 2003 Design, 2003
Implementation and Evaluation of an On-Demand Parameter-Passing Strategy for Reducing Energy.
Proceedings of the 2003 Design, 2003
Proceedings of the 2003 Design, 2003
Interprocedural optimizations for improving data cache performance of array-intensive embedded applications.
Proceedings of the 40th Design Automation Conference, 2003
Proceedings of the International Conference on Compilers, 2003
Proceedings of the Embedded Software for SoC, 2003
2002
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002
Proceedings of the 2002 Joint Conference on Languages, 2002
2001
Proceedings of the 34th Annual International Symposium on Microarchitecture, 2001