Yongwen Wang

Orcid: 0009-0008-2514-2052

According to our database1, Yongwen Wang authored at least 36 papers between 2004 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
RVAM16: a low-cost multiple-ISA processor based on RISC-V and ARM Thumb.
Frontiers Comput. Sci., January, 2025

2024
A Low-Cost Floating-Point FMA Unit Supporting Package Operations for HPC-AI Applications.
IEEE Trans. Circuits Syst. II Express Briefs, July, 2024

A survey of compute nodes with 100 TFLOPS and beyond for supercomputers.
CCF Trans. High Perform. Comput., June, 2024

MPRTA: An Efficient Multilevel Parallel Mobile Accelerator for High-Performance Ray Tracing.
IEEE Trans. Very Large Scale Integr. Syst., February, 2024

Server-Assisted Traffic Measurement for Programmable Data Center Networks.
IEEE Trans. Netw. Sci. Eng., 2024

Cost-Effective Value Predictor for ILP processors through Design Space Exploration.
Proceedings of the Great Lakes Symposium on VLSI 2024, 2024

ImSPU: Implicit Sharing of Computation Resources Between Vector and Scalar Processing Units.
Proceedings of the Euro-Par 2024: Parallel Processing, 2024

QuickTree: A Fast Hardware BVH Construction Engine.
Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

Out-of-Order and Recursive RAS: A Return Address Stack Design on High Performance Processor.
Proceedings of the 35th IEEE International Conference on Application-specific Systems, 2024

2023
MMsRT: A Hardware Architecture for Ray Tracing in the Mobile Domain.
J. Circuits Syst. Comput., July, 2023

Fast Approximate LUT-based Vector Multiplication in DRAM.
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

A Multi-level Parallel Integer/Floating-Point Arithmetic Architecture for Deep Learning Instructions.
Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023

2022
RV16: An Ultra-Low-Cost Embedded RISC-V Processor Core.
J. Comput. Sci. Technol., 2022

RTA: an Efficient SIMD Architecture for Ray Tracing.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

Design and optimization of Issue queue in Out-of-Order superscalar microprocessor.
Proceedings of the Asia Conference on Algorithms, Computing and Machine Learning, 2022

2021
Bubble-Wall Plot: A New Tool for Data Visualization.
Proceedings of the Australasian Conference on Information Systems, 2021

2020
CSMO-DSE: Fast and Precise Application-driven DSE Guided by Criticality and Sensitivity Analysis.
ACM J. Emerg. Technol. Comput. Syst., 2020

2017
Effective Optimization of Branch Predictors through Lightweight Simulation.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

2016
A Methodology for Performance Verification of Microprocessors.
Proceedings of the Computer Engineering and Technology - 20th CCF Conference, 2016

2015
The Improvement of March C+ Algorithm for Embedded Memory Test.
Proceedings of the Computer Engineering and Technology - 19th CCF Conference, 2015

Fast FPGA system for microarchitecture optimization on synthesizable modern processor design.
Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

2014
Integrated Coherence Prediction: Towards Efficient Cache Coherence on NoC-Based Multicore Architectures.
ACM Trans. Design Autom. Electr. Syst., 2014

A High-Dynamic Invocation Load Balancing Algorithm for Distributed Servers in the Cloud.
Proceedings of the Intelligent Computing Theory - 10th International Conference, 2014

2013
Dynamic Streamization Model Execution for SIMD Engines on Multicore Architectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Adaptive communication mechanism for accelerating MPI functions in NoC-based multicore processors.
ACM Trans. Archit. Code Optim., 2013

Efficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP.
Parallel Comput., 2013

Achieving Predictable Performance in SMT Processors by Instruction Fetch Policy.
Proceedings of the Computer Engineering and Technology - 17th CCF Conference, 2013

2011
BP-NUCA: Cache Pressure-Aware Migration for High-Performance Caching in CMPs.
Comput. Informatics, 2011

Characterizing Time-Varying Behavior and Predictability of Cache AVF.
Proceedings of the 2011 Third International Conference on Intelligent Networking and Collaborative Systems (INCoS), Fukuoka, Japan, November 30, 2011

A Formalization of an Emulation Based Co-designed Virtual Machine.
Proceedings of the Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, 2011

A Novel Chaining Approach to Indirect Control Transfer Instructions.
Proceedings of the Availability, Reliability and Security for Business, Enterprise and Health Information Systems, 2011

2010
EPSP: Enhancing Network Protocol with Social-Aware Plane.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

Low Power Design for a Multi-core Multi-thread Microprocessor.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

2009
Real-Time Visualization of Virtual Huge Texture.
Proceedings of the 2009 International Conference on Digital Image Processing, 2009

2004
An Efficient Broadcast Algorithm Based on Connected Dominating Set in Unstructured Peer-to-Peer Network.
Proceedings of the Web Information Systems, 2004

WCBF: Efficient and High-Coverage Search Schema in Unstructured Peer-to-Peer Network.
Proceedings of the Grid and Cooperative Computing, 2004


  Loading...