Xiaoyao Liang

Orcid: 0000-0002-2790-5884

According to our database1, Xiaoyao Liang authored at least 121 papers between 2005 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
VSPIM: SRAM Processing-in-Memory DNN Acceleration via Vector-Scalar Operations.
IEEE Trans. Computers, October, 2024

ERA-BS: Boosting the Efficiency of ReRAM-Based PIM Accelerator With Fine-Grained Bit-Level Sparsity.
IEEE Trans. Computers, September, 2024

FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large Scale.
CoRR, 2024

GNeRF: Accelerating Neural Radiance Fields Inference via Adaptive Sample Gating.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

MEGA: A Memory-Efficient GNN Accelerator Exploiting Degree-Aware Mixed-Precision Quantization.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Watt: A Write-Optimized RRAM-Based Accelerator for Attention.
Proceedings of the Euro-Par 2024: Parallel Processing, 2024

Sava: A Spatial- and Value-Aware Accelerator for Point Cloud Transformer.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

FusionArch: A Fusion-Based Accelerator for Point-Based Point Cloud Neural Networks.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

InterArch: Video Transformer Acceleration via Inter-Feature Deduplication with Cube-based Dataflow.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

TSAcc: An Efficient \underline{T}empo-\underline{S}patial Similarity Aware \underline{Acc}elerator for Attention Acceleration.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

MoC: A Morton-Code-Based Fine-Grained Quantization for Accelerating Point Cloud Neural Networks.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

CMC: Video Transformer Acceleration via CODEC Assisted Matrix Condensing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
A Point Cloud Video Recognition Acceleration Framework Based on Tempo-Spatial Information.
IEEE Trans. Parallel Distributed Syst., December, 2023

Real-Time Video Recognition via Decoder-Assisted Neural Network Acceleration Framework.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., July, 2023

ClusterSeg: A crowd cluster pinpointed nucleus segmentation framework with cross-modality datasets.
Medical Image Anal., April, 2023

E<sup>2</sup>-VOR: An End-to-End En/Decoder Architecture for Efficient Video Object Recognition.
ACM Trans. Design Autom. Electr. Syst., January, 2023

A Federated Learning System for Histopathology Image Analysis With an Orchestral Stain-Normalization GAN.
IEEE Trans. Medical Imaging, 2023

SoBS-X: Squeeze-Out Bit Sparsity for ReRAM-Crossbar-Based Neural Network Accelerator.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2023

A<sup>2</sup>Q: Aggregation-Aware Quantization for Graph Neural Networks.
CoRR, 2023

$\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

HyAcc: A Hybrid CAM-MAC RRAM-based Accelerator for Recommendation Model.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

RealArch: A Real-Time Scheduler for Mapping Multi-Tenant DNNs on Multi-Core Accelerators.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

DEQ: Dynamic Element-wise Quantization for Efficient Attention Architecture.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

ViTframe: Vision Transformer Acceleration via Informative Frame Selection for Video Recognition.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

PRADA: Point Cloud Recognition Acceleration via Dynamic Approximation.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

AdaS: A Fast and Energy-Efficient CNN Accelerator Exploiting Bit-Sparsity.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

PRADA: Point Cloud Recognition Acceleration via Dynamic Approximation.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023

2022
Integrated Power Anomaly Defense: Towards Oversubscription-Safe Data Centers.
IEEE Trans. Cloud Comput., 2022

RePAST: A ReRAM-based PIM Accelerator for Second-order Training of DNN.
CoRR, 2022

DNN Training Acceleration via Exploring GPGPU Friendly Sparsity.
CoRR, 2022

CP-ViT: Cascade Vision Transformer Pruning via Progressive Sparsity Prediction.
CoRR, 2022

Ristretto: An Atomized Processing Architecture for Sparsity-Condensed Stream Flow in CNN.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

GCNTrain: A Unified and Efficient Accelerator for Graph Convolutional Neural Network Training.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

Gzippo: Highly-Compact Processing-in-Memory Graph Accelerator Alleviating Sparsity and Redundancy.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores.
Proceedings of the FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022, 2022

E<sup>2</sup>SR: an end-to-end video CODEC assisted system for super resolution acceleration.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

EBSP: evolving bit sparsity patterns for hardware-friendly inference of quantized deep neural networks.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
ITT-RNA: Imperfection Tolerable Training for RRAM-Crossbar-Based Deep Neural-Network Accelerator.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network.
CoRR, 2021

Multiple-datasets and multiple-label based color normalization in histopathology with cGAN.
Proceedings of the Medical Imaging 2021: Digital Pathology, Online, February 15-19, 2021, 2021

Contrastive Learning Based Stain Normalization Across Multiple Tumor in Histopathology.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

PIPArch: Programmable Image Processing Architecture Using Sliding Array.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

ReRAM-Sharing: Fine-Grained Weight Sharing for ReRAM-Based Deep Neural Network Accelerator.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Improving Neural Network Efficiency via Post-training Quantization with Adaptive Floating-Point.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Re2PIM: A Reconfigurable ReRAM-Based PIM Design for Variable-Sized Vector-Matrix Multiplication.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Energy-Efficient Hybrid-RAM with Hybrid Bit-Serial based VMM Support.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

BayesFT: Bayesian Optimization for Fault Tolerant Neural Network Architecture.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

PIMGCN: A ReRAM-Based PIM Design for Graph Convolutional Network Acceleration.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

2020
VR-DANN: Real-Time Video Recognition via Decoder-Assisted Neural Network Acceleration.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

DRQ: Dynamic Region-based Quantization for Deep Neural Network Acceleration.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Fast Tumor Detector in Whole-Slide Image With Dynamic Programing Based Monte Carlo Sampling.
Proceedings of the IEEE International Conference on Image Processing, 2020

A Prediction Model of Microsatellite Status from Histology Images.
Proceedings of the 10th International Conference on Biomedical Engineering and Technology, 2020

PRArch: Pattern-Based Reconfigurable Architecture for Deep Neural Network Acceleration.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

ESNreram: An Energy-Efficient Sparse Neural Network Based on Resistive Random-Access Memory.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

GPNPU: Enabling Efficient Hardware-Based Direct Convolution with Multi-Precision Support in GPU Tensor Cores.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

PIM-Prune: Fine-Grain DCNN Pruning for Crossbar-Based Process-In-Memory Architecture.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Identifying patch-level MSI from histological images of Colorectal Cancer by a Knowledge Distillation Model.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

A High-Throughput Tumor Location System with Deep Learning for Colorectal Cancer Histopathology Image.
Proceedings of the Artificial Intelligence in Medicine, 2020

FIMIL : A high-throughput deep learning model for abnormality detection with weak annotation in microscopy images.
Proceedings of the Australasian Computer Science Week, 2020

2019
Energy-Efficient and Quality-Assured Approximate Computing Framework Using a Co-Training Method.
ACM Trans. Design Autom. Electr. Syst., 2019

Selective Detection and Segmentation of Cervical Cells.
Proceedings of the 11th International Conference on Bioinformatics and Biomedical Technology, 2019

Approximate Random Dropout for DNN training acceleration in GPGPU.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

System-level hardware failure prediction using deep learning.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

A sharing-aware L1.5D cache for data reuse in GPGPUs.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

HUBPA: high utilization bidirectional pipeline architecture for neuromorphic computing.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018
IBOM: An Integrated and Balanced On-Chip Memory for High Performance GPGPUs.
IEEE Trans. Parallel Distributed Syst., 2018

CNFET-Based High Throughput SIMD Architecture.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Approximate Random Dropout.
CoRR, 2018

Invocation-driven neural approximate computing with a multiclass-classifier and multiple approximators.
Proceedings of the International Conference on Computer-Aided Design, 2018

AXNet: approximate computing using an end-to-end trainable neural network.
Proceedings of the International Conference on Computer-Aided Design, 2018

A FPGA Friendly Approximate Computing Framework with Hybrid Neural Networks: (Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

In-growth test for monolithic 3D integrated SRAM.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

2017
Bank Stealing for a Compact and Efficient Register File Architecture in GPGPU.
IEEE Trans. Very Large Scale Integr. Syst., 2017

A Hint Frequency Based Approach to Enhancing the I/O Performance of Multilevel Cache Storage Systems.
J. Comput. Sci. Technol., 2017

Incorporating selective victim cache into GPGPU for high-performance computing.
Concurr. Comput. Pract. Exp., 2017

Fault clustering technique for 3D memory BISR.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

On Quality Trade-off Control for Approximate Computing Using Iterative Training.
Proceedings of the 54th Annual Design Automation Conference, 2017

Sneak-Path Based Test and Diagnosis for 1R RRAM Crossbar Using Voltage Bias Technique.
Proceedings of the 54th Annual Design Automation Conference, 2017

2016
A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs.
IEEE Trans. Parallel Distributed Syst., 2016

A Novel Test Method for Metallic CNTs in CNFET-Based SRAMs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Energy-Efficient eDRAM-Based On-Chip Storage Architecture for GPGPUs.
IEEE Trans. Computers, 2016

Cache-emulated register file: An integrated on-chip memory architecture for high performance GPGPUs.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Defect tolerance for CNFET-based SRAMs.
Proceedings of the 2016 IEEE International Test Conference, 2016

Applying Victim Cache in High Performance GPGPU Computing.
Proceedings of the 15th International Symposium on Parallel and Distributed Computing, 2016

Power Attack Defense: Securing Battery-Backed Data Centers.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

CNFET-based high throughput register file architecture.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

2015
Efficient graph computation on hybrid CPU and GPU systems.
J. Supercomput., 2015

Buddy SM: Sharing Pipeline Front-End for Improved Energy Efficiency in GPGPUs.
ACM Trans. Archit. Code Optim., 2015

Timing-driven placement for carbon nanotube circuits.
Proceedings of the 28th IEEE International System-on-Chip Conference, 2015

On microarchitectural modeling for CNFET-based circuits.
Proceedings of the 28th IEEE International System-on-Chip Conference, 2015

On diagnosable and tunable 3D clock network design for lifetime reliability enhancement.
Proceedings of the 2015 IEEE International Test Conference, 2015

CGSharing: Efficient content sharing in GPU-based cloud gaming.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Bank stealing for conflict mitigation in GPGPU Register File.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Towards sustainable in-situ server systems in the big data era.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Building Fuel Powered Supercomputing Data Center at Low Cost.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Exploring Hardware Profile-Guided Green Datacenter Scheduling.
Proceedings of the 44th International Conference on Parallel Processing, 2015

A novel TSV probing technique with adhesive test interposer.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Jump test for metallic CNTs in CNFET-based SRAM.
Proceedings of the 52nd Annual Design Automation Conference, 2015

2014
HFA: A Hint Frequency-based approach to enhance the I/O performance of multi-level cache storage systems.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Dynamic front-end sharing in graphics processing units.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

2013
Compiler assisted dynamic register file in GPGPU.
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

An energy-efficient and scalable eDRAM-based register file architecture for GPGPU.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

2012
AgileRegulator: A hybrid voltage regulator scheme redeeming dark silicon for power efficiency in a multicore architecture.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011
MicroFix: Using timing interpolation and delay sensors for power reduction.
ACM Trans. Design Autom. Electr. Syst., 2011

2010
Leveraging the core-level complementary effects of PVT variations to reduce timing emergencies in multi-core processors.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

2009
Revival: A Variation-Tolerant Architecture Using Voltage Interpolation and Variable Latency.
IEEE Micro, 2009

MicroFix: exploiting path-grained timing adaptability for improving power-performance efficiency.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

Empirical performance models for 3T1D memories.
Proceedings of the 27th International Conference on Computer Design, 2009

Design and test strategies for microarchitectural post-fabrication tuning.
Proceedings of the 27th International Conference on Computer Design, 2009

2008
Replacing 6T SRAMs with 3T1D DRAMs in the L1 Data Cache to Combat Process Variability.
IEEE Micro, 2008

A Process-Variation-Tolerant Floating-Point Unit with Voltage Interpolation and Variable Latency.
Proceedings of the 2008 IEEE International Solid-State Circuits Conference, 2008

Instruction-driven clock scheduling with glitch mitigation.
Proceedings of the 2008 International Symposium on Low Power Electronics and Design, 2008

2007
Process Variation Tolerant 3T1D-Based Cache Architectures.
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007

Architectural power models for SRAM and CAM structures based on hybrid analytical/empirical techniques.
Proceedings of the 2007 International Conference on Computer-Aided Design, 2007

2006
Mitigating the Impact of Process Variations on Processor Register Files and Execution Units.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

Microarchitecture parameter selection to optimize system performance under process variation.
Proceedings of the 2006 International Conference on Computer-Aided Design, 2006

2005
Dynamic coarse grain dataflow reconfiguration technique for real-time systems design.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

Equalizing data-path for processing speed determination in block level pipelining.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005


  Loading...