Sitao Huang

Orcid: 0000-0001-7669-1467

According to our database1, Sitao Huang authored at least 41 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
HyperDetect: A Real-Time Hyperdimensional Solution for Intrusion Detection in IoT Networks.
IEEE Internet Things J., April, 2024

Optimized Spatial Architecture Mapping Flow for Transformer Accelerators.
CoRR, 2024

Optimizing High-Level Synthesis Designs with Retrieval-Augmented Large Language Models.
CoRR, 2024

MONAS: Efficient Zero-Shot Neural Architecture Search for MCUs.
CoRR, 2024

TBA: Faster Large Language Model Training Using SSD-Based Activation Offloading.
CoRR, 2024

RSEND: Retinex-based Squeeze and Excitation Network with Dark Region Detection for Efficient Low Light Image Enhancement.
CoRR, 2024

TG-NAS: Leveraging Zero-Cost Proxies with Transformer and Graph Convolution Networks for Efficient Neural Architecture Search.
CoRR, 2024

Accelerating Autonomous Path Planning on FPGAs with Sparsity-Aware HW/SW Co-Optimizations.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

MicroNAS: Zero-Shot Neural Architecture Search for MCUs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Sgap: towards efficient sparse tensor algebra compilation for GPU.
CCF Trans. High Perform. Comput., June, 2023

PIGEON: Optimizing CUDA Code Generator for End-to-End Training and Inference of Relational Graph Neural Networks.
CoRR, 2023

CARMA: Context-Aware Runtime Reconfiguration for Energy-Efficient Sensor Fusion.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2023

DistHD: A Learner-Aware Dynamic Encoding Method for Hyperdimensional Classification.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Late Breaking Results: Scalable and Efficient Hyperdimensional Computing for Network Intrusion Detection.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
Efficient Machine Learning, Compilers, and Optimizations for Embedded Systems.
CoRR, 2022

2021
High-efficiency and high-usability heterogeneous hardware acceleration with FPGAs
PhD thesis, 2021

PyLog: An Algorithm-Centric Python-Based FPGA Programming and Synthesis Flow.
IEEE Trans. Computers, 2021

Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture.
Proc. VLDB Endow., 2021

PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses.
CoRR, 2021

A Python-based High-Level Programming Flow for CPU-FPGA Heterogeneous Systems : (Invited Paper).
Proceedings of the IEEE/ACM Programming Environments for Heterogeneous Computing, 2021

Chimera: A Hybrid Machine Learning-Driven Multi-Objective Design Space Exploration Tool for FPGA High-Level Synthesis.
Proceedings of the Intelligent Data Engineering and Automated Learning - IDEAL 2021, 2021

Improved GPU Implementations of the Pair-HMM Forward Algorithm for DNA Sequence Alignment.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Extending HLS with High-Level Descriptive Language for Configurable Algorithm-Level Spatial Structure Design.
Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021

Mind mappings: enabling efficient algorithm-accelerator mapping space search.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators.
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

2019
Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA Architectures.
Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019

Near-Memory and In-Storage FPGA Acceleration for Emerging Cognitive Computing Workloads.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

Accelerating Sparse Deep Neural Networks on FPGAs.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018

Towards Neural Phrase-based Machine Translation.
Proceedings of the 6th International Conference on Learning Representations, 2018

Triangle Counting and Truss Decomposition using FPGA.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

2017
Collaborative Computing for Heterogeneous Integrated Systems.
Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, 2017

Hardware Acceleration of the Pair-HMM Algorithm for DNA Variant Calling.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

2016
Acceleration of the Pair-HMM Algorithm for DNA Variant Calling.
Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

2014
Accelerating frequent item counting with FPGA.
Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

2013
DTW-Based Subsequence Similarity Search on AMD Heterogeneous Computing Platform.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

Accelerating subsequence similarity search based on dynamic time warping distance with FPGA.
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013


  Loading...