Yangdong Deng

Orcid: 0000-0002-8257-693X

According to our database1, Yangdong Deng authored at least 93 papers between 2001 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
AlignedKV: Reducing Memory Access of KV-Cache with Precision-Aligned Quantization.
CoRR, 2024

CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers.
CoRR, 2024

A Multi-Level Framework for Accelerating Training Transformer Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
TCGAN: Convolutional Generative Adversarial Network for time series classification and clustering.
Neural Networks, August, 2023

Removing Backdoors in Pre-trained Models by Regularized Continual Pre-training.
Trans. Assoc. Comput. Linguistics, 2023

A spatiotemporal deep neural network for fine-grained multi-horizon wind prediction.
Data Min. Knowl. Discov., 2023

Beat LLMs at Their Own Game: Zero-Shot LLM-Generated Text Detection via Querying ChatGPT.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
Duplicacy: A New Generation of Cloud Backup Tool Based on Lock-Free Deduplication.
IEEE Trans. Cloud Comput., 2022

Agglomerative Memory and Thread Scheduling for High-Performance Ray-Tracing on GPUs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Pass off Fish Eyes for Pearls: Attacking Model Selection of Pre-trained Models.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
A GPU Acceleration Framework for Motif and Discord Based Pattern Mining.
IEEE Trans. Parallel Distributed Syst., 2021

An Elastic Task Scheduling Scheme on Coarse-Grained Reconfigurable Architectures.
IEEE Trans. Parallel Distributed Syst., 2021

Knowledge Enhanced Fact Checking and Verification.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

PMBA: A Parallel MCMC Bayesian Computing Accelerator.
IEEE Access, 2021

2020
A Flattened-Priority Framework for Mixed-Criticality Systems.
IEEE Trans. Ind. Electron., 2020

Model-Based Adaptation of Mixed-Criticality Multiservice Systems for Extreme Physical Environments.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Time-Triggered Switch-Memory-Switch Architecture for Time-Sensitive Networking Switches.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

A Survey of Coarse-Grained Reconfigurable Architecture and Design: Taxonomy, Challenges, and Applications.
ACM Comput. Surv., 2020

Grid Cells Are Ubiquitous in Neural Networks.
CoRR, 2020

A resource-efficient priority scheduler for time-sensitive networking switches.
CCF Trans. Netw., 2020

GraphABCD: Scaling Out Graph Analytics with Asynchronous Block Coordinate Descent.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

2019
An Enhanced Reconfiguration for Deterministic Transmission in Time-Triggered Networks.
IEEE/ACM Trans. Netw., 2019

Dynamically Optimizing End-to-End Latency for Time-Triggered Networks.
Proceedings of the ACM SIGCOMM 2019 Workshop on Networking for Emerging Applications and Technologies, 2019

FPGA-Accelerated Optimistic Concurrency Control for Transactional Memory.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

2018
Triggered-Issuance and Triggered-Execution: A Control Paradigm to Minimize Pipeline Stalls in Distributed Controlled Coarse-Grained Reconfigurable Arrays.
IEEE Trans. Parallel Distributed Syst., 2018

DetNet: A Backbone network for Object Detection.
CoRR, 2018

Breaking the Synchronization Bottleneck with Reconfigurable Transactional Execution.
IEEE Comput. Archit. Lett., 2018

Work-in-Progress: A Flattened Priority Framework for Mixed-Criticality Real-Time Systems.
Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium, 2018

Model-based adaptation to extreme physical environments: a case study on mixed-criticality industrial ethernet.
Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, 2018

DetNet: Design Backbone for Object Detection.
Proceedings of the Computer Vision - ECCV 2018, 2018

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Energy-Efficient Automatic Train Driving by Learning Driving Patterns.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Toward Robust Vehicle Platooning With Bounded Spacing Error.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

Toward Real-Time Ray Tracing: A Survey on Hardware Acceleration and Microarchitecture Techniques.
ACM Comput. Surv., 2017

Light-Head R-CNN: In Defense of Two-Stage Object Detector.
CoRR, 2017

Path compression kd-trees with multi-layer parallel construction a case study on ray tracing.
Proceedings of the 21st ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, 2017

Aggressive Pipelining of Irregular Applications on Reconfigurable Hardware.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Minimizing Pipeline Stalls in Distributed-Controlled Coarse-Grained Reconfigurable Arrays with Triggered Instruction Issue and Execution.
Proceedings of the 54th Annual Design Automation Conference, 2017

Human experience knowledge induction based intelligent train driving.
Proceedings of the 16th IEEE/ACIS International Conference on Computer and Information Science, 2017

2016
Time-Delay Neural Network for Continuous Emotional Dimension Prediction From Facial Expression Sequences.
IEEE Trans. Cybern., 2016

An Energy-Efficient Train Control Framework for Smart Railway Transportation.
IEEE Trans. Computers, 2016

2015
Design and Optimization of Multiclocked Embedded Systems Using Formal Techniques.
IEEE Trans. Ind. Electron., 2015

GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication.
Concurr. Comput. Pract. Exp., 2015

Apparent resolution enhancement for near-eye light field display.
Proceedings of the SIGGRAPH Asia 2015 Mobile Graphics and Interactive Applications, 2015

RadixBoost: A hardware acceleration structure for scalable radix sort on graphic processors.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

FastTree: a hardware KD-tree construction acceleration engine for real-time ray tracing.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2014
Orchestrating Cache Management and Memory Scheduling for GPGPU Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2014

A Fast and Accurate Segmentation Method for Ordered LiDAR Point Cloud of Large-Scale Scenes.
IEEE Geosci. Remote. Sens. Lett., 2014

Toward Concurrent Lock-Free Queues on GPUs.
IEICE Trans. Inf. Syst., 2014

Performance Optimization for Sparse <i>A<sup>t</sup>Ax</i> in Parallel on Multicore CPU.
IEICE Trans. Inf. Syst., 2014

A feasibility study of ray tracing on mobile GPUs.
Proceedings of the SIGGRAPH Asia 2014 Mobile Graphics and Interactive Applications, 2014

Fully parallel kd-tree construction for real-time ray tracing.
Proceedings of the Symposium on Interactive 3D Graphics and Games, 2014

Atomic reduction based sparse matrix-transpose vector multiplication on GPUs.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Fast Radix: A Scalable Hardware Accelerator for Parallel Radix Sort.
Proceedings of the 12th International Conference on Frontiers of Information Technology, 2014

2013
Exploiting the Task-Pipelined Parallelism of Stream Programs on Many-Core GPUs.
IEICE Trans. Inf. Syst., 2013

Electronic Design Automation with Graphic Processors: A Survey.
Found. Trends Electron. Des. Autom., 2013

Design and optimization of multi-clocked embedded systems using formal technique.
Proceedings of the Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2013

Mining effective parallelism from hidden coherence for GPU based path tracing.
Proceedings of the SIGGRAPH Asia 2013, 2013

Robust conservative parallel HDL simulation on multi-core CPUs.
Proceedings of the International Conference on High Performance Computing & Simulation, 2013

Engineering a fully GPU-accelerated H.264 encoder.
Proceedings of the Fifth International Conference on Digital Image Processing, 2013

FastLanes: An FPGA accelerated GPU microarchitecture simulator.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

A facial expression based continuous emotional state monitoring system with GPU acceleration.
Proceedings of the 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 2013

2012
A Framework for Layout-Dependent STI Stress Analysis and Stress-Aware Circuit Optimization.
IEEE Trans. Very Large Scale Integr. Syst., 2012

A Two-Hop Wireless Power Transfer System With an Efficiency-Enhanced Power Receiver for Motion-Free Capsule Endoscopy Inspection.
IEEE Trans. Biomed. Eng., 2012

Towards accelerating irregular EDA applications with GPUs.
Integr., 2012

A theoretical and empirical error analysis of mobile 3D data acquisition system.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

A Polyhedral Modeling Based Source-to-Source Code Optimization Framework for GPGPU.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Design of Micro-Ball endoscopy system.
Proceedings of the 2012 IEEE Biomedical Circuits and Systems Conference, 2012

A Thermal-Driven Test Application Scheme for 3-Dimensional ICs.
Proceedings of the 21st IEEE Asian Test Symposium, 2012

2011
A High-Throughput, High-Accuracy System-Level Simulation Framework for System on Chips.
VLSI Design, 2011

CAD for Gigascale SoC Design and Verification Solutions.
VLSI Design, 2011

Massively Parallel Logic Simulation with GPUs.
ACM Trans. Design Autom. Electr. Syst., 2011

Exploiting graphics processors for high-performance IP lookup in software routers.
Proceedings of the INFOCOM 2011. 30th IEEE International Conference on Computer Communications, 2011

Accelerating RTL simulation with GPUs.
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011

Evaluating the potential of graphics processors for high performance embedded computing.
Proceedings of the Design, Automation and Test in Europe, 2011

Scalable packet classification via GPU metaprogramming.
Proceedings of the Design, Automation and Test in Europe, 2011

Hermes: an integrated CPU/GPU microarchitecture for IP routing.
Proceedings of the 48th Design Automation Conference, 2011

2010
Full-chip leakage analysis for 65 nm CMOS technology and beyond.
Integr., 2010

IP routing processing with graphic processors.
Proceedings of the Design, Automation and Test in Europe, 2010

Distributed time, conservative parallel logic simulation on GPUs.
Proceedings of the 47th Design Automation Conference, 2010

Massively Parallel Finite Element Simulator for Full-Chip STI Stress Analysis.
Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010

GPU Accelerated VLSI Design Verification.
Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010

2009
Layout-dependent STI stress analysis and stress-aware RF/analog circuit design optimization.
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

Taming irregular EDA applications on GPUs.
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

2006
Temperature-Aware Floorplanning of 3-D ICs Considering Thermally Dependent Leakage Power.
J. Low Power Electron., 2006

2005
2.5-dimensional VLSI system integration.
IEEE Trans. Very Large Scale Integr. Syst., 2005

Temperature-Dependent Optimization of Cache Leakage Power Dissipation.
Proceedings of the 23rd International Conference on Computer Design (ICCD 2005), 2005

2004
2.5D system integration: a design driven system implementation schema.
Proceedings of the 2004 Conference on Asia South Pacific Design Automation: Electronic Design and Solution Fair 2004, 2004

2003
Physical Design of the "2.5D" Stacked System.
Proceedings of the 21st International Conference on Computer Design (ICCD 2003), 2003

2002
System-Level Point-to-Point Communication Synthesis using Floorplanning Information.
Proceedings of the 7th Asia and South Pacific Design Automation Conference (ASP-DAC 2002), 2002

2001
Interconnect characteristics of 2.5-D system integration scheme.
Proceedings of the 2001 International Symposium on Physical Design, 2001


  Loading...