2025
PATCHEDSERVE: A Patch Management Framework for SLO-Optimized Hybrid Resolution Diffusion Serving.
CoRR, January, 2025
2024
CNN-Based Reversible Data Hiding for JPEG Images.
IEEE Trans. Circuits Syst. Video Technol., November, 2024
Local or global? A novel transformer for Chinese named entity recognition based on multi-view and sliding attention.
Int. J. Mach. Learn. Cybern., June, 2024
CARE: Context-aware attention interest redistribution for session-based recommendation.
Expert Syst. Appl., 2024
OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Model.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
Predicting Credit Spreads of Chinese Municipal Bonds: A Hybrid Model of Wavelet Transform, Random Forest, and SAM-GRU.
Proceedings of the International Joint Conference on Neural Networks, 2024
A Method for Modeling Normal User Behavior Based on Security Risk Audit Elements.
Proceedings of the 16th IEEE International Conference on Advanced Infocomm Technology, 2024
RAP: Resource-aware Automated GPU Sharing for Multi-GPU Recommendation Model Training and Input Preprocessing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
ZENO: A Type-based Optimization Framework for Zero Knowledge Neural Network Inference.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
Human anterior thalamic stimulation evoked cortical potentials align with intrinsic functional connectivity.
,
,
,
,
,
,
,
,
,
,
,
NeuroImage, August, 2023
A Cost-Effective Instrument of Distributed Functional Near-Infrared Spectroscopy for Hyperscanning Real-World Interactions.
IEEE Trans. Instrum. Meas., 2023
The Study of Perceptual Training of Chinese Mandarin Tones for Monolingual Speakers of English Using Adaptive Computer Based Training Software.
CoRR, 2023
TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023
MGG: Accelerating Graph Neural Networks with Fine-Grained Intra-Kernel Communication-Computation Pipelining on Multi-GPU Platforms.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023
ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023
OneQ: A Compilation Framework for Photonic One-Way Quantum Computation.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Chinese named entity recognition based on Heterogeneous Graph and Dynamic Attention Network.
Proceedings of the 15th IEEE International Symposium on Autonomous Decentralized System, 2023
Analysis of the Influencing Factors of Urban Investment Bond Credit Spreads Based on Deep Forest.
Proceedings of the 2023 4th International Conference on Big Data Economy and Information Management, 2023
2022
STPAcc: Structural TI-Based Pruning for Accelerating Distance-Related Algorithms on CPU-FPGA Platforms.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022
Rubik: A Hierarchical Architecture for Efficient Graph Neural Network Training.
,
,
,
,
,
,
,
,
,
,
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022
Polymorphic graph attention network for Chinese NER.
Expert Syst. Appl., 2022
Enabling Data Movement and Computation Pipelining in Deep Learning Compiler.
CoRR, 2022
Empowering GNNs with Fine-grained Communication-Computation Pipelining on Multi-GPU Platforms.
CoRR, 2022
GMI-DRL: Empowering Multi-GPU Deep Reinforcement Learning with GPU Spatial Multiplexing.
CoRR, 2022
Collision risk assessment and automatic obstacle avoidance strategy for teleoperation robots.
Comput. Ind. Eng., 2022
Faith: An Efficient Framework for Transformer Verification on GPUs.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022
EL-Rec: Efficient Large-Scale Recommendation Model Training via Tensor-Train Embedding Table.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
QGTC: accelerating quantized graph neural networks via GPU tensor core.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
BBAE: A Method for Few-Shot Charge Prediction with Data Augmentation and Neural Network.
Proceedings of the Chinese Lexical Semantics - 23rd Workshop, 2022
2021
Planetary Wave Spectrum in the Stratosphere-Mesosphere during Sudden Stratospheric Warming 2018.
Remote. Sens., 2021
TC-GNN: Accelerating Sparse Graph Neural Network Computation Via Dense Tensor Core on GPUs.
CoRR, 2021
Towards Efficient Ansatz Architecture for Variational Quantum Algorithms.
CoRR, 2021
QGTC: Accelerating Quantized GNN via GPU Tensor Core.
CoRR, 2021
Palleon: A Runtime System for Efficient Video Processing toward Dynamic Class Skew.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021
APNN-TC: accelerating arbitrary precision neural networks on ampere GPU tensor cores.
Proceedings of the International Conference for High Performance Computing, 2021
Design of a Walking Assistive Robot Against Festination and Freezing of Gait.
Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2021
EGEMM-TC: accelerating scientific computing on tensor cores with extended precision.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.
Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, 2021
DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021
Saga: Sparse Adversarial Attack on EEG-Based Brain Computer Interface.
Proceedings of the IEEE International Conference on Acoustics, 2021
An Efficient Quantitative Approach for Optimizing Convolutional Neural Networks.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
TiAcc: Triangle-inequality based Hardware Accelerator for K-means on FPGAs.
Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021
UAG: Uncertainty-aware Attention Graph Neural Network for Defending Adversarial Attacks.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Comparison of Major Sudden Stratospheric Warming Impacts on the Mid-Latitude Mesosphere Based on Local Microwave Radiometer CO Observations in 2018 and 2019.
Remote. Sens., 2020
Rubik: A Hierarchical Architecture for Efficient Graph Learning.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2020
Uncertainty-aware Attention Graph Neural Network for Defending Adversarial Attacks.
CoRR, 2020
Scalable Adversarial Attack on Graph Neural Networks with Alternating Direction Method of Multipliers.
CoRR, 2020
Optimizing Convolutional Neural Network Architecture via Information Field.
CoRR, 2020
GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs.
CoRR, 2020
SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization.
Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, 2020
Boosting Deep Neural Network Efficiency with Dual-Module Inference.
Proceedings of the 37th International Conference on Machine Learning, 2020
2019
AccD: A Compiler-based Framework for Accelerating Distance-related Algorithms on CPU-FPGA Platforms.
CoRR, 2019
Shear Behavior of Wheat-Concrete Interface during Monotonic and Cyclic Loading.
Complex., 2019
The Shear Strength and Dilatancy Behavior of Wheat Stored in Silos.
Complex., 2019
KPynq: A Work-Efficient Triangle-Inequality Based K-Means on FPGA.
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019
Automatic Threshold Selection Method for SAR Edge Detection.
Proceedings of the Advances in Brain Inspired Cognitive Systems, 2019
Dilated Convolutional Network for Road Extraction in Remote Sensing Images.
Proceedings of the Advances in Brain Inspired Cognitive Systems, 2019
2011
A novel fault diagnosis mechanism for wireless sensor networks.
Math. Comput. Model., 2011
2008
A Key Management Scheme for Hierarchical Access Control in Group Communication.
Int. J. Netw. Secur., 2008
Visualization of Clustered Directed Acyclic Graphs without Node Overlapping.
Proceedings of the 12th International Conference on Information Visualisation, 2008
2007
Novel Memory Reference Reduction Methods for FFT Implementations on DSP Processors.
IEEE Trans. Signal Process., 2007
2006
Parallel random number generators for sequences uniformly distributed over any range of integers.
IEEE Trans. Circuits Syst. I Regul. Pap., 2006
Algebraic Characterization of Reversible Logic Gates.
Theory Comput. Syst., 2006
A wavelength retuning scheme with no service interruption in survivable optical networks.
Proceedings of IEEE International Conference on Communications, 2006
A New Scalable Multicast Solution in MPLS Networks.
Proceedings of the Global Telecommunications Conference, 2006. GLOBECOM '06, San Francisco, CA, USA, 27 November, 2006
2005
Resource-aware conference key establishment for heterogeneous networks.
IEEE/ACM Trans. Netw., 2005
Token bucket based statistical regulator for S-BIND modeled on-line traffic.
Proceedings of IEEE International Conference on Communications, 2005
A new coordinated scheduling algorithm in distributed bandwidth broker QoS architecture.
Proceedings of IEEE International Conference on Communications, 2005
Analysis of TCP over optical burst-switched networks with burst retransmission.
Proceedings of the Global Telecommunications Conference, 2005. GLOBECOM '05, St. Louis, Missouri, USA, 28 November, 2005
Optimized software implementation of a full-rate IEEE 802.11a compliant digital baseband transmitter on a digital signal processor.
Proceedings of the Global Telecommunications Conference, 2005. GLOBECOM '05, St. Louis, Missouri, USA, 28 November, 2005
A new fair bandwidth allocation algorithm for multimedia multicasting in DiffServ.
Proceedings of the Global Telecommunications Conference, 2005. GLOBECOM '05, St. Louis, Missouri, USA, 28 November, 2005
Evaluation of burst retransmission in optical burst-switched networks.
Proceedings of the 2nd International Conference on Broadband Networks (BROADNETS 2005), 2005
2004
A novel multiplexer-based low-power full adder.
IEEE Trans. Circuits Syst. II Express Briefs, 2004
High-speed assembly FFT implementation with memory reference reduction on DSP processors.
Proceedings of the 2004 11th IEEE International Conference on Electronics, 2004
Novel disjoint graph based algorithm for multi-field range-based packet classification.
Proceedings of IEEE International Conference on Communications, 2004
High-performance implementation for graph-based packet classification algorithm on network processor.
Proceedings of IEEE International Conference on Communications, 2004
A centralized key management scheme for hierarchical access control.
Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004
A-Serv: a novel architecture providing scalable quality of service [Internet applications].
Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004
A new traffic model and statistical admission control algorithm for providing QoS guarantees to on-line traffic.
Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004
2003
Low hardware complexity parallel turbo decoder architecture.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003
A new memory reference reduction method for FFT implementation on DSP.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003
A method of generating uniformly distributed sequences over [0, K], where K+1 is not a power of two.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
An efficient implementation of multi-prime RSA on DSP processor.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
2002
A Parallel Residue-to-binary Converter for the Moduli Set {2m-1, 220m+1, 221m+1, ..., 22km+1}.
VLSI Design, 2002
Adder based residue to binary number converters for (2<sup>n</sup>-1, 2<sup>n</sup>, 2<sup>n</sup>+1).
IEEE Trans. Signal Process., 2002
Partitioning and Scheduling DSP Applications with Maximal Memory Access Hiding.
EURASIP J. Adv. Signal Process., 2002
EURASIP J. Adv. Signal Process., 2002
A trace-back-free Viterbi decoder using a new survival path management algorithm.
Proceedings of the 2002 International Symposium on Circuits and Systems, 2002
Twiddle-Factor-Based FFT Algorithm with Reduced Memory Access.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002
Establishment of conference keys in heterogeneous networks.
Proceedings of the IEEE International Conference on Communications, 2002
Optimized scheduling and mapping of logarithm and arctangent functions on TI TMS320C67X processor.
Proceedings of the IEEE International Conference on Acoustics, 2002
Reduce FFT memory reference for low power applications.
Proceedings of the IEEE International Conference on Acoustics, 2002
A DSP-based turbo codec for 3G communication systems.
Proceedings of the IEEE International Conference on Acoustics, 2002
2001
Single-faced Boolean Functions and their Minimization.
Comput. J., 2001
On area-efficient low power array multipliers.
Proceedings of the 2001 8th IEEE International Conference on Electronics, 2001
Optimal partitioning and balanced scheduling with the maximal overlap of data footprints.
Proceedings of the 11th ACM Great Lakes Symposium on VLSI 2001, 2001
CAM-based label search engine for MPLS over ATM networks.
Proceedings of the Global Telecommunications Conference, 2001
Distributed Scaling Algorithm for FFT Computation Using Fixed-Point Arithmetic.
Proceedings of the ISCA 14th International Conference on Parallel and Distributed Computing Systems, 2001
2000
Wire space estimation and routability analysis.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2000
Explicit Cook-Toom algorithm for linear convolution.
Proceedings of the IEEE International Conference on Acoustics, 2000
1999
On the crossing distribution problem.
ACM Trans. Design Autom. Electr. Syst., 1999
Diagnosis of clustered faults for identical degree topologies.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1999
Lossy Compression of Images Using Logic Minimization.
Proceedings of the 12th International Conference on VLSI Design (VLSI Design 1999), 1999
A high-speed residue-to-binary converter and a scheme for its VLSI implementation.
Proceedings of the 1999 International Symposium on Circuits and Systems, ISCAS 1999, Orlando, Florida, USA, May 30, 1999
A parallel residue-to-binary converter.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999
A New Algorithm for RNS Magnitude Comparison Based on New Chinese Remainder Theorem II.
Proceedings of the 9th Great Lakes Symposium on VLSI (GLS-VLSI '99), 1999
1998
Solving Boolean Equations Using ROSOP Forms.
IEEE Trans. Computers, 1998
Residue to Binary Number Converters for (2<sup>n</sup>-1, 2<sup>n</sup>, 2<sup>n</sup>+1).
Proceedings of the 8th Great Lakes Symposium on VLSI (GLS-VLSI '98), 1998
1997
An Algorithm for Total Symmetric OBDD Detection.
IEEE Trans. Computers, 1997
1996
Negation Trees: A Unified Approach to Boolean Function Complementation.
IEEE Trans. Computers, 1996