Qiang Dou

According to our database1, Qiang Dou authored at least 59 papers between 2004 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Extending the RISC-V Instruction Set for High Performance Data Compression Hardware Acceleration.
Proceedings of the 35th IEEE International Conference on Application-specific Systems, 2024

Bidirectional generative transductive zero-shot learning.
Neural Comput. Appl., 2021

Laius: an energy-efficient FPGA CNN accelerator with the support of a fixed-point training framework.
Int. J. Comput. Sci. Eng., 2020

An Overhead-Free Max-Pooling Method for SNN.
IEEE Embed. Syst. Lett., 2020

Efficient architectural exploration of TAGE branch predictor for embedded processors.
Microelectron. J., 2019

A Method for Improving Controlling Factors Based on Information Fusion for Debris Flow Susceptibility Mapping: A Case Study in Jilin Province, China.
Entropy, 2019

A Systolic SNN Inference Accelerator and its Co-optimized Software Framework.
Proceedings of the 2019 on Great Lakes Symposium on VLSI, 2019

ASIE: An Asynchronous SNN Inference Engine for AER Events Processing.
Proceedings of the 25th IEEE International Symposium on Asynchronous Circuits and Systems, 2019

Systolic Array Based Accelerator and Algorithm Mapping for Deep Learning Algorithms.
Proceedings of the Network and Parallel Computing, 2018

A Parallel Algorithm for Instruction Dependence Graph Analysis Based on Multithreading.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018

A Power Efficient Hardware Implementation of the IF Neuron Model.
Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

Memory Bandwidth and Energy Efficiency Optimization of Deep Convolutional Neural Network Accelerators.
Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

Research of Configurable Hybrid Memory Architecture for Big Data Processing.
Proceedings of the Computer Engineering and Technology - 21st CCF Conference, 2017

ACCDSE: A Design Space Exploration Framework for Convolutional Neural Network Accelerator.
Proceedings of the Computer Engineering and Technology - 21st CCF Conference, 2017

SimpleBP: A Lightweight Branch Prediction Simulator for Effective Design Exploration.
Proceedings of the 2017 International Conference on Networking, Architecture, and Storage, 2017

BPSim: An integrated missrate, area, and power simulator for branch predictor.
Proceedings of the 6th International Conference on Modern Circuits and Systems Technologies, 2017

Laius: An 8-Bit Fixed-Point CNN Hardware Inference Engine.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Effective Optimization of Branch Predictors through Lightweight Simulation.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Design Space Exploration of TAGE Branch Predictor with Ultra-Small RAM.
Proceedings of the on Great Lakes Symposium on VLSI 2017, 2017

FixCaffe: Training CNN with Low Precision Arithmetic Operations by Fixed Point Caffe.
Proceedings of the Advanced Parallel Processing Technologies, 2017

An Area-Efficient DAMQ Buffer with Congestion Control Support.
J. Circuits Syst. Comput., 2016

The Macro-DSE for HPC Processing Unit: The Physical Constraints Perspective.
Proceedings of the Green, Pervasive, and Cloud Computing - 11th International Conference, 2016

Coarse Granularity Data Migration Based Power Management Mechanism for 3D DRAM Cache.
Proceedings of the Advanced Computer Architecture - 11th Conference, 2016

Modeling and Analyzing of 3D DRAM as L3 Cache Based on DRAMSim2.
Proceedings of the Computer Engineering and Technology - 19th CCF Conference, 2015

A Scalable and Fast Microprocessor Design Space Exploration Methodology.
Proceedings of the IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2015

Fast FPGA system for microarchitecture optimization on synthesizable modern processor design.
Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

Integrated Coherence Prediction: Towards Efficient Cache Coherence on NoC-Based Multicore Architectures.
ACM Trans. Design Autom. Electr. Syst., 2014

Analysis of worst-case backlog bounds for Networks-on-Chip.
J. Syst. Archit., 2014

Efficient Utilization of SIMD Engines for General-Purpose Processors.
Comput. J., 2014

Dynamic Streamization Model Execution for SIMD Engines on Multicore Architectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Adaptive communication mechanism for accelerating MPI functions in NoC-based multicore processors.
ACM Trans. Archit. Code Optim., 2013

Stationary Distribution for the Mobilities in Catastrophe Rescue Scenario.
KSII Trans. Internet Inf. Syst., 2013

Closed Walk Ferry Route Design for Wireless Sensor Networks.
KSII Trans. Internet Inf. Syst., 2013

An Energy-efficient routing Protocol Based on Hotspot-aware uneven clustering and Dynamic Path Selection.
Proceedings of the 22nd Wireless and Optical Communication Conference, 2013

Dedicated-path protection algorithm for preventing high-powered optical crosstalk in Transparent Optical Networks.
Proceedings of the 22nd Wireless and Optical Communication Conference, 2013

Mapping and Optimizing 2-D Scientific Applications on a Stream Processor.
Proceedings of the Multimedia and Ubiquitous Engineering, 2013

Bandwidth allocation design to guarantee qos of differentiated services for a novel OFDMA-PON.
Proceedings of the 18th Asia-Pacific Conference on Communications, 2012

An energy efficient routing protocol for Wireless Sensor Network.
Proceedings of the 18th Asia-Pacific Conference on Communications, 2012

A Formalization of an Emulation Based Co-designed Virtual Machine.
Proceedings of the Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, 2011

A novel shared-buffer router for network-on-chip based on Hierarchical Bit-line Buffer.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Heat flux analysis of space camera with thermal door.
Proceedings of the International Conference on Electronic and Mechanical Engineering and Information Technology, 2011

A Novel Chaining Approach to Indirect Control Transfer Instructions.
Proceedings of the Availability, Reliability and Security for Business, Enterprise and Health Information Systems, 2011

Exploiting Loop-Carried Stream Reuse for Scientific Computing Applications on the Stream Processor.
Int. J. Commun. Netw. Syst. Sci., 2010

Towards Low Delay Sub-Stream Scheduling.
Int. J. Comput. Commun. Control, 2010

TH-1: China's first petaflop supercomputer.
Frontiers Comput. Sci. China, 2010

A Novel Chaining Approach for Direct Control Transfer Instructions.
Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems, 2010

QoS scheduling for NoCs: Strict Priority Queueing versus Weighted Round Robin.
Proceedings of the 28th International Conference on Computer Design, 2010

Low Power Design for a Multi-core Multi-thread Microprocessor.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

Analyzing Credit-Based Router-to-Router Flow Control for On-Chip Networks.
IEICE Trans. Electron., 2009

An efficient stream memory architecture for heterogeneous multicore processor.
Proceedings of the 14th IEEE Symposium on Computers and Communications (ISCC 2009), 2009

DTM: Decoupled Hardware Transactional Memory to Support Unbounded Transaction and Operating System.
Proceedings of the ICPP 2009, 2009

BOIN: A novel Bufferless Optical Interconnection Network for high performance computer.
Proceedings of the 7th IEEE/ACS International Conference on Computer Systems and Applications, 2009

A Fault Tolerant Bufferless Optical Interconnection Network.
Proceedings of the 8th IEEE/ACIS International Conference on Computer and Information Science, 2009

Mapping and Optimizing 2-D Jacobi Iteration on a Stream Processor.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

Study on distributing pattern of shared data in CC-NUMA system.
Proceedings of the 8th ACIS International Conference on Software Engineering, 2007

A Two-Level Directory Organization Solution for CC-NUMA Systems.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2007

FPGA Accelerating Algorithms of Active Shape Model in People Tracking Applications.
Proceedings of the Tenth Euromicro Conference on Digital System Design: Architectures, 2007

A New Hybrid Directory Scheme for Shared Memory Multi-processors.
Proceedings of the Computer Science, 2006

Configuration of the Galaxy Grid Node Environment.
Proceedings of the Grid and Cooperative Computing, 2004
