Sheng Ma

Orcid: 0000-0003-1710-4060

According to our database1, Sheng Ma authored at least 166 papers between 1997 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
RVAM16: a low-cost multiple-ISA processor based on RISC-V and ARM Thumb.
Frontiers Comput. Sci., January, 2025

2024
SAL: Optimizing the Dataflow of Spin-based Architectures for Lightweight Neural Networks.
ACM Trans. Archit. Code Optim., September, 2024

Inventory and Spatial Distribution of Landslides on the Eastern Slope of Gongga Mountain, Southwest China.
Remote. Sens., September, 2024

EPHA: An Energy-efficient Parallel Hybrid Architecture for ANNs and SNNs.
ACM Trans. Design Autom. Electr. Syst., May, 2024

SparGD: A Sparse GEMM Accelerator with Dynamic Dataflow.
ACM Trans. Design Autom. Electr. Syst., March, 2024

SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs.
ACM Trans. Archit. Code Optim., March, 2024

A Survey of Design and Optimization for Systolic Array-based DNN Accelerators.
ACM Comput. Surv., January, 2024

Spin-NeuroMem: A Low-Power Neuromorphic Associative Memory Design Based on Spintronic Devices.
CoRR, 2024

AP-assisted adaptive video streaming in wireless networks with high-density clients.
Comput. Commun., 2024

Understanding and Mitigating the Soft Error of Contrastive Language-Image Pre-training Models.
Proceedings of the IEEE International Test Conference in Asia, 2024

The Self-adaptive and Topology-aware MPI_Bcast leveraging Collective offload on Tianhe Express Interconnect.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Cost-Effective Value Predictor for ILP processors through Design Space Exploration.
Proceedings of the Great Lakes Symposium on VLSI 2024, 2024

Sparm: A Sparse Matrix Multiplication Accelerator Supporting Multiple Dataflows.
Proceedings of the 35th IEEE International Conference on Application-specific Systems, 2024

2023
RHS-TRNG: A Resilient High-Speed True Random Number Generator Based on STT-MTJ Device.
IEEE Trans. Very Large Scale Integr. Syst., October, 2023

A Hybrid Kernel Pruning Approach for Efficient and Accurate CNNs.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2023

Absorb: Deadlock Resolution for 2.5D Modular Chiplet Based Systems.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2023

Optimizing the Parallelism of Communication and Computation in Distributed Training Platform.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2023

2022
Heterogeneous Systolic Array Architecture for Compact CNNs Hardware Accelerators.
IEEE Trans. Parallel Distributed Syst., 2022

Optimizing convolutional neural networks on multi-core vector accelerator.
Parallel Comput., 2022

RV16: An Ultra-Low-Cost Embedded RISC-V Processor Core.
J. Comput. Sci. Technol., 2022

A novel systolic array processor with dynamic dataflows.
Integr., 2022

Medial compartment cartilage repair and lower extremity biomechanical changes after single-plane high tibial osteotomy of distal tibial tuberosity.
Comput. Methods Programs Biomed., 2022

Three-dimensional surgical planning and clinical evaluation of the efficacy of distal tibial tuberosity high tibial osteotomy in obese patients with varus knee osteoarthritis.
Comput. Methods Programs Biomed., 2022

Stride Equality Prediction for Value Speculation.
IEEE Comput. Archit. Lett., 2022

Adaptive Low-Cost Loop Expansion for Modulo Scheduling.
Proceedings of the Network and Parallel Computing, 2022

SADD: A Novel Systolic Array Accelerator with Dynamic Dataflow for Sparse GEMM in Deep Learning.
Proceedings of the Network and Parallel Computing, 2022

Optimizing Winograd Convolution on GPUs via Partial Kernel Fusion.
Proceedings of the Network and Parallel Computing, 2022

SparG: A Sparse GEMM Accelerator for Deep Learning Applications.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

PipeFB: An Optimized Pipeline Parallelism Scheme to Reduce the Peak Memory Usage.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

Full-credit Flow Control: A Novel Technique to Implement Deadlock-free Adaptive Routing.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

2021
Configurable Multi-directional Systolic Array Architecture for Convolutional Neural Networks.
ACM Trans. Archit. Code Optim., 2021

A Memory Saving Mechanism Based on Data Transferring for Pipeline Parallelism.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

Co-designing the Topology/Algorithm to Accelerate Distributed Training.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

MiniMCTAD: Minimalist Monte Carlo Transport Architecture Design.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

HeSA: Heterogeneous Systolic Array Architecture for Compact CNNs Hardware Accelerators.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

2020
A Dynamic and Proactive GPU Preemption Mechanism Using Checkpointing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

DancerFly: An Order-Aware Network-on-Chip Router On-the-Fly Mitigating Multi-path Packet Reordering.
Int. J. Parallel Program., 2020

Medical imaging and diagnosis of subpatellar vertebrae based on improved Laplacian image enhancement algorithm.
Comput. Methods Programs Biomed., 2020

Distal tibial tuberosity high tibial osteotomy using an image enhancement technique for orthopedic scans in the treatment of medial compartment knee osteoarthritis.
Comput. Methods Programs Biomed., 2020

Accelerating Large-Scale Deep Convolutional Neural Networks on Multi-core Vector Accelerators.
Proceedings of the Network and Parallel Computing, 2020

CMSA: Configurable Multi-directional Systolic Array for Convolutional Neural Networks.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

2019
Coordinated DMA: Improving the DRAM Access Efficiency for Matrix Multiplication.
IEEE Trans. Parallel Distributed Syst., 2019

CD-Xbar: A Converge-Diverge Crossbar Network for High-Performance GPUs.
IEEE Trans. Computers, 2019

SIMD stealing: Architectural support for efficient data parallel execution on multicores.
Microprocess. Microsystems, 2019

Priority-Based PCIe Scheduling for Multi-Tenant Multi-GPU Systems.
IEEE Comput. Archit. Lett., 2019

MT-DMA: A DMA Controller Supporting Efficient Matrix Transposition for Digital Signal Processing.
IEEE Access, 2019

An Efficient Direct Memory Access (DMA) Controller for Scientific Computing Accelerators.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

A Dynamic Bypass Approach to Realize Power Efficient Network-on-Chip.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

The Implementation and Optimization of Parallel Linpack on Multi-Core Vector Accelerator.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

EVC-Based Power Gating Approach to Achieve Low-Power and High Performance NoC.
Proceedings of the 22nd Euromicro Conference on Digital System Design, 2019

Improving the DRAM Access Efficiency for Matrix Multiplication on Multicore Accelerators.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Surf-Bless: A Confined-interference Routing for Energy-Efficient Communication in NoCs.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

2018
The Design of NoC-Side Memory Access Scheduling for Energy-Efficient GPGPUs.
Int. J. Parallel Program., 2018

MTGIpick allows robust identification of genomic islands from a single genome.
Briefings Bioinform., 2018

DyCache: Dynamic Multi-Grain Cache Management for Irregular Memory Accesses on GPU.
IEEE Access, 2018

Accelerating BFS via Data Structure-Aware Prefetching on GPU.
IEEE Access, 2018

Performance Analysis of Different Convolution Algorithms in GPU Environment.
Proceedings of the 2018 IEEE International Conference on Networking, 2018

Improving Branch Prediction Accuracy on Multi-Core Architectures for Big Data.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018

Adaptive VC Partitioning for NoCs in GPGPUs.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Accelerating CNNs Using Optimized Scheduling Strategy.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018

VISU: A Simple and Efficient Cache Coherence Protocol Based on Self-updating.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018

2017
A Phytoplankton Class-Specific Marine Primary Productivity Model Using MODIS Data.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2017

A high performance reliable NoC router.
Integr., 2017

Improving Branch Prediction for Thread Migration on Multi-core Architectures.
Proceedings of the Network and Parallel Computing, 2017

2016
A runtime fault-tolerant routing algorithm based on region flooding in NoCs.
Microprocess. Microsystems, 2016

Monitoring of sinking flux of ocean particulate organic carbon using remote sensing methods.
Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium, 2016

A heterogeneous low-cost and low-latency Ring-Chain network for GPGPUs.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

DLL: A dynamic latency-aware load-balancing strategy in 2.5D NoC architecture.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

A low-cost conflict-free NoC for GPGPUs.
Proceedings of the 53rd Annual Design Automation Conference, 2016

A high performance reliable NoC router.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

Overcoming and Analyzing the Bottleneck of Interposer Network in 2.5D NoC Architecture.
Proceedings of the Advanced Computer Architecture - 11th Conference, 2016

2015
Leaving One Slot Empty: Flit Bubble Flow Control for Torus Cache-Coherent NoCs.
IEEE Trans. Computers, 2015

Merging Satellite Ocean Color Data With Bayesian Maximum Entropy Method.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2015

Express Ring: a multi-layer and non-blocking NoC architecture.
IEICE Electron. Express, 2015

A New Memory Address Transformation for Continuous-Flow FFT Processors with SIMD Extension.
Proceedings of the Computer Engineering and Technology - 19th CCF Conference, 2015

Adaptive remaining hop count flow control: Consider the interaction between packets.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

2014
FPGA Implementation of a Special-Purpose VLIW Structure for Double-Precision Elementary Function.
ACM Trans. Reconfigurable Technol. Syst., 2014

Novel Flow Control for Fully Adaptive Routing in Cache-Coherent NoCs.
IEEE Trans. Parallel Distributed Syst., 2014

Bathymetry Retrieval From Hyperspectral Remote Sensing Data in Optical-Shallow Water.
IEEE Trans. Geosci. Remote. Sens., 2014

Holistic Routing Algorithm Design to Support Workload Consolidation in NoCs.
IEEE Trans. Computers, 2014

Estimation of Marine Primary Productivity From Satellite-Derived Phytoplankton Absorption Data.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2014

An Improved Model for L-Band Brightness Temperature Estimation Over Foam-Covered Seas Under Low and Moderate Winds.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2014

A comprehensive comparison between virtual cut-through and wormhole routers for cache coherent Network on-Chips.
IEICE Electron. Express, 2014

An efficient floating-point multiplier for digital signal processors.
IEICE Electron. Express, 2014

Selective Extension of Routing Algorithms Based on Turn Model.
Proceedings of the 22nd Euromicro International Conference on Parallel, 2014

2012
Low-Cost Binary128 Floating-Point FMA Unit Design with SIMD Support.
IEEE Trans. Computers, 2012

Whole packet forwarding: Efficient design of fully adaptive routing algorithms for networks-on-chip.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

Supporting efficient collective communication in NoCs.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011
A practical low-latency router architecture with wing channel for on-chip network.
Microprocess. Microsystems, 2011

DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Some Remarks on Convergence in Credibility Measure and Convergence in Credibility Distribution of Fuzzy Variable.
Proceedings of the 2nd International Symposium on Intelligence Information Processing and Trusted Computing, 2011

2010
Wavelet Methods in Data Mining.
Proceedings of the Data Mining and Knowledge Discovery Handbook, 2nd ed., 2010

An Integrated Data-Driven Framework for Computing System Management.
IEEE Trans. Syst. Man Cybern. Part A, 2010

On combining multiple clusterings: an overview and a new perspective.
Appl. Intell., 2010

SIF: Overcoming the limitations of SIMD devices via implicit permutation.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

2009
On Clustering Techniques.
Proceedings of the Encyclopedia of Data Warehousing and Mining, Second Edition (4 Volumes), 2009

Implementation of OpenVG Path and Paint Algorithms on Synchronous Data Triggered Architecture with Optimization.
Proceedings of the International Conference on Networking, Architecture, and Storage, 2009

2008
Top-k Correlation Computation.
INFORMS J. Comput., 2008

Guest editorial: special issue on temporal data mining: theory, algorithms and applications.
Data Min. Knowl. Discov., 2008

Periodic and Subharmonic Solutions for a Class of Local Nonquadratic Second-Order Hamiltonian Systems.
Proceedings of the International Conference on Computer Science and Software Engineering, 2008

2007
Locally adaptive metrics for clustering high dimensional data.
Data Min. Knowl. Discov., 2007

2006
TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases.
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006

Recommendation on Item Graphs.
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006

Fast Relevance Discovery in Time Series.
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006

2005
Adaptive diagnosis in distributed systems.
IEEE Trans. Neural Networks, 2005

Mining logs files for data-driven system management.
SIGKDD Explor., 2005

Demand-driven frequent itemset mining using pattern structures.
Knowl. Inf. Syst., 2005

Automated Problem Determination using Call-Stack Matching.
J. Netw. Syst. Manag., 2005

Statictical Models for Unequally Spaced Time Series.
Proceedings of the 2005 SIAM International Conference on Data Mining, 2005

An integrated framework on mining logs files for computing system management.
Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2005

Data-driven monitoring design of service level and resource utilization.
Proceedings of the Integrated Network Management, 2005

Test-based diagnosis: tree and matrix representations.
Proceedings of the Integrated Network Management, 2005

Mining Logs Files for Computing System Management.
Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

PICCIL: Interactive Learning to Support Log File Categorization.
Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

Quickly Finding Known Software Problems via Automated Symptom Matching.
Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

Wavelet Methods in Data Mining.
Proceedings of the Data Mining and Knowledge Discovery Handbook., 2005

2004
Modeling Network Traffic in Wavelet Domain.
Comput. Sci. J. Moldova, 2004

Document clustering via adaptive subspace iteration.
Proceedings of the SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004

IFD: Iterative Feature and Data Clustering.
Proceedings of the Fourth SIAM International Conference on Data Mining, 2004

Subspace Clustering of High Dimensional Data.
Proceedings of the Fourth SIAM International Conference on Data Mining, 2004

Real-time problem determination in distributed systems using active probing.
Proceedings of the Managing Next Generation Convergence Networks and Services, 2004

Entropy-based criterion in categorical clustering.
Proceedings of the Machine Learning, 2004

Mining Temporal Patterns Without Predefined Time Windows.
Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 2004

Generic Adapter Logging Toolkit.
Proceedings of the 1st International Conference on Autonomic Computing (ICAC 2004), 2004

On combining multiple clusterings.
Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, 2004

2003
Critical event prediction for proactive management in large-scale computer clusters.
Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 24, 2003

Data-driven validation, completion and construction of event relationship networks.
Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 24, 2003

Active Probing Strategies for Problem Diagnosis in Distributed Systems.
Proceedings of the IJCAI-03, 2003

Is random model better? On its accuracy and efficiency.
Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), 2003

Clustering gene expression data in SQL using locally adaptive metrics.
Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, 2003

2002
Discovery in multi-attribute data with user-defined constraints.
SIGKDD Explor., 2002

Mining mutually dependent patterns for system management.
IEEE J. Sel. Areas Commun., 2002

Fast ordering of large categorical datasets for visualization.
Intell. Data Anal., 2002

Applying machine learning to automated information graphics generation.
IBM Syst. J., 2002

Predictive algorithms in the management of computer systems.
IBM Syst. J., 2002

Discovering actionable patterns in event data.
IBM Syst. J., 2002

Intelligent probing: A cost-effective approach to fault diagnosis in computer networks.
IBM Syst. J., 2002

Discovering Fully Dependent Patterns.
Proceedings of the Second SIAM International Conference on Data Mining, 2002

A Classification Approach for Prediction of Target Events in Temporal Sequences.
Proceedings of the Principles of Data Mining and Knowledge Discovery, 2002

Mining Associations by Pattern Structure in Large Relational Tables.
Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 2002

Predicting Rare Events In Temporal Domains.
Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 2002

User-directed Exploration of Mining Space with Multiple Attributes.
Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 2002

Progressive and Interactive Analysis of Event Data Using Event Miner.
Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 2002

Web Application Performance: Realistic Work Load for Stress Test.
Proceedings of the 28th International Computer Measurement Group Conference, 2002

Accuracy vs. Efficiency Trade-offs in Probabilistic Diagnosis.
Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, July 28, 2002

2001
Modeling heterogeneous network traffic in wavelet domain.
IEEE/ACM Trans. Netw., 2001

Fast ordering of large categorical datasets for better visualization.
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001

Towards Discovery of Event Correlation Rules.
Proceedings of the 2001 IEEE/IFIP International Symposium on Integrated Network Management, 2001

FARM: A Framework for Exploring Mining Spaces with Multiple Attributes.
Proceedings of the 2001 IEEE International Conference on Data Mining, 29 November, 2001

Mining Mutually Dependent Patterns.
Proceedings of the 2001 IEEE International Conference on Data Mining, 29 November, 2001

Mining Partially Periodic Event Patterns with Unknown Periods.
Proceedings of the 17th International Conference on Data Engineering, 2001

Rule Induction of Computer Events.
Proceedings of the Operations & Management, 2001

Optimizing Probe Selection for Fault Localization.
Proceedings of the Operations & Management, 2001

2000
Comparison of the independent wavelet models to network traffic.
Proceedings of the Global Telecommunications Conference, 2000. GLOBECOM 2000, San Francisco, CA, USA, 27 November, 2000

Scalable Visualization of Event Data.
Proceedings of the Services Management in Intelligent Networks, 2000

Mining Event Data for Actionable Patterns.
Proceedings of the 26th International Computer Measurement Group Conference, 2000

1999
Performance and efficiency: recent advances in supervised learning.
Proc. IEEE, 1999

Approximation Capability of Independent Wavelet Models to Heterogeneous Network Traffic.
Proceedings of the Proceedings IEEE INFOCOM '99, 1999

EventBrowser: A Flexible Tool for Scalable Analysis of Event Data.
Proceedings of the Active Technologies for Network and Service Management, 1999

1998
A unified approach on fast training of feedforward and recurrent networks using EM algorithm.
IEEE Trans. Signal Process., 1998

Fast training of recurrent networks based on the EM algorithm.
IEEE Trans. Neural Networks, 1998

Modeling video traffic using wavelets.
IEEE Commun. Lett., 1998

Modeling Video Traffic in the Wavelet Domain.
Proceedings of the Proceedings IEEE INFOCOM '98, The Conference on Computer Communications, Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies, Gateway to the 21st Century, San Francisco, CA, USA, March 29, 1998

1997
Combinations of weak classifiers.
IEEE Trans. Neural Networks, 1997

An Efficient EM-based Training Algorithm for Feedforward Neural Networks.
Neural Networks, 1997

Wavelet Models for Video Time-Series.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997


  Loading...