Edwin H.-M. Sha

Orcid: 0000-0001-5605-5631

Affiliations:
  • Chongqing University, Key Laboratory of Dependable Service Computing in Cyber Physical Society, China


According to our database1, Edwin H.-M. Sha authored at least 434 papers between 1991 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
MuDP: multi-granularity data placement for uniform loops on SPM-DRAM architectures to minimize latency.
Frontiers Comput. Sci., May, 2025

2024
Revisiting TRIM on High-Density Flash-Based Hybrid Storage Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2024

QuanPath: achieving one-step communication for distributed quantum circuit simulation.
Quantum Inf. Process., January, 2024

Ensuring consistent recovery under power failure with minimal NVM write overhead.
J. Syst. Archit., 2024

An efficient flattened index structure with lazy restructuring and hotness awareness.
Future Gener. Comput. Syst., 2024

Eliminate Critical Fragmentation of F2FS in Mobile Devices with Controller Co-Design.
Proceedings of the 13th Non-Volatile Memory Systems and Applications Symposium, 2024

Sparrow: Flexible Memory Deduplication in Android Systems with Similar-Page Awareness.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

2023
Efficient algorithm for full-state quantum circuit simulation with DD compression while maintaining accuracy.
Quantum Inf. Process., November, 2023

IOSR: Improving I/O Efficiency for Memory Swapping on Mobile Devices Via Scheduling and Reshaping.
ACM Trans. Embed. Comput. Syst., October, 2023

V-WAFA: An Endurance Variation Aware Fine-Grained Allocator for Persistent Memory.
IEEE Trans. Computers, April, 2023

Optimizing Data Placement for Hybrid SRAM+Racetrack Memory SPM in Embedded Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., March, 2023

Loop interchange and tiling for multi-dimensional loops to minimize write operations on NVMs.
J. Syst. Archit., February, 2023

Hardware-aware neural architecture search for stochastic computing-based neural networks on tiny devices.
J. Syst. Archit., February, 2023

Rapid recovery of program execution under power failures for embedded systems with NVM.
Microprocess. Microsystems, 2023

A Prototype of Efficient Learning System for Objective-Driven Learners.
Proceedings of the 12th IEEE International Conference on Educational and Information Technology, 2023

FlashDAM: Flexible I/O Throttling for the User Experience of Mobile Systems.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

Optimizing Data Layout for Racetrack Memory in Embedded Systems.
Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

2022
Tail Latency Optimization for LDPC-Based High-Density and Low-Cost Flash Memory Devices.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Transient computing for energy harvesting systems: A survey.
J. Syst. Archit., 2022

Read latency variation aware performance optimization on high-density NAND flash based storage systems.
CCF Trans. High Perform. Comput., 2022

Fairness Scheduling for Tasks with Different Real-time Level on Heterogeneous Systems.
Proceedings of the 28th IEEE International Conference on Parallel and Distributed Systems, 2022

Pseudo-Log: Restore Global Data Facing Power Failures with Minimum NVM Write.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

Efficient Checkpoint under Unstable Power Supplies on NVM based Devices.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

CDB: critical data backup design for consumer devices with high-density flash based hybrid storage.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Optimal Loop Tiling for Minimizing Write Operations on NVMs with Complete Memory Latency Hiding.
Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

BSC: Block-based Stochastic Computing to Enable Accurate and Efficient TinyML.
Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

2021
On the Design of Minimal-Cost Pipeline Systems Satisfying Hard/Soft Real-Time Constraints.
IEEE Trans. Emerg. Top. Comput., 2021

Exploring Efficient Architectures on Remote In-Memory NVM over RDMA.
ACM Trans. Embed. Comput. Syst., 2021

Contour: A Process Variation Aware Wear-Leveling Mechanism for Inodes of Persistent Memory File Systems.
IEEE Trans. Computers, 2021

Optimizing the data placement and scheduling on multi-port DWM in multi-core embedded system.
J. Syst. Archit., 2021

Performance optimization for parallel systems with shared DWM via retiming, loop scheduling, and data placement.
J. Syst. Archit., 2021

An Empirical Study of NVM-based File System.
Proceedings of the 10th IEEE Non-Volatile Memory Systems and Applications Symposium, 2021

Understanding and Optimizing Hybrid SSD with High-Density and Low-Cost Flash Memory.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Relaxed Placement: Minimizing Shift Operations for Racetrack Memory in Hybrid SPM.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

SFP: Smart File-Aware Prefetching for Flash based Storage Systems.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Accommodating Transformer onto FPGA: Coupling the Balanced Model Compression and FPGA-Implementation Optimization.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

SAC: A Stream Aware Write Cache Scheme for Multi-Streamed Solid State Drives.
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

2020
Hardware/Software Co-Exploration of Neural Architectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Multigranularity Space Management Scheme for Accelerating the Write Performance of In-Memory File Systems.
IEEE Syst. J., 2020

Optimizing synchronization mechanism for block-based file systems using persistent memory.
Future Gener. Comput. Syst., 2020

Towards the design of efficient hash-based indexing scheme for growing databases on non-volatile memory.
Future Gener. Comput. Syst., 2020

HydraFS: an efficient NUMA-aware in-memory file system.
Clust. Comput., 2020

Architectural Exploration on Racetrack Memories.
Proceedings of the 33rd IEEE International System-on-Chip Conference, 2020

A Zero-Energy Consumption Scheme for System Suspend to Limited NVM.
Proceedings of the 9th Non-Volatile Memory Systems and Applications Symposium, 2020

Optimizing Data Placement for Hybrid SPM with SRAM and Racetrack Memory.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

Unified-TP: A Unified TLB and Page Table Cache Structure for Efficient Address Translation.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

Latency Variation Aware Read Performance Optimization on 3D High Density NAND Flash Memory.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

Optimizing Performance of Persistent Memory File Systems using Virtual Superpages.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Efficient Multi-Grained Wear Leveling for Inodes of Persistent Memory File Systems.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Access Characteristic Guided Partition for Read Performance Improvement on Solid State Drives.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Co-Exploring Neural Architecture and Network-on-Chip Design for Real-Time Artificial Intelligence.
Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020

2019
Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference.
ACM Trans. Embed. Comput. Syst., 2019

On the Design of Time-Constrained and Buffer-Optimal Self-Timed Pipelines.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Hardware/Software Co-Exploration of Neural Architectures.
CoRR, 2019

Optimizing fragmentation and segment cleaning for CPS based storage devices.
Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, 2019

Optimizing Tail Latency of LDPC based Flash Memory Storage Systems Via Smart Refresh.
Proceedings of the 2019 IEEE International Conference on Networking, 2019

1+1>2: variation-aware lifetime enhancement for embedded 3D NAND flash systems.
Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, 2019

XFER: A Novel Design to Achieve Super-Linear Performance on Multiple FPGAs for Real-Time AI.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

A Wear-Leveling-Aware Fine-Grained Allocator for Non-Volatile Memory.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

2018
Write Energy Reduction for PCM via Pumping Efficiency Improvement.
ACM Trans. Storage, 2018

Exploiting Chip Idleness for Minimizing Garbage Collection - Induced Chip Access Conflict on SSDs.
ACM Trans. Design Autom. Electr. Syst., 2018

Heterogeneous FPGA-Based Cost-Optimal Design for Timing-Constrained CNNs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Exploiting Parallelism for Access Conflict Minimization in Flash-Based Solid State Drives.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Towards the Design of Efficient and Consistent Index Structure with Minimal Write Activities for Non-Volatile Memory.
IEEE Trans. Computers, 2018

带磨损均衡的小粒度非易失性内存管理机制 (In-page Wear-leveling Memory Management Based on Non-volatile Memory).
计算机科学, 2018

UMFS: An efficient user-space file system for non-volatile memory.
J. Syst. Archit., 2018

Synthesizing distributed pipelining systems with timing constraints via optimal functional unit assignment and communication selection.
J. Comput. Sci., 2018

DWARM: A wear-aware memory management scheme for in-memory file systems.
Future Gener. Comput. Syst., 2018

Write-Aware Data Allocation on Heterogeneous Memory Architecture with Minimum Cost.
Proceedings of the 24th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, 2018

An Efficient File System for Hybrid In-Memory NVM and Block Devices.
Proceedings of the IEEE 7th Non-Volatile Memory Systems and Applications Symposium, 2018

On the Design of Reliable Heterogeneous Systems via Checkpoint Placement and Core Assignment.
Proceedings of the 2018 on Great Lakes Symposium on VLSI, 2018

An Efficient Cache Management Scheme for Capacitor Equipped Solid State Drives.
Proceedings of the 2018 on Great Lakes Symposium on VLSI, 2018

Efficient wear leveling for inodes of file systems on persistent memories.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Energy, latency, and lifetime improvements in MLC NVM with enhanced WOM code.
Proceedings of the 23rd Asia and South Pacific Design Automation Conference, 2018

2017
Building NVRAM-Aware Swapping Through Code Migration in Mobile Devices.
IEEE Trans. Parallel Distributed Syst., 2017

Durable Address Translation in PCM-Based Flash Storage Systems.
IEEE Trans. Parallel Distributed Syst., 2017

Optimal Functional-Unit Assignment for Heterogeneous Systems Under Timing Constraint.
IEEE Trans. Parallel Distributed Syst., 2017

FoToNoC: A Folded Torus-Like Network-on-Chip Based Many-Core Systems-on-Chip in the Dark Silicon Era.
IEEE Trans. Parallel Distributed Syst., 2017

crowddeliver: Planning City-Wide Package Delivery Paths Leveraging the Crowd of Taxis.
IEEE Trans. Intell. Transp. Syst., 2017

Asymmetric Error Rates of Cell States Exploration for Performance Improvement on Flash Memory Based Storage Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

面向内存文件系统的数据一致性更新机制研究 (Research on Data Consistency for In-memory File Systems).
计算机科学, 2017

Revisiting swapping in mobile systems with SwapBench.
Future Gener. Comput. Syst., 2017

Hardware-software collaboration for dark silicon heterogeneous many-core systems.
Future Gener. Comput. Syst., 2017

Refinery swap: An efficient swap mechanism for hybrid DRAM-NVM systems.
Future Gener. Comput. Syst., 2017

Efficient assignment algorithms to minimize operation cost for supply chain networks in agile manufacturing.
Comput. Ind. Eng., 2017

BOSS: An Efficient Data Distribution Strategy for Object Storage Systems With Hybrid Devices.
IEEE Access, 2017

Towards the design of optimal range assignment for elevator groups under fluctuant traffic loads.
Proceedings of the 23rd IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, 2017

Improving read performance via selective Vpass reduction on high density 3D NAND flash memory.
Proceedings of the IEEE 6th Non-Volatile Memory Systems and Applications Symposium, 2017

UDORN: A design framework of persistent in-memory key-value database for NVM.
Proceedings of the IEEE 6th Non-Volatile Memory Systems and Applications Symposium, 2017

Optimal functional unit assignment and voltage selection for pipelined MPSoC with guaranteed probability on time performance.
Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, 2017

An Efficient Racetrack Memory-Based Processing-in-Memory Architecture for Convolutional Neural Networks.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Efficient Task Assignment and Scheduling on MPSOC with STT-RAM Based Hybrid SPMs Considering Data Allocation.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Exploiting Process Variation for Read Performance Improvement on LDPC Based Flash Memory Storage Systems.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

A PV aware data placement scheme for read performance improvement on LDPC based flash memory: work-in-progress.
Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion, 2017

Solving dynamic vehicle routing problem via evolutionary search with learning capability.
Proceedings of the 2017 IEEE Congress on Evolutionary Computation, 2017

Improving LDPC performance via asymmetric sensing level placement on flash memory.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

Dark silicon-aware hardware-software collaborated design for heterogeneous many-core systems.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016
Applying trust enhancements to reactive routing protocols in mobile ad hoc networks.
Wirel. Networks, 2016

Properties of Self-Timed Ring Architectures for Deadlock-Free and Consistent Configuration Reaching Maximum Throughput.
J. Signal Process. Syst., 2016

Data Allocation with Minimum Cost under Guaranteed Probability for Multiple Types of Memories.
J. Signal Process. Syst., 2016

Application Mapping and Scheduling for Network-on-Chip-Based Multiprocessor System-on-Chip With Fine-Grain Communication Optimization.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Exploiting Process Variation for Write Performance Improvement on NAND Flash Memory Storage Systems.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Efficient Data Placement for Improving Data Access Performance on Domain-Wall Memory.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Quality-of-Experience-Oriented Autonomous Intersection Control in Vehicular Networks.
IEEE Trans. Intell. Transp. Syst., 2016

Energy-Efficient In-Memory Paging for Smartphones.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Retention Trimming for Lifetime Improvement of Flash Memory Storage Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Morphable Resistive Memory Optimization for Mobile Virtualization.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

A Time, Energy, and Area Efficient Domain Wall Memory-Based SPM for Embedded Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

A New Design of In-Memory File System Based on File Virtual Address Framework.
IEEE Trans. Computers, 2016

Write reconstruction for write throughput improvement on MLC PCM based main memory.
J. Syst. Archit., 2016

A compiler assisted wear leveling for morphable PCM in embedded systems.
J. Syst. Archit., 2016

A unified framework for designing high performance in-memory and hybrid memory file systems.
J. Syst. Archit., 2016

Light-weight trust-enhanced on-demand multi-path routing in mobile ad hoc networks.
J. Netw. Comput. Appl., 2016

Worst-Case Finish Time Analysis for DAG-Based Applications in the Presence of Transient Faults.
J. Comput. Sci. Technol., 2016

A Convex Optimization Based Autonomous Intersection Control Strategy in Vehicular Cyber-Physical Systems.
Proceedings of the 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, 2016

Evolutionary multitasking in combinatorial search spaces: A case study in capacitated vehicle routing problem.
Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence, 2016

Performance Optimization for In-Memory File Systems on NUMA Machines.
Proceedings of the 17th International Conference on Parallel and Distributed Computing, 2016

The design and implementation of an efficient user-space in-memory file system.
Proceedings of the 5th Non-Volatile Memory Systems and Applications Symposium, 2016

Minimizing cell-to-cell interference by exploiting differential bit impact characteristics of scaled MLC NAND flash memories.
Proceedings of the 5th Non-Volatile Memory Systems and Applications Symposium, 2016

Optimizing Data Placement of MapReduce on Ceph-Based Framework under Load-Balancing Constraint.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

Optimal Functional Assignment and Communication Selection under Timing Constraint for Self-Timed Pipelines.
Proceedings of the 13th International Conference on Embedded Software and Systems, 2016

The Design and Implementation of an Efficient Data Consistency Mechanism for In-Memory File Systems.
Proceedings of the 13th International Conference on Embedded Software and Systems, 2016

Cooperative Information Services Based on Predictable Trajectories in Bus-VANETs.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

Access Characteristic Guided Read and Write Cost Regulation for Performance Improvement on Flash Memory.
Proceedings of the 14th USENIX Conference on File and Storage Technologies, 2016

The design of an efficient swap mechanism for hybrid DRAM-NVM systems.
Proceedings of the 2016 International Conference on Embedded Software, 2016

Optimal functional-unit assignment and buffer placement for probabilistic pipelines.
Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2016

A preliminary study on distance selection in probabilistic memetic framework for capacitated arc routing problem.
Proceedings of the IEEE Congress on Evolutionary Computation, 2016

The Design and Implementation of a High-Performance Hybrid Memory File System.
Proceedings of the International Conference on Advanced Cloud and Big Data, 2016

ApproxMap: On task allocation and scheduling for resilient applications.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

FoToNoC: A hierarchical management strategy based on folded lorus-like Network-on-Chip for dark silicon many-core systems.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

2015
Reliability-Guaranteed Task Assignment and Scheduling for Heterogeneous Multiprocessors Considering Timing Constraint.
J. Signal Process. Syst., 2015

Low Overhead Software Wear Leveling for Hybrid PCM + DRAM Main Memory on Embedded Systems.
IEEE Trans. Very Large Scale Integr. Syst., 2015

Optimizing Task and Data Assignment on Multi-Core Systems with Multi-Port SPMs.
IEEE Trans. Parallel Distributed Syst., 2015

Power Efficiency for Hardware/Software Partitioning with Time and Area Constraints on MPSoC.
Int. J. Parallel Program., 2015

Designing an efficient persistent in-memory file system.
Proceedings of the IEEE Non-Volatile Memory System and Applications Symposium, 2015

Mixer: software enabled wear leveling for morphable PCM in embedded systems.
Proceedings of the IEEE Non-Volatile Memory System and Applications Symposium, 2015

Vehicle Assisted Data Update for Temporal Information Service in Vehicular Networks.
Proceedings of the IEEE 18th International Conference on Intelligent Transportation Systems, 2015

Efficient Scheduling with Intensive In-Memory File Accesses Considering Bandwidth Constraint on Memory Bus.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

An Efficient Cluster-Based Data Sharing Algorithm for Bidirectional Road Scenario in Vehicular Ad-hoc Networks.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

Prevent Deadlock and Remove Blocking for Self-Timed Systems.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

SwapBench: The Easy Way to Demystify Swapping in Mobile Systems.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Traffic-Aware Application Mapping for Network-on-Chip Based Multiprocessor System-on-Chip.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

On the Design of High-Performance and Energy-Efficient Probabilistic Self-Timed Systems.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Realistic Task Parallelization of the H.264 Decoding Algorithm for Multiprocessors.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

An Efficient Technique for Chip Temperature Optimization of Multiprocessor Systems in the Dark Silicon Era.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

User Experience Enhanced Task Scheduling and Processor Frequency Scaling for Energy-Sensitive Mobile Devices.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

<i>n</i>Code: limiting harmful writes to emerging mobile NVRAM through code swapping.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Maximizing IO performance via conflict reduction for flash memory storage systems.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Area and performance co-optimization for domain wall memory in application-specific embedded systems.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Optimizing data placement for reducing shift operations on domain wall memories.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Balloonfish: Utilizing morphable resistive memory in mobile virtualization.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

2014
Minimizing System Cost with Efficient Task Assignment on Heterogeneous Multicore Processors Considering Time Constraint.
IEEE Trans. Parallel Distributed Syst., 2014

Scheduling Temporal Data with Dynamic Snapshot Consistency Requirement in Vehicular Cyber-Physical Systems.
ACM Trans. Embed. Comput. Syst., 2014

Management and optimization for nonvolatile memory-based hybrid scratchpad memory on multicore embedded processors.
ACM Trans. Embed. Comput. Syst., 2014

Application-Specific Wear Leveling for Extending Lifetime of Phase Change Memory in Embedded Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

Scheduling to Optimize Cache Utilization for Non-Volatile Main Memories.
IEEE Trans. Computers, 2014

Hybrid particle swarm optimization for parameter estimation of Muskingum model.
Neural Comput. Appl., 2014

Applying link stability estimation mechanism to multicast routing in MANETs.
J. Syst. Archit., 2014

A space allocation and reuse strategy for PCM-based embedded systems.
J. Syst. Archit., 2014

Scan-Based Attack on Stream Ciphers: A Case Study on eSTREAM Finalists.
J. Comput. Sci. Technol., 2014

A Partition-based Mechanism for Reducing Energy in Phase Change Memory.
J. Comput., 2014

Optimizing Data Distribution for Loops on Embedded Multicore with Scratch-Pad Memory.
J. Comput., 2014

Estimating parameters of Muskingum Model using an Adaptive Hybrid PSO Algorithm.
Int. J. Pattern Recognit. Artif. Intell., 2014

Efficient fault-tolerant scheduling on multiprocessor systems via replication and deallocation.
Int. J. Embed. Syst., 2014

Research of trust model based on fuzzy theory in mobile ad hoc networks.
IET Inf. Secur., 2014

Efficient grouping-based mapping and scheduling on heterogeneous cluster architectures.
Comput. Electr. Eng., 2014

Taxi Exp: A Novel Framework for City-Wide Package Express Shipping via Taxi Crowd Sourcing.
Proceedings of the 2014 IEEE 11th Intl Conf on Ubiquitous Intelligence and Computing and 2014 IEEE 11th Intl Conf on Autonomic and Trusted Computing and 2014 IEEE 14th Intl Conf on Scalable Computing and Communications and Its Associated Workshops, 2014

Messages from the conference chairs.
Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, 2014

Energy efficient routing techniques with guaranteed reliability based on multi-level uncertain graph.
Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, 2014

On self-timed ring for consistent mapping and maximum throughput.
Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, 2014

Minimum-cost data allocation with guaranteed probability on multiple types of memory.
Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, 2014

Joint Convergecast and Power Allocation in Wireless Sensor Networks.
Proceedings of the 15th International Conference on Parallel and Distributed Computing, 2014

Exploiting parallelism in I/O scheduling for access conflict minimization in flash-based solid state drives.
Proceedings of the IEEE 30th Symposium on Mass Storage Systems and Technologies, 2014

An Improved Thermal Model for Static Optimization of Application Mapping and Scheduling in Multiprocessor System-on-Chip.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2014

DR. Swap: energy-efficient paging for smartphones.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

Exploit asymmetric error rates of cell states to improve the performance of flash memory storage systems.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

Building high-performance smartphones via non-volatile memory: The swap approach.
Proceedings of the 2014 International Conference on Embedded Software, 2014

Retention Trimming for Wear Reduction of Flash Memory Storage Systems.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Efficient feasibility analysis of DAG scheduling with real-time constraints in the presence of faults.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013
Optimizing Data Placement of Loops for Energy Minimization with Multiple Types of Memories.
J. Signal Process. Syst., 2013

SAFE: A Source Deduplication Framework for Efficient Cloud Backup Services.
J. Signal Process. Syst., 2013

Algorithms to Minimize Data Transfer for Code Update on Wireless Sensor Network.
J. Signal Process. Syst., 2013

Efficient Loop Scheduling for Chip Multiprocessors with Non-Volatile Main Memory.
J. Signal Process. Syst., 2013

Data Allocation Optimization for Hybrid Scratch Pad Memory With SRAM and Nonvolatile Memory.
IEEE Trans. Very Large Scale Integr. Syst., 2013

Write activity reduction on non-volatile main memories for embedded chip multiprocessors.
ACM Trans. Embed. Comput. Syst., 2013

Data Placement and Duplication for Embedded Multicore Systems With Scratch Pad Memory.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Energy-aware preemptive scheduling algorithm for sporadic tasks on DVS platform.
Microprocess. Microsystems, 2013

Minimizing accumulative memory load cost on multi-core DSPs with multi-level memory.
J. Syst. Archit., 2013

Accurate age counter for wear leveling on non-volatile based main memory.
Des. Autom. Embed. Syst., 2013

A content-aware writing mechanism for reducing energy on non-volatile memory based embedded storage systems.
Des. Autom. Embed. Syst., 2013

Effective file data-block placement for different types of page cache on hybrid main memory architectures.
Des. Autom. Embed. Syst., 2013

Impact of trust model on on-demand multi-path routing in mobile ad hoc networks.
Comput. Commun., 2013

Trust prediction and trust-based source routing in mobile ad hoc networks.
Ad Hoc Networks, 2013

Optimal data allocation algorithm for loop-centric applications on scratch-PAD memories.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2013

Optimizing task assignment for heterogeneous multiprocessor system with guaranteed reliability and timing constraint.
Proceedings of the 2013 IEEE 19th International Conference on Embedded and Real-Time Computing Systems and Applications, 2013

A space-based wear leveling for PCM-based embedded systems.
Proceedings of the 2013 IEEE 19th International Conference on Embedded and Real-Time Computing Systems and Applications, 2013

Efficient task assignment and scheduling for MPSoC DSPS with VS-SPM considering concurrent accesses through data allocation.
Proceedings of the IEEE International Conference on Acoustics, 2013

Software enabled wear-leveling for hybrid PCM main memory on embedded systems.
Proceedings of the Design, Automation and Test in Europe, 2013

Curling-PCM: Application-specific wear leveling for phase change memory based embedded systems.
Proceedings of the 18th Asia and South Pacific Design Automation Conference, 2013

2012
Minimizing Access Cost for Multiple Types of Memory Units in Embedded Systems Through Data Allocation and Scheduling.
IEEE Trans. Signal Process., 2012

Randomized execution algorithms for smart cards to resist power analysis attacks.
J. Syst. Archit., 2012

Memory access schedule minimization for embedded systems.
J. Syst. Archit., 2012

A hierarchical reliability-driven scheduling algorithm in grid systems.
J. Parallel Distributed Comput., 2012

General Loop Fusion Technique with Improved Timing Performance and Minimal Code Size.
Int. J. Comput. Their Appl., 2012

Node trust evaluation in mobile ad hoc networks based on multi-dimensional fuzzy and Markov SCGM(1, 1) model.
Comput. Commun., 2012

Reducing the De-linearization of Data Placement to Improve Deduplication Performance.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Optimizing Data Allocation for Loops on Embedded Systems with Scratch-Pad Memory.
Proceedings of the 2012 IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, 2012

Optimal Assignment for Tree-Structure Task Graph on Heterogeneous Multicore Systems Considering Time Constraint.
Proceedings of the IEEE 6th International Symposium on Embedded Multicore/Manycore SoCs, 2012

Optimizing Data Allocation and Memory Configuration for Non-Volatile Memory Based Hybrid SPM on Embedded CMPs.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Loop scheduling optimization for chip-multiprocessors with non-volatile main memory.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Efficient Task Assignment on Heterogeneous Multicore Systems Considering Communication Overhead.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2012

PRR: A low-overhead cache replacement algorithm for embedded processors.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

MGC: Multiple graph-coloring for non-volatile memory based hybrid Scratchpad Memory.
Proceedings of the 16th Workshop on Interaction between Compilers and Computer Architectures, 2012

2011
Loop Distribution and Fusion with Timing and Code Size Optimization.
J. Signal Process. Syst., 2011

Energy-Efficient Joint Scheduling and Application-Specific Interconnection Design.
IEEE Trans. Very Large Scale Integr. Syst., 2011

Overhead-aware energy optimization for real-time streaming applications on multiprocessor System-on-Chip.
ACM Trans. Design Autom. Electr. Syst., 2011

2011 ACM TODAES best paper award.
ACM Trans. Design Autom. Electr. Syst., 2011

Write Activity Minimization for Nonvolatile Main Memory Via Scheduling and Recomputation.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2011

Variable assignment and instruction scheduling for processor with multi-module memory.
Microprocess. Microsystems, 2011

Preface.
J. Comput. Sci. Technol., 2011

Optimal Data Placement for Memory Architectures with Scratch-Pad Memories.
Proceedings of the IEEE 10th International Conference on Trust, 2011

Adaptive and Cost-Optimal Parallel Algorithm for the 0-1 Knapsack Problem.
Proceedings of the 19th International Euromicro Conference on Parallel, 2011

Optimal Data Allocation for Scratch-Pad Memory on Embedded Multi-core Systems.
Proceedings of the International Conference on Parallel Processing, 2011

A Novel Energy-Aware Fault Tolerance Mechanism for Wireless Sensor Networks.
Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications (GreenCom), 2011

Towards energy efficient hybrid on-chip Scratch Pad Memory with non-volatile memory.
Proceedings of the Design, Automation and Test in Europe, 2011

2010
Variable Partitioning and Scheduling for MPSoC with Virtually Shared Scratch Pad Memory.
J. Signal Process. Syst., 2010

Variable Length Pattern Matching for Hardware Network Intrusion Detection System.
J. Signal Process. Syst., 2010

Dynamic and Leakage Energy Minimization With Soft Real-Time Loop Scheduling and Voltage Assignment.
IEEE Trans. Very Large Scale Integr. Syst., 2010

Iterational retiming with partitioning: Loop scheduling with complete memory latency hiding.
ACM Trans. Embed. Comput. Syst., 2010

Algorithms for Optimally Arranging Multicore Memory Structures.
EURASIP J. Embed. Syst., 2010

Optimal scheduling to minimize non-volatile memory access time with hardware cache.
Proceedings of the 18th IEEE/IFIP VLSI-SoC 2010, 2010

Minimizing write activities to non-volatile memory via scheduling and recomputation.
Proceedings of the IEEE 8th Symposium on Application Specific Processors, 2010

Write activity reduction on flash main memory via smart victim cache.
Proceedings of the 20th ACM Great Lakes Symposium on VLSI 2009, 2010

Reducing write activities on non-volatile memories in embedded CMPs via data migration and recomputation.
Proceedings of the 47th Design Automation Conference, 2010

Energy efficient joint scheduling and multi-core interconnect design.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

Co-optimization of memory access and task scheduling on MPSoC architectures with multi-level memory.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

2009
Optimizing scheduling and intercluster connection for application-specific DSP processors.
IEEE Trans. Signal Process., 2009

Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems.
ACM Trans. Design Autom. Electr. Syst., 2009

Loop scheduling and bank type assignment for heterogeneous multi-bank memory.
J. Parallel Distributed Comput., 2009

Optimizing parallelism for nested loops with iterational and instructional retiming.
J. Embed. Comput., 2009

Energy minimization for heterogeneous wireless sensor networks.
J. Embed. Comput., 2009

Fast and Noniterative Scheduling in Input-Queued Switches.
Int. J. Commun. Netw. Syst. Sci., 2009

Fast and noniterative scheduling in input-queued switches: Supporting QoS.
Comput. Commun., 2009

Heterogeneous real-time embedded software optimization considering hardware platform.
Proceedings of the 2009 ACM Symposium on Applied Computing (SAC), 2009

Reprogramming with Minimal Transferred Data on Wireless Sensor Network.
Proceedings of the IEEE 6th International Conference on Mobile Adhoc and Sensor Systems, 2009

Global Variable Partition with Virtually Shared Scratch Pad Memory to Minimize Schedule Length.
Proceedings of the ICPPW 2009, 2009

Energy Minimization and Latency Hiding for Heterogeneous Parallel Memory.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Minimizing Memory Access Schedule for Memories.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Joint Sleep Scheduling and Mode Assignment in Wireless Cyber-Physical Systems.
Proceedings of the 29th IEEE International Conference on Distributed Computing Systems Workshops (ICDCS 2009 Workshops), 2009

Rotation Scheduling and Voltage Assignment to Minimize Energy for SoC.
Proceedings of the 12th IEEE International Conference on Computational Science and Engineering, 2009

ILP optimal scheduling for multi-module memory.
Proceedings of the 7th International Conference on Hardware/Software Codesign and System Synthesis, 2009

Loop Fusion Technique with Minimal Memory Cost via Retiming.
Proceedings of the ISCA 24th International Conference on Computers and Their Applications, 2009

Computation and data transfer co-scheduling for interconnection bus minimization.
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

2008
Guest Editorial: Special Issue on Design and Programming of Signal Processors for Multimedia Communication.
J. Signal Process. Syst., 2008

Optimized Address Assignment With Array and Loop Transformations for Minimizing Schedule Length.
IEEE Trans. Circuits Syst. I Regul. Pap., 2008

Timing optimization via nest-loop pipelining considering code size.
Microprocess. Microsystems, 2008

Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP.
J. Parallel Distributed Comput., 2008

Adaptive attenuation factor model for localization in wireless sensor networks.
Int. J. Pervasive Comput. Commun., 2008

Minimizing Transferred Data for Code Update on Wireless Sensor Network.
Proceedings of the Wireless Algorithms, 2008

Energy Efficient Operating Mode Assignment for Real-Time Tasks in Wireless Embedded Systems.
Proceedings of the Fourteenth IEEE Internationl Conference on Embedded and Real-Time Computing Systems and Applications, 2008

Address assignment sensitive variable partitioning and scheduling for DSPS with multiple memory banks.
Proceedings of the IEEE International Conference on Acoustics, 2008

Failure Rate Minimization with Multiple Function Unit Scheduling for Heterogeneous WSNs.
Proceedings of the Global Communications Conference, 2008. GLOBECOM 2008, New Orleans, LA, USA, 30 November, 2008

Loop scheduling and assignment to minimize energy while hiding latency for heterogeneous multi-bank memory.
Proceedings of the FPL 2008, 2008

Dynamic and Leakage Power Minimization with Loop Voltage Scheduling and Assignment.
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008

Effective Loop Partitioning and Scheduling under Memory and Register Dual Constraints.
Proceedings of the Design, Automation and Test in Europe, 2008

QoS for Networked Heterogeneous Real-Time Embedded Systems.
Proceedings of the ISCA 21st International Conference on Parallel and Distributed Computing and Communication Systems, 2008

2007
Maximize Parallelism Minimize Overhead for Nested Loops via Loop Striping.
J. VLSI Signal Process., 2007

Voltage Assignment with Guaranteed Probability Satisfying Timing Constraint for Real-time Multiproceesor DSP.
J. VLSI Signal Process., 2007

Real-Time Dynamic Voltage Loop Scheduling for Multi-Core Embedded Systems.
IEEE Trans. Circuits Syst. II Express Briefs, 2007

Universal Routing and Performance Assurance for Distributed Networks.
J. Interconnect. Networks, 2007

An efficient algorithm for dynamic shortest path tree update in network routing.
J. Commun. Networks, 2007

Analysis and algorithms design for the partition of large-scale adaptive mobile wireless networks.
Comput. Commun., 2007

Real-Time Loop Scheduling with Leakage Energy Minimization for Embedded VLIW DSP Processors.
Proceedings of the 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007), 2007

Energy-Aware Online Algorithm to Satisfy Sampling Rates with Guaranteed Probability for Sensor Applications.
Proceedings of the High Performance Computing and Communications, 2007

Parallel Network Intrusion Detection on Reconfigurable Platforms.
Proceedings of the Embedded and Ubiquitous Computing, International Conference, 2007

Applying Situation Awareness to Mobile Proactive Information Delivery.
Proceedings of the Emerging Directions in Embedded and Ubiquitous Computing, 2007

Energy minimization with soft real-time and DVS for uniprocessor and multiprocessor embedded systems.
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

2006
Loop scheduling with timing and switching-activity minimization for VLIW DSP.
ACM Trans. Design Autom. Electr. Syst., 2006

Optimizing Address Assignment and Scheduling for DSPs With Multiple Functional Units.
IEEE Trans. Circuits Syst. II Express Briefs, 2006

Design Exploration With Imprecise Latency and Register Constraints.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2006

Security Protection and Checking for Embedded System Integration against Buffer Overflow Attacks via Hardware/Software.
IEEE Trans. Computers, 2006

Design optimization and space minimization considering timing and code size via retiming and unfolding.
Microprocess. Microsystems, 2006

Hardware/software optimization for array & pointer boundary checking against buffer overflow attacks.
J. Parallel Distributed Comput., 2006

The fat-stack and universal routing in interconnection networks.
J. Parallel Distributed Comput., 2006

Time-constrained loop scheduling with minimal resources.
J. Embed. Comput., 2006

Algorithms and analysis of scheduling for loops with minimum switching.
Int. J. Comput. Sci. Eng., 2006

Loop Scheduling with Complete Memory Latency Hiding on Multi-core Architecture.
Proceedings of the 12th International Conference on Parallel and Distributed Systems, 2006

Loop Striping: Maximize Parallelism for Nested Loops.
Proceedings of the Embedded and Ubiquitous Computing, International Conference, 2006

Efficent Algorithm of Energy Minimization for Heterogeneous Wireless Sensor Network.
Proceedings of the Embedded and Ubiquitous Computing, International Conference, 2006

Voltage Assignment and Loop Scheduling for Energy Minimization while Satisfying Timing Constraint with Guaranteed Probability.
Proceedings of the 2006 IEEE International Conference on Application-Specific Systems, 2006

Optimizing Timing and Code Size Using Maximum Direct Loop Fusion.
Proceedings of the ISCA 19th International Conference on Parallel and Distributed Computing Systems, 2006

2005
Combining Extended Retiming and Unfolding for Rate-Optimal Graph Transformation.
J. VLSI Signal Process., 2005

Efficient Assignment and Scheduling for Heterogeneous DSP Systems.
IEEE Trans. Parallel Distributed Syst., 2005

Optimal Assignment with Guaranteed Confidence Probability for Trees on Heterogeneous DSP Systems.
Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2005

Static Scheduling of Split-Node Data-Flow Graphs.
Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2005

Efficient Array & Pointer Bound Checking Against Buffer Overflow Attacks via Hardware/Software.
Proceedings of the International Symposium on Information Technology: Coding and Computing (ITCC 2005), 2005

Maximum Loop Distribution and Fusion for Two-level Loops Considering Code Size.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005

A Fast Noniterative Scheduler for Input-Queued Switches with Unbuffered Crossbars.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005

An Active Detecting Method Against SYN Flooding Attack.
Proceedings of the 11th International Conference on Parallel and Distributed Systems, 2005

Minimizing Energy via Loop Scheduling and DVS for Multi-Core Embedded Systems.
Proceedings of the 11th International Conference on Parallel and Distributed Systems, 2005

Universal Routing in Distributed Networks.
Proceedings of the 11th International Conference on Parallel and Distributed Systems, 2005

Optimizing DSP scheduling via address assignment with array and loop transformation.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Optimizing Nested Loops with Iterational and Instructional Retiming.
Proceedings of the Embedded and Ubiquitous Computing, 2005

Parallel Embedded Systems: Optimizations and Challenges.
Proceedings of the Embedded and Ubiquitous Computing, 2005

Loop Distribution and Fusion with Timing and Code Size Optimization for Embedded DSPs.
Proceedings of the Embedded and Ubiquitous Computing, 2005

Iterational retiming: maximize iteration-level parallelism for nested loops.
Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2005

High-level synthesis for DSP applications using heterogeneous functional units.
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

Multi-level Loop Fusion with Minimal Code Size.
Proceedings of the ISCA 18th International Conference on Parallel and Distributed Computing Systems, 2005

A Feasible Baseline Architecture for Building and Evaluating Distributed Systems.
Proceedings of the ISCA 18th International Conference on Parallel and Distributed Computing Systems, 2005

2004
Efficient variable partitioning and scheduling for DSP processors with multiple memory modules.
IEEE Trans. Signal Process., 2004

A novel multiplexer-based low-power full adder.
IEEE Trans. Circuits Syst. II Express Briefs, 2004

Communication Scheduling With Re-Routing Based On Static And Hybrid Techniques.
J. Circuits Syst. Comput., 2004

Efficient Algorithms for Dynamic Update of Shortest Path Tree in Networking.
Int. J. Comput. Their Appl., 2004

Algorithms and analysis of scheduling for low-power high-performance DSP on VLIW processors.
Int. J. High Perform. Comput. Netw., 2004

Security Protection and Checking in Embedded System Integration Against Buffer Overflow Attacks.
Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04), 2004

Dynamic Update of Shortest Path Tree in OSPF.
Proceedings of the 7th International Symposium on Parallel Architectures, 2004

Approximation Algorithms Design for Disk Partial Covering Problem.
Proceedings of the 7th International Symposium on Parallel Architectures, 2004

Assignment and Scheduling of Real-time DSP Applications for Heterogeneous Functional Units.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Timing Optimization of Nested Loops Considering Code Size for DSP Applications.
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Dynamic shortest path tree update for multiple link state decrements.
Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004

Maintaining Comprehensive Resource Availability in P2P Networks.
Proceedings of the Grid and Cooperative Computing, 2004

Optimizing Address Assignment for Scheduling Embedded DSPs.
Proceedings of the Embedded and Ubiquitous Computing, 2004

Loop Scheduling for Real-Time DSPs with Minimum Switching Activities on Multiple-Functional-Unit Architectures.
Proceedings of the Embedded and Ubiquitous Computing, 2004

Efficient Scheduling for Design Exploration with Imprecise Latency and Register Constraints.
Proceedings of the Embedded and Ubiquitous Computing, 2004

General loop fusion technique for nested loops considering timing and code size.
Proceedings of the 2004 International Conference on Compilers, 2004

Design Exploration Framework Under Impreciseness Based on Register-Constrained Inclusion Scheduling.
Proceedings of the Advances in Computer Science, 2004

Switching-Activity Minimization on Instruction-Level Loop Scheduling for VLIWDSP Applications.
Proceedings of the 15th IEEE International Conference on Application-Specific Systems, 2004

Loop Fusion via Retiming for DSP Applications.
Proceedings of the ISCA 17th International Conference on Parallel and Distributed Computing Systems, 2004

2003
Code size reduction technique and implementation for software-pipelined DSP applications.
ACM Trans. Embed. Comput. Syst., 2003

Efficient Polynomial-Time Nested Loop Fusion with Full Parallelism.
Int. J. Comput. Their Appl., 2003

An Integrated Framework of Design Optimization and Space Minimization for DSP applications.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003

Loop scheduling for minimizing schedule length and switching activities.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003

Design space minimization with timing and code size optimization for embedded DSP.
Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2003

Register aware scheduling for distributed cache clustered architecture.
Proceedings of the 2003 Asia and South Pacific Design Automation Conference, 2003

Defending Embedded Systems Against Buffer Overflow via Hardware/Software.
Proceedings of the 19th Annual Computer Security Applications Conference (ACSAC 2003), 2003

Application-Specific Interconnection Network Design in Clustered DSP Processors.
Proceedings of the ISCA 16th International Conference on Parallel and Distributed Computing Systems, 2003

Design and Analysis of Improved Shortest Path Tree Update for Network Routing.
Proceedings of the ISCA 16th International Conference on Parallel and Distributed Computing Systems, 2003

2002
Partitioning and Scheduling DSP Applications with Maximal Memory Access Hiding.
EURASIP J. Adv. Signal Process., 2002

Analysis and Algorithms for Partitioning of Large-scale Adaptive Mobile Networks.
Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2002

Unfolding a Split-node Data-flow Graph.
Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2002

Optimal Code Size Reduction for Software-Pipelined and Unfolded Loops.
Proceedings of the 15th International Symposium on System Synthesis (ISSS 2002), 2002

Performance optimization of multiple memory architectures for DSP.
Proceedings of the 2002 International Symposium on Circuits and Systems, 2002

Variable Partitioning and Scheduling of Multiple Memory Architectures for DSP.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Optimal Code Size Reduction for Software-Pipelined Loops on DSP Applications.
Proceedings of the 31st International Conference on Parallel Processing (ICPP 2002), 2002

Minimizing resources in a repeating schedule for a split-node data-flow graph.
Proceedings of the 12th ACM Great Lakes Symposium on VLSI 2002, 2002

2001
Minimizing Average Schedule Length under Memory Constraints by Optimal Partitioning and Prefetching.
J. VLSI Signal Process., 2001

Estimating probabilistic timing performance for real-time embedded systems.
IEEE Trans. Very Large Scale Integr. Syst., 2001

Optimal loop scheduling for hiding memory latency based on two-level partitioning and prefetching.
IEEE Trans. Signal Process., 2001

Retiming synchronous data-flow graphs to reduce execution time.
IEEE Trans. Signal Process., 2001

Scheduling and partitioning for multiple loop nests.
Proceedings of the 14th International Symposium on Systems Synthesis, 2001

On area-efficient low power array multipliers.
Proceedings of the 2001 8th IEEE International Conference on Electronics, 2001

Implementing parallelism and scheduling data flow graphs on Java virtual machine.
Proceedings of the IEEE International Conference on Acoustics, 2001

Optimal partitioning and balanced scheduling with the maximal overlap of data footprints.
Proceedings of the 11th ACM Great Lakes Symposium on VLSI 2001, 2001

Minimum dynamic update for shortest path tree construction.
Proceedings of the Global Telecommunications Conference, 2001

Combined partitioning and data padding for scheduling multiple loop nests.
Proceedings of the 2001 International Conference on Compilers, 2001

Efficient Update of Shortest Path Algorithms for Network Routing.
Proceedings of the ISCA 14th International Conference on Parallel and Distributed Computing Systems, 2001

On Retiming Synchronous Data-Flow Graphs.
Proceedings of the ISCA 14th International Conference on Parallel and Distributed Computing Systems, 2001

Distributed Scaling Algorithm for FFT Computation Using Fixed-Point Arithmetic.
Proceedings of the ISCA 14th International Conference on Parallel and Distributed Computing Systems, 2001

2000
Properties and Algorithms for Unfolding of Probabilistic Data-Flow Graphs.
J. VLSI Signal Process., 2000

Communication Reduction in Multiple Multicasts Based on Hybrid Static-Dynamic Scheduling.
IEEE Trans. Parallel Distributed Syst., 2000

Optimizing Overall Loop Schedules Using Prefetching and Partitioning.
IEEE Trans. Parallel Distributed Syst., 2000

Efficient design exploration based on module utility selection.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2000

Probabilistic Loop Scheduling for Applications with Uncertain Execution Time.
IEEE Trans. Computers, 2000

Efficient module selections for finding highly acceptable designs based on inclusion scheduling.
J. Syst. Archit., 2000

Application Specific Image Compression for Virtual Conferencing.
Proceedings of the 2000 International Symposium on Information Technology (ITCC 2000), 2000

Efficient algorithms for acceptable design exploration.
Proceedings of the 10th ACM Great Lakes Symposium on VLSI 2000, 2000

Design and analysis of efficient application-specific on-line page replacement techniques.
Proceedings of the 10th ACM Great Lakes Symposium on VLSI 2000, 2000

Optimal two level partitioning and loop scheduling for hiding memory latency for DSP applications.
Proceedings of the 37th Conference on Design Automation, 2000

1999
Loop Scheduling and Partitions for Hiding Memory Latencies.
Proceedings of the 12th International Symposium on System Synthesis, 1999

Unfolding probabilistic data-flow graphs under different timing models.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Extended retiming: optimal scheduling via a graph-theoretical approach.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Efficient Algorithms for Finding Highly Acceptable Designs Based on Module-Utility Selections.
Proceedings of the 9th Great Lakes Symposium on VLSI (GLS-VLSI '99), 1999

A probabilistic performance metric for real-time system design.
Proceedings of the Seventh International Workshop on Hardware/Software Codesign, 1999

Rapid Prototyping Techniques for Fuzzy Controllers.
Proceedings of the Advances in Computing Science, 1999

1998
Reducing Data Hazards on Multi-pipelined DSP Architecture with Loop Scheduling.
J. VLSI Signal Process., 1998

Scheduling of uniform multidimensional systems under resource constraints.
IEEE Trans. Very Large Scale Integr. Syst., 1998

Special Section on Low-Power Electronics and Design.
IEEE Trans. Very Large Scale Integr. Syst., 1998

Probabilistic Loop Scheduling Considering Communication Overhead.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 1998

Optimizing Data Scheduling on Processor-in-Memory Arrays.
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998

Loop scheduling algorithms for power reduction.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

RCRS: A Framework for Loop Scheduling with Limited Number of Registers.
Proceedings of the 8th Great Lakes Symposium on VLSI (GLS-VLSI '98), 1998

1997
Communication-sensitive loop scheduling for DSP applications.
IEEE Trans. Signal Process., 1997

Scheduling Data-Flow Graphs via Retiming and Unfolding.
IEEE Trans. Parallel Distributed Syst., 1997

Multidimensional interleaving for synchronous circuit design optimization.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1997

Rotation scheduling: a loop pipelining algorithm.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1997

Hybrid static-dynamic communication scheduling for parallel systems.
Proceedings of the 1997 ACM symposium on Applied Computing, 1997

Probabilistic Rotation: Scheduling Graphs with Uncertain Execution Time.
Proceedings of the 1997 International Conference on Parallel Processing (ICPP '97), 1997

Algorithm and Hardware Support for Branch Anticipation.
Proceedings of the 7th Great Lakes Symposium on VLSI (GLS-VLSI '97), 1997

Scheduling with Confidence for Probabilistic Data-flow Graphs.
Proceedings of the 7th Great Lakes Symposium on VLSI (GLS-VLSI '97), 1997

1996
Hardware/Software co-design with the HMS framework.
J. VLSI Signal Process., 1996

Optimizing DSP flow graphs via schedule-based multidimensional retiming.
IEEE Trans. Signal Process., 1996

Achieving Full Parallelism Using Multidimensional Retiming.
IEEE Trans. Parallel Distributed Syst., 1996

Optimal Data Scheduling for Uniform Multidimensional Applications.
IEEE Trans. Computers, 1996

Polynomial-Time Nested Loop Fusion with Full Parallelism.
Proceedings of the 1996 International Conference on Parallel Processing, 1996

Synthesis of Multi-Dimensional Applications in VHDL.
Proceedings of the 1996 International Conference on Computer Design (ICCD '96), 1996

Optimal communication scheduling based on collision graph model.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

Hardware/software co-design for DSP applications via the HMS framework.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

A Parameterized Index-Generator for the Multi-Dimensional Interleaving Optimization.
Proceedings of the 6th Great Lakes Symposium on VLSI (GLS-VLSI '96), 1996

Rapid Prototyping for Fuzzy Systems.
Proceedings of the 6th Great Lakes Symposium on VLSI (GLS-VLSI '96), 1996

Fully Parallel Hardware/Software Codesign for Multi-Dimensional DSP Applications.
Proceedings of the Forth International Workshop on Hardware/Software Codesign, 1996

Static Communication Scheduling for Minimizing Collisions in Application Specific Parallel Systems.
Proceedings of the 1996 International Conference on Application-Specific Systems, 1996

1995
Static scheduling for synthesis of DSP algorithms on various models.
J. VLSI Signal Process., 1995

Multi-level partitioning and scheduling under local memory constraint.
Proceedings of the Seventh IEEE Symposium on Parallel and Distributed Processing, 1995

Architecture-Dependent Loop Scheduling via Communication-Sensitive Remapping.
Proceedings of the 1995 International Conference on Parallel Processing, 1995

Memory Efficient Fully Parallel Nested Loop Pipelining.
Proceedings of the 1995 International Conference on Parallel Processing, 1995

Multi-dimensional interleaving for time-and-memory design optimization.
Proceedings of the 1995 International Conference on Computer Design (ICCD '95), 1995

Push-up scheduling: Optimal polynomial-time resource constrained scheduling for multi-dimensional applications.
Proceedings of the 1995 IEEE/ACM International Conference on Computer-Aided Design, 1995

Memory/time optimization of 2-D filters.
Proceedings of the 1995 International Conference on Acoustics, 1995

Rate-optimal scheduling for cyclo-static and periodic schedules.
Proceedings of the 1995 International Conference on Acoustics, 1995

Improving self-timed pipeline ring performance through the addition of buffer loops.
Proceedings of the 5th Great Lakes Symposium on VLSI (GLS-VLSI '95), 1995

Bus minimization and scheduling of multi-chip systems.
Proceedings of the 5th Great Lakes Symposium on VLSI (GLS-VLSI '95), 1995

Optimizing synchronous systems for multi-dimensional applications.
Proceedings of the 1995 European Design and Test Conference, 1995

1994
Global node reduction of linear systems using ratio analysis.
Proceedings of the 7th International Symposium on High Level Synthesis, 1994

Partitioning and Retiming of Multi-Dimensional Systems.
Proceedings of the 1994 IEEE International Symposium on Circuits and Systems, ISCAS 1994, London, England, UK, May 30, 1994

Retiming and Clock Skew for Synchronous Systems.
Proceedings of the 1994 IEEE International Symposium on Circuits and Systems, ISCAS 1994, London, England, UK, May 30, 1994

Schedule-Based Multi-Dimensional Retiming on Data Flow Graphs.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

Full Parallelism in Uniform Nested Loops Using Multi-Dimensional Retiming.
Proceedings of the 1994 International Conference on Parallel Processing, 1994

Communication Sensitive Rotation Scheduling.
Proceedings of the Proceedings 1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors, 1994

Loop Pipelining for Scheduling Multi-Dimensional Systems via Rotation.
Proceedings of the 31st Conference on Design Automation, 1994

1993
Reconfigurability and Reliability of Systolic/Wavefront Arrays.
IEEE Trans. Computers, 1993

Maintaining bipartite matchings in the presence of failures.
Networks, 1993

Static Scheduling of Uniform Nested Loops.
Proceedings of the Seventh International Parallel Processing Symposium, 1993

Unified Static Scheduling on Various Models.
Proceedings of the 1993 International Conference on Parallel Processing, 1993

Efficient retiming and unfolding.
Proceedings of the IEEE International Conference on Acoustics, 1993

Rate-optimal static scheduling for DSP data-flow programs.
Proceedings of the Third Great Lakes Symposium on Design Automation of High Performance VLSI Systems, 1993

1992
Error detection in arrays via dependency graphs.
J. VLSI Signal Process., 1992

Retiming and Unfolding Data-Flow Graphs.
Proceedings of the 1992 International Conference on Parallel Processing, 1992

Run-time error detection in arrays based on the data-dependency graph.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

Unfolding and retiming data-flow DSP programs for RISC multiprocessor scheduling.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Explicit construction for reliable reconfigurable array architectures.
Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, 1991

Design for Easily Applying Test Vectors to Improve Delay Fault Coverage.
Proceedings of the 1991 IEEE/ACM International Conference on Computer-Aided Design, 1991


  Loading...