Hamid Sarbazi-Azad

Orcid: 0000-0003-4079-8603

Affiliations:
  • Sharif University of Technology, Department of Computer Engineering, Tehran, Iran


According to our database1, Hamid Sarbazi-Azad authored at least 355 papers between 1999 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Performance analysis and modeling for quantum computing simulation on distributed GPU platforms.
Quantum Inf. Process., November, 2024

Cross-core Data Sharing for Energy-efficient GPUs.
ACM Trans. Archit. Code Optim., September, 2024

An Efficient FPGA Architecture with Turn-Restricted Switch Boxes.
ACM Trans. Design Autom. Electr. Syst., May, 2024

H3DM: A High-bandwidth High-capacity Hybrid 3D Memory Design for GPUs.
Proc. ACM Meas. Anal. Comput. Syst., 2024

Exploiting Direct Memory Operands in GPU Instructions.
IEEE Comput. Archit. Lett., 2024

Tulip: Turn-Free Low-Power Network-on-Chip.
IEEE Comput. Archit. Lett., 2024

A High-bandwidth High-capacity Hybrid 3D Memory for GPUs.
Proceedings of the Abstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, 2024

Blenda: Dynamically-Reconfigurable Stacked DRAM.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

2023
Fast and scalable quantum computing simulation on multi-core and many-core platforms.
Quantum Inf. Process., May, 2023

MANA: Microarchitecting a Temporal Instruction Prefetcher.
IEEE Trans. Computers, March, 2023

Energy Consumption Analysis of Instruction Cache Prefetching Methods.
Proceedings of the International Symposium on Computer Architecture and High Performance Computing Workshops , 2023

OCRA: An Oblivious Congested Region Avoiding Routing Algorithm for 3D NoCs.
Proceedings of the 16th International Workshop on Network on Chip Architectures, 2023

Snake: A Variable-length Chain-based Prefetching for GPUs.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

CoolDRAM: An Energy-Efficient and Robust DRAM.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2023

Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022
OSM: Off-Chip Shared Memory for GPUs.
IEEE Trans. Parallel Distributed Syst., 2022

Quick Generation of SSD Performance Models Using Machine Learning.
IEEE Trans. Emerg. Top. Comput., 2022

NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs.
Proc. ACM Meas. Anal. Comput. Syst., 2022

A simple model for citation curve.
CoRR, 2022

Chapter Six - Evaluation of data prefetchers.
Adv. Comput., 2022

Chapter Five - State-of-the-art data prefetchers.
Adv. Comput., 2022

Chapter One - Traffic-load-aware virtual channel power-gating in network-on-chips.
Adv. Comput., 2022

Chapter Two - An efficient DVS scheme for on-chip networks.
Adv. Comput., 2022

Chapter Three - A power-performance balanced network-on-chip for mixed CPU-GPU systems.
Adv. Comput., 2022

Chapter Four - Beyond spatial or temporal prefetching.
Adv. Comput., 2022

Chapter Three - Temporal prefetching.
Adv. Comput., 2022

Chapter Two - Spatial prefetching.
Adv. Comput., 2022

Chapter One - Introduction to data prefetching.
Adv. Comput., 2022

Preface.
Adv. Comput., 2022

Morpheus: Extending the Last Level Cache Capacity in GPU Systems Using Idle GPU Core Resources.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

PIPF-DRAM: processing in precharge-free DRAM.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Efficient Nearest-Neighbor Data Sharing in GPUs.
ACM Trans. Archit. Code Optim., 2021

MANA: Microarchitecting an Instruction Prefetcher.
CoRR, 2021

Data-Aware Compression of Neural Networks.
IEEE Comput. Archit. Lett., 2021

Linearization error in synchronization of Kuramoto oscillators.
Appl. Math. Comput., 2021

PF-DRAM: A Precharge-Free DRAM Structure.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Understanding Power Consumption and Reliability of High-Bandwidth Memory with Voltage Underscaling.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

2020
An Enhanced Dynamic Weighted Incremental Technique for QoS Support in NoC.
ACM Trans. Parallel Comput., 2020

Enabling High-Capacity, Latency-Tolerant, and Highly-Concurrent GPU Register Files via Software/Hardware Cooperation.
CoRR, 2020

A Survey on Recent Hardware Data Prefetching Approaches with An Emphasis on Servers.
CoRR, 2020

Harnessing Pairwise-Correlating Data Prefetching With Runahead Metadata.
IEEE Comput. Archit. Lett., 2020

Chapter Six - Addressing issues with MLC phase-change memory.
Adv. Comput., 2020

Chapter Five - Handling hard errors in PCMs by using intra-line level schemes.
Adv. Comput., 2020

Chapter Four - Inter-line level schemes for handling hard errors in PCMs.
Adv. Comput., 2020

Chapter Three - Phase-change memory architectures.
Adv. Comput., 2020

Chapter Two - The emerging phase change memory.
Adv. Comput., 2020

Chapter One - Introduction to non-volatile memory technologies.
Adv. Comput., 2020

Divide and Conquer Frontend Bottleneck.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration.
Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2020

2019
Reducing Writebacks Through In-Cache Displacement.
ACM Trans. Design Autom. Electr. Syst., 2019

Highly Concurrent Latency-tolerant Register Files for GPUs.
ACM Trans. Comput. Syst., 2019

Energy-Efficient Permanent Fault Tolerance in Hard Real-Time Systems.
IEEE Trans. Computers, 2019

ITAP: Idle-Time-Aware Power Management for GPU Execution Units.
ACM Trans. Archit. Code Optim., 2019

Temperature-aware power consumption modeling in Hyperscale cloud data centers.
Future Gener. Comput. Syst., 2019

A Survey on PCM Lifetime Enhancement Schemes.
ACM Comput. Surv., 2019

Evaluation of Hardware Data Prefetchers on Server Processors.
ACM Comput. Surv., 2019

Code Layout Optimization for Near-Ideal Instruction Cache.
IEEE Comput. Archit. Lett., 2019

Focus on What is Needed: Area and Power Efficient FPGAs Using Turn-Restricted Switch Boxes.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

Bingo Spatial Data Prefetcher.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

2018
BARAN: Bimodal Adaptive Reconfigurable-Allocator Network-on-Chip.
ACM Trans. Parallel Comput., 2018

Domino Cache: An Energy-Efficient Data Cache for Modern Applications.
ACM Trans. Design Autom. Electr. Syst., 2018

Express Read in MLC Phase Change Memories.
ACM Trans. Design Autom. Electr. Syst., 2018

Classified Round Robin: A Simple Prioritized Arbitration to Equip Best Effort NoCs With Effective Hard QoS.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Fast Data Delivery for Many-Core Processors.
IEEE Trans. Computers, 2018

Improving MLC PCM Performance through Relaxed Write and Read for Intermediate Resistance Levels.
ACM Trans. Archit. Code Optim., 2018

ORIGAMI: A Heterogeneous Split Architecture for In-Memory Acceleration of Learning.
CoRR, 2018

Die-Stacked DRAM: Memory, Cache, or MemCache?
CoRR, 2018

Making Belady-Inspired Replacement Policies More Effective Using Expected Hit Count.
CoRR, 2018

Scale-Out Processors & Energy Efficiency.
CoRR, 2018

Parallelizing Bisection Root-Finding: A Case for Accelerating Serial Algorithms in Multicore Substrates.
CoRR, 2018

Cache Replacement Policy Based on Expected Hit Count.
IEEE Comput. Archit. Lett., 2018

Neda: Supporting Direct Inter-Core Neighbor Data Exchange in GPUs.
IEEE Comput. Archit. Lett., 2018

Chapter Six - Topology Specialization for Networks-on-Chip in the Dark Silicon Era.
Adv. Comput., 2018

Chapter One - Dark Silicon and the History of Computing.
Adv. Comput., 2018

Chapter Two - Revisiting Processor Allocation and Application Mapping in Future CMPs in Dark Silicon Era.
Adv. Comput., 2018

Domino Temporal Data Prefetcher.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

LTRF: Enabling High-Capacity Register Files for GPUs via Hardware/Software Cooperative Register Prefetching.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
Efficient Mapping of Applications for Future Chip-Multiprocessors in Dark Silicon Era.
ACM Trans. Design Autom. Electr. Syst., 2017

Endurance-Aware Security Enhancement in Non-Volatile Memories Using Compression and Selective Encryption.
IEEE Trans. Computers, 2017

An Efficient Temporal Data Prefetcher for L1 Caches.
IEEE Comput. Archit. Lett., 2017

BiNoCHS: Bimodal Network-on-Chip for CPU-GPU Heterogeneous Systems.
Proceedings of the Eleventh IEEE/ACM International Symposium on Networks-on-Chip, 2017

Data Block Partitioning for Recovering Stuck-at Faults in PCMs.
Proceedings of the 2017 International Conference on Networking, Architecture, and Storage, 2017

Near-Ideal Networks-on-Chip for Servers.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

Effective cache bank placement for GPUs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

POSTER: Elastic Reconfiguration for Heterogeneous NoCs with BiNoCHS.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
Sequoia: A High-Endurance NVM-Based Cache Architecture.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Performance Evaluation of Dynamic Page Allocation Strategies in SSDs.
ACM Trans. Model. Perform. Evaluation Comput. Syst., 2016

Adaptive sparse matrix representation for efficient matrix-vector multiplication.
J. Supercomput., 2016

A Hybrid Non-Volatile Cache Design for Solid-State Drives Using Comprehensive I/O Characterization.
IEEE Trans. Computers, 2016

An Efficient Hybrid-Switched Network-on-Chip for Chip Multiprocessors.
IEEE Trans. Computers, 2016

Guest Editors' Introduction: Special Section on Emerging Memory Technologies in Very Large Scale Computing and Storage Systems.
IEEE Trans. Computers, 2016

SPCM: The Striped Phase Change Memory.
ACM Trans. Archit. Code Optim., 2016

Reconfigurable multicast routing for Networks on Chip.
Microprocess. Microsystems, 2016

Power- and performance-efficient cluster-based network-on-chip with reconfigurable topology.
Microprocess. Microsystems, 2016

Introduction: Special Section on Architecture of Future Many Core Systems.
Microprocess. Microsystems, 2016

ASHA: An adaptive shared-memory sharing architecture for multi-programmed GPUs.
Microprocess. Microsystems, 2016

Why Does Data Prefetching Not Work for Modern Workloads?
Comput. J., 2016

Introduction to the Special Section on On-chip parallel and network-based systems.
Comput. Electr. Eng., 2016

TBM: Twin Block Management Policy to Enhance the Utilization of Plane-Level Parallelism in SSDs.
IEEE Comput. Archit. Lett., 2016

Preface.
Adv. Comput., 2016

A Method to Improve Adaptivity of Odd-Even Routing Algorithm in Mesh NoCs.
Proceedings of the 24th Euromicro International Conference on Parallel, 2016

An efficient on-chip network with packet compression capability.
Proceedings of the International SoC Design Conference, 2016

Power-efficient partially-adaptive routing in on-chip mesh networks.
Proceedings of the International SoC Design Conference, 2016

Reducing Power Consumption of GPGPUs Through Instruction Reordering.
Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016

Quantifying the difference in resource demand among classic and modern NoC workloads.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

Efficient processor allocation in a reconfigurable CMP architecture for dark silicon era.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

Tolerating more hard errors in MLC PCMs using compression.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

Captopril: Reducing the pressure of bit flips on hot locations in non-volatile main memories.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

BLESS: a simple and efficient scheme for prolonging PCM lifetime.
Proceedings of the 53rd Annual Design Automation Conference, 2016

2015
Variable Resistance Spectrum Assignment in Phase Change Memory Systems.
IEEE Trans. Very Large Scale Integr. Syst., 2015

Architecting the Last-Level Cache for GPUs using STT-RAM Technology.
ACM Trans. Design Autom. Electr. Syst., 2015

Prolonging Lifetime of PCM-Based Main Memories through On-Demand Page Pairing.
ACM Trans. Design Autom. Electr. Syst., 2015

Advances in multicore systems architectures.
J. Supercomput., 2015

Leveraging dark silicon to optimize networks-on-chip topology.
J. Supercomput., 2015

Improving the performance of packet-switched networks-on-chip by SDM-based adaptive shortcut paths.
Integr., 2015

On-chip parallel and network-based systems.
Integr., 2015

P2R2: Parallel Pseudo-Round-Robin arbiter for high performance NoCs.
Integr., 2015

Special issue on on-chip parallel and network-based systems.
Computing, 2015

Traffic-aware buffer reconfiguration in on-chip networks.
Proceedings of the 2015 IFIP/IEEE International Conference on Very Large Scale Integration, 2015

DiskAccel: Accelerating Disk-Based Experiments by Representative Sampling.
Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2015

Using Intra-Line Level Pairing for Graceful Degradation Support in PCMs.
Proceedings of the 2015 IEEE Computer Society Annual Symposium on VLSI, 2015

An efficient DVS scheme for on-chip networks using reconfigurable Virtual Channel allocators.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

An energy-efficient virtual channel power-gating mechanism for on-chip networks.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2014
A loss aware scalable topology for photonic on chip interconnection networks.
J. Supercomput., 2014

Adaptive prefetching using global history buffer in multicore processors.
J. Supercomput., 2014

Special Issue on Networks-on-Chip and Memories for Multicore Architectures.
Microprocess. Microsystems, 2014

A generic FPGA prototype for on-chip systems with network-on-chip communication infrastructure.
Comput. Electr. Eng., 2014

Unleashing the potentials of dynamism for page allocation strategies in SSDs.
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014

An Opto-electrical NoC with Traffic Flow Prediction in Chip Multiprocessors.
Proceedings of the 22nd Euromicro International Conference on Parallel, 2014

Reducing access latency of MLC PCMs through line striping.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

A Reliable 3D MLC PCM Architecture with Resistance Drift Predictor.
Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014

An Efficient STT-RAM Last Level Cache Architecture for GPUs.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

OD3P: On-Demand Page Paired PCM.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

A reconfigurable network-on-chip architecture for heterogeneous CMPs in the dark-silicon era.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

A compression-based morphable PCM architecture for improving resistance drift tolerance.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

Design for scalability in enterprise SSDs.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
Optimum hello interval for a connected homogeneous topology in mobile wireless sensor networks.
Telecommun. Syst., 2013

Designing best effort networks-on-chip to meet hard latency constraints.
ACM Trans. Embed. Comput. Syst., 2013

Computing Accurate Performance Bounds for Best Effort Networks-on-Chip.
IEEE Trans. Computers, 2013

A parallel clustering algorithm on the star graph and its performance.
Math. Comput. Model., 2013

Using task migration to improve non-contiguous processor allocation in NoC-based CMPs.
J. Syst. Archit., 2013

Multicore computing systems: Architecture, programming tools, and applications.
J. Comput. Syst. Sci., 2013

Efficient genetic based topological mapping using analytical models for on-chip networks.
J. Comput. Syst. Sci., 2013

Exploration of Temperature Constraints for Thermal-Aware Mapping of 3D Networks-on-Chip.
Int. J. Adapt. Resilient Auton. Syst., 2013

Network-on-SSD: A Scalable and High-Performance Communication Design Paradigm for SSDs.
IEEE Comput. Archit. Lett., 2013

Power and Performance Efficient Partial Circuits in Packet-Switched Networks-on-Chip.
Proceedings of the 21st Euromicro International Conference on Parallel, 2013

2012
The 2D digraph-based NoCs: attractive alternatives to the 2D mesh NoCs.
J. Supercomput., 2012

Editorial notes: Special issue on on-chip parallel and network-based systems.
Microprocess. Microsystems, 2012

Power-efficient deterministic and adaptive routing in torus networks-on-chip.
Microprocess. Microsystems, 2012

On-demand multicast routing protocol with efficient route discovery.
J. Netw. Comput. Appl., 2012

Supporting non-contiguous processor allocation in mesh-based chip multiprocessors using virtual point-to-point links.
IET Comput. Digit. Tech., 2012

Analysis of link lifetime in wireless mobile networks.
Ad Hoc Networks, 2012

A Game Theoretical Thermal - Aware Run - Time Task Synchronization Method for Multiprocessor Systems - on - Chip.
Proceedings of the 15th Euromicro Conference on Digital System Design, 2012

Reconfigurable Cluster-Based Networks-on-Chip for Application-Specific MPSoCs.
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012

2011
Application-Aware Topology Reconfiguration for On-Chip Networks.
IEEE Trans. Very Large Scale Integr. Syst., 2011

Multispanning Tree Zone-Ordered Label-Based Routing Algorithms for Irregular Networks.
IEEE Trans. Parallel Distributed Syst., 2011

Task migration in three-dimensional meshes.
J. Supercomput., 2011

Special issue on: On-chip parallel and network-based systems.
J. Syst. Archit., 2011

Performance modeling of the LEACH protocol for mobile wireless sensor networks.
J. Parallel Distributed Comput., 2011

Performance modeling of Cartesian product networks.
J. Parallel Distributed Comput., 2011

On pancyclicity properties of OTIS-mesh.
Inf. Process. Lett., 2011

Pancyclicity of OTIS (swapped) networks based on properties of the factor graph.
Inf. Process. Lett., 2011

On the Topological Properties of Grid-Based Interconnection Networks: Surface Area and Volume of Radial Spheres.
Comput. J., 2011

Modeling the effects of hot-spot traffic load on the performance of wormhole-switched hypermeshes.
Comput. Electr. Eng., 2011

Evaluation and design of beaconing in mobile wireless networks.
Ad Hoc Networks, 2011

Multicast-Aware Mapping Algorithm for On-chip Networks.
Proceedings of the 19th International Euromicro Conference on Parallel, 2011

Task Migration in Mesh NoCs over Virtual Point-to-Point Connections.
Proceedings of the 19th International Euromicro Conference on Parallel, 2011

A Distributed Task Migration Scheme for Mesh-Based Chip-Multiprocessors.
Proceedings of the 12th International Conference on Parallel and Distributed Computing, 2011

Providing mobile Internet service using MOnetary Wireless NETworking (MOWNET).
Proceedings of the IEEE 36th Conference on Local Computer Networks, 2011

High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement.
Proceedings of the 2011 International Symposium on Low Power Electronics and Design, 2011

A reconfigurable fault-tolerant routing algorithm to optimize the network-on-chip performance and latency in presence of intermittent and permanent faults.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

A morphable phase change memory architecture considering frequent zero values.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Application-aware deadlock-free oblivious routing based on extended turn-model.
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011

Supporting non-contiguous processor allocation in mesh-based CMPs using virtual point-to-point links.
Proceedings of the Design, Automation and Test in Europe, 2011

Energy-Optimized On-Chip Networks Using Reconfigurable Shortcut Paths.
Proceedings of the Architecture of Computing Systems - ARCS 2011, 2011

2010
Special issue on network-based high performance computing.
J. Supercomput., 2010

Virtual Point-to-Point Connections for NoCs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2010

Power-Performance Analysis of Networks-on-Chip With Arbitrary Buffer Allocation Schemes.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2010

Performance modeling of n-dimensional mesh networks.
Perform. Evaluation, 2010

The 2D SEM: A novel high-performance and low-power mesh-based topology for networks-on-chip.
Int. J. Parallel Emergent Distributed Syst., 2010

Performance analysis of opportunistic broadcast for delay-tolerant wireless sensor networks.
J. Syst. Softw., 2010

Corrigendum to "A general methodology for direction-based irregular routing algorithms" [J. Parallel Distrib. Comput. 70 (2010) 363-370]
J. Parallel Distributed Comput., 2010

A general methodology for direction-based irregular routing algorithms.
J. Parallel Distributed Comput., 2010

Resource placement in Cartesian product of networks.
J. Parallel Distributed Comput., 2010

The triangular pyramid: Routing and topological properties.
Inf. Sci., 2010

Properties of a hierarchical network based on the star graph.
Inf. Sci., 2010

Special section on advances in computing systems science and engineering.
Comput. Electr. Eng., 2010

Voltage-Frequency Planning for Thermal-Aware, Low-Power Design of Regular 3-D NoCs.
Proceedings of the VLSI Design 2010: 23rd International Conference on VLSI Design, 2010

An efficient routing algorithm for irregular mesh NoCs.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Improving the performance of deadlock recovery based routing in irregular mesh NoCs using added mesh-like links.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

An efficient dynamically reconfigurable on-chip network architecture.
Proceedings of the 47th Design Automation Conference, 2010

2009
Detecting Threats in Star Graphs.
IEEE Trans. Parallel Distributed Syst., 2009

Adaptive routing in wormhole-switched necklace-cubes: Analytical modelling and performance comparison.
Simul. Model. Pract. Theory, 2009

Resource placement in three-dimensional tori.
Parallel Comput., 2009

Some topological properties of star graphs: The surface area and volume.
Discret. Math., 2009

Analysis of k-Neigh topology control protocol for mobile wireless networks.
Comput. Networks, 2009

Chromatic sets of power graphs and their application to resource placement in multicomputer networks.
Comput. Math. Appl., 2009

A general mathematical performance model for wormhole-switched irregular networks.
Clust. Comput., 2009

A General Methodology for Routing in Irregular Networks.
Proceedings of the 17th Euromicro International Conference on Parallel, 2009

Performance and power efficient on-chip communication using adaptive virtual point-to-point connections.
Proceedings of the Third International Symposium on Networks-on-Chips, 2009

A Path-Based Broadcast Algorithm for Wormhole Hypercubes.
Proceedings of the 10th International Symposium on Pervasive Systems, 2009

Routing, data gathering, and neighbor discovery in delay-tolerant wireless sensor networks.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

A comprehensive power-performance model for NoCs with multi-flit channel buffers.
Proceedings of the 23rd international conference on Supercomputing, 2009

A method for calculating hard QoS guarantees for Networks-on-Chip.
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

A hybrid packet-circuit switched on-chip network based on SDM.
Proceedings of the Design, Automation and Test in Europe, 2009

2008
Parallel Lagrange interpolation on <i>k</i> -ary <i>n</i> -cubes with maximum channel utilization.
J. Supercomput., 2008

An accurate mathematical performance model of partially adaptive routing in binary n-cube multiprocessors.
Math. Comput. Model., 2008

Combinatorial performance modelling of toroidal cubes.
J. Syst. Archit., 2008

Some topological and combinatorial properties of WK-recursive mesh and WK-pyramid interconnection networks.
J. Syst. Archit., 2008

The Stretched Network: Properties, Routing, and Performance.
J. Inf. Sci. Eng., 2008

Analytic performance comparison of hypercubes and star graphs with implementation constraints.
J. Comput. Syst. Sci., 2008

Intruder Capturing in Mesh and Torus Networks.
Int. J. Found. Comput. Sci., 2008

An accurate mathematical performance model of adaptive routing in the star graph.
Future Gener. Comput. Syst., 2008

A Markovian Performance Model for Networks-on-Chip.
Proceedings of the 16th Euromicro International Conference on Parallel, 2008

Virtual Point-to-Point Links in Packet-Switched NoCs.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2008

Multi-Objective Genetic optimized multiprocessor SoC design.
Proceedings of the 2008 IEEE International Symposium on System-on-Chip, 2008

The Shuffle-Exchange Mesh Topology for 3D NoCs.
Proceedings of the 9th International Symposium on Parallel Architectures, 2008

Resource Placement in Cube-Connected Cycles.
Proceedings of the 9th International Symposium on Parallel Architectures, 2008

One-to-one and One-to-many node-disjoint Routing Algorithms for WK-Recursive networks.
Proceedings of the 9th International Symposium on Parallel Architectures, 2008

Mesh Connected Crossbars: A Novel NoC Topology with Scalable Communication Bandwidth.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2008

A General Approach for Analytical Modeling of Irregular NoCs.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2008

Performance Evaluation of Broadcast Algorithms in All-Port 2D Mesh Networks.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2008

Broadcast Algorithms on OTIS-Cubes.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2008

A novel high-performance and low-power mesh-based NoC.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

A Deadlock Free Shortest Path Routing Algorithm for WK-Recursive Meshes.
Proceedings of the Distributed Computing and Networking, 9th International Conference, 2008

Mathematical Performance Modelling of Stretched Hypercubes.
Proceedings of the Distributed Computing and Networking, 9th International Conference, 2008

An Adaptive and Fault-Tolerant Routing Algorithm for Meshes.
Proceedings of the Computational Science and Its Applications - ICCSA 2008, International Conference, Perugia, Italy, June 30, 2008

The 2D DBM: An attractive alternative to the simple 2D mesh topology for on-chip networks.
Proceedings of the 26th International Conference on Computer Design, 2008

The Effect of Network Topology and Channel Labels on the Performance of Label-Based Routing Algorithms.
Proceedings of the Computational Science, 2008

A Simple and Efficient Fault-Tolerant Adaptive Routing Algorithm for Meshes.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2008

Caspian: A Tunable Performance Model for Multi-core Systems.
Proceedings of the Euro-Par 2008, 2008

Efficient VLSI Layout of Edge Product Networks.
Proceedings of the 4th IEEE International Symposium on Electronic Design, 2008

Efficient VLSI Layout of WK-Recursive and WK-Pyramid Interconnection Networks.
Proceedings of the Advances in Computer Science and Engineering, 2008

Efficient Parallel Routing Algorithms for Cartesian and Composition Networks.
Proceedings of the Advances in Computer Science and Engineering, 2008

PERMAP: A performance-aware mapping for application-specific SoCs.
Proceedings of the 19th IEEE International Conference on Application-Specific Systems, 2008

Analysis of k-Neigh Topology Control Protocol for Wireless Networks.
Proceedings of the 22nd International Conference on Advanced Information Networking and Applications, 2008

Resource Placement in the Edge Product of Graphs.
Proceedings of the 22nd International Conference on Advanced Information Networking and Applications, 2008

An Adaptive Software-Based Deadlock Recovery Technique.
Proceedings of the 22nd International Conference on Advanced Information Networking and Applications, 2008

Empirical Performance Evaluation of Stretched Hypercubes.
Proceedings of the 22nd International Conference on Advanced Information Networking and Applications, 2008

2007
Perfect load balancing on the star interconnection network.
J. Supercomput., 2007

Comparative analytical performance evaluation of adaptivity in wormhole-switched hypercubes.
Simul. Model. Pract. Theory, 2007

An accurate performance model of fully adaptive routing in wormhole-switched two-dimensional mesh multicomputers.
Microprocess. Microsystems, 2007

Capturing an intruder in product networks.
J. Parallel Distributed Comput., 2007

Mathematical performance modelling of adaptive wormhole routing in optoelectronic hypercubes.
J. Parallel Distributed Comput., 2007

An empirical performance analysis of minimal and non-minimal routing in cube-based OTIS multicomputers.
J. High Speed Networks, 2007

Network-based computing.
J. Comput. Syst. Sci., 2007

The performance of synchronous parallel polynomial root extraction on a ring multicomputer.
Clust. Comput., 2007

The Edge Product of Networks.
Proceedings of the Eighth International Conference on Parallel and Distributed Computing, 2007

Distant-Based Resource Placement in Product Networks.
Proceedings of the Eighth International Conference on Parallel and Distributed Computing, 2007

Analysis of Time-Based Random Waypoint Mobility Model for Wireless Mobile Networks.
Proceedings of the Fourth International Conference on Information Technology: New Generations (ITNG 2007), 2007

Some Properties of WK-Recursive and Swapped Networks.
Proceedings of the Parallel and Distributed Processing and Applications, 2007

Performance Modelling of Necklace Hypercubes.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Accelerating 3-D capacitance extraction in deep sub-micron VLSI design using vector/parallel computing.
Proceedings of the 13th International Conference on Parallel and Distributed Systems, 2007

Lifetime analysis of the logical topology constructed by homogeneous topology control in wireless mobile networks.
Proceedings of the 13th International Conference on Parallel and Distributed Systems, 2007

Mathematical performance analysis of product networks.
Proceedings of the 13th International Conference on Parallel and Distributed Systems, 2007

On the Link Excess Life in Mobile Wireless Networks.
Proceedings of the 2007 International Conference on Computing: Theory and Applications (ICCTA 2007), 2007

Empirical Performance Evaluation of Adaptive Routing in Necklace Hypercubes: A Comparative Study.
Proceedings of the 2007 International Conference on Computing: Theory and Applications (ICCTA 2007), 2007

Power-aware mapping for reconfigurable NoC architectures.
Proceedings of the 25th International Conference on Computer Design, 2007

Improving a Fault-Tolerant Routing Algorithm Using Detailed Traffic Analysis.
Proceedings of the High Performance Computing and Communications, 2007

On Pancyclicity Properties of OTIS Networks.
Proceedings of the High Performance Computing and Communications, 2007

Performance Modeling of Wormhole Hypermeshes Under Hotspot Traffic.
Proceedings of the Computer Science, 2007

Resource Placement in Networks Using Chromatic Sets of Power Graphs.
Proceedings of the Computer Science, 2007

XMulator: A Listener-Based Integrated Simulation Platform for Interconnection Networks.
Proceedings of the First Asia International Conference on Modelling and Simulation, 2007

Parallel Numerical Interpolation on Necklace Hypercubes.
Proceedings of the First Asia International Conference on Modelling and Simulation, 2007

Simulation-Based Performance Evaluation of Deterministic Routing in Necklace Hypercubes.
Proceedings of the 2007 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA 2007), 2007

Power Consumption and Performance Analysis of 3D NoCs.
Proceedings of the Advances in Computer Systems Architecture, 2007

Optimal Placement of Frequently Accessed IPs in Mesh NoCs.
Proceedings of the Advances in Computer Systems Architecture, 2007

2006
The Grid-Pyramid: A Generalized Pyramid Network.
J. Supercomput., 2006

Modelling and evaluation of adaptive routing in high-performance n-D tori networks.
Simul. Model. Pract. Theory, 2006

Performance evaluation of communication networks for parallel and distributed systems.
Parallel Comput., 2006

Multicast communication in OTIS-hypercube multi-computer systems.
Int. J. High Perform. Comput. Netw., 2006

Analytic Performance Evaluation of OTIS-Hypercubes.
IEICE Trans. Inf. Syst., 2006

Analytical performance modelling of partially adaptive routing in wormhole hypercubes.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Analytical performance modelling of adaptive wormhole routing in the star interconnection network.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

A comparative performance analysis of n-cubes and star graphs.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

A physical particle and plane framework for load balancing in multiprocessors.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

The impacts of timing constraints on virtual channels multiplexing in interconnect networks.
Proceedings of the 25th IEEE International Performance Computing and Communications Conference, 2006

Analytical Performance Comparison of Deterministic, Partially- and Fully-Adaptive Routing Algorithms in Binary n-Cubes.
Proceedings of the 12th International Conference on Parallel and Distributed Systems, 2006

A performance and power analysis of WK-Recursive and Mesh Networks for Network-on-Chips.
Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

Capturing an Intruder in the Pyramid.
Proceedings of the Computer Science, 2006

Analytic Modeling of Channel Traffic in <i>n</i>-Cubes.
Proceedings of the Computer Science, 2006

A heuristic routing mechanism using a new addressing scheme.
Proceedings of the 1st International ICST Conference on Bio Inspired Models of Network, 2006

Performance Comparison of Partially Adaptive Routing Algorithms.
Proceedings of the 20th International Conference on Advanced Information Networking and Applications (AINA 2006), 2006

Topological Properties of Stretched Graphs.
Proceedings of the 2006 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA 2006), 2006

A Probability-Based Instruction Combining Method for Scheduling in VLIW Processors.
Proceedings of the 2006 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA 2006), 2006

2005
Hierarchical Binary Set Partitioning in Cache Memories.
J. Supercomput., 2005

A Performance Model of Software-based Deadlock Recovery Routing Algorithm in Hypercubes.
Parallel Process. Lett., 2005

Performance modeling and evaluation of high-performance parallel and distributed systems.
Perform. Evaluation, 2005

Design and performance of networks for super-, cluster-, and grid-computing: Part II.
J. Parallel Distributed Comput., 2005

Design and performance of networks for super-, cluster-, and grid-computing: Part I.
J. Parallel Distributed Comput., 2005

Constraint-Based Performance Analysis of k Ary n -Cube Networks
Int. J. Comput. Their Appl., 2005

Design and performance of a pixel-level pipelined-parallel architecture for high speed wavelet-based image compression.
Comput. Electr. Eng., 2005

The necklace-hypercube: a well scalable hypercube-based interconnection network for multiprocessors.
Proceedings of the 2005 ACM Symposium on Applied Computing (SAC), 2005

The recursive transpose-connected cycles (RTCC) interconnection network for multiprocessors.
Proceedings of the 2005 ACM Symposium on Applied Computing (SAC), 2005

The Stretched-Hypercube: A VLSI Efficient Network Topology.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005

Topological Properties of Necklace Networks.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005

On Some Combinatorial Properties of the Star Graph.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005

The WK-Recursive Pyramid: An Efficient Network Topology.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005

Analytic Performance Modeling of a Fully Adaptive Routing Algorithm in the Torus.
Proceedings of the Parallel and Distributed Processing and Applications, 2005

Parallel Polynomial Root Extraction on A Ring of Processors.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

The Effect of Virtual Channel Organization on the Performance of Interconnection Networks.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

An Empirical Comparison of OTIS-Mesh and OTIS-Hypercube Multicomputer Systems under Deterministic Routing.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

A Cordic-Based Processor Extension for Scalar and Vector Processing.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Message from the Chairs.
Proceedings of the 34th International Conference on Parallel Processing Workshops (ICPP 2005 Workshops), 2005

Performance Evaluation of Fully Adaptive Routing Under Different Workloads and Constant Node Buffer Size.
Proceedings of the 11th International Conference on Parallel and Distributed Systems, 2005

Efficient Polynomial Root Finding Using SIMD Extensions.
Proceedings of the 11th International Conference on Parallel and Distributed Systems, 2005

Parallel Clustering on the Star Graph.
Proceedings of the Distributed and Parallel Computing, 2005

Efficient SIMD Numerical Interpolation.
Proceedings of the High Performance Computing and Communications, 2005

A Constraint-Based Performance Comparison of Hypercube and Star Multicomputers with Failures.
Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA 2005), 2005

The Star-Pyramid Graph: An Attractive Alternative to the Pyramid.
Proceedings of the Advances in Computer Systems Architecture, 10th Asia-Pacific Conference, 2005

2004
Analysis of true fully adaptive routing with software-based deadlock recovery.
J. Syst. Softw., 2004

On The Combinatorial Properties Of k-Ary n-Cubes.
J. Interconnect. Networks, 2004

Algorithmic Construction of Hamiltonian Cycles in k-Ary n-Cubes.
Int. J. Comput. Their Appl., 2004

Towards a more realistic comparative analysis of multicomputer networks.
Concurr. Pract. Exp., 2004

Constraint-based performance comparison of multi-dimensional interconnection networks with deterministic and adaptive routing strategies.
Comput. Electr. Eng., 2004

The Effect of Adaptivity on the Performance of the OTIS-Hypercube Under Different Traffic Patterns.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2004

Performance Modeling of Fully Adaptive Wormhole Routing in 2-D Mesh-Connected Multiprocessors.
Proceedings of the 12th International Workshop on Modeling, 2004

On Some Combinatorial Properties of Meshes.
Proceedings of the 7th International Symposium on Parallel Architectures, 2004

Enhanced-Star: A New Topology Based on the Star Graph.
Proceedings of the Parallel and Distributed Processing and Applications, 2004

Fault Detection Enhancement in Cache Memories Using a High Performance Placement Algorithm.
Proceedings of the 10th IEEE International On-Line Testing Symposium (IOLTS 2004), 2004

An Accurate Combinatorial Model for Performance Prediction of Deterministic Wormhole Routing in Torus Multicomputer Systems.
Proceedings of the 22nd IEEE International Conference on Computer Design: VLSI in Computers & Processors (ICCD 2004), 2004

Fault-Tolerant Routing in the Star Graph.
Proceedings of the 18th International Conference on Advanced Information Networking and Applications (AINA 2004), 2004

Comparative Evaluation of Adaptive and Deterministic Routing in the OTIS-Hypercube.
Proceedings of the Advances in Computer Systems Architecture, 9th Asia-Pacific Conference, 2004

2003
Analytical modelling of wormhole-routed k-ary n-cubes in the presence of matrix-transpose traffic.
J. Parallel Distributed Comput., 2003

Analysis of k-ary n-cubes with dimension-ordered routing.
Future Gener. Comput. Syst., 2003

An analytical model of adaptive wormhole routing with time-out.
Future Gener. Comput. Syst., 2003

A class of ball-and-bin problems and its application to mesh networks.
Proceedings of the 2003 10th IEEE International Conference on Electronics, 2003

A BTC-based technique for improving image compression.
Proceedings of the 2003 10th IEEE International Conference on Electronics, 2003

2002
A Performance Model of Adaptive Wormhole Routing in <i>k</i>-Ary <i>n</i>-Cubes in the Presence of Digit-Reversal Traffic.
J. Supercomput., 2002

A Parallel Algorithm for Lagrange Interpolation on the Star Graph.
J. Parallel Distributed Comput., 2002

Performance Analysis of Deterministic Routing in Workhole <i>k</i>-Ary <i>n</i>-Cubes with Virtual Channels.
J. Interconnect. Networks, 2002

A simple mathematical model of adaptive routing in wormhole k-ary n-cubes.
Proceedings of the 2002 ACM Symposium on Applied Computing (SAC), 2002

Comparative Analysis of Adaptive Wormhole Routing in Tori and Hypercubes in the Presence of Hotspot Traffic.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
Performance analysis of wormhole routing in multicomputer interconnection networks.
PhD thesis, 2001

An Analytical Model of Adaptive Wormhole Routing in Hypercubes in the Presence of Hot Spot Traffic.
IEEE Trans. Parallel Distributed Syst., 2001

Analytical Modeling of Wormhole-Routed k-Ary n-Cubes in the Presence of Hot-Spot Traffic.
IEEE Trans. Computers, 2001

An accurate analytical model of adaptive wormhole routing in k-ary n-cubes interconnection networks.
Perform. Evaluation, 2001

Communication delay in hypercubes in the presence of bit-reversal traffic.
Parallel Comput., 2001

Employing k-ary n-cubes for parallel Lagrange interpolation.
Parallel Algorithms Appl., 2001

On the performance of adaptive wormhole routing in the bi-directional torus network: a hot spot analysis.
Microprocess. Microsystems, 2001

Algorithmic construction of Hamiltonians in pyramids.
Inf. Process. Lett., 2001

Analysis of Timeout-Based Adaptive Wormhole Routing.
Proceedings of the 9th International Workshop on Modeling, 2001

On Some Properties of <i>k</i>-Ary <i>n</i>-Cubes.
Proceedings of the Eigth International Conference on Parallel and Distributed Systems, 2001

Analysis of Deterministic Routing in <i>k</i>-Ary <i>n</i>-Cubes with Virtual Channels.
Proceedings of the Eigth International Conference on Parallel and Distributed Systems, 2001

A FIFO-based architecture for high speed image compression.
Proceedings of the 2001 8th IEEE International Conference on Electronics, 2001

2000
A parallel algorithm for Lagrange interpolation on the cube-connected cycles.
Microprocess. Microsystems, 2000

Performance Analysis of k-Ary n-Cube Networks with Pipelined Circuit Switching.
Int. J. High Speed Comput., 2000

Message Latency in Hypercubes in the Presence of Matrix-Transpose Traffic.
Comput. J., 2000

Modeling of Pipelined Circuit Switching in Multicomputer Networks.
Proceedings of the MASCOTS 2000, Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 29 August, 2000

An Analytic Model for Communication Latency in Wormhole-Switched <i>k</i>-ary <i>n</i>-Cube Interconnection Networks with Digit-Reversal Traffic.
Proceedings of the High Performance Computing, Third International Symposium, 2000

Parallel Lagrange Interpolation on the Star Graph.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

An Analytical Model of Fully-Adaptive Wormhole-Routed k-Ary n-Cubes in the Presence of Hot Spot Traffic.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

A Performance Model of Adaptive Routing in k-Ary n-Cubes with Matrix-Transpose Traffic.
Proceedings of the 2000 International Conference on Parallel Processing, 2000

Performance Analysis of k-Ary n-Cubes with Fully Adaptive Routing.
Proceedings of the Seventh International Conference on Parallel and Distributed Systems, 2000

1999
A Parallel Algorithm for Lagrange Interpolation on <i>k</i>-ary <i>n</i>-Cubes.
Proceedings of the Parallel Computation, 1999


  Loading...