Guang R. Gao

ACM Trans. Archit. Code Optim., 2017

Parallel Turing Machine, a Proposal.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2017

HAMR: A dataflow-based real-time in-memory cluster computing engine.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2017

Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes.

[BibT_eX]

[DOI]

Proceedings of the Third International Workshop on Extreme Scale Programming Models and Middleware, 2017

Multigrain Parallelism: Bridging Coarse-Grain Parallel Programs and Fine-Grain Event-Driven Multithreading.

[BibT_eX]

[DOI]

Jaime Arteaga Molina

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Leveraging access port positions to accelerate page table walk in DWM-based main memory.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Leveraging Compiler Optimizations to Reduce Runtime Fault Recovery Overhead.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

Designing Scalable Distributed Memory Models: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the Computing Frontiers Conference, 2017

2016

The Design and Implementation of TIDeFlow: A Dataflow-Inspired Execution Model for Parallel Loops and Task Pipelining.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2016

Toward a Parallel Turing Machine Model.

[BibT_eX]

[DOI]

Peng Qu

Jin Yan

Proceedings of the Network and Parallel Computing, 2016

Energy Avoiding Matrix Multiply.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2016

The Importance of Efficient Fine-Grain Synchronization for Many-Core Systems.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2016

Asynchronous Runtimes in Action: An Introspective Framework for a Next Gen Runtime.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Extending the Roofline Model for Asynchronous Many-Task Runtimes.

[BibT_eX]

[DOI]

Joshua D. Suetterlein

Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

Application characterization at scale: lessons learned from developing a distributed open community runtime system for high performance computing.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

2015

Author Rebuttal to Rocha et al. "Comments on Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks".

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2015

Design and evaluation of a novel dataflow based bigdata solution.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, 2015

Landing Containment Domains on SWARM: Toward a Robust Resiliency Solution on a Dynamic Adaptive Runtime Machine.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: On the Road to Exascale, 2015

FreshBreeze: A Data Flow Approach for Meeting DDDAS Challenges.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computational Science, 2015

Gregarious Data Re-structuring in a Many Core Architecture.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Energy efficient multi-level tiling for dense matrix multiplication on many-core architecture.

[BibT_eX]

[DOI]

Haitao Wei

Elkin Garcia

Proceedings of the Sixth International Green and Sustainable Computing Conference, 2015

Locality aware concurrent start for stencil applications.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014

TERAFLUX: Harnessing dataflow in next generation teradevices.

[BibT_eX]

[DOI]

Microprocess. Microsystems, 2014

Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2014

Position Paper: Locality-Driven Scheduling of Tasks for Data-Dependent Multithreading.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

ACDT: Architected Composite Data Types trading-in unfettered data access for improved execution.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

A Dataflow Programming Language and its Compiler for Streaming Systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computational Science, 2014

2013

StreamTMC: Stream compilation for tiled multi-core architectures.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2013

Automatic Locality Exploitation in the Codelet Model.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Trust, 2013

Optimizing the LU Factorization for Energy Efficiency on a Many-Core Architecture.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2013

Towards Memory-Load Balanced Fast Fourier Transformations in Fine-Grain Execution Models.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

A dynamic schema to increase performance in many-core architectures through percolation operations.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual International Conference on High Performance Computing, 2013

An Implementation of the Codelet Model.

[BibT_eX]

[DOI]

Joshua Suetterlein

Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Toward a Self-aware System for Exascale Architectures.

[BibT_eX]

[DOI]

Aaron Myles Landwehr

Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013

The TERAFLUX Project: Exploiting the DataFlow Paradigm in Next Generation Teradevices.

[BibT_eX]

[DOI]

Proceedings of the 2013 Euromicro Conference on Digital System Design, 2013

Strategies for improving performance and energy efficiency on a many-core.

[BibT_eX]

[DOI]

Elkin Garcia

Proceedings of the Computing Frontiers Conference, 2013

2012

Software Pipelining for Stream Programs on Resource Constrained Multicore Architectures.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2012

Toward high-throughput algorithms on many-core architectures.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2012

Massively parallel breadth first search using a tree-structured memory model.

[BibT_eX]

[DOI]

Tom St. John

Proceedings of the 2012 PPOPP International Workshop on Programming Models and Applications for Multicores and Manycores, 2012

Demystifying Performance Predictions of Distributed FFT3D Implementations.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 9th IFIP International Conference, 2012

A Discussion in Favor of Dynamic Scheduling for Regular Applications in Many-core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

The Role of Non-strict Fine-grain Synchronization.

[BibT_eX]

[DOI]

Juergen Ributzka

Joseph B. Manzano

Proceedings of the Transition of HPC Towards Exascale Computing, 2012

Dynamic percolation: a case of study on the shortcomings of traditional optimization in many-core architectures.

[BibT_eX]

[DOI]

Proceedings of the Computing Frontiers Conference, CF'12, 2012

2011

Analysis and performance results of computing betweenness centrality on IBM Cyclops64.

[BibT_eX]

[DOI]

J. Supercomput., 2011

Experiments with the Fresh Breeze tree-based memory model.

[BibT_eX]

[DOI]

Xiao X. Meng

Comput. Sci. Res. Dev., 2011

The Fresh Breeze Program Execution Model.

[BibT_eX]

[DOI]

Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Polytasks: A Compressed Task Representation for HPC Runtimes.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2011

OPELL and PM: A Case Study on Porting Shared Memory Programming Models to Accelerators Architectures.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2011

The elephant and the mice: the role of non-strict fine-grain synchronization for modern many-core architectures.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Source Code Partitioning in Program Optimization.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

DEEP: an iterative fpga-based many-core emulation system for chip verification and architecture research.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

Hardware and Software Tradeoffs for Task Synchronization on Manycore Architectures.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Exploring Fine-Grained Task-Based Execution on Multi-GPU Systems.

[BibT_eX]

[DOI]

Long Chen

Oreste Villa

Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

2010

Performance analysis of Cooley-Tukey FFT algorithms for a many-core architecture.

[BibT_eX]

[DOI]

Long Chen

Proceedings of the 2010 Spring Simulation Multiconference, 2010

Locality Optimization of Stencil Applications Using Data Dependency Graphs.

[BibT_eX]

[DOI]

Daniel A. Orozco

Elkin Garcia

Proceedings of the Languages and Compilers for Parallel Computing, 2010

TiNy threads on BlueGene/P: Exploring many-core parallelisms beyond The traditional OS.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Dynamic load balancing on single- and multi-GPU systems.

[BibT_eX]

[DOI]

Long Chen

Oreste Villa

Sriram Krishnamoorthy

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Optimized Dense Matrix Multiplication on a Many-Core Architecture.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

A Study of a Software Cache Implementation of the OpenMP Memory Model for Multicore and Manycore Architectures.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

Minimizing communication in rate-optimal software pipelining for stream programs.

[BibT_eX]

[DOI]

Proceedings of the CGO 2010, 2010

2009

Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures.

[BibT_eX]

[DOI]

Ninghui Sun

IEEE Trans. Parallel Distributed Syst., 2009

Tile Reduction: The First Step towards Tile Aware Parallelization in OpenMP.

[BibT_eX]

[DOI]

Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009

Iterative layer-based raytracing on CUDA.

[BibT_eX]

[DOI]

Alejandro Segovia

Xiaoming Li

Proceedings of the 28th International Performance Computing and Communications Conference, 2009

Mapping the FDTD Application to Many-Core Chip Architectures.

[BibT_eX]

[DOI]

Daniel A. Orozco

Proceedings of the ICPP 2009, 2009

Tile Percolation: An OpenMP Tile Aware Parallelization Technique for the Cyclops-64 Multicore Processor.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2009 Parallel Processing, 2009

Mapping the LU decomposition on a many-core architecture: challenges and solutions.

[BibT_eX]

[DOI]

Ioannis E. Venetis

Proceedings of the 6th Conference on Computing Frontiers, 2009

2008

[BibT_eX]

[DOI]

ACM Trans. Program. Lang. Syst., 2008

Engenius - Environmental genome Informational Utility System.

[BibT_eX]

[DOI]

J. Bioinform. Comput. Biol., 2008

Guest Editors Introduction: Special Issue on OpenMP.

[BibT_eX]

[DOI]

Mitsuhisa Sato

Eduard Ayguadé

Int. J. Parallel Program., 2008

Experience on optimizing irregular computation for memory hierarchy in manycore architecture.

[BibT_eX]

[DOI]

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

Minimum Lock Assignment: A Method for Exploiting Concurrency among Critical Sections.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2008

Just-In-Time Locality and Percolation for Optimizing Irregular Applications on a Manycore Architecture.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2008

Open64 compiler infrastructure for emerging multicore/manycore architecture All Symposium Tutorial.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007

Single-dimension software pipelining for multidimensional loops.

[BibT_eX]

[DOI]

Zhizhong Tang

ACM Trans. Archit. Code Optim., 2007

Performance portability on EARTH: a case study across several parallel architectures.

[BibT_eX]

[DOI]

Yanwei Niu

Clust. Comput., 2007

A parallel dynamic programming algorithm on a multi-core architecture.

[BibT_eX]

[DOI]

Ninghui Sun

Proceedings of the SPAA 2007: Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2007

Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform.

[BibT_eX]

[DOI]

Peiheng Zhang

Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications, 2007

Optimized lock assignment and allocation: a method for exploiting concurrency among critical sections.

[BibT_eX]

[DOI]

Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

On Parallel Models of Computation.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, IFIP International Conference, 2007

Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers.

[BibT_eX]

[DOI]

Yuan Zhang

Evelyn Duesterwald

Proceedings of the Languages and Compilers for Parallel Computing, 2007

Synchronization state buffer: supporting efficient fine-grain synchronization on many-core architectures.

[BibT_eX]

[DOI]

Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

On the Role of Deterministic Fine-Grain Data Synchronization for Scientific Applications: A Revisit in the Emerging Many-Core Era.

[BibT_eX]

[DOI]

Ziang Hu

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Automatic Program Segment Similarity Detection in Targeted Program Performance Improvement.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Experience of Optimizing FFT on Intel Architectures.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

ParalleX: A Study of A New Parallel Computation Model.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Exploring a Multithreaded Methodology to Implement a Network Communication Protocol on the Cyclops-64 Multithreaded Architecture.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Optimizing the Fast Fourier Transform on a Multi-core Architecture.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Software-Pipelining on Multi-Core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007

2006

User-Friendly Methodology for Automatic Exploration of Compiler Options: A Case Study on the Intel XScale Microarchitecture.

[BibT_eX]

Proceedings of the International Conference on Software Engineering Research and Practice & Conference on Programming Languages and Compilers, 2006

A User-Friendly Methodology for Automatic Exploration of Compiler Options.

[BibT_eX]

Proceedings of the International Conference on Software Engineering Research and Practice & Conference on Programming Languages and Compilers, 2006

Performance Characteristics of OpenMP Language Constructs on a Many-core-on-a-chip Architecture.

[BibT_eX]

[DOI]

Juan del Cuvillo

Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2006

Exploring Financial Applications on Many-Core-on-a-Chip Architecture: A First Experiment.

[BibT_eX]

[DOI]

Ruppa K. Thulasiram

Proceedings of the Frontiers of High Performance Computing and Networking, 2006

A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Hierarchical multithreading: programming model and system software.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Toward a Software Infrastructure for the Cyclops-64 Cellular Architecture.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual International Symposium on High Performance Computing Systems and Applications (HPCS 2006), 2006

Optimization of Dense Matrix Multiplication on IBM Cyclops-64: Challenges and Experiences.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Multi-dimensional Kernel Generation for Loop Nest Software Pipelining.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip.

[BibT_eX]

[DOI]

Juan del Cuvillo

Proceedings of the Third Conference on Computing Frontiers, 2006

The Era of Multi-core Chips -A Fresh Look on Software Challenges.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

2005

Improving power efficiency with compiler-assisted cache replacement.

[BibT_eX]

[DOI]

Ziang Hu

J. Embed. Comput., 2005

Madd Operation Aware Redundancy Elimination.

[BibT_eX]

[DOI]

Int. J. Softw. Eng. Knowl. Eng., 2005

Quasi-consensus-based comparison of profile hidden Markov models for protein sequences.

[BibT_eX]

[DOI]

Roland L. Dunbrack Jr.

Bioinform., 2005

An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes.

[BibT_eX]

[DOI]

Robel Y. Kahsay

Li Liao

Bioinform., 2005

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, 2005

Sequential Consistency Revisit: The Sufficient Condition and Method to Reason the Consistency Model of a Multiprocessor-on-a-Chip Architecture.

[BibT_eX]

Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks, 2005

Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, IFIP International Conference, 2005

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2005

An energy efficient TLB design methodology.

[BibT_eX]

[DOI]

Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005

Sustained Petaflop and Beyond: Can Parallel Computing Systems Meet The Challenges?

[BibT_eX]

[DOI]

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

TiNy Threads: A Thread Virtual Machine for the Cyclops64 Cellular Architecture.

[BibT_eX]

[DOI]

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Discriminating transmembrane proteins from signal peptides using SVM-Fisher approach.

[BibT_eX]

[DOI]

Li Liao

Robel Y. Kahsay

Proceedings of the Fourth International Conference on Machine Learning and Applications, 2005

Identifying Multiply-Add Operations in Kylin Compiler.

[BibT_eX]

Proceedings of The 2005 International Conference on Embedded Systems and Applications, 2005

2004

A fine-grain load-adaptive algorithm of the 2D discrete wavelet transform for multithreaded architectures.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2004

A cluster-based solution for high performance hmmpfam using EARTH execution model.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Netw., 2004

An Improved Hidden Markov Model for Transmembrane Topology Prediction.

[BibT_eX]

[DOI]

Robel Y. Kahsay

Li Liao

Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004), 2004

If-Conversion in SSA Form.

[BibT_eX]

[DOI]

Arthur Stoutchinin

Proceedings of the Euro-Par 2004 Parallel Processing, 2004

Implementing parallel conjugate gradient on the EARTH multithreaded architecture.

[BibT_eX]

[DOI]

Fei Chen

Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

Single-Dimension Software Pipelining for Multi-Dimensional Loops.

[BibT_eX]

[DOI]

Zhizhong Tang

Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

Code Generation for Single-Dimension Software Pipelining of Multi-Dimensional Loops.

[BibT_eX]

[DOI]

Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

2003

Special issue on compilers, architecture, and synthesis for embedded systems.

[BibT_eX]

[DOI]

Trevor N. Mudge

ACM Trans. Embed. Comput. Syst., 2003

Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2003

Evaluation and Choice of Various Branch Predictors for Low-Power Embedded Processor.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2003

Implementation of the EARTH programming model on SMP clusters: a multi-threaded language and runtime system.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2003

Compiler-Assisted Cache Replacement: Problem Formulation and Performance Evaluation.

[BibT_eX]

[DOI]

Ziang Hu

Proceedings of the Languages and Compilers for Parallel Computing, 2003

CARE: Overview of an Adaptive Multithreaded Architecture.

[BibT_eX]

[DOI]

Andrès Márquez

Proceedings of the High Performance Computing, 5th International Symposium, 2003

Performance Study of a Whole Genome Comparison Tool on a Hyper-Threading Multiprocessor.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 5th International Symposium, 2003

An Executable Analytical Performance Evaluation Approach for Early Performance Prediction.

[BibT_eX]

[DOI]

Thomas L. Sterling

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Programming Models and System Software for Future High-End Computing Systems: Work-in-Progress.

[BibT_eX]

[DOI]

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Inter-procedural stacked register allocation for itanium® like architecture.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual International Conference on Supercomputing, 2003

DIMES: an iterative emulation platform for Multiprocessor-System-On-Chip designs.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Field-Programmable Technology, 2003

Implementing Parallel Hmm-pfam on the EARTH Multithreaded Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2nd IEEE Computer Society Bioinformatics Conference, 2003

2002

Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks.

[BibT_eX]

[DOI]

Palash Desai

J. VLSI Signal Process., 2002

Efficent Multithreaded Algorithms for the Fast Fourier Transform.

[BibT_eX]

[DOI]

Parallel Distributed Comput. Pract., 2002

A Theory for Co-Scheduling Hardware and Software Pipelines in ASIPs and Embedded Processors.

[BibT_eX]

[DOI]

Wellington Santos Martins

Des. Autom. Embed. Syst., 2002

Implementation and evaluation of a communication intensive application on the EARTH multithreaded system.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2002

CASA: a server for the critical assessment of protein sequence alignment accuracy.

[BibT_eX]

[DOI]

Roland L. Dunbrack Jr.

Bioinform., 2002

TROLL-Tandem Repeat Occurrence Locator.

[BibT_eX]

[DOI]

Adalberto T. Castelo

Bioinform., 2002

Fine-Grain Stacked Register Allocation for the Itanium Architecture.

[BibT_eX]

[DOI]

José Nelson Amaral

Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

Compiling Several Classes of Communication Patterns on a Multithreaded Architecture.

[BibT_eX]

[DOI]

Rishi Kumar

Gagan Agrawal

Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Next Generation System Software for Future High-End Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Visualizing Biosequence Data Using Texture Mapping.

[BibT_eX]

[DOI]

Praveen R. Thiagarajan

Proceedings of the 2002 IEEE Symposium on Information Visualization (InfoVis 2002), 27 October, 2002

Power-Performance Trade-Offs for Energy-Efficient Architectures: A Quantitative Study.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Computer Design (ICCD 2002), 2002

An Adaptive Meta-Clustering Approach: Combining the Information from Different Clustering Results.

[BibT_eX]

[DOI]

Proceedings of the 1st IEEE Computer Society Bioinformatics Conference, 2002

A Bayesian Modeling Framework for Genetic Regulation.

[BibT_eX]

[DOI]

Proceedings of the 1st IEEE Computer Society Bioinformatics Conference, 2002

On achieving balanced power consumption in software pipelined loops.

[BibT_eX]

[DOI]

Wellington Santos Martins

Clement Leung

Proceedings of the International Conference on Compilers, 2002

2001

Dynamic Load Balancers for a Multithreaded Multiprocessor System.

[BibT_eX]

[DOI]

Parallel Process. Lett., 2001

Exploiting Locality in Single Assignment Data Structures Updated Through Split-Phase Transactions.

[BibT_eX]

[DOI]

Clust. Comput., 2001

A Multithreaded Parallel Implementation of a Dynamic Programming Algorithm for Sequence Comparison.

[BibT_eX]

[DOI]

Proceedings of the 6th Pacific Symposium on Biocomputing, 2001

New Design Paradigms: What Needs to be Standardized?.

[BibT_eX]

[DOI]

Proceedings of the 14th International Symposium on Systems Synthesis, 2001

Bridging the gap between ISA compilers and silicon compilers a challenge for future SoC design.

[BibT_eX]

[DOI]

Proceedings of the 14th International Symposium on Systems Synthesis, 2001

Multithreaded Algorithms for Pricing a Class of Complex Options.

[BibT_eX]

[DOI]

Ruppa K. Thulasiram

Lubomir Litov

Hassan Nojumi

Christopher T. Downing

Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Minimum Register Instruction Sequence Problem: Revisiting Optimal Code Generation for DAGs.

[BibT_eX]

[DOI]

Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Topic 08+13: Instruction-Level Parallelism and Computer Architecture.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2001: Parallel Processing, 2001

Speculative Prefetching of Induction Pointers.

[BibT_eX]

[DOI]

Proceedings of the Compiler Construction, 10th International Conference, 2001

2000

Location Consistency-A New Memory Model and Cache Consistency Protocol.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2000

Enhanced Co-Scheduling: A Software Pipelining Method Using Modulo-Scheduled Pipeline Theory.

[BibT_eX]

[DOI]

N. S. S. Narasimha Rao

Int. J. Parallel Program., 2000

Self-Avoiding Walks over Adaptive Unstructured Grids.

[BibT_eX]

[DOI]

Concurr. Pract. Exp., 2000

Multithreaded algorithms for the fast Fourier transform.

[BibT_eX]

[DOI]

Proceedings of the Twelfth annual ACM Symposium on Parallel Algorithms and Architectures, 2000

Landing CG on EARTH: A Case Study of Fine-Grained Multithreading on an Evolutionary Path.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing 2000, 2000

Recursive and Iterative Multithreaded Algorithms for Pricing American Securities.

[BibT_eX]

Ruppa K. Thulasiram

Christopher T. Downing

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2000

Design and Implementation of an Efficient Thread Partitioning Algorithm.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, Third International Symposium, 2000

Caching Single-Assignment Structures to Build a Robust Fine-Grain Multi-Threading System.

[BibT_eX]

[DOI]

Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

Parallel FEM Simulation of Crack Propagation - Challenges, Status, and Perspectives.

[BibT_eX]

[DOI]

Proceedings of the Parallel and Distributed Processing, 2000

Automatic compiler techniques for thread coarsening for multithreaded architectures.

[BibT_eX]

[DOI]

Proceedings of the 14th international conference on Supercomputing, 2000

Developing a Communication Intensive Application on the EARTH Multithreaded Architecture (Distinguished Paper).

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

A Theory for Software-Hardware Co-Scheduling for ASIPs and Embedded Processors.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Application-Specific Systems, 2000

1999

Advances in the dataflow computational model.

[BibT_eX]

[DOI]

Walid A. Najjar

Edward A. Lee

Parallel Comput., 1999

Automatically Partitioning Threads for Multithreaded Architectures.

[BibT_eX]

[DOI]

Xinan Tang

J. Parallel Distributed Comput., 1999

Self-Avoiding Walks Over Adaptive Triangular Grids.

[BibT_eX]

Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

Minimum Register Instruction Scheduling: A New Approach for Dynamic Instruction Issue Processors.

[BibT_eX]

[DOI]

Chihong Zhang

Proceedings of the Languages and Compilers for Parallel Computing, 1999

Coping with very High Latencies in Petaflop Computer Systems.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, Second International Symposium, 1999

Implementing a Non-Strict Functional Programming Language on a Threaded Architecture.

[BibT_eX]

[DOI]

Proceedings of the Parallel and Distributed Processing, 1999

Load Adaptive Algorithms and Implementations for the 2D Discrete Wavelet Transform on Fine-Grain Multithreaded Architectures.

[BibT_eX]

[DOI]

Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

A New Approach to Parallel Dynamic Partitioning for Adaptive Unstructured Meshes.

[BibT_eX]

[DOI]

Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

From EARTH to HTMT: An Evolution of a Multiheaded Architecture Model (Abstract).

[BibT_eX]

[DOI]

Proceedings of the Parallel and Distributed Processing, 1999

Multithreaded Execution Architecture and Compilation.

[BibT_eX]

[DOI]

Dean M. Tullsen

Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

Efficient State-Diagram Construction Methods for Software Pipelining.

[BibT_eX]

[DOI]

Chihong Zhang

Sean Ryan

Proceedings of the Compiler Construction, 8th International Conference, 1999

1998

A New Framework for Elimination-Based Data Flow Analysis Using DJ Graphs.

[BibT_eX]

[DOI]

ACM Trans. Program. Lang. Syst., 1998

A Unified Framework for Instruction Scheduling and Mapping for Function Units with Structural Hazards.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1998

Optimal Modulo Scheduling Through Enumeration.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 1998

How "Hard" is Thread Partitioning and How "Bad" is a List Scheduling Based Partitioning Algorithm?

[BibT_eX]

[DOI]

Xinan Tang

Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures, 1998

Using Multithreading for the Automatic Load Balancing of Adaptive Finite Element Meshes.

[BibT_eX]

[DOI]

Proceedings of the Solving Irregularly Structured Problems in Parallel, 1998

An Enhanced Co-Scheduling Method Using Reduced MS-State Diagrams.

[BibT_eX]

[DOI]

N. S. S. Narasimha Rao

Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998

Automatically Partitioning Threads Based on Remote Paths.

[BibT_eX]

[DOI]

Xinan Tang

Proceedings of the International Conference on Parallel and Distributed Systems, 1998

Partial Sampling with Reverse State Reconstruction: A New Technique for Branch Predictor Performance Estimation.

[BibT_eX]

[DOI]

Darren Erik Vengroff

Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31, 1998

A New Fast Algorithm for Optimal Register Allocation in Modulo Scheduled Loops.

[BibT_eX]

[DOI]

Sylvain Lelait

Christine Eisenbeis

Proceedings of the Compiler Construction, 7th International Conference, 1998

1997

Incremental Computation of Dominator Trees.

[BibT_eX]

[DOI]

ACM Trans. Program. Lang. Syst., 1997

Compiling C for the EARTH multithreaded architecture.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 1997

Thread Partitioning and Scheduling Based on Cost Model.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures, 1997

Experiences with Non-numeric Applications on Multithreaded Architectures.

[BibT_eX]

[DOI]

Proceedings of the Sixth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1997

On the Importance of an End-To-End View of Memory Consistency in Future Computer Systems.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, International Symposium, 1997

Latency Tolerance: A Metric for Performance Analysis of Multithreaded Architectures.

[BibT_eX]

[DOI]

Proceedings of the 11th International Parallel Processing Symposium (IPPS '97), 1997

Elastic History Buffer: A Low-Cost Method to Improve Branch Prediction Accuracy.

[BibT_eX]

[DOI]

Maria-Dana Tarlescu

Proceedings of the Proceedings 1997 International Conference on Computer Design: VLSI in Computers & Processors, 1997

Heap Analysis and Optimizations for Threaded Programs.

[BibT_eX]

[DOI]

Proceedings of the 1997 Conference on Parallel Architectures and Compilation Techniques (PACT '97), 1997

A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors.

[BibT_eX]

[DOI]

Rad Silvera

Jian Wang

Proceedings of the 1997 Conference on Parallel Architectures and Compilation Techniques (PACT '97), 1997

1996

A Framework for Resource-Constrained Rate-Optimal Software Pipelining.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 1996

Identifying Loops Using DJ Graphs.

[BibT_eX]

[DOI]

ACM Trans. Program. Lang. Syst., 1996

A Study of the EARTH-MANNA Multithreaded System.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 1996

A New Framework for Exhaustive and Incremental Data Flow Analysis Using DJ Graphs.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN'96 Conference on Programming Language Design and Implementation (PLDI), 1996

Software Pipelining Showdown: Optimal vs. Heuristic Methods in a Production Compiler.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN'96 Conference on Programming Language Design and Implementation (PLDI), 1996

Measurement and Modeling of EARTH-MANNA Multithreaded Architecture.

[BibT_eX]

[DOI]

Proceedings of the MASCOTS '96, 1996

Locality Analysis for Distributed Shared-Memory Multiprocessors.

[BibT_eX]

[DOI]

Shaohua Han

Proceedings of the Languages and Compilers for Parallel Computing, 1996

Polling Watchdog: Combining Polling and Interrupts for Efficient Message Handling.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

Co-Scheduling Hardware and Software Pipelines.

[BibT_eX]

[DOI]

Proceedings of the Second International Symposium on High-Performance Computer Architecture, 1996

Quantitive studies of data-locality sensitivity on the EARTH multithreaded architecture: preliminary results.

[BibT_eX]

[DOI]

Xinmin Tian

Proceedings of the 3rd International Conference on High Performance Computing, 1996

Multithreading implementation of a distributed shortest path algorithm on EARTH multiprocessor.

[BibT_eX]

[DOI]

Xinmin Tian

Proceedings of the 3rd International Conference on High Performance Computing, 1996

Optimal Software Pipelining Through Enumeration of Schedules.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par '96 Parallel Processing, 1996

Pipelining-Dovetailing: A Transformation to Enhance Software Pipelining for Nested Loops.

[BibT_eX]

[DOI]

Jian Wang

Proceedings of the Compiler Construction, 6th International Conference, 1996

Data locality sensitivity of multithreaded computations on a distributed-memory multiprocessor.

[BibT_eX]

[DOI]

Xinmin Tian

Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative Research, 1996

Compiling C for the EARTH multithreaded architecture.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Parallel Architectures and Compilation Techniques, 1996

1995

Rate-optimal schedule for multi-rate DSP computations.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 1995

Automatic Data and Computation Decomposition for Distributed-Memory Machines.

[BibT_eX]

[DOI]

Parallel Process. Lett., 1995

Computing phi-nodes in linear time using DJ graphs.

[BibT_eX]

[DOI]

J. Program. Lang., 1995

ABC++: Concurrency by Inheritance in C++.

[BibT_eX]

[DOI]

IBM Syst. J., 1995

On memory models and cache management for shared-memory multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Seventh IEEE Symposium on Parallel and Distributed Processing, 1995

A Linear Time Algorithm for Placing phi-nodes.

[BibT_eX]

[DOI]

Proceedings of the Conference Record of POPL'95: 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1995

Scheduling and Mapping: Software Pipelining in the Presence of Structural Hazards.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN'95 Conference on Programming Language Design and Implementation (PLDI), 1995

Exploiting short-lived variables in superscalar processors.

[BibT_eX]

[DOI]

Luis A. Lozano

Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, Michigan, USA, November 29, 1995

An Experimental Study of an ILP-based Exact Solution Method for Software Pipelining.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 1995

The Threaded Communication Library: Preliminary Experiences on a Multiprocessor with Dual-Processor Nodes.

[BibT_eX]

[DOI]

Nasser Elmasri

Proceedings of the 9th international conference on Supercomputing, 1995

Location Consistency: Stepping Beyond the Memory Coherence Barrier.

[BibT_eX]

Proceedings of the 1995 International Conference on Parallel Processing, 1995

A Design Frame for Hybrid Access Caches.

[BibT_eX]

[DOI]

Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture (HPCA 1995), 1995

Automatic data and computation decomposition for distributed memory machines.

[BibT_eX]

[DOI]

Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS-28), 1995

Costs and Benefits of Multithreading with Off-the-Shelf RISC Processors.

[BibT_eX]

[DOI]

Olivier Maquelin

Proceedings of the Euro-Par '95 Parallel Processing, 1995

A design study of the EARTH multiprocessor.

[BibT_eX]

[DOI]

Prakash Panangaden

Xun Xue

Yingchun Zhu

Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, 1995

Advanced topics in dataflow computing and multithreading.

[BibT_eX]

Lubomir Bic

Jean-Luc Gaudiot

IEEE, ISBN: 978-0-8186-6542-4, 1995

1994

Performance of Interconnection Network in Multithreaded Architectures.

[BibT_eX]

[DOI]

Proceedings of the PARLE '94: Parallel Architectures and Languages Europe, 1994

Minimizing register requirements under resource-constrained rate-optimal software pipelining.

[BibT_eX]

[DOI]

Proceedings of the 27th Annual International Symposium on Microarchitecture, San Jose, California, USA, November 30, 1994

Building Multithreaded Architectures with Off-the-Shelf Microprocessors.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Parallel Processing, 1994

A Comparative Study of Multiprocessor List Scheduling Heuristics.

[BibT_eX]

Proceedings of the 27th Annual Hawaii International Conference on System Sciences (HICSS-27), 1994

Automatic decomposition in EPPP compiler.

[BibT_eX]

[DOI]

Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative Research, October 31, 1994

FTL: a multithreaded environment for parallel computation.

[BibT_eX]

[DOI]

Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative Research, October 31, 1994

EPPP - an integrated environment for portable parallel programming.

[BibT_eX]

[DOI]

Gilles Hurteau

Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative Research, October 31, 1994

Data parallelism with high performance C.

[BibT_eX]

[DOI]

Christophe Bonello

Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative Research, October 31, 1994

Minimizing memory requirements in rate-optimal schedules.

[BibT_eX]

[DOI]

Palash Desai

Proceedings of the International Conference on Application Specific Array Processors, 1994

Concurrent Execution of Heterogeneous Threads in the Super-Actor Machine.

[BibT_eX]

[DOI]

Proceedings of the Multithreaded Computer Architecture, 1994

Multithreaded Architectures: Principles, Projects, and Issues.

[BibT_eX]

[DOI]

Proceedings of the Multithreaded Computer Architecture, 1994

1993

Special Issue on DataFlow and Multithreaded Architectures - Guest Editors' Introduction.

[BibT_eX]

[DOI]

Jean-Luc Gaudiot

Lubomir Bic

J. Parallel Distributed Comput., 1993

An Efficient Hybrid Dataflow Architecture Modle.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1993

Designing Programming Languages for the Analyzability of Pointer Data Structures.

[BibT_eX]

[DOI]

Comput. Lang., 1993

Analysis of Multithreaded Multiprocessors with Distributed Shared Memory.

[BibT_eX]

[DOI]

Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, 1993

A Novel Framework of Register Allocation for Software Pipelining.

[BibT_eX]

[DOI]

Proceedings of the Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1993

A Kahn Principle for Networks of Nonmonotonic Real-time Processes.

[BibT_eX]

[DOI]

Robert Kim Yates

Proceedings of the PARLE '93, 1993

Extending Software Pipelining Techniques for Scheduling Nested Loops.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 1993

Speculative Execution and Branch Prediction on Parallel Machines.

[BibT_eX]

[DOI]

Proceedings of the 7th international conference on Supercomputing, 1993

A Novel Methodology Using Genetic Algorithms for the Design of Caches and Cache Replacement Policy.

[BibT_eX]

Proceedings of the 5th International Conference on Genetic Algorithms, 1993

A novel framework for multi-rate scheduling in DSP applications.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Application-Specific Array Processors, 1993

1992

Optimal loop storage allocation for argument-fetching dataflow machines.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 1992

A high-speed memory organization for hybrid dataflow / von Neumann computing.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 1992

Minimizing Loop Storage Allocation for An Argument-Fetching Dataflow Architecture Model.

[BibT_eX]

[DOI]

Proceedings of the PARLE '92: Parallel Architectures and Languages Europe, 1992

On the limits of program parallelism and its smoothability.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual International Symposium on Microarchitecture, 1992

Designing the McCAT Compiler Based on a Family of Structured Intermediate Representations.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 1992

Collective Loop Fusion for Array Contraction.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 1992

Efficient Interprocessor Synchronization/Communication on a Dataflow Multiprocessor Architecture.

[BibT_eX]

Jean-Marc Monti

Proceedings of the 1992 International Conference on Parallel Processing, 1992

Designing programming languages for analyzability: a fresh look at pointer data structures.

[BibT_eX]

[DOI]

Proceedings of the ICCL'92, 1992

Performance Evaluation of Latency Tolerant Architectures.

[BibT_eX]

Proceedings of the Computing and Information, 1992

Well-behaved dataflow programs for DSP computation.

[BibT_eX]

[DOI]

Prakash Panangaden

Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

A Polynomial Time Method for Optimal Software Pipelining.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing: CONPAR 92, 1992

A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs.

[BibT_eX]

[DOI]

Proceedings of the Compiler Construction, 1992

1991

Efficient support of concurrent threads in a hybrid dataflow/von Neumann architecture.

[BibT_eX]

[DOI]

Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, 1991

An efficient parallel algorithm for all pairs examination.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing '91, 1991

A Timed Petri-Net Model for Fine-Grain Loop Scheduling.

[BibT_eX]

[DOI]

Yue-Bong Wong

Proceedings of the ACM SIGPLAN'91 Conference on Programming Language Design and Implementation (PLDI), 1991

A Novel High-Speed Memory Organization for Fine-Grain Multi-Thread Computing.

[BibT_eX]

[DOI]

Proceedings of the PARLE '91: Parallel Architectures and Languages Europe, 1991

Towards an Efficient Hybrid Dataflow Architecture Model.

[BibT_eX]

[DOI]

Jean-Marc Monti

Proceedings of the PARLE '91: Parallel Architectures and Languages Europe, 1991

Loop Storage Optimization for Dataflow Machines.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 1991

Optimization of array accesses by collective loop transformations.

[BibT_eX]

[DOI]

Proceedings of the 5th international conference on Supercomputing, 1991

A code mapping scheme for dataflow software pipelining.

[BibT_eX]

The Kluwer international series in engineering and computer science 125, Kluwer, ISBN: 978-0-7923-9130-2, 1991

1990

Exploiting fine-grain parallelism on dataflow architectures.

[BibT_eX]

[DOI]

Parallel Comput., 1990

A strict monolithic array constructor.

[BibT_eX]

[DOI]

Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing, 1990

Towards efficient fine-grain software pipelining.

[BibT_eX]

[DOI]

Yue-Bong Wong

Proceedings of the 4th international conference on Supercomputing, 1990

An Efficient Scheme for Fine-Grain Software Pipelining.

[BibT_eX]

[DOI]

Yue-Bong Wong

Proceedings of the CONPAR 90, 1990

1989

Algorithmic Aspects of Balancing Techniques for Pipelined Data Flow Code Generation.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1989

1988

Summary of the workshop on frontiers in functional programming and dataflow architecture.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 1988

An efficient pipelined dataflow processor architecture.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing '88, Orlando, FL, USA, November 12-17, 1988, 1988

Design of an Efficient Dataflow Architecture without Data Flow.

[BibT_eX]

René Tio

Proceedings of the International Conference on Fifth Generation Computer Systems, 1988

1987

A stability classification method and its application to pipelined solution of linear recurrences.

[BibT_eX]

[DOI]

Parallel Comput., 1987

1986

A pipelined code mapping scheme for static data flow computers.

[BibT_eX]

[DOI]

Guang Rong Gao

PhD thesis, 1986

A Maximally Pipelined Tridiagonal Linear Equation Solver.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1986

Maximum pipelining linear recurrence on static data flow computers.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 1986

A Pipelined Solution Method of Tridiagonal Linear Equation Systems.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1986

1984

Modeling the Weather with a Data Flow Supercomputer.

[BibT_eX]

[DOI]

Kenneth W. Todd

IEEE Trans. Computers, 1984

1983

Maximum Pipelining of Array Operations on Static Data Flow Machine.

[BibT_eX]