Guy Lemieux

ACM Trans. Embed. Comput. Syst., 2023

2022

Mixing Low-Precision Formats in Multiply-Accumulate Units for DNN Training.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2022

2020

Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

2019

TinBiNN: Tiny Binarized Neural Network Overlay in about 5, 000 4-LUTs and 5mW.

[BibT_eX]

[DOI]

CoRR, 2019

Full Deep Neural Network Training On A Pruned Weight Budget.

[BibT_eX]

[DOI]

Maximilian Golub

Mieszko Lis

Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Software-based Dynamic Overlays Require Fast, Fine-grained Partial Reconfiguration.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2019

Low-Level Loop Analysis and Pipelining of Applications Mapped to Xilinx FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

2018

DropBack: Continuous Pruning During Training.

[BibT_eX]

[DOI]

Maximilian Golub

Mieszko Lis

CoRR, 2018

An Accelerated OpenVX Overlay for Pure Software Programmers.

[BibT_eX]

[DOI]

Nick Ivanov

Proceedings of the International Conference on Field-Programmable Technology, 2018

Modular Block-RAM-Based Longest-Prefix Match Ternary Content-Addressable Memories.

[BibT_eX]

[DOI]

Lesley Shannon

Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

2017

Exploring automated space/time tradeoffs for OpenVX compute graphs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field Programmable Technology, 2017

Real-time object detection in software with custom vector instructions and algorithm changes.

[BibT_eX]

[DOI]

Joe Edwards

Proceedings of the 28th IEEE International Conference on Application-specific Systems, 2017

2016

Modular Switched Multiported SRAM-Based Memories.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2016

Automated Space/Time Scaling of Streaming Task Graph.

[BibT_eX]

[DOI]

CoRR, 2016

A Multi-ported Memory Compiler Utilizing True Dual-Port BRAMs.

[BibT_eX]

[DOI]

Yehdhih Ould Mohammed Moctar

Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

2015

Fast and Memory-Efficient Routing Algorithms for Field Programmable Gate Arrays With Sparse Intracluster Routing Crossbars.

[BibT_eX]

[DOI]

Philip Brisk

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2015

Area Optimization of Arithmetic Units by Component Sharing for FPGAs (Abstract Only).

[BibT_eX]

[DOI]

Shao Lin S. T. Tang

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

Wavefront Skipping using BRAMs for Conditional Algorithms on Vector Processors.

[BibT_eX]

[DOI]

Joe Edwards

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

Rapid Overlay Builder for Xilinx FPGAs.

[BibT_eX]

[DOI]

Michael Xi Yue

Dirk Koch

Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

Modular SRAM-Based Binary Content-Addressable Memories.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

2014

Deep and narrow binary content-addressable memories using FPGA-based BRAMs.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Conference on Field-Programmable Technology, 2014

Soft vector processors with streaming pipelines.

[BibT_eX]

[DOI]

Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

Modular multi-ported SRAM-based memories.

[BibT_eX]

[DOI]

Ameer Abdelhadi

Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

2013

TputCache: High-frequency, multi-way cache for high-throughput FPGA applications.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

An efficient FPGA overlay for portable custom instruction set extensions.

[BibT_eX]

[DOI]

Dirk Koch

Christian Beckhoff

Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

Safe Overclocking of Tightly Coupled CGRAs and Processor Arrays using Razor.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013

Embedded supercomputing in FPGAs with the VectorBlox MXP Matrix Processor.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2013

2012

Rapid Synthesis and Simulation of Computational Circuits in an MPPA.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2012

VENICE: A compact vector processor for FPGA applications.

[BibT_eX]

[DOI]

Yehdhih Ould Mohammed Moctar

Proceedings of the 2012 International Conference on Field-Programmable Technology, 2012

Pipeline frequency boosting: Hiding dual-ported block RAM latency using intentional clock skew.

[BibT_eX]

[DOI]

Proceedings of the 2012 International Conference on Field-Programmable Technology, 2012

Routing algorithms for FPGAS with sparse intra-cluster routing crossbars.

[BibT_eX]

[DOI]

Philip Brisk

Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), 2012

Parallel FPGA placement based on individual LUT placement (abstract only).

[BibT_eX]

[DOI]

Chris C. Wang

Yehdhih Ould Mohammed Moctar

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

Reducing the cost of floating-point mantissa alignment and normalization in FPGAs.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

Accelerator compiler for the VENICE vector processor.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

ZUMA: An Open FPGA Overlay Architecture.

[BibT_eX]

[DOI]

Alexander Brant

Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012

2011

Performance and Cost Tradeoffs in Metal-Programmable Structured ASICs (MPSAs).

[BibT_eX]

[DOI]

Usman Ahmed

IEEE Trans. Very Large Scale Integr. Syst., 2011

Deterministic Timing-Driven Parallel Placement by Simulated Annealing Using Half-Box Window Decomposition.

[BibT_eX]

[DOI]

Jeffrey B. Goeders

Proceedings of the 2011 International Conference on Reconfigurable Computing and FPGAs, 2011

Configuration Bitstream Reduction for SRAM-based FPGAs by Enumerating LUT Input Permutations.

[BibT_eX]

[DOI]

Ameer Abdelhadi

Proceedings of the 2011 International Conference on Reconfigurable Computing and FPGAs, 2011

Scalable and deterministic timing-driven parallel placement for FPGAs.

[BibT_eX]

[DOI]

Chris C. Wang

Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

The role of FPGAs in a converged future with heterogeneous programmable processors: pre-conference workshop.

[BibT_eX]

[DOI]

Jonathan Rose

Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

A CAD framework for Malibu: an FPGA with time-multiplexed coarse-grained elements.

[BibT_eX]

[DOI]

Chris C. Wang

Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

VEGAS: soft vector processor with scratchpad memory.

[BibT_eX]

[DOI]

Christopher Han-Yu Chou

Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

2010

A 4 GHz Non-Resonant Clock Driver With Inductor-Assisted Energy Return to Power Grid.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. I Regul. Pap., 2010

The impact of interconnect architecture on via-programmed structured ASICs (VPSAs).

[BibT_eX]

[DOI]

Usman Ahmed

Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays, 2010

2009

Vector Processing as a Soft Processor Accelerator.

[BibT_eX]

[DOI]

Jason Yu

Christopher Eagleston

Christopher Han-Yu Chou

Maxime Perreault

ACM Trans. Reconfigurable Technol. Syst., 2009

Estimating reliability and throughput of source-synchronous wave-pipelined interconnect.

[BibT_eX]

[DOI]

Paul Teehan

Mark R. Greenstreet

Proceedings of the Third International Symposium on Networks-on-Chips, 2009

PGR: Period and glitch reduction via clock skew scheduling, delay padding and GlitchLess.

[BibT_eX]

[DOI]

Xiao Dong

Proceedings of the 2009 International Conference on Field-Programmable Technology, 2009

Replace: An incremental placement algorithm for field programmable gate arrays.

[BibT_eX]

[DOI]

David Leong

Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

Towards reliable 5Gbps wave-pipelined and 3Gbps surfing interconnect in 65nm FPGAs.

[BibT_eX]

[DOI]

Paul Teehan

Mark R. Greenstreet

Proceedings of the ACM/SIGDA 17th International Symposium on Field Programmable Gate Arrays, 2009

PERG-Rx: a hardware pattern-matching engine supporting limited regular expressions.

[BibT_eX]

[DOI]

Johnny Tsung Lin Ho

Proceedings of the ACM/SIGDA 17th International Symposium on Field Programmable Gate Arrays, 2009

2008

Interconnect Driver Design for Long Wires in Field-Programmable Gate Arrays.

[BibT_eX]

[DOI]

Edmund Lee

Shahriar Mirabbasi

J. Signal Process. Syst., 2008

GlitchLess: Dynamic Power Minimization in FPGAs Through Edge Alignment and Glitch Filtering.

[BibT_eX]

[DOI]

Julien Lamoureux

IEEE Trans. Very Large Scale Integr. Syst., 2008

Perturb+mutate: Semisynthetic circuit generation for incremental placement and routing.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2008

Energy Recovery from High-Frequency Clocks Using DC-DC Converters.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2008

PERG: A scalable FPGA-based pattern-matching engine with consolidated Bloomier filters.

[BibT_eX]

[DOI]

Johnny Tsung Lin Ho

Proceedings of the 2008 International Conference on Field-Programmable Technology, 2008

Vector processing as a soft-core CPU accelerator.

[BibT_eX]

[DOI]

Jason Yu

Christopher Eagleston

Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008

Designing with extreme parallelism.

[BibT_eX]

[DOI]

Tarek A. El-Ghazawi

Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008

Extreme parallel architectures for the masses.

[BibT_eX]

[DOI]

Tarek A. El-Ghazawi

Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008

2007

A Survey and Taxonomy of GALS Design Styles.

[BibT_eX]

[DOI]

Paul Teehan

Mark R. Greenstreet

IEEE Des. Test Comput., 2007

Congestion estimation and localization in FPGAS: a visual tool for interconnect prediction.

[BibT_eX]

[DOI]

David Yeager

Darius Chiu

Proceedings of the Ninth International Workshop on System-Level Interconnect Prediction (SLIP 2007), 2007

A 3GHz Switching DC-DC Converter Using Clock-Tree Charge-Recycling in 90nm CMOS with Integrated Output Filter.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE International Solid-State Circuits Conference, 2007

A Case for Soft Vector Processors in FPGAs.

[BibT_eX]

[DOI]

Jason Yu

Proceedings of the 2007 International Conference on Field-Programmable Technology, 2007

GlitchLess: an active glitch minimization technique for FPGAs.

[BibT_eX]

[DOI]

Julien Lamoureux

Proceedings of the ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, 2007

2006

System-on-Chip: Reuse and Integration.

[BibT_eX]

[DOI]

Proc. IEEE, 2006

Un/DoPack: re-clustering of large system-on-chip designs with interconnect variation for low-cost FPGAs.

[BibT_eX]

[DOI]

Marvin Tom

David Leong

Proceedings of the 2006 International Conference on Computer-Aided Design, 2006

Perturber: semi-synthetic circuit generation using ancestor control for testing incremental place and route.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Field Programmable Technology, 2006

Semi-Synthetic Circuit Generation Using Graph Monomorphism for Testing Incremental Placement and Incremental Routing Tools.

[BibT_eX]

[DOI]

Scott Chin

Proceedings of the 2006 International Conference on Field Programmable Logic and Applications (FPL), 2006

2005

FPGA Defect Tolerance: Impact of Granularity.

[BibT_eX]

Anthony J. Yu

Proceedings of the 2005 IEEE International Conference on Field-Programmable Technology, 2005

Defect-Tolerant FPGA Switch Block and Connection Block with Fine-Grain Redundancy for Yield Enhancement.

[BibT_eX]

[DOI]

Anthony J. Yu

Proceedings of the 2005 International Conference on Field Programmable Logic and Applications (FPL), 2005

Logic block clustering of large designs for channel-width constrained FPGAs.

[BibT_eX]

[DOI]

Marvin Tom

Proceedings of the 42nd Design Automation Conference, 2005

An improved "soft" eFPGA design and implementation strategy.

[BibT_eX]

[DOI]

Victor O. Aken'Ova

Resve A. Saleh

Proceedings of the IEEE 2005 Custom Integrated Circuits Conference, 2005

2004

Directional and single-driver wires in FPGA interconnect.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Field-Programmable Technology, 2004

Design of interconnection networks for programmable logic.

[BibT_eX]

David A. Lewis

Kluwer, ISBN: 978-1-4020-7700-5, 2004

2002

Analytical Framework for Switch Block Design.

[BibT_eX]

[DOI]

Proceedings of the Field-Programmable Logic and Applications, 2002

Circuit design of routing switches.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2002

2001

Using sparse crossbars within LUT.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2001

2000

The NUMAchine Multiprocessor.

[BibT_eX]

[DOI]

Proceedings of the 2000 International Conference on Parallel Processing, 2000

Generating highly-routable sparse crossbars for PLDs.

[BibT_eX]

[DOI]

Paul Leventis

Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2000

1998

Design and Implementation of the NUMAchine Multiprocessor.

[BibT_eX]

[DOI]

Proceedings of the 35th Conference on Design Automation, 1998

1997

On two-step routing for FPGAS.

[BibT_eX]

[DOI]

Stephen Dean Brown

Daniel Vranesic

Proceedings of the 1997 International Symposium on Physical Design, 1997

1996

Segmented Routing for Speed-Performance and Routability in Field-Programmable Gate Arrays.

[BibT_eX]

[DOI]

Stephen Dean Brown

Muhammad M. Khellah