Martin C. Herbordt

Orcid: 0000-0002-3443-9113

Affiliations:
  • Boston University, Department of Electrical and Computer Engineering, MA, USA


According to our database1, Martin C. Herbordt authored at least 150 papers between 1990 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
FPGA-Accelerated Range-Limited Molecular Dynamics.
IEEE Trans. Computers, June, 2024

AutoAnnotate: Reinforcement Learning based Code Annotation for High Level Synthesis.
Proceedings of the 25th International Symposium on Quality Electronic Design, 2024

Multi-Core Multi-Rule VeBPF Firewall for Secure FPGA IoT Device Deployments.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Further Optimizations and Analysis of Smith-Waterman with Vector Extensions.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Performance Evaluation of VirtIO Device Drivers for Host-FPGA PCIe Communication.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

SmartFuse: Reconfigurable Smart Switches to Accelerate Fused Collectives in HPC Applications.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

2023
Introduction to the Special Section on FCCM 2022.
ACM Trans. Reconfigurable Technol. Syst., December, 2023

A Survey of Potential MPI Complex Collectives: Large-Scale Mining and Analysis of HPC Applications.
CoRR, 2023

FASDA: An FPGA-Aided, Scalable and Distributed Accelerator for Range-Limited Molecular Dynamics.
Proceedings of the International Conference for High Performance Computing, 2023

FLASH: FPGA-Accelerated Smart Switches with GCN Case Study.
Proceedings of the 37th International Conference on Supercomputing, 2023

Software-Hardware Co-design of Heterogeneous SmartNIC System for Recommendation Models Inference and Training.
Proceedings of the 37th International Conference on Supercomputing, 2023

Improved Models for Policy-Agent Learning of Compiler Directives in HLS.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

2022
The Future of FPGA Acceleration in Datacenters and the Cloud.
ACM Trans. Reconfigurable Technol. Syst., 2022

Reconfigurable switches for high performance and flexible MPI collectives.
Concurr. Comput. Pract. Exp., 2022

Reinforcement Learning Strategies for Compiler Optimization in High level Synthesis.
Proceedings of the Eighth IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2022

Distributed Hardware Accelerated Secure Joint Computation on the COPA Framework.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

The Viability of Using Online Prediction to Perform Extra Work while Executing BSP Applications.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

Enabling VirtIO Driver Support on FPGAs.
Proceedings of the IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing, 2022

Optimized Mappings for Symmetric Range-Limited Molecular Force Calculations on FPGAs.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

A Framework for Neural Network Inference on FPGA-Centric SmartNICs.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

COPA Use Case: Distributed Secure Joint Computation.
Proceedings of the 30th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2022

FCsN: A FPGA-Centric SmartNIC Framework for Neural Networks.
Proceedings of the 30th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2022

2021
O3BNN-R: An Out-of-Order Architecture for High-Performance and Regularized BNN Inference.
IEEE Trans. Parallel Distributed Syst., 2021

I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

System-Level Modeling of GPU/FPGA Clusters for Molecular Dynamics Simulations.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Survey and Future Trends for FPGA Cloud Architectures.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Workload Imbalance in HPC Applications: Effect on Performance of In-Network Processing.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

A Survey: Handling Irregularities in Neural Network Acceleration with FPGAs.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Upgrade of FPGA Range-Limited Molecular Dynamics to Handle Hundreds of Processors.
Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021

Particle Mesh Ewald for Molecular Dynamics in OpenCL on an FPGA Cluster.
Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021

The Open Cloud Testbed (OCT): A Platform for Research into new Cloud Technologies.
Proceedings of the 10th IEEE International Conference on Cloud Networking, CloudNet 2021, 2021

2020
FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters.
IEEE Trans. Computers, 2020

An OpenCL 3D FFT for Molecular Dynamics Simulations on Multiple FPGAs.
CoRR, 2020

AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

CSB-RNN: a faster-than-realtime RNN acceleration framework with compressed structured blocks.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

A Reconfigurable Compute-in-the-Network FPGA Assistant for High-Level Collective Support with Distributed Matrix Multiply Case Study.
Proceedings of the International Conference on Field-Programmable Technology, 2020

A Communication-Efficient Multi-Chip Design for Range-Limited Molecular Dynamics.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

Arithmetic and Boolean Secret Sharing MPC on FPGAs in the Data Center.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

Execution of Complete Molecular Dynamics Simulations on Multiple FPGAs.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

FPGAs in the Network and Novel Communicator Support Accelerate MPI Collectives.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

CQNN: a CGRA-based QNN Framework.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

Secret Sharing MPC on FPGAs in the Datacenter.
Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020

Accelerating MPI Collectives with FPGAs in the Network and Novel Communicator Support.
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

FP-AMG: FPGA-Based Acceleration Framework for Algebraic Multigrid Solvers.
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

2019
UWB-GCN: Hardware Acceleration of Graph-Convolution-Network through Runtime Workload Rebalancing.
CoRR, 2019

Fully Integrated On-FPGA Molecular Dynamics Simulations.
CoRR, 2019

A Scalable Framework for Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters with Weight and Workload Balancing.
CoRR, 2019

Fully integrated FPGA molecular dynamics simulations.
Proceedings of the International Conference for High Performance Computing, 2019

BSTC: a novel binarized-soft-tensor-core design for accelerating bit-based approximated neural nets.
Proceedings of the International Conference for High Performance Computing, 2019

O3BNN: an out-of-order architecture for high-performance binarized neural network inference with fine-grained pruning.
Proceedings of the ACM International Conference on Supercomputing, 2019

GhostSZ: A Transparent FPGA-Accelerated Lossy Compression Framework.
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

FP-AMR: A Reconfigurable Fabric Framework for Adaptive Mesh Refinement Applications.
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

Molecular Dynamics Range-Limited Force Evaluation Optimized for FPGAs.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

Accelerating AP3M-Based Computational Astrophysics Simulations with Reconfigurable Clusters.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

2018
Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks.
BMC Bioinform., 2018

MPI Derived Datatypes: Performance and Portability Issues.
Proceedings of the 25th European MPI Users' Group Meeting, 2018

Tangram: Colocating HPC Applications with Oversubscription.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Soft-Core. Multiple-Lane, FPGA-based ADCs for a Liquid Helium Environment.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Application Aware Tuning of Reconfigurable Multi-Layer Perceptron Architectures.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Unlocking Performance-Programmability by Penetrating the Intel FPGA OpenCL Toolflow.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Benchmarking Heterogeneous HPC Systems Including Reconfigurable Fabrics: Community Aspirations for Ideal Comparisons.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

An Access-Pattern-Aware On-Chip Vector Memory System with Automatic Loading for SIMD Architectures.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

FPGA HPC using OpenCL: Case Study in 3D FFT.
Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2018

An Empirically Guided Optimization Framework for FPGA OpenCL.
Proceedings of the International Conference on Field-Programmable Technology, 2018

Accelerating MPI Message Matching through FPGA Offload.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

High Performance Communication on Reconfigurable Clusters.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

A Framework for Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters with Work and Weight Load Balancing.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

High Performance Dynamic Communication on Reconfigurable Clusters.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

FPDeep: Acceleration and Load Balancing of CNN Training on FPGA Clusters.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

2017
An FPGA-based data acquisition system for directional dark matter detection.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

OpenCL for HPC with FPGAs: Case study in molecular electrostatics.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

HPC on FPGA clouds: 3D FFTs and implications for molecular dynamics.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

Bonded Force Computations on FPGAs.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

2016
Collective Communication on FPGA Clusters with Static Scheduling.
SIGARCH Comput. Archit. News, 2016

Communication and cooling aware job allocation in data centers for communication-intensive workloads.
J. Parallel Distributed Comput., 2016

GPU-accelerated charge mapping.
Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

A hardware design for in-brain neural spike sorting.
Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

Novo-G#: Large-scale reconfigurable computing with direct and programmable interconnects.
Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

Application-Aware Collective Communication (Extended Abstract).
Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

FPGA-Accelerated Particle-Grid Mapping.
Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

2015
NCBI BLASTP on High-Performance Reconfigurable Computing Systems.
ACM Trans. Reconfigurable Technol. Syst., 2015

Hardware-efficient compressed sensing encoder designs for WBSNs.
Proceedings of the 2015 IEEE High Performance Extreme Computing Conference, 2015

2014
Design of 3D FFTs with FPGA clusters.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2014

An investigation of Unified Memory Access performance in CUDA.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2014

GPU optimizations for a production molecular docking code.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2014

3D FFTs on a Single FPGA.
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

Increasing Parallelism and Reducing Thread Contentions in Mapping Localized N-Body Simulations to GPUs.
Proceedings of the Numerical Computations with GPUs, 2014

2013
NCBI BLASTP on the convey HC1-EX.
SIGARCH Comput. Archit. News, 2013

3D FFT for FPGAs.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2013

Architecture/algorithm codesign of molecular dynamics processors.
Proceedings of the 2013 Asilomar Conference on Signals, 2013

2012
CAAD BLASTP 2.0: NCBI BLASTP accelerated with pipelined filters.
Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), 2012

FMSA: FPGA-Accelerated ClustalW-Based Multiple Sequence Alignment through Pipelined Prefiltering.
Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012

2011
Parallel discrete molecular dynamics simulation with speculation and in-order commitment.
J. Comput. Phys., 2011

Software optimization for performance, energy, and thermal distribution: Initial case studies.
Proceedings of the 2011 International Green Computing Conference and Workshops, 2011

Efficient Calculation of Pairwise Nonbonded Forces.
Proceedings of the IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines, 2011

2010
Molecular Dynamics Simulations on High-Performance Reconfigurable Computing Systems.
ACM Trans. Reconfigurable Technol. Syst., 2010

FPGA acceleration of rigid-molecule docking codes.
IET Comput. Digit. Tech., 2010

Towards production FPGA-accelerated molecular dynamics: Progress and challenges.
Proceedings of the 2010 Fourth International Workshop on High-Performance Reconfigurable Computing Technology and Applications, 2010

CAAD BLASTn: Accelerated NCBI BLASTn with FPGA prefiltering.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Fast binding site mapping using GPUs and CUDA.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Fast and accurate NCBI BLASTP: acceleration with multiphase FPGA-based prefiltering.
Proceedings of the 24th International Conference on Supercomputing, 2010

2009
Elements of High-Performance Reconfigurable Computing.
Adv. Comput., 2009

FPGA-based acceleration of CHARMM-potential minimization.
Proceedings of the Third International Workshop on High-Performance Reconfigurable Computing Technology and Applications, 2009

Efficient particle-pair filtering for acceleration of molecular dynamics simulation.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

CAAD BLASTP: NCBI BLASTP Accelerated with FPGA-Based Accelerated Pre-Filtering.
Proceedings of the FCCM 2009, 2009

GPU acceleration of a production molecular docking code.
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009

Parallel Discrete Event Simulation of Molecular Dynamics Through Event-Based Decomposition.
Proceedings of the 20th IEEE International Conference on Application-Specific Systems, 2009

2008
Explicit design of FPGA-based coprocessors for short-range force computations in molecular dynamics simulations.
Parallel Comput., 2008

Computing Models for FPGA-Based Accelerators.
Comput. Sci. Eng., 2008

Performance potential of molecular dynamics simulations on high performance reconfigurable computing systems.
Proceedings of the 2008 Second International Workshop on High-Performance Reconfigurable Computing Technology and Applications, 2008

Acceleration of a production rigid molecule docking code.
Proceedings of the FPL 2008, 2008

An Efficient O(1) Priority Queue for Large FPGA-Based Discrete Event Simulations of Molecular Dynamics.
Proceedings of the 16th IEEE International Symposium on Field-Programmable Custom Computing Machines, 2008

2007
Single pass streaming BLAST on FPGAs.
Parallel Comput., 2007

Families of FPGA-based accelerators for approximate string matching.
Microprocess. Microsystems, 2007

Achieving High Performance with FPGA-Based Computing.
Computer, 2007

Discrete Event Simulation of Molecular Dynamics with Configurable Logic.
Proceedings of the FPL 2007, 2007

FPGA-Based Multigrid Computation for Molecular Dynamics Simulations.
Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, 2007

2006
Rigid Molecule Docking: FPGA Reconfiguration for Alternative Force Laws.
EURASIP J. Adv. Signal Process., 2006

Improved Interpolation and System Integration for FPGA-Based Molecular Dynamics Simulations.
Proceedings of the 2006 International Conference on Field Programmable Logic and Applications (FPL), 2006

Sizing of Processing Arrays for FPGA-Based Computation.
Proceedings of the 2006 International Conference on Field Programmable Logic and Applications (FPL), 2006

Application-Specific Memory Interleaving for FPGA-Based Grid Computations: A General Design Technique.
Proceedings of the 2006 International Conference on Field Programmable Logic and Applications (FPL), 2006

Single Pass, BLAST-Like, Approximate String Matching on FPGAs.
Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2006), 2006

Integrating FPGA Acceleration into the Protomol Molecular Dynamics Code: Preliminary Report.
Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2006), 2006

Application-Specific Memory Interleaving Enables High Performance in FPGA-based Grid Computations.
Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2006), 2006

2005
Accelerating Molecular Dynamics Simulations With Configurable Circuits.
Proceedings of the 2005 International Conference on Field Programmable Logic and Applications (FPL), 2005

LAMP: A Tool Suite for Families of FPGA-Based Application Accelerators.
Proceedings of the 2005 International Conference on Field Programmable Logic and Applications (FPL), 2005

Preliminary Report: FPGA Acceleration of Molecular Dynamics Computations.
Proceedings of the 13th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2005), 2005

Three-Dimensional Template Correlation: Object Recognition in 3D Voxel Data.
Proceedings of the Seventh International Workshop on Computer Architectures for Machine Perception (CAMP 2005), 2005

2004
Case study of a functional genomics application for an FPGA-based coprocessor.
Microprocess. Microsystems, 2004

Array control for high-performance SIMD systems.
J. Parallel Distributed Comput., 2004

Processing Repetitive Sequence Structures with Mismatches at Streaming Rate.
Proceedings of the Field Programmable Logic and Application, 2004

FPGA Acceleration of Rigid Molecule Interactions.
Proceedings of the 12th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2004), 2004

Families of FPGA-Based Algorithms for Approximate String Matching.
Proceedings of the 15th IEEE International Conference on Application-Specific Systems, 2004

2003
Case Study of a Functional Genomics Application.
Proceedings of the Field Programmable Logic and Application, 13th International Conference, 2003

2000
A System for Evaluating Performance and Cost of SIMD Array Designs.
J. Parallel Distributed Comput., 2000

An Array Control Unit for High Performance SIMD Arrays.
Proceedings of the Fifth International Workshop on Computer Architectures for Machine Perception (CAMP 2000), 2000

Control for High-Speed PE Arrays.
Proceedings of the 12th IEEE International Conference on Application-Specific Systems, 2000

1999
Using Emulations to Enhance the Performance of Parallel Architectures.
IEEE Trans. Parallel Distributed Syst., 1999

1997
Preprototyping SIMD Coprocessors Using Virtual Machine Emulation and Trace Compilation.
Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1997

1995
Enpassant: An Environment for Evaluating Massively Parallel Array Architectures for Spatially Mapped Applications.
Int. J. Pattern Recognit. Artif. Intell., 1995

Experimental Analysis of Some SIMD Array Memory Hierarchies.
Proceedings of the 1995 International Conference on Parallel Processing, 1995

An empirical study of datapath, memory hierarchy, and network in SIMD array architectures.
Proceedings of the 1995 International Conference on Computer Design (ICCD '95), 1995

1994
Practical Algorithms for Online Routing on Fixed and Reconfigurable Meshes.
J. Parallel Distributed Comput., 1994

1992
Nonuniform region processing on SIMD arrays using the coterie network.
Mach. Vis. Appl., 1992

1991
Message-passing algorithms for a SIMD torus with coteries.
SIGARCH Comput. Archit. News, 1991

Multi-associativity: A Framework for Solving Multiple Non-uniform Problem Instances Simultaneously on SIMD Arrays.
Proceedings of the International Conference on Parallel Processing, 1991

A computational framework and SIMD algorithms for low-level support of intermediate level vision processing.
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1991

1990
Routing on the CAAPP.
Proceedings of the 10th IAPR International Conference on Pattern Recognition, 1990


  Loading...