Martha A. Kim

Orcid: 0000-0001-6243-5753

Affiliations:
  • Columbia University, New York City, USA


According to our database1, Martha A. Kim authored at least 37 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUs.
Proceedings of the 4th Workshop on Machine Learning and Systems, 2024

2022
Synthesized In-BramGarbage Collection for Accelerators with Immutable Memory.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

Synthesized Garbage Collection for FPGA Accelerators.
Proceedings of the FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022, 2022

2020
Catena: A Near-Threshold, Sub-0.4-mW, 16-Core Programmable Spatial Array Accelerator for the Ultralow-Power Mobile and Embedded Internet of Things.
IEEE J. Solid State Circuits, 2020

2019
Compositional Dataflow Circuits.
ACM Trans. Embed. Comput. Syst., 2019

Recursive Binary Neural Network Training Model for Efficient Usage of On-Chip Memory.
IEEE Trans. Circuits Syst. I Regul. Pap., 2019

Catena: A 0.5-V Sub-0.4-mW 16-Core Spatial Array Accelerator for Mobile and Embedded Computing.
Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

Master of none acceleration: a comparison of accelerator architectures for analytical query processing.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

2018
Hardware Acceleration.
IEEE Micro, 2018

vbench: Benchmarking Video Transcoding in the Cloud.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
Address Translation Design Tradeoffs for Heterogeneous Systems.
CoRR, 2017

Compact and voltage-scalable sensor for accurate thermal sensing in dynamic thermal management.
Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017

Pipelining a triggered processing element.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Compositional dataflow circuits.
Proceedings of the 15th ACM-IEEE International Conference on Formal Methods and Models for System Design, 2017

Hotspot monitoring and Temperature Estimation with miniature on-chip temperature sensors.
Proceedings of the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, 2017

Deadlock-free joins in DB-mesh, an asynchronous systolic array accelerator.
Proceedings of the 13th International Workshop on Data Management on New Hardware, 2017

Network Synthesis for Database Processing Units.
Proceedings of the 54th Annual Design Automation Conference, 2017

From functional programs to pipelined dataflow circuits.
Proceedings of the 26th International Conference on Compiler Construction, 2017

2016
NRG-loops: adjusting power from within applications.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016

2015
The Q100 Database Processing Unit.
IEEE Micro, 2015

Implementing latency-insensitive dataflow blocks.
Proceedings of the 13. ACM/IEEE International Conference on Formal Methods and Models for Codesign, 2015

Fast Computational GPU Design with GT-Pin.
Proceedings of the 2015 IEEE International Symposium on Workload Characterization, 2015

Hardware synthesis from a recursive functional language.
Proceedings of the 2015 International Conference on Hardware/Software Codesign and System Synthesis, 2015

2014
Energy Analysis of Hardware and Software Range Partitioning.
ACM Trans. Comput. Syst., 2014

Hardware Partitioning for Big Data Analytics.
IEEE Micro, 2014

The Cache and Codec Model for Storing and Manipulating Data.
IEEE Micro, 2014

An experimental survey of energy management across the stack.
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

ParaShares: Finding the Important Basic Blocks in Multithreaded Programs.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

Q100: the architecture and design of a database processing unit.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

2013
Parallel Block Vectors: Collection, Analysis, and Uses.
IEEE Micro, 2013

Parallel scaling properties from a basic block view.
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2013

Navigating big data with high-throughput, energy-efficient data partitioning.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

2012
Cache Impacts of Datatype Acceleration.
IEEE Comput. Archit. Lett., 2012

Measuring interference between live datacenter applications.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Harmony: Collection and analysis of parallel block vectors.
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

2011
Retinal Oximetry Based on Nonsimultaneous Image Acquisition Using a Conventional Fundus Camera.
IEEE Trans. Medical Imaging, 2011

2010
Computation vs. Memory Systems: Pinning Down Accelerator Bottlenecks.
Proceedings of the Computer Architecture, 2010


  Loading...