Benjamin Coleman

Orcid: 0009-0001-8045-3717

According to our database1, Benjamin Coleman authored at least 29 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
How to Train Data-Efficient LLMs.
CoRR, 2024

Improving Data Efficiency for Recommenders and LLMs.
Proceedings of the 18th ACM Conference on Recommender Systems, 2024

2023
Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies.
CoRR, 2023

CAPS: A Practical Partition Index for Filtered Similarity Search.
CoRR, 2023

CARAMEL: A Succinct Read-Only Lookup Table via Compressed Static Functions.
CoRR, 2023

BOLT: An Automated Deep Learning Framework for Training and Deploying Large-Scale Neural Networks on Commodity CPU Hardware.
CoRR, 2023

Efficient Data Representation Learning in Google-scale Systems.
Proceedings of the 17th ACM Conference on Recommender Systems, 2023

One-Pass Distribution Sketch for Measuring Data Heterogeneity in Federated Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DESSERT: An Efficient Algorithm for Vector Set Search with Vector Set Queries.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

BOLT: An Automated Deep Learning Framework for Training and Deploying Large-Scale Search and Recommendation Models on Commodity CPU Hardware.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

2022
Retaining Knowledge for Learning with Dynamic Definition.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Graph Reordering for Cache-Efficient Near Neighbor Search.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams.
Proceedings of the International Conference on Machine Learning, 2022

2021
Development of a Point-of-Care Assay for HIV-1 Viral Load Using Higher Refractive Index Antibody-Coated Microbeads.
Sensors, 2021

Efficient Inference via Universal LSH Kernel.
CoRR, 2021

Density Sketches for Sampling and Estimation.
CoRR, 2021

Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO).
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Practical Near Neighbor Search via Group Testing.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A One-Pass Distributed and Private Sketch for Kernel Sums with Applications to Machine Learning at Scale.
Proceedings of the CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, November 15, 2021

Revisiting Consistent Hashing with Bounded Loads.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Bloom Origami Assays: Practical Group Testing.
CoRR, 2020

STORM: Foundations of End-to-End Empirical Risk Minimization on the Edge.
CoRR, 2020

A One-Pass Private Sketch for Most Machine Learning Tasks.
CoRR, 2020

Sub-linear RACE Sketches for Approximate Kernel Density Estimation on Streaming Data.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Sub-linear Memory Sketches for Near Neighbor Search on Streaming Data.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
RAMBO: Repeated And Merged Bloom Filter for Multiple Set Membership Testing (MSMT) in Sub-linear time.
CoRR, 2019

RACE: Sub-Linear Memory Sketches for Approximate Near-Neighbor Search on Streaming Data.
CoRR, 2019

2018
SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data.
PLoS Comput. Biol., 2018


  Loading...