James Zou

Orcid: 0000-0001-8880-4764

Affiliations:
  • Stanford University, Department of Electrical Engineering, CA, USA
  • Harvard University, School of Engineering and Applied Sciences, Cambridge, MA, USA


According to our database1, James Zou authored at least 238 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Provable Membership Inference Privacy.
Trans. Mach. Learn. Res., 2024

Author Correction: Bridging the literacy gap for surgical consents: an AI-human expert collaborative approach.
npj Digit. Medicine, 2024

Bridging the literacy gap for surgical consents: an AI-human expert collaborative approach.
npj Digit. Medicine, 2024

Generative AI for designing and validating easily synthesizable and structurally novel antibiotics.
Nat. Mac. Intell., 2024

Systematic analysis of 32,111 AI model cards characterizes documentation practice in AI.
Nat. Mac. Intell., 2024

Belief in the Machine: Investigating Epistemological Blind Spots of Language Models.
CoRR, 2024

Reducing Hallucinations in Vision-Language Models via Latent Space Steering.
CoRR, 2024

MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models.
CoRR, 2024

Locality Alignment Improves Vision-Language Models.
CoRR, 2024

Self-rationalization improves LLM as a fine-grained judge.
CoRR, 2024

TFG: Unified Training-Free Guidance for Diffusion Models.
CoRR, 2024

Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes.
CoRR, 2024

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine.
CoRR, 2024

Regulating AI Adaptation: An Analysis of AI Medical Device Updates.
CoRR, 2024

Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models.
CoRR, 2024

Automated radiotherapy treatment planning guided by GPT-4Vision.
CoRR, 2024

AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval.
CoRR, 2024

TextGrad: Automatic "Differentiation" via Text.
CoRR, 2024

CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models.
CoRR, 2024

Truthful Dataset Valuation by Pointwise Mutual Information.
CoRR, 2024

Accelerating Transformers with Spectrum-Preserving Token Merging.
CoRR, 2024

STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases.
CoRR, 2024

Optimizing Calibration by Gaining Aware of Prediction Correctness.
CoRR, 2024

How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior.
CoRR, 2024

Mapping the Increasing Use of LLMs in Scientific Papers.
CoRR, 2024

Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems.
CoRR, 2024

Simple linear attention language models balance the recall-throughput tradeoff.
CoRR, 2024

Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content.
CoRR, 2024

What's documented in AI? Systematic Analysis of 32K AI Model Cards.
CoRR, 2024

How well do LLMs cite relevant medical references? An evaluation framework and analyses.
CoRR, 2024

Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution.
CoRR, 2024

Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on Hugging Face.
CoRR, 2024

TrustLLM: Trustworthiness in Large Language Models.
CoRR, 2024

Can AI Be as Creative as Humans?
CoRR, 2024

ADMET-AI: a machine learning ADMET platform for evaluation of large-scale chemical libraries.
Bioinform., 2024

Learning and Forgetting Unsafe Examples in Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

ArtWhisperer: A Dataset for Characterizing Human-AI Interactions in Artistic Creations.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Prospector Heads: Generalized Feature Attribution for Large Models & Data.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Selecting Large Language Model to Fine-tune via Rectified Scaling Law.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews.
Proceedings of the Forty-first International Conference on Machine Learning, 2024


Scaling Laws for the Value of Individual Data Points in Machine Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Simple linear attention language models balance the recall-throughput tradeoff.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on HuggingFace.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Zoology: Measuring and Improving Recall in Efficient Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Model ChangeLists: Characterizing Updates to ML Models.
Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024

2023
A clinically applicable AI system for diagnosis of congenital heart diseases based on computed tomography images.
Medical Image Anal., December, 2023

GPT detectors are biased against non-native English writers.
Patterns, July, 2023

Machine learning modeling of RNA structures: methods, challenges and future perspectives.
Briefings Bioinform., July, 2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Trans. Mach. Learn. Res., 2023

Skin Tone Analysis for Representation in Educational Materials (STAR-ED) using machine learning.
npj Digit. Medicine, 2023

A deep learning-based electrocardiogram risk score for long term cardiovascular death and disease.
npj Digit. Medicine, 2023

Author Correction: Prostate cancer therapy personalization via multi-modal deep learning on randomized phase III clinical trials.
npj Digit. Medicine, 2023

Dynamic visualization of high-dimensional data.
Nat. Comput. Sci., 2023

The Power of Contrast for Feature Learning: A Theoretical Analysis.
J. Mach. Learn. Res., 2023

GraphMETRO: Mitigating Complex Distribution Shifts in GNNs via Mixture of Aligned Experts.
CoRR, 2023

ChatGPT Exhibits Gender and Racial Biases in Acute Coronary Syndrome Management.
CoRR, 2023

Data Acquisition: A New Frontier in Data-centric AI.
CoRR, 2023

DMLR: Data-centric Machine Learning Research - Past, Present and Future.
CoRR, 2023

In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering.
CoRR, 2023

Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges.
CoRR, 2023

Can large language models provide useful feedback on research papers? A large-scale empirical analysis.
CoRR, 2023

Large language models in medicine: the potentials and pitfalls.
CoRR, 2023

Is your data alignable? Principled and interpretable alignability testing and integration of single-cell data.
CoRR, 2023

How is ChatGPT's behavior changing over time?
CoRR, 2023

What Should Data Science Education Do with Large Language Models?
CoRR, 2023

Factorized Contrastive Learning: Going Beyond Multi-view Redundancy.
CoRR, 2023

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance.
CoRR, 2023

Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks.
CoRR, 2023

SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis.
CoRR, 2023

Beyond Confidence: Reliable Models Should Also Consider Atypicality.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023


Factorized Contrastive Learning: Going Beyond Multi-view Redundancy.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

OpenDataVal: a Unified Benchmark for Data Valuation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

TWIGMA: A dataset of AI-Generated Images with Metadata From Twitter.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Multi-Granularity Approach to Similarity Search in Multiplexed Immunofluorescence Images.
Proceedings of the Machine Learning in Computational Biology, November 30, 2023

TCR-BERT: learning the grammar of T-cell receptors for flexible antigen-binding analyses.
Proceedings of the Machine Learning in Computational Biology, November 30, 2023

Discover and Cure: Concept-aware Mitigation of Spurious Correlation.
Proceedings of the International Conference on Machine Learning, 2023

Accuracy on the Curve: On the Nonlinear Correlation of ML Performance Between Data Subpopulations.
Proceedings of the International Conference on Machine Learning, 2023

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value.
Proceedings of the International Conference on Machine Learning, 2023

Data-Driven Subgroup Identification for Linear Regression.
Proceedings of the International Conference on Machine Learning, 2023

Diagnosing and Rectifying Vision Models using Language.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Post-hoc Concept Bottleneck Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

When and Why Vision-Language Models Behave like Bags-Of-Words, and What to Do About It?
Proceedings of the Eleventh International Conference on Learning Representations, 2023

FaiREE: fair classification with finite-sample and distribution-free guarantee.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale.
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 2023

Collecting data when missingness is unknown: a method for improving model performance given under-reporting in patient populations.
Proceedings of the Conference on Health, Inference, and Learning, 2023

Understanding and Predicting the Effect of Environmental Factors on People with Type 2 Diabetes.
Proceedings of the Conference on Health, Inference, and Learning, 2023

Freeze then Train: Towards Provable Representation Learning under Spurious Correlations and Feature Noise.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Understanding Multimodal Contrastive Learning and Incorporating Unpaired Data.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Beyond Positive Scaling: How Negation Impacts Scaling Trends of Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

HAPI Explorer: Comprehension, Discovery, and Explanation on History of ML APIs.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Author Correction: Advances, challenges and opportunities in creating data for trustworthy AI.
Nat. Mac. Intell., October, 2022

Competition over data: how does data purchase affect users?
Trans. Mach. Learn. Res., 2022

Systematic analysis of 50 years of Stanford University technology transfer and commercialization.
Patterns, 2022

Prostate cancer therapy personalization via multi-modal deep learning on randomized phase III clinical trials.
npj Digit. Medicine, 2022

Advances, challenges and opportunities in creating data for trustworthy AI.
Nat. Mach. Intell., 2022

AI reflections in 2021.
Nat. Mach. Intell., 2022

Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics.
Inf., 2022

A Spectral Method for Assessing and Combining Multiple Data Visualizations.
CoRR, 2022

SEAL : Interactive Tool for Systematic Error Analysis and Labeling.
CoRR, 2022

Knowledge-Driven New Drug Recommendation.
CoRR, 2022

Data Budgeting for Machine Learning.
CoRR, 2022

Protein structure generation via folding diffusion.
CoRR, 2022

Development and Clinical Evaluation of an AI Support Tool for Improving Telemedicine Photo Quality.
CoRR, 2022

DataPerf: Benchmarks for Data-Centric AI Development.
CoRR, 2022

GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language.
CoRR, 2022

A Unified f-divergence Framework Generalizing VAE and GAN.
CoRR, 2022

Improving genetic risk prediction across diverse population by disentangling ancestry representations.
CoRR, 2022

Electrocardiographic Deep Learning for Predicting Post-Procedural Mortality.
CoRR, 2022

Disparities in Dermatology AI Performance on a Diverse, Curated Clinical Image Set.
CoRR, 2022

Submix: Practical Private Prediction for Large-Scale Language Models.
CoRR, 2022

C-Mixup: Improving Generalization in Regression.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Uncalibrated Models Can Improve Human-AI Collaboration.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

WeightedSHAP: analyzing and improving Shapley based feature attributions.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained debugging and analysis.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Estimating and Explaining Model Performance When Both Covariates and Labels Shift.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Predicting Immune Escape with Pretrained Protein Language Model Embeddings.
Proceedings of the Machine Learning in Computational Biology, 21-22 November 2022, Online, 2022

Ensembling improves stability and power of feature selection for deep learning models.
Proceedings of the Machine Learning in Computational Biology, 21-22 November 2022, Online, 2022

When and How Mixup Improves Calibration.
Proceedings of the International Conference on Machine Learning, 2022

Improving Out-of-Distribution Robustness via Selective Augmentation.
Proceedings of the International Conference on Machine Learning, 2022

Efficient Online ML API Selection for Multi-Label Classification Tasks.
Proceedings of the International Conference on Machine Learning, 2022

Meaningfully debugging model mistakes using conceptual counterfactual explanations.
Proceedings of the International Conference on Machine Learning, 2022

MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Domino: Discovering Systematic Errors with Cross-Modal Embeddings.
Proceedings of the Tenth International Conference on Learning Representations, 2022

How Did the Model Change? Efficiently Assessing Machine Learning API Shifts.
Proceedings of the Tenth International Conference on Learning Representations, 2022

SEAL: Interactive Tool for Systematic Error Analysis and Labeling.
Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing, 2022

dcbench: a benchmark for data-centric AI systems.
Proceedings of the DEEM '22: Proceedings of the Sixth Workshop on Data Management for End-To-End Machine Learning Philadelphia, 2022

Clustering Plotted Data by Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

How to Learn when Data Gradually Reacts to Your Model.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

MLDemon: Deployment Monitoring for Machine Learning Systems.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Do Humans Trust Advice More if it Comes from AI?: An Analysis of Human-AI Interactions.
Proceedings of the AIES '22: AAAI/ACM Conference on AI, Ethics, and Society, Oxford, United Kingdom, May 19, 2022

Data Sculpting: Interpretable Algorithm for End-to-End Cohort Selection.
Proceedings of the 56th Asilomar Conference on Signals, Systems, and Computers, ACSSC 2022, Pacific Grove, CA, USA, October 31, 2022

Data Shapley Valuation for Efficient Batch Active Learning.
Proceedings of the 56th Asilomar Conference on Signals, Systems, and Computers, ACSSC 2022, Pacific Grove, CA, USA, October 31, 2022

Grading of Prostate Whole-slide Images Using Weak Self-supervised Learning.
Proceedings of the 56th Asilomar Conference on Signals, Systems, and Computers, ACSSC 2022, Pacific Grove, CA, USA, October 31, 2022

2021
Large language models associate Muslims with violence.
Nat. Mach. Intell., 2021

Patient Experience Surveys Reveal Gender-Biased Descriptions of Their Care Providers.
J. Medical Syst., 2021

Explaining medical AI performance disparities across sites with confounder Shapley value analysis.
CoRR, 2021

Disparities in Dermatology AI: Assessments Using Diverse Clinical Images.
CoRR, 2021

Did the Model Change? Efficiently Assessing Machine Learning API Shifts.
CoRR, 2021

Do Humans Trust Advice More if it Comes from AI? An Analysis of Human-AI Interactions.
CoRR, 2021

Meaningfully Explaining a Model's Mistakes.
CoRR, 2021

High-Throughput Precision Phenotyping of Left Ventricular Hypertrophy with Cardiovascular Deep Learning.
CoRR, 2021

Group-Structured Adversarial Training.
CoRR, 2021

FrugalMCT: Efficient Online ML API Selection for Multi-Label Classification Tasks.
CoRR, 2021

TrueImage: A Machine Learning Algorithm to Improve the Quality of Telehealth Photos.
Proceedings of the Biocomputing 2021: Proceedings of the Pacific Symposium, 2021

Adversarial Training Helps Transfer Learning via Better Representations.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Neural Group Testing to Accelerate Deep Learning.
Proceedings of the IEEE International Symposium on Information Theory, 2021

Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems.
Proceedings of the IEEE International Symposium on Information Theory, 2021

Improving Generalization in Meta-learning via Task Augmentation.
Proceedings of the 38th International Conference on Machine Learning, 2021

How to Learn when Data Reacts to Your Model: Performative Gradient Descent.
Proceedings of the 38th International Conference on Machine Learning, 2021

How Does Mixup Help With Robustness and Generalization?
Proceedings of the 9th International Conference on Learning Representations, 2021

Racial Representation Analysis in Dermatology Academic Materials.
Proceedings of the AMIA 2021, American Medical Informatics Association Annual Symposium, San Diego, CA, USA, October 30, 2021, 2021

Efficient Computation and Analysis of Distributional Shapley Values.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Approximate Data Deletion from Machine Learning Models.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Competing AI: How does competition feedback affect machine learning?
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Improving Adversarial Robustness via Unlabeled Out-of-Domain Data.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Who's Responsible? Jointly Quantifying the Contribution of the Learning Algorithm and Data.
Proceedings of the AIES '21: AAAI/ACM Conference on AI, 2021

Persistent Anti-Muslim Bias in Large Language Models.
Proceedings of the AIES '21: AAAI/ACM Conference on AI, 2021

2020
How Much Does Your Data Exploration Overfit? Controlling Bias via Information Usage.
IEEE Trans. Inf. Theory, 2020

Deep learning interpretation of echocardiograms.
npj Digit. Medicine, 2020

Video-based AI for beat-to-beat assessment of cardiac function.
Nat., 2020

An online platform for interactive feedback in biomedical machine learning.
Nat. Mach. Intell., 2020

Data Valuation for Medical Imaging Using Shapley Value: Application on A Large-scale Chest X-ray Dataset.
CoRR, 2020

Competing AI: How competition feedback affects machine learning.
CoRR, 2020

Improving Training on Noisy Stuctured Labels.
CoRR, 2020

Approximate Data Deletion from Machine Learning Models: Algorithms and Evaluations.
CoRR, 2020

Predicting target genes of non-coding regulatory variants with IRT.
Bioinform., 2020

LitGen: Genetic Literature Recommendation Guided by Human Explanations.
Proceedings of the Pacific Symposium on Biocomputing 2020, 2020

MOPO: Model-based Offline Policy Optimization.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Neuron Shapley: Discovering the Responsible Neurons.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

FrugalML: How to use ML Prediction APIs more accurately and cheaply.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A Distributional Framework For Data Valuation.
Proceedings of the 37th International Conference on Machine Learning, 2020

Learning transport cost from subset correspondence.
Proceedings of the 8th International Conference on Learning Representations, 2020

ALICE: Active Learning with Contrastive Natural Language Explanations.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
VetTag: improving automated veterinary diagnosis coding via large-scale language modeling.
npj Digit. Medicine, 2019

Sex and gender analysis improves science and engineering.
Nat., 2019

Feedback GAN for DNA optimizes protein functions.
Nat. Mach. Intell., 2019

Who's responsible? Jointly quantifying the contribution of the learning algorithm and training data.
CoRR, 2019

Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild.
CoRR, 2019

Contrastive Variational Autoencoder Enhances Salient Features.
CoRR, 2019

AdaFDR: A Fast, Powerful and Covariate-Adaptive Approach to Multiple Hypothesis Testing.
Proceedings of the Research in Computational Molecular Biology, 2019

Making AI Forget You: Data Deletion in Machine Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Towards Automatic Concept-based Explanations.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Adaptive Monte Carlo Multiple Testing via Multi-Armed Bandits.
Proceedings of the 36th International Conference on Machine Learning, 2019

Discovering Conditionally Salient Features with Statistical Guarantees.
Proceedings of the 36th International Conference on Machine Learning, 2019

Data Shapley: Equitable Valuation of Data for Machine Learning.
Proceedings of the 36th International Conference on Machine Learning, 2019

Concrete Autoencoders: Differentiable Feature Selection and Reconstruction.
Proceedings of the 36th International Conference on Machine Learning, 2019

Contingent Payment Mechanisms for Resource Utilization.
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

Contrastive Multivariate Singular Spectrum Analysis.
Proceedings of the 57th Annual Allerton Conference on Communication, 2019

Improving the Stability of the Knockoff Procedure: Multiple Simultaneous Knockoffs and Entropy Maximization.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Knockoffs for the Mass: New Feature Importance Statistics with False Discovery Guarantees.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Multiaccuracy: Black-Box Post-Processing for Fairness in Classification.
Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 2019

Interpretation of Neural Networks Is Fragile.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Word embeddings quantify 100 years of gender and ethnic stereotypes.
Proc. Natl. Acad. Sci. USA, 2018

DeepTag: inferring diagnoses from veterinary clinical notes.
npj Digit. Medicine, 2018

Large-scale Generative Modeling to Improve Automated Veterinary Disease Coding.
CoRR, 2018

Autowarp: Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders.
CoRR, 2018

DeepTag: inferring all-cause diagnoses from clinical notes in under-resourced medical domain.
CoRR, 2018

Feedback GAN (FBGAN) for DNA: a Novel Feedback-Loop Architecture for Optimizing Protein Functions.
CoRR, 2018

Stochastic EM for Shuffled Linear Regression.
CoRR, 2018

Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

CoVeR: Learning Covariate-Specific Vector Representations with Tensor Decompositions.
Proceedings of the 35th International Conference on Machine Learning, 2018

The Effects of Memory Replay in Reinforcement Learning.
Proceedings of the 56th Annual Allerton Conference on Communication, 2018

Embedding for Informative Missingness: Deep Learning With Incomplete Data.
Proceedings of the 56th Annual Allerton Conference on Communication, 2018

A Stochastic Expectation-Maximization Approach to Shuffled Linear Regression.
Proceedings of the 56th Annual Allerton Conference on Communication, 2018

Why Adaptively Collected Data Have Negative Bias and How to Correct for It.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
Contrastive Principal Component Analysis.
CoRR, 2017

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context.
Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017

NeuralFDR: Learning Discovery Thresholds from Hypothesis Features.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learning Latent Space Models with Angular Constraints.
Proceedings of the 34th International Conference on Machine Learning, 2017

Estimating the unseen from multiple populations.
Proceedings of the 34th International Conference on Machine Learning, 2017

2016
Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation.
CoRR, 2016

Contingent Payment Mechanisms to Maximize Resource Utilization.
CoRR, 2016

Quantifying and Reducing Stereotypes in Word Embeddings.
CoRR, 2016

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2015
Inferring parental genomic ancestries using pooled semi-Markov processes.
Bioinform., 2015

Incentive-Compatible Experimental Design.
Proceedings of the Sixteenth ACM Conference on Economics and Computation, 2015

Crowdsourcing Feature Discovery via Adaptively Chosen Comparisons.
Proceedings of the Third AAAI Conference on Human Computation and Crowdsourcing, 2015

Strategic Voting Behavior in Doodle Polls.
Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, 2015

2013
Contrastive Learning Using Spectral Methods.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012
Mechanism Design for Time Critical and Cost Critical Task Execution via Crowdsourcing.
Proceedings of the Internet and Network Economics - 8th International Workshop, 2012

Priors for Diversity in Generative Latent Variable Models.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

A Slime Mold Solver for Linear Programming Problems.
Proceedings of the How the World Computes, 2012

Threats and Trade-Offs in Resource Critical Crowdsourcing Tasks Over Networks.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2010
Tolerable Manipulability in Dynamic Assignment without Money.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010


  Loading...