Ce Zhang

Orcid: 0000-0002-8105-7505

Affiliations:
  • ETH Zurich, Institut für Computing Platforms, Switzerland
  • Stanford University, Computer Science Department, CA, USA (former)
  • University of Wisconsin-Madison, Department of Computer Science, Madison, WI, USA (former)
  • Peking University, Beijing, China (former)


According to our database1, Ce Zhang authored at least 263 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems.
VLDB J., September, 2024

How good are machine learning clouds? Benchmarking two snapshots over 5 years.
VLDB J., May, 2024

A systematic evaluation of machine learning on serverless infrastructure.
VLDB J., 2024

OpenBox: A Python Toolkit for Generalized Black-box Optimization.
J. Mach. Learn. Res., 2024

Red Onions, Soft Cheese and Data: From Food Safety to Data Traceability for Responsible AI.
IEEE Data Eng. Bull., 2024

Semantic-Enhanced Indirect Call Analysis with Large Language Models.
CoRR, 2024

OAM-TCD: A globally diverse dataset of high-resolution tree cover maps.
CoRR, 2024

Analysis of Distributed Optimization Algorithms on a Real Processing-In-Memory System.
CoRR, 2024

TablePuppet: A Generic Framework for Relational Federated Learning.
CoRR, 2024


Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM.
Proceedings of the IEEE Conference on Secure and Trustworthy Machine Learning, 2024

Mechanistic Design and Scaling of Hybrid Architectures.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Effective and Efficient Federated Tree Learning on Hybrid Data.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Data Debugging with Shapley Importance over Machine Learning Pipelines.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System.
Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023
Convolution-Enhanced Evolving Attention Networks.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

VolcanoML: speeding up end-to-end AutoML via scalable search space decomposition.
VLDB J., March, 2023

Holistic Evaluation of Language Models.
Trans. Mach. Learn. Res., 2023

Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-Aware Deep Architecture.
IEEE Trans. Knowl. Data Eng., 2023

Towards General and Efficient Online Tuning for Spark.
Proc. VLDB Endow., 2023

DMLR: Data-centric Machine Learning Research - Past, Present and Future.
CoRR, 2023

In-Context Few-Shot Relation Extraction via Pre-Trained Language Models.
CoRR, 2023

BenchTemp: A General Benchmark for Evaluating Temporal Graph Neural Networks.
CoRR, 2023

Improving Retrieval-Augmented Large Language Models via Data Importance Learning.
CoRR, 2023

OpenBox: A Python Toolkit for Generalized Black-box Optimization.
CoRR, 2023

High-throughput Generative Inference of Large Language Models with a Single GPU.
CoRR, 2023

Hierarchical Classification of Research Fields in the "Web of Science" Using Deep Learning.
CoRR, 2023

RAB: Provable Robustness Against Backdoor Attacks.
Proceedings of the 44th IEEE Symposium on Security and Privacy, 2023

Proactively Screening Machine Learning Pipelines with ARGUSEYES.
Proceedings of the Companion of the 2023 International Conference on Management of Data, 2023

CARE: Certifiably Robust Learning with Reasoning via Variational Inference.
Proceedings of the 2023 IEEE Conference on Secure and Trustworthy Machine Learning, 2023

WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023


Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Skill-it! A data-driven skills framework for understanding and training language models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks.
Proceedings of the International Conference on Machine Learning, 2023

FedHPO-Bench: A Benchmark Suite for Federated Hyperparameter Optimization.
Proceedings of the International Conference on Machine Learning, 2023

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time.
Proceedings of the International Conference on Machine Learning, 2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.
Proceedings of the International Conference on Machine Learning, 2023

Contrastive Learning for Unsupervised Domain Adaptation of Time Series.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

DSG: An End-to-End Document Structure Generator.
Proceedings of the IEEE International Conference on Data Mining, 2023

DBCatcher: A Cloud Database Online Anomaly Detection System based on Indicator Correlation.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Automatic Feasibility Study via Data Quality Analysis for ML: A Case-Study on Label Noise.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

2022
Author Correction: Advances, challenges and opportunities in creating data for trustworthy AI.
Nat. Mac. Intell., October, 2022


Deep Learning for Recommender Systems.
Proceedings of the Recommender Systems Handbook, 2022

Data Science Through the Looking Glass: Analysis of Millions of GitHub Notebooks and ML.NET Pipelines.
SIGMOD Rec., 2022

SHiFT: An Efficient, Flexible Search Engine for Transfer Learning.
Proc. VLDB Endow., 2022

Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale.
Proc. VLDB Endow., 2022

Advances, challenges and opportunities in creating data for trustworthy AI.
Nat. Mach. Intell., 2022

Bringing artificial intelligence to business management.
Nat. Mach. Intell., 2022

Number-Adaptive Prototype Learning for 3D Point Cloud Semantic Segmentation.
CoRR, 2022

New primitives for bounded degradation in network service.
CoRR, 2022

Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM.
CoRR, 2022

DataPerf: Benchmarks for Data-Centric AI Development.
CoRR, 2022

GraphFramEx: Towards Systematic Evaluation of Explainability Methods for Graph Neural Networks.
CoRR, 2022

Efficient End-to-End AutoML via Scalable Search Space Decomposition.
CoRR, 2022

Stochastic Gradient Descent without Full Data Shuffle.
CoRR, 2022

FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization.
CoRR, 2022

Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees.
CoRR, 2022

Data Debugging with Shapley Importance over End-to-End Machine Learning Pipelines.
CoRR, 2022

Modelling graph dynamics in fraud detection with "Attention".
CoRR, 2022

Experiments as Code: A Concept for Reproducible, Auditable, Debuggable, Reusable, & Scalable Experiments.
CoRR, 2022

A Deep Markov Model for Clickstream Analytics in Online Shopping.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022

Keyword Extraction in Scientific Documents.
Proceedings of the Swiss Text Analytics Conference 2022, Lugano, 2022

LINKTELLER: Recovering Private Edges from Graph Neural Networks via Influence Analysis.
Proceedings of the 43rd IEEE Symposium on Security and Privacy, 2022

In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

HUNTER: An Online Cloud Database Hybrid Tuning System for Personalized Requirements.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Decentralized Training of Foundation Models in Heterogeneous Environments.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Improving Certified Robustness via Statistical Learning with Logical Reasoning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Certifying Some Distributional Fairness with Subpopulation Decomposition.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely?
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Dynamic Human Evaluation for Relative Model Comparisons.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

GraphFramEx: Towards Systematic Evaluation of Explainability Methods for Graph Neural Networks.
Proceedings of the Learning on Graphs Conference, 2022

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Transfer Learning based Search Space Design for Hyperparameter Tuning.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Certifying Out-of-Domain Generalization for Blackbox Functions.
Proceedings of the International Conference on Machine Learning, 2022

iFlood: A Stable and Effective Regularizer.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Neural Methods for Logical Reasoning over Knowledge Graphs.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-aware Deep Architecture (Extended Abstract).
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

User-Centered Information Architecture of Vehicle AR-HUD Interface.
Proceedings of the HCI in Mobility, Transport, and Automotive Systems, 2022

dcbench: a benchmark for data-centric AI systems.
Proceedings of the DEEM '22: Proceedings of the Sixth Workshop on Data Management for End-To-End Machine Learning Philadelphia, 2022

Which Model to Transfer? Finding the Needle in the Growing Haystack.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

BRIGHT - Graph Neural Networks in Real-time Fraud Detection.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

Screening Native Machine Learning Pipelines with ArgusEyes.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022


ReforesTree: A Dataset for Estimating Tropical Forest Carbon Stock with Deep Learning and Aerial Imagery.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets.
Proceedings of the Workshop on Scientific Document Understanding co-located with 36th AAAI Conference on Artificial Inteligence, 2022

Distributed Machine Learning and Gradient Optimization
Springer, ISBN: 978-981-16-3419-2, 2022

2021
Model averaging in distributed machine learning: a case study with Apache Spark.
VLDB J., 2021

A Large-Scale Study of Android Malware Development Phenomenon on Public Malware Submission and Scanning Platform.
IEEE Trans. Big Data, 2021

WindTunnel: Towards Differentiable ML Pipelines Beyond a Single Modele.
Proc. VLDB Endow., 2021

xFraud: Explainable Fraud Transaction Detection.
Proc. VLDB Endow., 2021

VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition.
Proc. VLDB Endow., 2021

Federated Matrix Factorization with Privacy Guarantee.
Proc. VLDB Endow., 2021

BAGUA: Scaling up Distributed Learning with System Relaxations.
Proc. VLDB Endow., 2021

A Data Quality-Driven View of MLOps.
IEEE Data Eng. Bull., 2021

RumbleML: program the lakehouse with JSONiq.
CoRR, 2021

Evaluating Bayes Error Estimators on Read-World Datasets with FeeBee.
CoRR, 2021

Tackling the Overestimation of Forest Carbon with Deep Learning and Aerial Imagery.
CoRR, 2021

TRS: Transferability Reduced Ensemble via Encouraging Gradient Diversity and Model Smoothness.
CoRR, 2021

Switch Spaces: Learning Product Spaces with Sparse Gating.
CoRR, 2021

Decoding EEG Brain Activity for Multi-Modal Natural Language Processing.
CoRR, 2021

Learning User Representations with Hypercuboids for Recommender Systems.
Proceedings of the WSDM '21, 2021

Towards Demystifying Serverless Machine Learning Training.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Towards understanding end-to-end learning in the context of data: machine learning dancing over semirings & Codd's table.
Proceedings of the Fifth Workshop on Data Management for End-To-End Machine Learning, 2021

TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Evaluating Bayes Error Estimators on Real-World Datasets with FeeBee.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Knowledge Router: Learning Disentangled Representations for Knowledge Graphs.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Multilingual Language Models Predict Human Reading Behavior.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

DeGNN: Improving Graph Neural Networks with Graph Decomposition.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

AutoML: A Perspective where Industry Meets Academy.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

OpenBox: A Generalized Black-box Optimization Service.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Evolving Attention with Residual Convolutions.
Proceedings of the 38th International Conference on Machine Learning, 2021

1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed.
Proceedings of the 38th International Conference on Machine Learning, 2021

Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks.
Proceedings of the 38th International Conference on Machine Learning, 2021

CleanML: A Study for Evaluating the Impact of Data Cleaning on ML Classification Tasks.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Scalability vs. Utility: Do We Have To Sacrifice One for the Other in Data Importance Quantification?
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

AutoML: From Methodology to Application.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

Ease.ML: A Lifecycle Management System for Machine Learning.
Proceedings of the 11th Conference on Innovative Data Systems Research, 2021

DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation.
Proceedings of the CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, November 15, 2021

TSS: Transformation-Specific Smoothing for Robustness Certification.
Proceedings of the CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, November 15, 2021

Online Active Model Selection for Pre-trained Classifiers.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

DocParser: Hierarchical Document Structure Parsing from Renderings.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
A Principled Approach to Data Valuation for Federated Learning.
Proceedings of the Federated Learning - Privacy and Incentive, 2020

Compressive Sensing Using Iterative Hard Thresholding With Low Precision Data Representation: Theory and Applications.
IEEE Trans. Signal Process., 2020

Ease.ml/snoopy in Action: Towards Automatic Feasibility Analysis for Machine Learning Application Development.
Proc. VLDB Endow., 2020

Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions.
Proc. VLDB Endow., 2020

Neural dynamics of sentiment processing during naturalistic sentence reading.
NeuroImage, 2020

RosENet: Improving Binding Affinity Prediction by Leveraging Molecular Mechanics Energies with an Ensemble of 3D Convolutional Neural Networks.
J. Chem. Inf. Model., 2020

Distributed Learning Systems with First-Order Methods.
Found. Trends Databases, 2020

Suspicious Massive Registration Detection via Dynamic Heterogeneous Graph Neural Networks.
CoRR, 2020

xFraud: Explainable Fraud Transaction Detection on Heterogeneous Graphs.
CoRR, 2020

On Automatic Feasibility Study for Machine Learning Application Development with ease.ml/snoopy.
CoRR, 2020

MicroRec: Accelerating Deep Recommendation Systems to Microseconds by Hardware and Data Structure Solutions.
CoRR, 2020

Optimal Provable Robustness of Quantum Classification via Quantum Hypothesis Testing.
CoRR, 2020

A Principled Approach to Data Valuation for Federated Learning.
CoRR, 2020

APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm.
CoRR, 2020

Interactive Feature Generation via Learning Adjacency Tensor of Feature Graph.
CoRR, 2020

TrueBranch: Metric Learning-based Verification of Forest Conservation Projects.
CoRR, 2020

Improving BERT with Self-Supervised Attention.
CoRR, 2020

End-to-end Robustness for Sensing-Reasoning Machine Learning Pipelines.
CoRR, 2020

Provable Robust Learning Based on Transformation-Specific Smoothing.
CoRR, 2020

Learning to Mutate with Hypergradient Guided Population.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

On Convergence of Nearest Neighbor Classifiers over Feature Transformations.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

ZuCo 2.0: A Dataset of Physiological Recordings During Natural Reading and Annotation.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Building Continuous Integration Services for Machine Learning.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript.
Proceedings of the 37th International Conference on Machine Learning, 2020

C olumnSGD: A Column-oriented Framework for Distributed Stochastic Gradient Descent.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Control, Generate, Augment: A Scalable Framework for Multi-Attribute Text Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Observer Dependent Lossy Image Compression.
Proceedings of the Pattern Recognition - 42nd DAGM German Conference, DAGM GCPR 2020, Tübingen, Germany, September 28, 2020

CogniVal in Action: An Interface for Customizable Cognitive Word Embedding Evaluation.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

TextNAS: A Neural Architecture Search Space Tailored for Text Representation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Efficient Automatic CASH via Rising Bandits.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning.
Proc. VLDB Endow., 2019

Ease.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization.
Proc. VLDB Endow., 2019

Opportunities for Data Management Research in the Era of Horizontal AI/ML.
Proc. VLDB Endow., 2019

doppioDB 2.0: Hardware Techniques for Improved Integration of Machine Learning into Databases.
Proc. VLDB Endow., 2019

Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms.
Proc. VLDB Endow., 2019

Data Science through the looking glass and what we found there.
CoRR, 2019

An Empirical and Comparative Analysis of Data Valuation with Scalable Algorithms.
CoRR, 2019

DocParser: Hierarchical Structure Parsing of Document Renderings.
CoRR, 2019

An Anatomy of Graph Neural Networks Going Deep via the Lens of Mutual Information: Exponential Decay vs. Full Preservation.
CoRR, 2019

Lossy Image Compression with Recurrent Neural Networks: from Human Perceived Visual Quality to Classification Accuracy.
CoRR, 2019

DeepSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression.
CoRR, 2019

Quantitative Overfitting Management for Human-in-the-loop ML Application Development with ease.ml/meter.
CoRR, 2019

CleanML: A Benchmark for Joint Data Cleaning and Machine Learning [Experiments and Analysis].
CoRR, 2019

SysML: The New Frontier of Machine Learning Systems.
CoRR, 2019

Advancing NLP with Cognitive Language Processing Signals.
CoRR, 2019

Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning (Technical Report).
CoRR, 2019

Sensing Social Media Signals for Cryptocurrency News.
Proceedings of the Companion of The 2019 World Wide Web Conference, 2019

Is advance knowledge of flow sizes a plausible assumption?
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

Entity Recognition at First Sight: Improving NER with Eye Movement Information.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Continuous Integration of Machine Learning Models with ease.ml/ci: Towards a Rigorous Yet Practical Treatment.
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Distributed Learning over Unreliable Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

MLlib*: Fast Training of GLMs Using Spark MLlib.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

CogniVal: A Framework for Cognitive Word Embedding Evaluation.
Proceedings of the 23rd Conference on Computational Natural Language Learning, 2019

AutoML from Service Provider's Perspective: Multi-device, Multi-tenant Model Selection with GP-EI.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Towards Efficient Data Valuation Based on the Shapley Value.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
MLBench: Benchmarking Machine Learning Services Against Human Experts.
Proc. VLDB Endow., 2018

Ease.ml: Towards Multi-tenant Resource Sharing for Machine Learning Workloads.
Proc. VLDB Endow., 2018

Ease.ml in Action: Towards Multi-tenant Declarative Learning Services.
Proc. VLDB Endow., 2018

ColumnML: Column-Store Machine Learning with On-The-Fly Data Transformation.
Proc. VLDB Endow., 2018

Exploring galaxy evolution with generative models.
CoRR, 2018

Using transfer learning to detect galaxy mergers.
CoRR, 2018

Multi-device, Multi-tenant Model Selection with GP-EI.
CoRR, 2018

Decentralization Meets Quantization.
CoRR, 2018

Compressive Sensing with Low Precision Data Representation: Radio Astronomy and Beyond.
CoRR, 2018

DataBright: Towards a Global Exchange for Decentralized Data Ownership and Trusted Computation.
CoRR, 2018

DimBoost: Boosting Gradient Boosting Decision Tree to Higher Dimensions.
Proceedings of the 2018 International Conference on Management of Data, 2018

ETH-DS3Lab at SemEval-2018 Task 7: Effectively Combining Recurrent and Convolutional Neural Networks for Relation Classification and Extraction.
Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

Communication Compression for Decentralized Training.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

D<sup>2</sup>: Decentralized Training over Decentralized Data.
Proceedings of the 35th International Conference on Machine Learning, 2018

Asynchronous Decentralized Parallel Stochastic Gradient Descent.
Proceedings of the 35th International Conference on Machine Learning, 2018

Synchronous Multi-GPU Training for Deep Learning with Low-Precision Communications: An Empirical Study.
Proceedings of the 21st International Conference on Extending Database Technology, 2018

Inferring Short-Term Volatility Indicators from the Bitcoin Blockchain.
Proceedings of the Complex Networks and Their Applications VII, 2018

Network Scheduling in the Dark.
Proceedings of the ACM Symposium on Cloud Computing, 2018

Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

Patient Risk Assessment and Warning Symptom Detection Using Deep Attention-Based Neural Networks.
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis, 2018

2017
Incremental knowledge base construction using DeepDive.
VLDB J., 2017

An Experimental Evaluation of SimRank-based Similarity Search Algorithms.
Proc. VLDB Endow., 2017

LDA*: A Robust and Large-scale Topic Modeling System.
Proc. VLDB Endow., 2017

MLog: Towards Declarative In-Database Machine Learning.
Proc. VLDB Endow., 2017

How Good Are Machine Learning Clouds for Binary Classification with Good Features?
CoRR, 2017

Generative Adversarial Networks recover features in astrophysical images of galaxies beyond the deconvolution limit.
CoRR, 2017

DeepDive: declarative knowledge base construction.
Commun. ACM, 2017

An Overreaction to the Broken Machine Learning Abstraction: The ease.ml Vision.
Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, 2017

Heterogeneity-aware Distributed Parameter Servers.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

TencentBoost: A Gradient Boosting Tree System with Parameter Server.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

Scalable inference of decision tree ensembles: Flexible design for CPU-FPGA platforms.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

FPGA-Accelerated Dense Linear Machine Learning: A Precision-Convergence Trade-Off.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

How good are machine learning clouds for binary classification with good features?: extended abstract.
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017

Predicting Non-Small Cell Lung Cancer Diagnosis and Prognosis by Fully Automated Microscopic Pathology Image Features.
Proceedings of the AMIA 2017, 2017

2016
Materialization Optimizations for Feature Selection Workloads.
ACM Trans. Database Syst., 2016

DeepDive: Declarative Knowledge Base Construction.
SIGMOD Rec., 2016

ZipML: An End-to-end Bitwise Framework for Dense Generalized Linear Models.
CoRR, 2016

CYCLADES: Conflict-free Asynchronous Machine Learning.
CoRR, 2016

Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs.
CoRR, 2016

Large-scale extraction of gene interactions from full-text literature using DeepDive.
Bioinform., 2016

Extracting Databases from Dark Data with DeepDive.
Proceedings of the 2016 International Conference on Management of Data, 2016

Android malware development on public malware scanning platforms: A large-scale data-driven study.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

Asynchrony begets momentum, with an application to deep learning.
Proceedings of the 54th Annual Allerton Conference on Communication, 2016

2015
Incremental Knowledge Base Construction Using DeepDive.
Proc. VLDB Endow., 2015

Building a Large-scale Multimodal Knowledge Base for Visual Question Answering.
CoRR, 2015

Incremental Knowledge Base Construction Using DeepDive.
CoRR, 2015

Caffe con Troll: Shallow Ideas to Speed Up Deep Learning.
Proceedings of the Fourth Workshop on Data analytics in the Cloud, 2015

Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

2014
DimmWitted: A Study of Main-Memory Statistical Analytics.
Proc. VLDB Endow., 2014

Feature Engineering for Knowledge Base Construction.
IEEE Data Eng. Bull., 2014

A machine-compiled macroevolutionary history of Phanerozoic life.
CoRR, 2014

Tradeoffs in Main-Memory Statistical Analytics from Impala to DimmWitted.
Proceedings of the 2nd International Workshop on In Memory Data Management and Analytics, 2014

Parallel Feature Selection Inspired by Group Testing.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
An Approximate, Efficient Solver for LP Rounding.
CoRR, 2013

Bootstrapping Knowledge Base Acceleration.
Proceedings of The Twenty-Second Text REtrieval Conference, 2013

Evaluating Stream Filtering for Entity Profile Updates for TREC 2013.
Proceedings of The Twenty-Second Text REtrieval Conference, 2013

Towards high-throughput gibbs sampling at scale: a study across storage managers.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

GeoDeepDive: statistical inference using familiar data-processing languages.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

An Approximate, Efficient LP Solver for LP Rounding.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

A Markov logic framework for recognizing complex events from multimodal data.
Proceedings of the 2013 International Conference on Multimodal Interaction, 2013

Brainwash: A Data System for Feature Engineering.
Proceedings of the Sixth Biennial Conference on Innovative Data Systems Research, 2013

Understanding Tables in Context Using Standard NLP Toolkits.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
Elementary: Large-Scale Knowledge-Base Construction via Machine Learning and Statistical Inference.
Int. J. Semantic Web Inf. Syst., 2012

DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference.
Proceedings of the Second International Workshop on Searching and Integrating New Web Data Sources, 2012

Building an Entity-Centric Stream Filtering Test Collection for TREC 2012.
Proceedings of The Twenty-First Text REtrieval Conference, 2012

Scaling Inference for Markov Logic via Dual Decomposition.
Proceedings of the 12th IEEE International Conference on Data Mining, 2012

Big Data versus the Crowd: Looking for Relationships in All the Right Places.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Felix: Scaling Inference for Markov Logic with an Operator-based Approach
CoRR, 2011

Modeling User Expertise in Folksonomies by Fusing Multi-type Features.
Proceedings of the Database Systems for Advanced Applications, 2011

2010
Multiple feature fusion for social media applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

Content-enriched classifier for web video classification.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

2009
A Revisit of Query Expansion with Different Semantic Levels.
Proceedings of the Database Systems for Advanced Applications, 2009

Video Annotation System Based on Categorizing and Keyword Labelling.
Proceedings of the Database Systems for Advanced Applications, 2009

The use of categorization information in language models for question retrieval.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008
Semantic similarity based on compact concept ontology.
Proceedings of the 17th International Conference on World Wide Web, 2008


  Loading...