Bingsheng He

Orcid: 0000-0001-8618-4581

Affiliations:
  • National University of Singapore, Department of Computer Science, Singapore
  • Nanyang Technological University (NTU), School of Computer Engineering, Singapore
  • Microsoft Research at Asia (MSRA), Beijing, China
  • Hong Kong University of Science & Technology (HKUST), Hong Kong (PhD)


According to our database1, Bingsheng He authored at least 424 papers between 1986 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Survey on Concurrent Processing of Graph Analytical Queries: Systems and Algorithms.
IEEE Trans. Knowl. Data Eng., November, 2024

Spade+: A Generic Real-Time Fraud Detection Framework on Dynamic Graphs.
IEEE Trans. Knowl. Data Eng., November, 2024

Optimizing the Number of Clusters for Billion-Scale Quantization-Based Nearest Neighbor Search.
IEEE Trans. Knowl. Data Eng., November, 2024

Large-Scale Graph Label Propagation on GPUs.
IEEE Trans. Knowl. Data Eng., October, 2024

GPU-based butterfly counting.
VLDB J., September, 2024

Enabling Adaptive Sampling for Intra-Window Join: Simultaneously Optimizing Quantity and Quality.
Proc. ACM Manag. Data, September, 2024

Spade: A Real-Time Fraud Detection Framework.
Proc. VLDB Endow., August, 2024

OFL-W3: A One-shot Federated Learning System on Web 3.0.
Proc. VLDB Endow., August, 2024

LLM-PBE: Assessing Data Privacy in Large Language Models.
Proc. VLDB Endow., July, 2024

RUSH: Real-time Burst Subgraph Discovery in Dynamic Graphs.
Proc. VLDB Endow., July, 2024

FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework.
Proc. VLDB Endow., April, 2024

OEBench: Investigating Open Environment Challenges in Real-World Relational Data Streams.
Proc. VLDB Endow., February, 2024

Introduction to distributed and parallel processing of big spatiotemporal data.
Future Gener. Comput. Syst., February, 2024

uBlade: Efficient Batch Processing for Uncertainty Graph Queries.
Proc. ACM Manag. Data, 2024

Aggressive Post-Training Compression on Extremely Large Language Models.
CoRR, 2024

MegaAgent: A Practical Framework for Autonomous Cooperation in Large-Scale LLM Agent Systems.
CoRR, 2024

Revisiting, Benchmarking and Understanding Unsupervised Graph Domain Adaptation.
CoRR, 2024

A Reflective LLM-based Agent to Guide Zero-shot Cryptocurrency Trading.
CoRR, 2024

BuffGraph: Enhancing Class-Imbalanced Node Classification via Buffer Nodes.
CoRR, 2024

Collaborate to Adapt: Source-Free Graph Domain Adaptation via Bi-directional Adaptation.
Proceedings of the ACM on Web Conference 2024, 2024

ModelGo: A Practical Tool for Machine Learning License Analysis.
Proceedings of the ACM on Web Conference 2024, 2024

Bandwidth Expansion via CXL: A Pathway to Accelerating In-Memory Analytical Processing.
Proceedings of Workshops at the 50th International Conference on Very Large Data Bases, 2024

Practical Hybrid Gradient Compression for Federated Learning Systems.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Partitioning Message Passing for Graph Fraud Detection.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

VertiBench: Advancing Feature Distribution Diversity in Vertical Federated Learning Benchmarks.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

EX-Graph: A Pioneering Dataset Bridging Ethereum and X.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Effective and Efficient Federated Tree Learning on Hybrid Data.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Consistency Training with Learnable Data Augmentation for Graph Anomaly Detection with Limited Supervision.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

CryptoTrade: A Reflective LLM-based Agent to Guide Zero-shot Cryptocurrency Trading.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

TaC: An Anti-Caching Key-Value Store on Heterogeneous Memory Architectures.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

FaaSGraph: Enabling Scalable, Efficient, and Cost-Effective Graph Processing with Serverless Computing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Exploiting Label Skews in Federated Learning with Model Concatenation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
RACE: An Efficient Redundancy-aware Accelerator for Dynamic Graph Neural Network.
ACM Trans. Archit. Code Optim., December, 2023

HongTu: Scalable Full-Graph GNN Training on Multiple GPUs.
Proc. ACM Manag. Data, December, 2023

GraphTune: An Efficient Dependency-Aware Substrate to Alleviate Irregularity in Concurrent Graph Processing.
ACM Trans. Archit. Code Optim., September, 2023

Welcome.
Commun. ACM, July, 2023

NIOT: A Novel Inference Optimization of Transformers on Modern CPUs.
IEEE Trans. Parallel Distributed Syst., June, 2023

A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection.
IEEE Trans. Knowl. Data Eng., April, 2023

Efficient Decomposition Selection for Multi-class Classification.
IEEE Trans. Knowl. Data Eng., April, 2023

A rank-two relaxed parallel splitting version of the augmented Lagrangian method with step size in (0,2) for separable convex programming.
Math. Comput., February, 2023

FEBench: A Benchmark for Real-Time Relational Data Feature Extraction.
Proc. VLDB Endow., 2023

A Design Space Exploration and Evaluation for Main-Memory Hash Joins in Storage Class Memory.
Proc. VLDB Endow., 2023

Parallel Colorful h-star Core Maintenance in Dynamic Graphs.
Proc. VLDB Endow., 2023

DeltaBoost: Gradient Boosting Decision Trees with Efficient Machine Unlearning.
Proc. ACM Manag. Data, 2023

LightRW: FPGA Accelerated Graph Dynamic Random Walks.
Proc. ACM Manag. Data, 2023

Sequence-Based Target Coin Prediction for Cryptocurrency Pump-and-Dump.
Proc. ACM Manag. Data, 2023

HongTu: Scalable Full-Graph GNN Training on Multiple GPUs (via communication-optimized CPU data offloading).
CoRR, 2023

Efficient Heterogeneous Graph Learning via Random Projection.
CoRR, 2023

Reliable and Efficient In-Memory Fault Tolerance of Large Language Model Pretraining.
CoRR, 2023

ETGraph: A Pioneering Dataset Bridging Ethereum and Twitter.
CoRR, 2023

FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs.
CoRR, 2023

AI-powered Fraud Detection in Decentralized Finance: A Project Life Cycle Perspective.
CoRR, 2023

A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions.
CoRR, 2023

BERT4ETH: A Pre-trained Transformer for Ethereum Fraud Detection.
Proceedings of the ACM Web Conference 2023, 2023

Live Graph Lab: Towards Open, Dynamic and Real Transaction Graphs with NFT.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FedTree: A Federated Learning System For Trees.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

Practical Edge Kernels for Integer-Only Vision Transformers Under Post-training Quantization.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

Real Time Index and Search Across Large Quantities of GNN Experts for Low Latency Online Learning.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Communication-Efficient Generalized Neuron Matching for Federated Learning.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

Adversarial Collaborative Learning on Non-IID Features.
Proceedings of the International Conference on Machine Learning, 2023

Towards Addressing Label Skews in One-Shot Federated Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Leveraging Data Density and Sparsity for Efficient SVM Training on GPUs.
Proceedings of the IEEE International Conference on Data Mining, 2023

EdgeNN: Efficient Neural Network Inference for CPU-GPU Integrated Edge Devices.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

A High-Performance Index for Real-Time Matrix Retrieval (Extended Abstract).
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

A Survey on Spark Ecosystem: Big Data Processing Infrastructure, Machine Learning, and Applications (Extended abstract).
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

OpenEmbedding: A Distributed Parameter Server for Deep Learning Recommendation Models using Persistent Memory.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

2022
A Generalized Primal-Dual Algorithm with Improved Convergence Condition for Saddle Point Problems.
SIAM J. Imaging Sci., September, 2022

The Serverless Computing Survey: A Technical Primer for Design Architecture.
ACM Comput. Surv., January, 2022

Payment behavior prediction on shared parking lots with TR-GCN.
VLDB J., 2022

ThunderGP: Resource-Efficient Graph Processing Framework on FPGAs with HLS.
ACM Trans. Reconfigurable Technol. Syst., 2022

Taming System Dynamics on Resource Optimization for Data Processing Workflows: A Probabilistic Approach.
IEEE Trans. Parallel Distributed Syst., 2022

Leveraging Code Snippets to Detect Variations in the Performance of HPC Systems.
IEEE Trans. Parallel Distributed Syst., 2022

Parallel and Distributed Structured SVM Training.
IEEE Trans. Parallel Distributed Syst., 2022

Periodic Weather-Aware LSTM With Event Mechanism for Parking Behavior Prediction.
IEEE Trans. Knowl. Data Eng., 2022

A High-Performance Index for Real-Time Matrix Retrieval.
IEEE Trans. Knowl. Data Eng., 2022

A Survey on Spark Ecosystem: Big Data Processing Infrastructure, Machine Learning, and Applications.
IEEE Trans. Knowl. Data Eng., 2022

The OARF Benchmark Suite: Characterization and Implications for Federated Learning Systems.
ACM Trans. Intell. Syst. Technol., 2022

A Structure-Aware Storage Optimization for Out-of-Core Concurrent Graph Processing.
IEEE Trans. Computers, 2022

A Stack-Centric Processing Model for Iterative Processing.
IEEE Trans. Big Data, 2022

Efficient Load-Balanced Butterfly Counting on GPU.
Proc. VLDB Endow., 2022

An In-Depth Study of Continuous Subgraph Matching.
Proc. VLDB Endow., 2022

RapidFlow: An Efficient Approach to Continuous Subgraph Matching.
Proc. VLDB Endow., 2022

Spade: A Real-Time Fraud Detection Framework on Evolving Graphs.
Proc. VLDB Endow., 2022

Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems.
J. Syst. Archit., 2022

On Convergence of the Arrow-Hurwicz Method for Saddle Point Problems.
J. Math. Imaging Vis., 2022

Privacy-preserving workflow scheduling in geo-distributed data centers.
Future Gener. Comput. Syst., 2022

Spade: A Real-Time Fraud Detection Framework on Evolving Graphs (Complete Version).
CoRR, 2022

Practical Vertical Federated Learning with Unsupervised Representation Learning.
CoRR, 2022

An In-Depth Study of Continuous Subgraph Matching (Complete Version).
CoRR, 2022

A Simulation Platform for Multi-tenant Machine Learning Services on Thousands of GPUs.
CoRR, 2022

A Coupled Design of Exploiting Record Similarity for Practical Vertical Federated Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ReGraph: Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Dynamic Graph Segmentation for Deep Graph Neural Networks.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

TDGraph: a topology-driven accelerator for high-performance streaming graph processing.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks.
Proceedings of the 51st International Conference on Parallel Processing, 2022

Adaptive Partitioning for Large-Scale Graph Analytics in Geo-Distributed Data Centers.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Federated Learning on Non-IID Data Silos: An Experimental Study.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

A System for Time Series Feature Extraction in Federated Learning.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

Micro-architectural Analysis of OLAP Systems on Persistent Memory.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022

2021
Fine-Grained Multi-Query Stream Processing on Integrated Architectures.
IEEE Trans. Parallel Distributed Syst., 2021

iMLBench: A Machine Learning Benchmark Suite for CPU-GPU Integrated Architectures.
IEEE Trans. Parallel Distributed Syst., 2021

YuenyeungSpTRSV: A Thread-Level and Warp-Level Fusion Synchronization-Free Sparse Triangular Solve.
IEEE Trans. Parallel Distributed Syst., 2021

Automatic Irregularity-Aware Fine-Grained Workload Partitioning on Integrated Architectures.
IEEE Trans. Knowl. Data Eng., 2021

Understanding and Optimizing Conjunctive Predicates Under Memory-Efficient Storage Layouts.
IEEE Trans. Knowl. Data Eng., 2021

HGP4CNN: an efficient parallelization framework for training convolutional neural networks on modern GPUs.
J. Supercomput., 2021

LargeGraph: An Efficient Dependency-Aware GPU-Accelerated Large-Scale Graph Processing.
ACM Trans. Archit. Code Optim., 2021

ThunderRW: An In-Memory Graph Random Walk Engine.
Proc. VLDB Endow., 2021

Optimizing An In-memory Database System For AI-powered On-line Decision Augmentation Using Persistent Memory.
Proc. VLDB Endow., 2021

Database Systems on GPUs.
Found. Trends Databases, 2021

VColor*: a practical approach for coloring large graphs.
Frontiers Comput. Sci., 2021

ThunderRW: An In-Memory Graph Random Walk Engine (Complete Version).
CoRR, 2021

DBL: Efficient Reachability Queries on Dynamic Graphs (Complete Version).
CoRR, 2021

A parallel splitting ALM-based algorithm for separable convex programming.
Comput. Optim. Appl., 2021

GPU-Accelerated Graph Label Propagation for Real-Time Fraud Detection.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

PathEnum: Towards Real-Time Hop-Constrained s-t Path Enumeration.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

MG-Join: A Scalable Join for Massively Parallel Multi-GPU Architectures.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Cache-Efficient Fork-Processing Patterns on Large Graphs.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Parallelizing Intra-Window Join on Multicores: An Experimental Study.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

LCCG: a locality-centric hardware accelerator for high throughput of concurrent graph processing.
Proceedings of the International Conference for High Performance Computing, 2021

Enhancing SVMs with Problem Context Aware Pipeline.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Challenges and Opportunities of Building Fast GBDT Systems.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Practical One-Shot Federated Learning for Cross-Silo Setting.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

ThundeRiNG: generating multiple independent random number sequences on FPGAs.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Gengar: An RDMA-based Distributed Hybrid Memory Pool.
Proceedings of the 41st IEEE International Conference on Distributed Computing Systems, 2021

TransMask: A Compact and Fast Speech Separation Model Based on Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2021

DepGraph: A Dependency-Driven Accelerator for Efficient Iterative Graph Processing.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

ThunderGP: HLS-based Graph Processing Framework on FPGAs.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

DBL: Efficient Reachability Queries on Dynamic Graphs.
Proceedings of the Database Systems for Advanced Applications, 2021

Skew-Oblivious Data Routing for Data Intensive Applications on FPGAs with HLS.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

Model-Contrastive Federated Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

From Community Search to Community Understanding: A Multimodal Community Query Engine.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

2020
Cost-Aware Partitioning for Efficient Large Graph Processing in Geo-Distributed Datacenters.
IEEE Trans. Parallel Distributed Syst., 2020

gMig: Efficient vGPU Live Migration with Overlapped Software-Based Dirty Page Verification.
IEEE Trans. Parallel Distributed Syst., 2020

Adaptive Kernel Value Caching for SVM Training.
IEEE Trans. Neural Networks Learn. Syst., 2020

Performance Modeling and Directives Optimization for High-Level Synthesis on FPGA.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Accelerating Generative Neural Networks on Unmodified Deep Learning Processors - A Software Approach.
IEEE Trans. Computers, 2020

Object-Level Memory Allocation and Migration in Hybrid Memory Systems.
IEEE Trans. Computers, 2020

AsynGraph: Maximizing Data Parallelism for Efficient Iterative Graph Processing on GPUs.
ACM Trans. Archit. Code Optim., 2020

RapidMatch: A Holistic Approach to Subgraph Query Processing.
Proc. VLDB Endow., 2020

Improving Execution Efficiency of Just-in-time Compilation based Query Processing on GPUs.
Proc. VLDB Endow., 2020

Accelerating Exact Constrained Shortest Paths on GPUs.
Proc. VLDB Endow., 2020

G3: When Graph Neural Networks Meet Parallel Graph Processing Systems on GPUs.
Proc. VLDB Endow., 2020

ThunderGBM: Fast GBDTs and Random Forests on GPUs.
J. Mach. Learn. Res., 2020

Revisiting hash join on graphics processors: a decade later.
Distributed Parallel Databases, 2020

A Survey of Non-Volatile Main Memory Technologies: State-of-the-Arts, Practices, and Future Directions.
CoRR, 2020

Model-Agnostic Round-Optimal Federated Learning via Knowledge Transfer.
CoRR, 2020

Optimally linearizing the alternating direction method of multipliers for convex programming.
Comput. Optim. Appl., 2020

FineStream: Fine-Grained Window-Based Stream Processing on CPU-GPU Integrated Architectures.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

GPU-Accelerated Subgraph Enumeration on Partitioned Graphs.
Proceedings of the 2020 International Conference on Management of Data, 2020

X-TaSNet: Robust and Accurate Time-Domain Speaker Extraction Network.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

GAZEV: GAN-Based Zero-Shot Voice Conversion Over Non-Parallel Speech Corpus.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

PewLSTM: Periodic LSTM with Weather-Aware Gating Mechanism for Parking Behavior Prediction.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

CapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Towards Concurrent Stateful Stream Processing on Multicore Processors.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

PA-Tree: Polled-Mode Asynchronous B+ Tree for NVMe.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Maxson: Reduce Duplicate Parsing Overhead on Raw Data.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Energy Efficient In-memory Integer Multiplication Based on Racetrack Memory.
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020

GradSA: Gradient Sparsification and Accumulation for Communication-Efficient Distributed Deep Learning.
Proceedings of the Green, Pervasive, and Cloud Computing - 15th International Conference, 2020

Poet: an Interactive Spatial Query Processing System in Grab.
Proceedings of the SIGSPATIAL '20: 28th International Conference on Advances in Geographic Information Systems, 2020

ByteSeries: an in-memory time series database for large-scale monitoring systems.
Proceedings of the SoCC '20: ACM Symposium on Cloud Computing, 2020

Is FPGA Useful for Hash Joins?
Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

Privacy-Preserving Gradient Boosting Decision Trees.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Practical Federated Gradient Boosting Decision Trees.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Big Data and Exascale Computing.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

Search and Query Accelerators.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

GPU-Based Hardware Platforms.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

Storage Technologies for Big Data.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

Emerging Hardware Technologies.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing.
IEEE Trans. Serv. Comput., 2019

Fairness-Efficiency Allocation of CPU-GPU Heterogeneous Resources.
IEEE Trans. Serv. Comput., 2019

Privacy Regulation Aware Process Mapping in Geo-Distributed Cloud Data Centers.
IEEE Trans. Parallel Distributed Syst., 2019

Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training.
IEEE Trans. Parallel Distributed Syst., 2019

CGraph: A Distributed Storage and Processing System for Concurrent Iterative Graph Analysis Jobs.
ACM Trans. Storage, 2019

Efficient Multi-Class Probabilistic SVMs on GPUs.
IEEE Trans. Knowl. Data Eng., 2019

Towards Declarative and Data-Centric Virtual Machine Image Management in IaaS Clouds.
IEEE Trans. Cloud Comput., 2019

Guest Editors' Introduction: Special Issue on Big Data Systems on Emerging Architectures.
IEEE Trans. Big Data, 2019

Supporting Superpages and Lightweight Page Migration in Hybrid Memory Systems.
ACM Trans. Archit. Code Optim., 2019

Hardware-Conscious Stream Processing: A Survey.
SIGMOD Rec., 2019

A Survey on Graph Processing Accelerators: Challenges and Opportunities.
J. Comput. Sci. Technol., 2019

A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection.
CoRR, 2019

Scaling Stream Processing with Transactional State Management on Multicores.
CoRR, 2019

Efficient Memory Management for GPU-based Deep Learning Systems.
CoRR, 2019

BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures.
Proceedings of the 2019 International Conference on Management of Data, 2019

GraphM: an efficient storage system for high throughput of concurrent graph processing.
Proceedings of the International Conference for High Performance Computing, 2019

Incorporating Probabilistic Optimizations for Resource Provisioning of Data Processing Workflows.
Proceedings of the 48th International Conference on Parallel Processing, 2019

OBFS: OpenCL Based BFS Optimizations on Software Programmable FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2019

Aucher: Multi-modal Queries on Live Audio Streams in Real-Time.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

On-The-Fly Parallel Data Shuffling for Graph Processing on OpenCL-Based FPGAs.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

Deploying Hash Tables on Die-Stacked High Bandwidth Memory.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

TraV: An Interactive Exploration System for Massive Trajectory Data.
Proceedings of the Fifth IEEE International Conference on Multimedia Big Data, 2019

DiGraph: An Efficient Path-based Iterative Directed Graph Processing System on Multiple GPUs.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
Fair Resource Allocation for Data-Intensive Computing in the Cloud.
IEEE Trans. Serv. Comput., 2018

Efficient Disk-Based Directed Graph Processing: A Strongly Connected Component Approach.
IEEE Trans. Parallel Distributed Syst., 2018

Scalable GPU Virtualization with Dynamic Sharing of Graphics Memory Space.
IEEE Trans. Parallel Distributed Syst., 2018

Long-Term Multi-Resource Fairness for Pay-as-you Use Computing Systems.
IEEE Trans. Parallel Distributed Syst., 2018

Frog: Asynchronous Graph Processing on GPU with Hybrid Coloring Model.
IEEE Trans. Knowl. Data Eng., 2018

Towards Efficient Resource Allocation for Heterogeneous Workloads in IaaS Clouds.
IEEE Trans. Cloud Comput., 2018

JouleMR: Towards Cost-Effective and Green-Aware Data Processing Frameworks.
IEEE Trans. Big Data, 2018

Layer-Centric Memory Reuse and Data Migration for Extreme-Scale Deep Learning on Many-Core Architectures.
ACM Trans. Archit. Code Optim., 2018

Many-core needs fine-grained scheduling: A case study of query processing on Intel Xeon Phi processors.
J. Parallel Distributed Comput., 2018

ThunderSVM: A Fast SVM Library on GPUs and CPUs.
J. Mach. Learn. Res., 2018

Database Architectures for Modern Hardware (Dagstuhl Seminar 18251).
Dagstuhl Reports, 2018

A Survey on Spark Ecosystem for Big Data Processing.
CoRR, 2018

A class of ADMM-based algorithms for three-block separable convex programming.
Comput. Optim. Appl., 2018

gMig: Efficient GPU Live Migration Optimized by Software Dirty Page for Full Virtualization.
Proceedings of the 14th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2018

CGraph: A Correlations-aware Approach for Efficient Concurrent Iterative Graph Processing.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

vSensor: leveraging fixed-workload snippets of programs for performance variance detection.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

G-NET: Effective GPU Sharing in NFV Systems.
Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation, 2018

Efficient Gradient Boosted Decision Tree Training on GPUs.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Energy-Efficient Speculative Execution using Advanced Reservation for Heterogeneous Clusters.
Proceedings of the 47th International Conference on Parallel Processing, 2018

GLP4NN: A Convergence-invariant and Network-agnostic Light-weight Parallelization Framework for Deep Neural Networks on Modern GPUs.
Proceedings of the 47th International Conference on Parallel Processing, 2018

Query Processing on OpenCL-Based FPGAs: Challenges and Opportunities.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018

RTSI: An Index Structure for Multi-Modal Real-Time Search on Live Audio Streaming Services.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

Hebe: An Order-Oblivious and High-Performance Execution Scheme for Conjunctive Predicates.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

FCN-engine: accelerating deconvolutional layers in classic CNN processors.
Proceedings of the International Conference on Computer-Aided Design, 2018

Efficient Support Vector Machine Training Algorithm on GPUs.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

An efficient graph accelerator with parallel data conflict management.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

Towards concurrency race debugging: an integrated approach for constraint solving and dynamic slicing.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
A distributed in-memory key-value store system on heterogeneous CPU-GPU cluster.
VLDB J., 2017

Multikernel Data Partitioning With Channel on OpenCL-Based FPGAs.
IEEE Trans. Very Large Scale Integr. Syst., 2017

A Variation-Aware Adaptive Fuzzy Control System for Thermal Management of Microprocessors.
IEEE Trans. Very Large Scale Integr. Syst., 2017

A Declarative Optimization Engine for Resource Provisioning of Scientific Workflows in Geo-Distributed Clouds.
IEEE Trans. Parallel Distributed Syst., 2017

Understanding Co-Running Behaviors on Integrated CPU/GPU Architectures.
IEEE Trans. Parallel Distributed Syst., 2017

Building an Efficient Put-Intensive Key-Value Store with Skip-Tree.
IEEE Trans. Parallel Distributed Syst., 2017

Analysis of Minimum Interaction Time for Continuous Distributed Interactive Computing.
IEEE Trans. Parallel Distributed Syst., 2017

QoS-Aware Resource Allocation for Video Transcoding in Clouds.
IEEE Trans. Circuits Syst. Video Technol., 2017

A Hybrid Logic Block Architecture in FPGA for Holistic Efficiency.
IEEE Trans. Circuits Syst. II Express Briefs, 2017

Network Performance Aware Optimizations on IaaS Clouds.
IEEE Trans. Computers, 2017

Accelerating Dynamic Graph Analytics on GPUs.
Proc. VLDB Endow., 2017

Convergence Rate Analysis for the Alternating Direction Method of Multipliers with a Substitution Procedure for Separable Convex Programming.
Math. Oper. Res., 2017

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities.
J. Optim. Theory Appl., 2017

An Algorithmic Framework of Generalized Primal-Dual Hybrid Gradient Methods for Saddle Point Problems.
J. Math. Imaging Vis., 2017

Technical Report: Accelerating Dynamic Graph Analytics on GPUs.
CoRR, 2017

Efficient process mapping in geo-distributed cloud data centers.
Proceedings of the International Conference for High Performance Computing, 2017

Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures.
Proceedings of the International Conference on Supercomputing, 2017

Multi-objective Optimizations in Geo-Distributed Data Analytics Systems.
Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017

Multi-Query Optimization for Complex Event Processing in SAP ESP.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

DIDO: Dynamic Pipelines for In-Memory Key-Value Stores on Coupled CPU-GPU Architectures.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

Revisiting the Design of Data Stream Processing Systems on Multi-Core Processors.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

AdaStorm: Resource Efficient Storm with Adaptive Configuration.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

Data Management Systems on Future Hardware: Challenges and Opportunities.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

On Achieving Efficient Data Transfer for Graph Processing in Geo-Distributed Datacenters.
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

COMBA: A comprehensive model-based analysis framework for high level synthesis of real applications.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

A novel two-stage modular multiplier based on racetrack memory for asymmetric cryptography.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

Dynamic Partitioning for Library based Placement on Heterogeneous FPGAs (Abstract Only).
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

Dynamic Module Partitioning for Library Based Placement on Heterogeneous FPGAs.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

A Study of Main-Memory Hash Joins on Many-core Processor: A Case with Intel Knights Landing Architecture.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

FinePar: irregularity-aware fine-grained workload partitioning on integrated architectures.
Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

2016
Decentralized Thermal-Aware Task Scheduling for Large-Scale Many-Core Systems.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Dynamic Job Ordering and Slot Configurations for MapReduce Workloads.
IEEE Trans. Serv. Comput., 2016

Melia: A MapReduce Framework on OpenCL-Based FPGAs.
IEEE Trans. Parallel Distributed Syst., 2016

F2C: Enabling Fair and Fine-Grained Resource Sharing in Multi-Tenant IaaS Clouds.
IEEE Trans. Parallel Distributed Syst., 2016

A Performance Debugging Framework for Unnecessary Lock Contentions with Record/Replay Techniques.
IEEE Trans. Parallel Distributed Syst., 2016

Library-Based Placement and Routing in FPGAs with Support of Partial Reconfiguration.
ACM Trans. Design Autom. Electr. Syst., 2016

Monetary Cost Optimizations for Hosting Workflow-as-a-Service in IaaS Clouds.
IEEE Trans. Cloud Comput., 2016

Rotated Logging Storage Architectures for Data Centers: Models and Optimizations.
IEEE Trans. Computers, 2016

NV-Tree: A Consistent and Workload-Adaptive Tree Structure for Non-Volatile Memory.
IEEE Trans. Computers, 2016

Rank-Aware Dynamic Migrations and Adaptive Demotions for DRAM Power Management.
IEEE Trans. Computers, 2016

Convergence Study on the Symmetric Version of ADMM with Larger Step Sizes.
SIAM J. Imaging Sci., 2016

The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent.
Math. Program., 2016

On the Proximal Jacobian Decomposition of ALM for Multiple-Block Separable Convex Minimization Problems and Its Relationship to ADMM.
J. Sci. Comput., 2016

Thermal-Aware Task Scheduling for 3D-Network-on-Chip: A Bottom to Top Scheme.
J. Circuits Syst. Comput., 2016

gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space.
Proceedings of the 2016 USENIX Annual Technical Conference, 2016

GPL: A GPU-based Pipelined Query Processing Engine.
Proceedings of the 2016 International Conference on Management of Data, 2016

Efficient Query Processing on Many-core Architectures: A Case Study with Intel Xeon Phi Processor.
Proceedings of the 2016 International Conference on Management of Data, 2016

A Study of Sorting Algorithms on Approximate Memory.
Proceedings of the 2016 International Conference on Management of Data, 2016

Elastic multi-resource fairness: balancing fairness and efficiency in coupled CPU-GPU architectures.
Proceedings of the International Conference for High Performance Computing, 2016

VColor: A practical vertex-cut based approach for coloring large graphs.
Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

Not All Joules are Equal: Towards Energy-Efficient and Green-Aware Data Processing Frameworks.
Proceedings of the 2016 IEEE International Conference on Cloud Engineering, 2016

A Study of Big Data Computing Platforms: Fairness and Energy Consumption.
Proceedings of the 2016 IEEE International Conference on Cloud Engineering Workshop, 2016

A performance analysis framework for optimizing OpenCL applications on FPGAs.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Modular Placement for Interposer based Multi-FPGA Systems.
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016

Relational query processing on OpenCL-based FPGAs.
Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Accelerating Database Query Processing on OpenCL-based FPGAs (Abstract Only).
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

A discrete thermal controller for chip-multiprocessors.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

A racetrack memory based in-memory booth multiplier for cryptography application.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

2015
A Combined SDC-SDF Architecture for Normal I/O Pipelined Radix-2 FFT.
IEEE Trans. Very Large Scale Integr. Syst., 2015

MrPhi: An Optimized MapReduce Framework on Intel Xeon Phi Coprocessors.
IEEE Trans. Parallel Distributed Syst., 2015

Hotplug or Ballooning: A Comparative Study on Dynamic Memory Management Techniques for Virtual Machines.
IEEE Trans. Parallel Distributed Syst., 2015

VMbuddies: Coordinating Live Migration of Multi-Tier Applications in Cloud Environments.
IEEE Trans. Parallel Distributed Syst., 2015

Willow: Saving Data Center Network Energy for Network-Limited Flows.
IEEE Trans. Parallel Distributed Syst., 2015

Improving Update-Intensive Workloads on Flash Disks through Exploiting Multi-Chip Parallelism.
IEEE Trans. Parallel Distributed Syst., 2015

Network Performance Aware MPI Collective Communication Operations in the Cloud.
IEEE Trans. Parallel Distributed Syst., 2015

Sensor Placement and Measurement of Wind for Water Quality Studies in Urban Reservoirs.
ACM Trans. Sens. Networks, 2015

PCMLogging: Optimizing Transaction Logging and Recovery Performance with PCM.
IEEE Trans. Knowl. Data Eng., 2015

Guest Editors' Introduction: Special Issue on Economics and Market Mechanisms for Cloud Computing.
IEEE Trans. Cloud Comput., 2015

Synergy of Dynamic Frequency Scaling and Demotion on DRAM Power Management: Models and Optimizations.
IEEE Trans. Computers, 2015

On Full Jacobian Decomposition of the Augmented Lagrangian Method for Separable Convex Programming.
SIAM J. Optim., 2015

Improving Main Memory Hash Joins on Intel Xeon Phi Processors: An Experimental Approach.
Proc. VLDB Endow., 2015

On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers.
Numerische Mathematik, 2015

Generalized alternating direction method of multipliers: new theoretical insights and applications.
Math. Program. Comput., 2015

On the convergence rate of Douglas-Rachford operator splitting method.
Math. Program., 2015

Monetary cost optimizations for MPI-based HPC applications on Amazon clouds: checkpoints and replicated execution.
Proceedings of the International Conference for High Performance Computing, 2015

Optimization of asynchronous graph processing on GPU with hybrid coloring model.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Hierarchical Library Based Power Estimator for Versatile FPGAs.
Proceedings of the IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2015

To Co-run, or Not to Co-run: A Performance Study on Integrated Architectures.
Proceedings of the 23rd IEEE International Symposium on Modeling, 2015

Real-Time In-Memory Checkpointing for Future Hybrid Memory Systems.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

A Declarative Optimization Engine for Resource Provisioning of Scientific Workflows in IaaS Clouds.
Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, 2015

A study of data partitioning on OpenCL-based FPGAs.
Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

Improving Data Partitioning Performance on OpenCL-Based FPGAs.
Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

NV-Tree: Reducing Consistency Cost for NVM-based Single Level Systems.
Proceedings of the 13th USENIX Conference on File and Storage Technologies, 2015

Fast Subgraph Matching on Large Graphs using Graphics Processors.
Proceedings of the Database Systems for Advanced Applications, 2015

Energy-Efficient Query Processing on Embedded CPU-GPU Architectures.
Proceedings of the 11th International Workshop on Data Management on New Hardware, 2015

Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing.
Proceedings of the 7th IEEE International Conference on Cloud Computing Technology and Science, 2015

On performance debugging of unnecessary lock contentions on multicore processors: a replay-based approach.
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014
Medusa: Simplified Graph Processing on GPUs.
IEEE Trans. Parallel Distributed Syst., 2014

Kernelet: High-Throughput GPU Kernel Executions with Dynamic Slicing and Scheduling.
IEEE Trans. Parallel Distributed Syst., 2014

Transformation-Based Monetary CostOptimizations for Workflows in the Cloud.
IEEE Trans. Cloud Comput., 2014

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters.
IEEE Trans. Cloud Comput., 2014

FD-Buffer: A Cost-Based Adaptive Buffer Replacement Algorithm for FlashMemory Devices.
IEEE Trans. Computers, 2014

Medusa: A Parallel Graph Processing System on Graphics Processors.
SIGMOD Rec., 2014

A Strictly Contractive Peaceman-Rachford Splitting Method for Convex Programming.
SIAM J. Optim., 2014

On the Convergence of Primal-Dual Hybrid Gradient Algorithm.
SIAM J. Imaging Sci., 2014

In-Cache Query Co-Processing on Coupled CPU-GPU Architectures.
Proc. VLDB Endow., 2014

When Data Management Systems Meet Approximate Hardware: Challenges and Opportunities.
Proc. VLDB Endow., 2014

Inexact Alternating-Direction-Based Contraction Methods for Separable Linearly Constrained Convex Optimization.
J. Optim. Theory Appl., 2014

A Taxonomy and Survey on eScience as a Service in the Cloud.
CoRR, 2014

A Survey of Resource Management in Multi-Tier Web Applications.
IEEE Commun. Surv. Tutorials, 2014

Customized proximal point algorithms for linearly constrained convex minimization and saddle-point problems: a unified approach.
Comput. Optim. Appl., 2014

On the <i>O</i>(1/<i>t</i>) convergence rate of the projection and contraction methods for variational inequalities with Lipschitz continuous monotone operators.
Comput. Optim. Appl., 2014

Demo Abstract: Wind measurements for water quality studies in urban reservoirs.
Proceedings of the Eleventh Annual IEEE International Conference on Sensing, 2014

Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds.
Proceedings of the International Conference for High Performance Computing, 2014

Finding Constant from Change: Revisiting Network Performance Aware Optimizations on IaaS Clouds.
Proceedings of the International Conference for High Performance Computing, 2014

Optimal sensor placement and measurement of wind for water quality studies in urban reservoirs.
Proceedings of the IPSN'14, 2014

Pipelined Compaction for the LSM-Tree.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Long-term resource fairness: towards economic fairness on pay-as-you-use computing systems.
Proceedings of the 2014 International Conference on Supercomputing, 2014

Towards multi-resource physical machine provisioning for IaaS clouds.
Proceedings of the IEEE International Conference on Communications, 2014

A novel authenticated multi-party key agreement for private cloud.
Proceedings of the IEEE International Conference on Communications, 2014

Towards automatic partial reconfiguration in FPGAs.
Proceedings of the 2014 International Conference on Field-Programmable Technology, 2014

Simplified Resource Provisioning for Workflows in IaaS Clouds.
Proceedings of the IEEE 6th International Conference on Cloud Computing Technology and Science, 2014

Towards Economic Fairness for Big Data Processing in Pay-as-You-Go Cloud Computing.
Proceedings of the IEEE 6th International Conference on Cloud Computing Technology and Science, 2014

GPU-Accelerated Cloud Computing for Data-Intensive Applications.
Proceedings of the Cloud Computing for Data-Intensive Applications, 2014

Network Performance Aware Graph Partitioning for Large Graph Processing Systems in the Cloud.
Proceedings of the Large Scale and Big Data - Processing and Management., 2014

2013
Parallel Graph Processing on Graphics Processors Made Easy.
Proc. VLDB Endow., 2013

OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures.
Proc. VLDB Endow., 2013

Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture.
Proc. VLDB Endow., 2013

Handling partitioning skew in MapReduce using LEEN.
Peer-to-Peer Netw. Appl., 2013

Forward-backward-based descent methods for composite variational inequalities.
Optim. Methods Softw., 2013

Probabilistic Scheduling of Scientific Workflows in Dynamic Cloud Environments.
CoRR, 2013

A customized proximal point algorithm for convex minimization with linear constraints.
Comput. Optim. Appl., 2013

Simulation studies of viral advertisement diffusion on multi-GPU.
Proceedings of the Winter Simulations Conference: Simulation Making Decisions in a Complex World, 2013

A Framework for Analyzing Monetary Cost of Database Systems in the Cloud.
Proceedings of the Web-Age Information Management - 14th International Conference, 2013

Brief announcement: on minimum interaction time for continuous distributed interactive computing.
Proceedings of the ACM Symposium on Principles of Distributed Computing, 2013

Spectral Decomposition for Optimal Graph Index Prediction.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2013

MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Simulation of Information Propagation over Complex Networks: Performance Studies on Multi-GPU.
Proceedings of the 17th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, 2013

Dynamic slot allocation technique for MapReduce clusters.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

Towards GPU-Accelerated Large-Scale Graph Processing in the Cloud.
Proceedings of the IEEE 5th International Conference on Cloud Computing Technology and Science, 2013

Green Databases Through Integration of Renewable Energy.
Proceedings of the Sixth Biennial Conference on Innovative Data Systems Research, 2013

Optimizing the MapReduce framework on Intel Xeon Phi coprocessor.
Proceedings of the 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 2013

2012
Flag Commit: Supporting Efficient Transaction Recovery in Flash-Based DBMSs.
IEEE Trans. Knowl. Data Eng., 2012

On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method.
SIAM J. Numer. Anal., 2012

Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming.
SIAM J. Optim., 2012

Convergence Analysis of Primal-Dual Algorithms for a Saddle-Point Problem: From Contraction Perspective.
SIAM J. Imaging Sci., 2012

HPC Simulations of Information Propagation Over Social Networks.
Proceedings of the International Conference on Computational Science, 2012

An Accelerated Inexact Proximal Point Algorithm for Convex Minimization.
J. Optim. Theory Appl., 2012

Proximal-like contraction methods for monotone variational inequalities in a unified framework II: general methods and numerical experiments.
Comput. Optim. Appl., 2012

Proximal-like contraction methods for monotone variational inequalities in a unified framework I: Effective quadruplet and primary methods.
Comput. Optim. Appl., 2012

RAMZzz: rank-aware dram power management with dynamic migrations and demotions.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

An overview of Medusa: simplified graph processing on GPUs.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

An overview of CMPI: network performance aware MPI in the cloud.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

Speedup for Multi-Level Parallel Computing.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

GPGPU for Real-Time Data Analytics.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

Green-aware workload scheduling in geographically distributed data centers.
Proceedings of the 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, 2012

Improving large graph processing on partitioned graphs in the cloud.
Proceedings of the ACM Symposium on Cloud Computing, SOCC '12, 2012

A Map-Reduce Based Framework for Heterogeneous Processing Element Cluster Environments.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

Maestro: Replica-Aware Map Scheduling for MapReduce.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011
Mars: Accelerating MapReduce with Graphics Processors.
IEEE Trans. Parallel Distributed Syst., 2011

Solving Large-Scale Least Squares Semidefinite Programming by Alternating Direction Methods.
SIAM J. Matrix Anal. Appl., 2011

High-throughput transaction executions on graphics processors.
Proc. VLDB Endow., 2011

GPU-Assisted Buffer Management.
Proceedings of the International Conference on Computational Science, 2011

GViewer: GPU-Accelerated Graph Visualization and Mining.
Proceedings of the Social Informatics - Third International Conference, SocInfo 2011, 2011

Operation-aware buffer management in flash-based systems.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

Adaptive Disk I/O Scheduling for MapReduce in Virtualized Environment.
Proceedings of the International Conference on Parallel Processing, 2011

PCMLogging: reducing transaction logging overhead with PCM.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Towards Pay-As-You-Consume Cloud Computing.
Proceedings of the IEEE International Conference on Services Computing, 2011

2010
Tree Indexing on Solid State Drives.
Proc. VLDB Endow., 2010

Database Compression on Graphics Processors.
Proc. VLDB Endow., 2010

Steplengths in the extragradient type methods.
J. Comput. Appl. Math., 2010

Solving a class of constrained 'black-box' inverse variational inequalities.
Eur. J. Oper. Res., 2010

Large graph processing in the cloud.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

Distributed Systems Meet Economics: Pricing in the Cloud.
Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing, 2010

Supporting extended precision on graphics processors.
Proceedings of the Sixth International Workshop on Data Management on New Hardware, 2010

LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud.
Proceedings of the Cloud Computing, Second International Conference, 2010

Comet: batched stream processing for data intensive distributed computing.
Proceedings of the 1st ACM Symposium on Cloud Computing, 2010

FD-buffer: a buffer manager for databases on flash disks.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

2009
Relational query coprocessing on graphics processors.
ACM Trans. Database Syst., 2009

Self-adaptive projection method for co-coercive variational inequalities.
Eur. J. Oper. Res., 2009

Parallel splitting augmented Lagrangian methods for monotone structured variational inequalities.
Comput. Optim. Appl., 2009

Stack-based parallel recursion on graphics processors.
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009

Tree Indexing on Flash Disks.
Proceedings of the 25th International Conference on Data Engineering, 2009

Wave Computing in the Cloud.
Proceedings of HotOS'09: 12th Workshop on Hot Topics in Operating Systems, 2009

A Uniform Framework for Ad-Hoc Indexes to Answer Reachability Queries on Large Graphs.
Proceedings of the Database Systems for Advanced Applications, 2009

Frequent itemset mining on graphics processors.
Proceedings of the Fifth International Workshop on Data Management on New Hardware, 2009

2008
Cache-oblivious databases: Limitations and opportunities.
ACM Trans. Database Syst., 2008

Relational joins on graphics processors.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Mars: a MapReduce framework on graphics processors.
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2007
Adaptive Index Utilization in Memory-Resident Structural Joins.
IEEE Trans. Knowl. Data Eng., 2007

An inexact logarithmic-quadratic proximal augmented Lagrangian method for a class of constrained variational inequalities.
Math. Methods Oper. Res., 2007

EaseDB: a cache-oblivious in-memory query processor.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

GPUQP: query co-processing using graphics processors.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Efficient gather and scatter operations on graphics processors.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

In-memory grid files on graphics processors.
Proceedings of the Workshop on Data Management on New Hardware, 2007

A general framework for improving query processing performance on multi-level memory hierarchies.
Proceedings of the Workshop on Data Management on New Hardware, 2007

Cache-Oblivious Query Processing.
Proceedings of the Third Biennial Conference on Innovative Data Systems Research, 2007

2006
Cache-Conscious Automata for XML Filtering.
IEEE Trans. Knowl. Data Eng., 2006

A Logarithmic-Quadratic Proximal Prediction-Correction Method for Structured Monotone Variational Inequalities.
Comput. Optim. Appl., 2006

A Quantitative Summary of XML Structures.
Proceedings of the Conceptual Modeling, 2006

Cache-oblivious nested-loop joins.
Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management, 2006

2005
A Relaxed Approximate Proximal Point Algorithm.
Ann. Oper. Res., 2005

2004
A modified augmented Lagrangian method for a class of monotone variational inequalities.
Eur. J. Oper. Res., 2004

Comparison of Two Kinds of Prediction-Correction Methods for Monotone Variational Inequalities.
Comput. Optim. Appl., 2004

The HKUST Frog Pond - A Case Study of Sensory Data Analysis.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2004

Accurate Emulation of Wireless Sensor Networks.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2004

MEADOWS: modeling, emulation, and analysis of data of wireless sensor networks.
Proceedings of the 1st Workshop on Data Management for Sensor Networks, 2004

2003
Self-adaptive operator splitting methods for monotone variational inequalities.
Numerische Mathematik, 2003

2002
A new inexact alternating directions method for monotone variational inequalities.
Math. Program., 2002

2000
A neural network model for monotone linear asymmetric variational inequalities.
IEEE Trans. Neural Networks Learn. Syst., 2000

A modified alternating direction method for convex minimization problems.
Appl. Math. Lett., 2000

1999
Inexact implicit methods for monotone general variational inequalities.
Math. Program., 1999

1998
Some convergence properties of a method of multipliers for linearly constrained monotone variational inequalities.
Oper. Res. Lett., 1998

1994
A new method for a class of linear variational inequalities.
Math. Program., 1994

1986
Grosse nichtlineare Optimierungsprobleme mit "Box-Constraints" und Gleichungsrestriktionen.
PhD thesis, 1986


  Loading...