Depei Qian

Orcid: 0000-0002-5382-1473

According to our database1, Depei Qian authored at least 314 papers between 2000 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
AtRec: Accelerating Recommendation Model Training on CPUs.
IEEE Trans. Parallel Distributed Syst., June, 2024

Towards optimized tensor code generation for deep learning on sunway many-core processor.
Frontiers Comput. Sci., April, 2024

Adaptive Auto-Tuning Framework for Global Exploration of Stencil Optimization on GPUs.
IEEE Trans. Parallel Distributed Syst., January, 2024

ElasticBatch: A Learning-Augmented Elastic Scheduling System for Batch Inference on MIG.
IEEE Trans. Parallel Distributed Syst., 2024

FDLoRA: Personalized Federated Learning of Large Language Model via Dual LoRA Tuning.
CoRR, 2024

INSPIRIT: Optimizing Heterogeneous Task Scheduling through Adaptive Priority in Task-based Runtime Systems.
CoRR, 2024

Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding.
CoRR, 2024

Building a domain-specific compiler for emerging processors with a reusable approach.
Sci. China Inf. Sci., 2024

Gloss: Guiding Large Language Models to Answer Questions from System Logs.
Proceedings of the IEEE International Conference on Software Analysis, 2024

Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

Jigsaw: Accelerating SpMM with Vector Sparsity on Sparse Tensor Core.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

PRoof: A Comprehensive Hierarchical Profiling Framework for Deep Neural Networks with Roofline Analysis.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

2023
HAOTuner: A Hardware Adaptive Operator Auto-Tuner for Dynamic Shape Tensor Compilers.
IEEE Trans. Computers, November, 2023

CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system.
J. Supercomput., September, 2023

Adapting combined tiling to stencil optimizations on sunway processor.
CCF Trans. High Perform. Comput., September, 2023

Software approaches for resilience of high performance computing systems: a survey.
Frontiers Comput. Sci., August, 2023

Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP.
Computer, August, 2023

LogEncoder: Log-Based Contrastive Representation Learning for Anomaly Detection.
IEEE Trans. Netw. Serv. Manag., June, 2023

swSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight.
Frontiers Comput. Sci., 2023

LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection.
CoRR, 2023

TrivialSpy: Identifying Software Triviality via Fine-grained and Dataflow-based Value Profiling.
Proceedings of the International Conference for High Performance Computing, 2023

EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs.
Proceedings of the International Conference for High Performance Computing, 2023

Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

BiRFIA: Selective Binary Rewriting for Function Interception on ARM.
Proceedings of the 37th International Conference on Supercomputing, 2023

Exploiting Subgraph Similarities for Efficient Auto-tuning of Tensor Programs.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

Accelerating Big Data Application by Eliminating Redundancy on Hadoop Cluster.
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

Efficient Deep Molecular Dynamic Model Training on Heterogeneous System.
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

gGMED: Towards GPU Accelerated Geometric Modeling Evaluation and Derivative Processes.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2023

LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection.
Proceedings of the IEEE International Conference on High Performance Computing & Communications, 2023

Towards Optimized Hydrological Forecast Prediction of WRF-Hydro on GPU.
Proceedings of the IEEE International Conference on High Performance Computing & Communications, 2023

Large-Scale Parallelization and Optimization of Lattice QCD on Tianhe New Generation Supercomputer.
Proceedings of the IEEE International Conference on High Performance Computing & Communications, 2023

VClinic: A Portable and Efficient Framework for Fine-Grained Value Profilers.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
REVAL: Recommend Which Variables to Log With Pretrained Model and Graph Neural Network.
IEEE Trans. Netw. Serv. Manag., December, 2022

Efficient detection of silent data corruption in HPC applications with synchronization-free message verification.
J. Supercomput., 2022

Magas: matrix-based asynchronous graph analytics on shared memory systems.
J. Supercomput., 2022

Accelerating approximate matrix multiplication for near-sparse matrices on GPUs.
J. Supercomput., 2022

Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP.
IEEE Trans. Computers, 2022

QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU.
Parallel Comput., 2022

Accelerating the cryo-EM structure determination in RELION on GPU cluster.
Frontiers Comput. Sci., 2022

Mimose: An Input-Aware Checkpointing Planner for Efficient Training on GPU.
CoRR, 2022

EasyScale: Accuracy-consistent Elastic Training for Deep Learning.
CoRR, 2022

FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity.
CoRR, 2022

CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

Adanomaly: Adaptive Anomaly Detection for System Logs with Adversarial Learning.
Proceedings of the 2022 IEEE/IFIP Network Operations and Management Symposium, 2022

PowerSpector: Towards Energy Efficiency with Calling-Context-Aware Profiling.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

StencilMART: Predicting Optimization Selection for Stencil Computations across GPUs.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Toward accelerated stencil computation by adapting tensor core unit on GPU.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Vectorizing SpMV by Exploiting Dynamic Regular Patterns.
Proceedings of the 51st International Conference on Parallel Processing, 2022

Towards Optimized Streaming Tensor Completion on multiple GPUs.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

2021
The Deep Learning Compiler: A Comprehensive Survey.
IEEE Trans. Parallel Distributed Syst., 2021

ELS: Emulation system for debugging and tuning large-scale parallel programs on small clusters.
J. Supercomput., 2021

Towards efficient tile low-rank GEMM computation on sunway many-core processors.
J. Supercomput., 2021

swMR: A Framework for Accelerating MapReduce Applications on Sunway Taihulight.
IEEE Trans. Emerg. Top. Comput., 2021

Towards efficient canonical polyadic decomposition on sunway many-core processor.
Inf. Sci., 2021

User-level failure detection and auto-recovery of parallel programs in HPC systems.
Frontiers Comput. Sci., 2021

Accelerating Sparse Approximate Matrix Multiplication on GPUs.
CoRR, 2021

CompactNet: Platform-Aware Automatic Optimization for Convolutional Neural Networks.
Proceedings of the PMAM@PPoPP 2021: Proceedings of the Twelfth International Workshop on Programming Models and Applications for Multicores and Manycores, 2021

dgQuEST: Accelerating Large Scale Quantum Circuit Simulation through Hybrid CPU-GPU Memory Hierarchies.
Proceedings of the Network and Parallel Computing, 2021

An optimized tensor completion library for multiple GPUs.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

DRStencil: Exploiting Data Reuse within Low-order Stencil on GPU.
Proceedings of the 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, 2021

csTuner: Scalable Auto-tuning Framework for Complex Stencil Computation on GPUs.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

PriPro: Towards Effective Privacy Protection on Edge-Cloud System running DNN Inference.
Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020
Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture.
IEEE Trans. Parallel Distributed Syst., 2020

Massively Scaling Seismic Processing on Sunway TaihuLight Supercomputer.
IEEE Trans. Parallel Distributed Syst., 2020

Thread-Level Locking for SIMT Architectures.
IEEE Trans. Parallel Distributed Syst., 2020

Temperature-Aware DRAM Cache Management - Relaxing Thermal Constraints in 3-D Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

The Deep Learning Compiler: A Comprehensive Survey.
CoRR, 2020

Privacy for Rescue: A New Testimony Why Privacy is Vulnerable In Deep Models.
CoRR, 2020

swGBDT: Efficient Gradient Boosted Decision Tree on Sunway Many-Core Processor.
Proceedings of the Supercomputing Frontiers - 6th Asian Conference, 2020

ZeroSpy: exploring software inefficiency with redundant zeros.
Proceedings of the International Conference for High Performance Computing, 2020

SpTFS: sparse tensor format selection for MTTKRP via deep learning.
Proceedings of the International Conference for High Performance Computing, 2020

SympleGraph: distributed graph processing with precise loop-carried dependency guarantee.
Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2020

Extremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Towards GPU Acceleration of Phonon Computation with ShengBTE.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020

swRodinia: A Benchmark Suite for Exploiting Architecture Properties of Sunway Processor.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2020

2019
Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee.
ACM Trans. Comput. Syst., 2019

Accelerating in-memory transaction processing using general purpose graphics processing units.
Future Gener. Comput. Syst., 2019

A novel index system describing program runtime characteristics for workload consolidation.
Frontiers Comput. Sci., 2019

High Performance Computing Development in China: A Brief Review and Perspectives.
Comput. Sci. Eng., 2019

Intelligent-Unrolling: Exploiting Regular Patterns in Irregular Applications.
CoRR, 2019

Massively Scaling Seismic Processing on Sunway TaihuLight Supercomputer.
CoRR, 2019

CompactNet: Platform-Aware Automatic Optimization for Convolutional Neural Networks.
CoRR, 2019

swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture.
CoRR, 2019

swTensor: accelerating tensor decomposition on Sunway architecture.
CCF Trans. High Perform. Comput., 2019

FPowerTool: A Function-Level Power Profiling Tool.
IEEE Access, 2019

SunwayImg: A Parallel Image Processing Library for the Sunway Many-Core Processor.
IEEE Access, 2019

Performance Evaluation and Analysis of Linear Algebra Kernels in the Prototype Tianhe-3 Cluster.
Proceedings of the Supercomputing Frontiers - 5th Asian Conference, 2019

Modeling Power Consumption of The Code Execution Using Performance Counters Statistics.
Proceedings of the 20th International Conference on Parallel and Distributed Computing, 2019

Multiple Algorithms Against Multiple Hardware Architectures: Data-Driven Exploration on Deep Convolution Neural Network.
Proceedings of the Network and Parallel Computing, 2019

Improving the Parallelism of CESM on GPU.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2019

Structure Characteristic-Aware Pruning Strategy for Convolutional Neural Networks.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

Towards a General and Efficient Linked-List Hash Table on GPUs.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

FLONet: Fewer Labeling Cost Active Learning for Deep Neural Network.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

swCPD: Optimizing Canonical Polyadic Decomposition on Sunway Manycore Architecture.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

Anomaly Detection Models Based on Context-Aware Sequential Long Short-Term Memory Learning.
Proceedings of the 2019 IEEE Global Communications Conference, 2019

L-DAG: Enabling Loopy Workflow in Scientific Application with Automatic DAG Transformation.
Proceedings of the 2019 IEEE Intl Conf on Dependable, 2019

SMQoS: Improving Utilization and Energy Efficiency with QoS Awareness on GPUs.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

Accelerating tile low-rank GEMM on sunway architecture: POSTER.
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019

LADet: A Light-weight and Adaptive Network for Multi-scale Object Detection.
Proceedings of The 11th Asian Conference on Machine Learning, 2019

2018
SMGuard: A Flexible and Fine-Grained Resource Management Framework for GPUs.
IEEE Trans. Parallel Distributed Syst., 2018

SRAM- and STT-RAM-based hybrid, shared last-level cache for on-chip CPU-GPU heterogeneous architectures.
J. Supercomput., 2018

T1000: Mitigating the memory footprint of convolution neural networks with decomposition and re-fusion.
Future Gener. Comput. Syst., 2018

Will supercomputers be super-data and super-AI machines?
Commun. ACM, 2018

A Lightweight and Flexible Tool for Distinguishing Between Hardware Malfunctions and Program Bugs in Debugging Large-Scale Programs.
IEEE Access, 2018

Sparsing Deep Neural Network Using Semi-Discrete Matrix Decomposition.
IEEE Access, 2018

HPC-SFI: System-Level Fault Injection for High Performance Computing Systems.
Proceedings of the Network and Parallel Computing, 2018

Estimating Software Energy Consumption with Machine Learning Approach by Software Performance Feature.
Proceedings of the IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, 2018

Re-Running Large-Scale Parallel Programs Using Two Nodes.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018

Block-Checksum-Based Fault Tolerance for Matrix Multiplication on Large-Scale Parallel Systems.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Mitigating I/O Impact of Checkpointing on Large Scale Parallel Systems.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Multi-role SpTRSV on Sunway Many-Core Architecture.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Outlier Detection for Distributed Services using Multi-Frequency Patterns.
Proceedings of the 14th International Conference on Network and Service Management, 2018

Performance Analysis and Optimization of Cyro-EM Structure Determination in RELION-2.
Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

EffectFace: A Fast and Efficient Deep Neural Network Model for Face Recognition.
Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

2017
Achieving Versatile and Simultaneous Cache Optimizations With Nonvolatile SRAM.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

ParaFlow: Fine-grained parallel SDN controller for large-scale networks.
J. Netw. Comput. Appl., 2017

Controller-proxy: Scaling network management for large-scale SDN networks.
Comput. Commun., 2017

Flow Stealer: lightweight load balancing by stealing flows in distributed SDN controllers.
Sci. China Inf. Sci., 2017

A survey of P2P content sharing in MANETs.
Comput. Electr. Eng., 2017

PSOM: Periodic Self-Organizing Maps for unsupervised anomaly detection in periodic time series.
Proceedings of the 25th IEEE/ACM International Symposium on Quality of Service, 2017

iDPL: A scalable and flexible inter-continental testbed for data placement research and experiment.
Proceedings of the 2017 IEEE Symposium on Computers and Communications, 2017

Arena: Adaptive real-time update anomaly prediction in cloud systems.
Proceedings of the 13th International Conference on Network and Service Management, 2017

PFSI.sw: A programming framework for sea ice model algorithms based on Sunway many-core processor.
Proceedings of the 28th IEEE International Conference on Application-specific Systems, 2017

Improving Performance for Geo-Distributed Data Process in Wide-Area.
Proceedings of the 2017 IEEE International Conference on Computer and Information Technology, 2017

2016
Managing Server Clusters on Renewable Energy Mix.
ACM Trans. Auton. Adapt. Syst., 2016

A cross-layer approach for partition detection at overlay layer for structured P2P in MANETs.
Peer-to-Peer Netw. Appl., 2016

BFT: a placement algorithm for non-rectangle task model in reconfigurable computing system.
IET Comput. Digit. Tech., 2016

Coordinating workload balancing and power switching in renewable energy powered data center.
Frontiers Comput. Sci., 2016

IBB: Improved K-Resource Aware Backfill Balanced Scheduling for HTCondor.
Proceedings of the Network and Parallel Computing, 2016

Using recurrent neural networks toward black-box system anomaly prediction.
Proceedings of the 24th IEEE/ACM International Symposium on Quality of Service, 2016

Scheduling Tasks with Mixed Timing Constraints in GPU-Powered Real-Time Systems.
Proceedings of the 2016 International Conference on Supercomputing, 2016

DScheduler: Dynamic Network Scheduling Method for MapReduce in Distributed Controllers.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

Parallel Image Processing on the Sunway Many-Core Processor.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

Performance Events Based Full System Estimation on Application Power Consumption.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

Restricted Boltzmann Machines and Deep Belief Networks on Sunway Cluster.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

China's HPC Development in the Next 5 Years.
Proceedings of the 23rd IEEE International Conference on High Performance Computing, 2016

Lock-based synchronization for GPU architectures.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Efficient Power Allocation under Global Power Cap and Application-Level Power Budget.
Proceedings of the 7th International Conference on Cloud Computing and Big Data, 2016

2015
Robust Design Space Modeling.
ACM Trans. Design Autom. Electr. Syst., 2015

SEIP: System for Efficient Image Processing on Distributed Platform.
J. Comput. Sci. Technol., 2015

Reducing DRAM refreshing in an error correction manner.
Sci. China Inf. Sci., 2015

Improving multiprocessor performance with fine-grain coherence bypass.
Sci. China Inf. Sci., 2015

Leveraging Non-Volatile Storage to Achieve Versatile Cache Optimizations.
IEEE Comput. Archit. Lett., 2015

Merging of P2P Overlays Over Mobile Ad Hoc Network: Evaluation of Three Approaches.
Ad Hoc Sens. Wirel. Networks, 2015

A-DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters.
Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2015

A methodology for root-cause analysis in component based systems.
Proceedings of the 23rd IEEE International Symposium on Quality of Service, 2015

Online Replacement of Distributed Controllers in Software Defined Networks.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

JellyFish: Online Performance Tuning with Adaptive Configuration and Elastic Container in Hadoop Yarn.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

Request Squeezer: Mitigating Tail Latency through Pruned Request Replication.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Adaptive Assignment for Quality-Aware Mobile Sensing Network with Strategic Users.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Revisit network anomaly ranking in datacenter network using re-ranking.
Proceedings of the 4th IEEE International Conference on Cloud Networking, 2015

Q-Detector: A Quorum-Based Byzantine Fault Detector.
Proceedings of the International Conference on Cloud Computing and Big Data, 2015

Data Analysis and Synchronization on Inter-Continent Data Placement Laboratory.
Proceedings of the International Conference on Cloud Computing and Big Data, 2015

2014
An Efficient and Scalable Routing for MANETs.
Wirel. Pers. Commun., 2014

Lightweight dynamic partitioning for last-level cache of multicore processor on real system.
J. Supercomput., 2014

Towards Automated Provisioning and Emergency Handling in Renewable Energy Powered Datacenters.
J. Comput. Sci. Technol., 2014

iMeter: An integrated VM power model based on performance profiling.
Future Gener. Comput. Syst., 2014

Scalable hierarchical scheduling for malleable parallel jobs on multiprocessor-based systems.
Comput. Syst. Sci. Eng., 2014

Software Transactional Memory for GPU Architectures.
IEEE Comput. Archit. Lett., 2014

Speedup Critical Stage of Machine Learning with Batch Scheduling in GPU.
Proceedings of the Network and Parallel Computing, 2014

Memory Centric Hardware Prefetching in Multi-core Processors.
Proceedings of the Trustworthy Computing and Services - International Conference, 2014

An Active Approach for Automatic Rule Discovery in Rule-Based Monitoring Systems.
Proceedings of the Trustworthy Computing and Services - International Conference, 2014

Pacifier: Record and replay for relaxed-consistency multiprocessors with distributed directory protocol.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

Paraio: A scalable network I/O framework for many-core systems.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Managing Green Datacenters Powered by Hybrid Renewable Energy Systems.
Proceedings of the 11th International Conference on Autonomic Computing, 2014

Remapping NUCA: Improving NUCA Cache's Power Efficiency.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Lessons from Experimental Methodology of Cache Hierarchy Changes with the Memory Technology.
Proceedings of the 17th IEEE International Conference on Computational Science and Engineering, 2014

Dual Power: Integrating Renewable Energy into Green Datacenters without Grid Tie Inverter.
Proceedings of the 17th IEEE International Conference on Computational Science and Engineering, 2014

Software Transactional Memory for GPU Architectures.
Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014

2013
Optimizing the Joint Source/Network Coding for Video Streaming over Multi-hop Wireless Networks.
KSII Trans. Internet Inf. Syst., 2013

Partition-Based Hardware Transactional Memory for Many-Core Processors.
Proceedings of the Network and Parallel Computing - 10th IFIP International Conference, 2013

BulkCommit: scalable and fast commit of atomic blocks in a lazy multiprocessor environment.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Chameleon: Adapting throughput server to time-varying green power budget using online learning.
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Energy Efficiency Evaluation of Workload Execution on Intel Xeon Phi Coprocessor.
Proceedings of the Trustworthy Computing and Services, 2013

A Black-Box Approach for Detecting the Failure Traces.
Proceedings of the Trustworthy Computing and Services, 2013

Differentiating data collection for cloud environment monitoring.
Proceedings of the 2013 IFIP/IEEE International Symposium on Integrated Network Management (IM 2013), 2013

POIGEM: A Programming-Oriented Instruction Level GPU Energy Model for CUDA Program.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

M&C: A Software Solution to Reduce Errors Caused by Incoherent Caches on GPUs in Unstructured Graphic Algorithm.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

Interference-Aware Program Scheduling for Multicore Processors.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

Load Balancing in Heterogeneous MapReduce Environments.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

Rainbow: Efficient memory dependence recording with high replay parallelism for relaxed memory model.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

A Framework for Earth System Model Application Monitoring.
Proceedings of the 16th IEEE International Conference on Computational Science and Engineering, 2013

Elastic Resource Allocation in the Cloud.
Proceedings of the 16th IEEE International Conference on Computational Science and Engineering, 2013

Research and Implementation of MapReduce Programming Oriented Graphical Modeling System.
Proceedings of the 16th IEEE International Conference on Computational Science and Engineering, 2013

Volition: scalable and precise sequential consistency violation detection.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

Ceaser: Exploring the Monitoring Rules in the Time Dimension.
Proceedings of the 2013 International Conference on Cloud and Service Computing, 2013

Pipeline-Based Parallel Framework for Mass File Processing.
Proceedings of the 2013 International Conference on Cloud and Service Computing, 2013

Research of MapReduce Oriented Graphical Programming.
Proceedings of the 2013 International Conference on Cloud and Service Computing, 2013

Empowering Designers to Estimate Function-Level Power for Developing Green Applications.
Proceedings of the 2013 International Conference on Cloud and Service Computing, 2013

2012
MANET adaptive structured P2P overlay.
Peer-to-Peer Netw. Appl., 2012

Online Anomaly Prediction for Real-Time Stream Processing.
IEICE Trans. Commun., 2012

Practical Distributed Location Service for Wireless Sensor Networks with Mobile Sinks.
IEICE Trans. Commun., 2012

Stable Adaptive Work-Stealing for Concurrent Many-Core Runtime Systems.
IEICE Trans. Inf. Syst., 2012

MapReduce Workload Modeling with Statistical Approach.
J. Grid Comput., 2012

Joint Source-Network Coding Optimization for Video Streaming over Wireless Multi-Hop Networks.
Proceedings of the 75th IEEE Vehicular Technology Conference, 2012

Measuring and Visualizing Thread Communications for Pthread Applications.
Proceedings of the 13th International Conference on Parallel and Distributed Computing, 2012

LPFSC: A Light Weight Parallel Framework for Super Computing.
Proceedings of the 13th International Conference on Parallel and Distributed Computing, 2012

Efficient Statistical Computing on Multicore and MultiGPU Systems.
Proceedings of the 15th International Conference on Network-Based Information Systems, 2012

Providing High Availability for Distributed Stream Processing Application with Replica Placement.
Proceedings of the 15th International Conference on Network-Based Information Systems, 2012

MOLTS: Mobile Object Localization and Tracking System Based on Wireless Sensor Networks.
Proceedings of the Seventh IEEE International Conference on Networking, 2012

Statistics-based Workload Modeling for MapReduce.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

UVMPM: A Unitary Approach for VM Power Metering Based on Performance Profiling.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

CPOP: Component Design and Parallelization towards POP Ocean Model Based on ESMF.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Predictive Data and Energy Management under Budget.
Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops, 2012

ERMS: An Elastic Replication Management System for HDFS.
Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops, 2012

Network Coding-Based Rate Allocation and Bursty Loss Protection for Video Streaming over Wireless Multi-hop Networks.
Proceedings of the 12th IEEE International Conference on Computer and Information Technology, 2012

2011
An Efficient Unstructured P2P Overlay over MANET Using Underlying Proactive Routing.
Proceedings of the Seventh International Conference on Mobile Ad-hoc and Sensor Networks, 2011

Enhancing cooperation with multiple stage auctions in opportunistic routing for wireless mesh networks.
Proceedings of the 12th IFIP/IEEE International Symposium on Integrated Network Management, 2011

Stable Adaptive Work-Stealing for Concurrent Multi-core Runtime Systems.
Proceedings of the 13th IEEE International Conference on High Performance Computing & Communication, 2011

Energy Prediction for MapReduce Workloads.
Proceedings of the IEEE Ninth International Conference on Dependable, 2011

Deployment Oriented Role Destined Integrated Protocol for Event-driven Wireless Sensor Networks.
Proceedings of the 14th IEEE International Conference on Computational Science and Engineering, 2011

Operator placement with QoS constraints for distributed stream processing.
Proceedings of the 7th International Conference on Network and Service Management, 2011

NEPnet: A scalable monitoring system for anomaly detection of network service.
Proceedings of the 7th International Conference on Network and Service Management, 2011

CDebugger: A scalable parallel debugger with dynamic communication topology configuration.
Proceedings of the 2011 International Conference on Cloud and Service Computing, 2011

Virtual machine mapping policy based on load balancing in private cloud environment.
Proceedings of the 2011 International Conference on Cloud and Service Computing, 2011

2010
Congestion avoidance, detection and alleviation in wireless sensor networks.
J. Zhejiang Univ. Sci. C, 2010

An Efficient Overlay for Unstructured P2P File Sharing over MANET using Underlying Cluster-based Routing.
KSII Trans. Internet Inf. Syst., 2010

Throughput maximization with bargaining game in cognitive radio networks.
Proceedings of the 3rd IFIP Wireless Days Conference 2010, 2010

Energy-Efficient Coded Routing with Selective Transmission Power for Wireless Sensor Networks.
Proceedings of the 72nd IEEE Vehicular Technology Conference, 2010

A New Cross-Layer Unstructured P2P File Sharing Protocol over Mobile Ad Hoc Network.
Proceedings of the Advances in Computer Science and Information Technology, 2010

Malleable-Lab: A Tool for Evaluating Adaptive Online Schedulers on Malleable Jobs.
Proceedings of the 18th Euromicro Conference on Parallel, 2010

An efficient structured P2P overlay over MANET.
Proceedings of the Ninth ACM International Workshop on Data Engineering for Wireless and Mobile Access, 2010

Scalable Hierarchical Scheduling for Multiprocessor Systems Using Adaptive Feedback-Driven Policies.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2010

Cross-Layer Design to Merge Structured P2P Networks over MANET.
Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems, 2010

A Fair Thread-Aware Memory Scheduling Algorithm for Chip Multiprocessor.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2010

Accelerating Dock6's Amber Scoring with Graphic Processing Unit.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2010

A Novel Scheme for High Performance Finite-Difference Time-Domain (FDTD) Computations Based on GPU.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2010

IndexTree: An Efficient Tamper-Evidence Logging.
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010

ORSP: An Efficient Resource Acquisition Policy for Peer-to-Peer Mesh Streaming Systems.
Proceedings of the GCC 2010, 2010

I/O Feature-based File Prefetching for Multi-Applications.
Proceedings of the GCC 2010, 2010

Video Streaming over Wireless Mesh Networks with Multi-Gateway Support.
Proceedings of the IEEE/IFIP 8th International Conference on Embedded and Ubiquitous Computing, 2010

AIM: An Auction Incentive Mechanism in Wireless Networks with Opportunistic Routing.
Proceedings of the 13th IEEE International Conference on Computational Science and Engineering, 2010

Efficient Transaction Nesting in Hardware Transactional Memory.
Proceedings of the Architecture of Computing Systems, 2010

2009
Design Web Services: Towards Service Reuse at the Design Level.
J. Comput., 2009

Link Availability Based Mobility-Aware Max-Min Multi-Hop Clustering (<i>M</i><sup>4</sup><i>C</i>) for Mobile Ad Hoc Networks.
IEICE Trans. Commun., 2009

Challenges and possible approaches: towards the petaflops computers.
Frontiers Comput. Sci. China, 2009

Re-exploring the Potential of Using Tree Structure in P2P Live Streaming Networks.
Proceedings of the NPC 2009, 2009

Optimizing Transmission in Multi-Flow Streaming Overlay Networks.
Proceedings of the NPC 2009, 2009

Data Currency in Replicated Distributed Storage System.
Proceedings of the International Conference on Networking, Architecture, and Storage, 2009

Context-Aware Routing for Peer-to-Peer Network on MANETs.
Proceedings of the International Conference on Networking, Architecture, and Storage, 2009

Reducing Communication Overhead in Threshold Monitoring with Arithmetic Aggregation.
Proceedings of the International Conference on Networking, Architecture, and Storage, 2009

Intra-flow Network Coding Based Multipath Routing Protocol for Event-Driven Wireless Sensor Networks.
Proceedings of the MSN 2009, 2009

RRDD: Receiver-oriented Robust Data Delivery in Mobile Sensor Networks.
Proceedings of the IEEE 6th International Conference on Mobile Adhoc and Sensor Systems, 2009

A Two-Phase Log-Based Fault Recovery Mechanism in Master/Worker Based Computing Environment.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2009

R-ECS: reliable elastic computing services for building virtual computing environment.
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, 2009

Tuning Performance of P2P Mesh Streaming System Using a Network Evolution Approach.
Proceedings of the Scalable Information Systems, 4th International ICST Conference, 2009

A Heuristic Energy-aware Scheduling Algorithm for Heterogeneous Clusters.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Embedded Processors in Heterogeneous Architectures for Web Servers.
Proceedings of the 2009 International Conference on Internet Computing, 2009

Employing Data Driven Random Membership Subset Algorithm for QoS-Aware Peer-to-Peer Streaming.
Proceedings of the Future Multimedia Networking, Second International Workshop, 2009

Link-Aware Geographic Routing in Wireless Sensor Networks.
Proceedings of the 12th IEEE International Conference on Computational Science and Engineering, 2009

Cesar-FD: An Effective Stateful Fault Detection Mechanism in Drug Discovery Grid.
Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009

An Energy-Efficient, Application-Oriented Control Algorithm for MAC Protocols in WSN.
Proceedings of the Ad Hoc Networks, First International Conference, 2009

2008
Link Availability Prediction in Ad Hoc Networks.
Proceedings of the 14th International Conference on Parallel and Distributed Systems, 2008

An Adaptive Network Node Architecture for Evolutionary Networks.
Proceedings of the IEEE International Conference on Networking, Sensing and Control, 2008

A fast handoff scheme for MPLS-based Mobile IPv6 Network.
Proceedings of the IEEE International Conference on Networking, Sensing and Control, 2008

Auto-configuration of Shared Network-layer Address in Cluster-based Wireless Sensor Network.
Proceedings of the IEEE International Conference on Networking, Sensing and Control, 2008

A Component-Oriented Development Approach to E-Business Applications.
Proceedings of the 2008 IEEE International Conference on e-Business Engineering, 2008

Hardware Transactional Memory Supporting I/O Operations within Transactions.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

EOMT: A Master-Slave Task Scheduling Strategy for Grid Environment.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

Architecture Centric System Design for Supporting Reconfiguration of Service Oriented Systems.
Proceedings of the 11th IEEE High Assurance Systems Engineering Symposium, 2008

Reducing the Cluster Monitoring Workload by Identifying Application Characteristics.
Proceedings of the Seventh International Conference on Grid and Cooperative Computing, 2008

DDGrid: A Grid Computing Environment with Massive Concurrency and Fault-Tolerance Support.
Proceedings of the Seventh International Conference on Grid and Cooperative Computing, 2008

Whirlpool: Structuring Mesh-Based Protocol for QoS-Aware Peer-to-Peer Streaming.
Proceedings of the Seventh International Conference on Grid and Cooperative Computing, 2008

GSON: A Group Based Hierarchically Structured Overlay Network.
Proceedings of the 12th IEEE International Workshop on Future Trends of Distributed Computing Systems, 2008

An Energy Efficient Weight-Clustering Algorithm in Wireless Sensor Networks.
Proceedings of the Japan-China Joint Workshop on Frontier of Computer Science and Technology, 2008

Cross-Domain Middlewares Interoperability for Distributed Aircraft Design Optimization.
Proceedings of the Fourth International Conference on e-Science, 2008

Mobile e-Lab: A Mobile Personalized Virtual Research Computing Environment.
Proceedings of the Fourth International Conference on e-Science, 2008

An Architecture for Distributed Controllable Networks and Manageable Node Based on Network Processor.
Proceedings of the Progress in WWW Research and Development, 2008

An evolutionary node architecture and performance optimization.
Proceedings of the 6th ACS/IEEE International Conference on Computer Systems and Applications, 2008

2007
Adaptive Call Admission Control Based on Reward-Penalty Model in Wireless/Mobile Network.
J. Comput. Sci. Technol., 2007

Handover for Seamless Stream Media in Mobile IPv6 Network.
Proceedings of the Wired/Wireless Internet Communications, 5th International Conference, 2007

Spatial Map Data Share and Parallel Dissemination System Based on Distributed Network Services and Digital Watermark.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2007

MidCASE : A Service Oriented Middleware Enabling Context Awareness for Smart Environment.
Proceedings of the 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE 2007), 2007

A Novel Distributed Wireless VoIP Server Based on SIP.
Proceedings of the 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE 2007), 2007

Prediction Algorithms and Grid Based Architecture for Streaming System.
Proceedings of the 2007 International Conference on Multimedia Systems and Applications, 2007

An Approach of End-to-End DiffServ/MPLS QoS Context Transfer in HMIPv6 Net.
Proceedings of the International Symposium on Autonomous Decentralized Systems (ISADS 2007), 2007

An On-demand Address Allocation Scheme for Query based Sensor Networks.
Proceedings of the International Symposium on Autonomous Decentralized Systems (ISADS 2007), 2007

Service Forest: Enabling Dynamic Service Composition in Mobile Ad Hoc Networks.
Proceedings of the 2007 International Conference on Intelligent Pervasive Computing, 2007

A New Attributes-Priority Matching Watermarking Algorithm Satisfying Topological Conformance for Vector Map.
Proceedings of the 3rd International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2007), 2007

Building a Simple and Effective Text Categorization System using Relative Importance in Category.
Proceedings of the Third International Conference on Natural Computation, 2007

EDDS: An Efficient Data Delivery Scheme for Address-Free Wireless Sensor Networks.
Proceedings of the Sixth International Conference on Networking (ICN 2007), 2007

Research on Planning and Deployment Platform for Wireless Sensor Networks.
Proceedings of the Advances in Grid and Pervasive Computing, 2007

Architecture Driven Grid Application Development.
Proceedings of the Grid and Cooperative Computing, 2007

A QoS Oriented Network Service Architecture for Grid Applications.
Proceedings of the Future Generation Communication and Networking, 2007

An Architecture of Policy-Based Application-aware Network QoS Management for Large-scale Heterogeneous Networks.
Proceedings of the Future Generation Communication and Networking, 2007

Agent-Based MADM Approach to the Dynamic Web Service Selection.
Proceedings of The 2nd IEEE Asia-Pacific Services Computing Conference, 2007

Experiences with the EUChinaGrid Project - Implementing Interoperation between gLite and GOS.
Proceedings of The 2nd IEEE Asia-Pacific Services Computing Conference, 2007

Context-Aware Web Service Selection Based on Multi-aspects Regulating.
Proceedings of The 2nd IEEE Asia-Pacific Services Computing Conference, 2007

Semantics Based Enterprise Modeling for Automated Service Discovery and Service Composition.
Proceedings of The 2nd IEEE Asia-Pacific Services Computing Conference, 2007

Study on Embedded Vehicle Dynamic Location Navigation Supported by Network and Route Availability Model.
Proceedings of the Advanced Parallel Processing Technologies, 7th International Symposium, 2007

A Study on Data Placement of Extensible Parallel Storage System.
Proceedings of the 6th Annual IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), 2007

2006
RSVP Context Extraction in IP Mobility Environments.
Proceedings of the 63rd IEEE Vehicular Technology Conference, 2006

Prediction Algorithms in Large Scale VOD Services on Grid Infrastructure.
Proceedings of the Advances in Multimedia Information Processing, 2006

Architecture-based problem frames constructing for software reuse.
Proceedings of the 2006 International Workshop on Advances and Applications of Problem Frames, 2006

A New P2P-like Architecture for Large Scale End to End Network Measurement.
Proceedings of the Fifth International Conference on Networking and the International Conference on Systems (ICN / ICONS / MCL 2006), 2006

Prediction Algorithms in Large Scale VOD Network Collaborations.
Proceedings of the Computational Intelligence and Bioinformatics, 2006

POWER: Planning and Deployment Platform for Wireless Sensor Networks.
Proceedings of the Grid and Cooperative Computing Workshops, 2006

Supplier Categorization with <i>K</i>-Means Type Subspace Clustering.
Proceedings of the Frontiers of WWW Research and Development, 2006

2005
The PARNEM: Using Network Emulation to Predict the Correctness and Performance of Applications.
Proceedings of the Grid and Cooperative Computing - GCC 2005, 4th International Conference, Beijing, China, November 30, 2005

A High Availability Mechanism for Parallel File System.
Proceedings of the Advanced Parallel Processing Technologies, 6th International Workshop, 2005

2004
A Framework for End-to-End QoS Context Transfer in Mobile IPv6.
Proceedings of the Personal Wireless Communications, IFIP TC6 9th International Conference, 2004

A New Grid Security Framework with Dynamic Access Control.
Proceedings of the Grid and Cooperative Computing, 2004

Digital Library Application Grid - An Opportunity to Open Cultural Infrastructure.
Proceedings of the Grid and Cooperative Computing, 2004

A Grid Security Infrastructure Based on Behaviors and Trusts.
Proceedings of the Grid and Cooperative Computing, 2004

CNGrid: A Test-Bed for Grid Technologies in China.
Proceedings of the 10th IEEE International Workshop on Future Trends of Distributed Computing Systems (FTDCS 2004), 2004

A Grid Middleware for Aggregating Scientific Computing Libraries and Parallel Programming Environments.
Proceedings of the Advanced Web Technologies and Applications, 2004

Study on the Behavior-based Trust Model in Grid Security System.
Proceedings of the 2004 IEEE International Conference on Services Computing (SCC 2004), 2004

2003
Site-Role Based GreedyDual-Size Replacement Algorithm.
Proceedings of the Advances in Web-Age Information Management, 2003

A Practical Approach for Constructing a Parallel Network Simulator.
Proceedings of the Computer and Information Sciences, 2003

To Manage Grid Using Dynamically Constructed Network Management Concept: An Early Thought.
Proceedings of the Grid and Cooperative Computing, Second International Workshop, 2003

A Novel Model and Architecture on NMS - Dynamically Constructed Network Management.
Proceedings of the Advanced Parallel Programming Technologies, 5th International Workshop, 2003

2001
Active Network Supports for Mobile IP.
J. Comput. Sci. Technol., 2001

2000
An Object-Oriented Middleware for our Metasystem on Internet.
Proceedings of the TOOLS Asia 2000: 36th International Conference on Technology of Object-Oriented Languages and Systems, Xi'an, China, 30 October, 2000


  Loading...