Xipeng Shen
Orcid: 0000-0003-3599-8010Affiliations:
- North Carolina State University, USA
According to our database1,
Xipeng Shen
authored at least 203 papers
between 2000 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on dl.acm.org
On csauthors.net:
Bibliography
2024
ESG: Pipeline-Conscious Efficient Scheduling of DNN Workflows on Serverless Platforms with Shareable GPUs.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
DACO: Pursuing Ultra-low Power Consumption via DNN-Adaptive CPU-GPU CO-optimization on Mobile Devices.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
Expanding the Edge: Enabling Efficient Winograd CNN Inference With Deep Reuse on Edge Device.
IEEE Trans. Knowl. Data Eng., October, 2023
Accelerating matrix-centric graph processing on GPUs through bit-level optimizations.
J. Parallel Distributed Comput., July, 2023
Proc. ACM Program. Lang., April, 2023
Proc. ACM Manag. Data, 2023
ACM Comput. Surv., 2023
Decentralized Application-Level Adaptive Scheduling for Multi-Instance DNNs on Open Mobile Devices.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023
BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs.
Proceedings of the 37th International Conference on Supercomputing, 2023
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
SpecPMT: Speculative Logging for Resolving Crash Consistency Overhead of Persistent Memory.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
IEEE Trans. Software Eng., 2022
IEEE Trans. Software Eng., 2022
POCLib: A High-Performance Framework for Enabling Near Orthogonal Processing on Compression.
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Parallel Distributed Syst., 2022
ACM Trans. Design Autom. Electr. Syst., 2022
ACM Trans. Archit. Code Optim., 2022
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022
Brief Industry Paper: Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card.
Proceedings of the 28th IEEE Real-Time and Embedded Technology and Applications Symposium, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
Proceedings of the 44th IEEE/ACM International Conference on Software Engineering: Companion Proceedings, 2022
Proceedings of the IEEE/ACM International Workshop on HPC User Support Tools, 2022
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines.
Proceedings of the Software Architecture. ECSA 2022 Tracks and Workshops, 2022
Enabling Near Real-Time NLU-Driven Natural Language Programming through Dynamic Grammar Graph-Based Translation.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022
2021
IEEE Trans. Parallel Distributed Syst., 2021
Proc. ACM Program. Lang., 2021
Proc. ACM Program. Lang., 2021
Faster SAT Solving for Software with Repeated Structures (with Case Studies on Software Test Suite Minimization).
CoRR, 2021
CoCoPIE: enabling real-time AI on off-the-shelf mobile devices via compression-compilation co-design.
Commun. ACM, 2021
Proceedings of the ESEC/FSE '21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021
Proceedings of the 2021 International Symposium on Secure and Private Execution Environment Design (SEED), 2021
Brief Industry Paper: Towards Real-Time 3D Object Detection for Autonomous Vehicles with Pruning Search.
Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
Proceedings of the IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2021
HPC Ontology: Towards a Unified Ontology for Managing Training Datasets and AI Models for High-Performance Computing.
Proceedings of the IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2021
PCCS: Processor-Centric Contention-aware Slowdown Model for Heterogeneous System-on-Chips.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Proceedings of the International Joint Conference on Neural Networks, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
Proceedings of the IEEE International Conference on Data Mining, 2021
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021
Proceedings of the 35th European Conference on Object-Oriented Programming, 2021
Proceedings of the CC '21: 30th ACM SIGPLAN International Conference on Compiler Construction, 2021
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
IEEE Trans. Parallel Distributed Syst., 2020
ACM Trans. Embed. Comput. Syst., 2020
CoRR, 2020
CoCoPIE: Making Mobile AI Sweet As PIE -Compression-Compilation Co-Design Goes a Long Way.
CoRR, 2020
Proceedings of the ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020
Proceedings of the Third Conference on Machine Learning and Systems, 2020
Hardware-Based Domain Virtualization for Intra-Process Isolation of Persistent Memory Objects.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020
Proceedings of the ICSE '20: 42nd International Conference on Software Engineering, Seoul, South Korea, 27 June, 2020
MKPipe: a compiler framework for optimizing multi-kernel workloads in OpenCL for FPGA.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020
MERR: Improving Security of Persistent Memory Objects via Efficient Memory Exposure Reduction and Randomization.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020
GOPipe: A Granularity-Oblivious Programming Framework for Pipelined Stencil Executions on GPU.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020
2019
Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, 2019
Deep reuse: streamline CNN inference on the fly via coarse-grained computation reuse.
Proceedings of the ACM International Conference on Supercomputing, 2019
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019
HiWayLib: A Software Framework for Enabling High Performance Communications for Heterogeneous Pipeline Computations.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019
2018
Efficient Document Analytics on Compressed Data: Method, Challenges, Algorithms, Insights.
Proc. VLDB Endow., 2018
Neural Networks, 2018
J. Parallel Distributed Comput., 2018
Frontiers Comput. Sci., 2018
Proceedings of the International Conference for High Performance Computing, 2018
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Proceedings of the International Symposium on Memory Systems, 2018
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Taming the "Monster": Overcoming Program Optimization Challenges on SW26010 Through Precise Performance Modeling.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data.
Proceedings of the 32nd International Conference on Supercomputing, 2018
Proceedings of the IEEE International Conference on Data Mining, 2018
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018
Proceedings of the 27th International Conference on Compiler Construction, 2018
2017
IEEE Trans. Computers, 2017
Proc. ACM Program. Lang., 2017
Understanding co-run performance on CPU-GPU integrated processors: observations, insights, directions.
Frontiers Comput. Sci., 2017
Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing.
Proceedings of the International Conference for High Performance Computing, 2017
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017
Generalizations of the theory and deployment of triangular inequality for compiler-based strength reduction.
Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Bridging the gap between memory performance and massive parallelism: the critical role of programming systems innovations (keynote).
Proceedings of the 2017 ACM SIGPLAN International Symposium on Memory Management, 2017
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017
Sweet KNN: An Efficient KNN on GPU through Reconciliation between Redundancy Removal and Regularity.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017
POSTER: Cutting the Fat: Speeding Up RBM for Fast Deep Learning Through Generalized Redundancy Elimination.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017
2016
Examining and Reducing the Influence of Sampling Errors on Feedback-Driven Optimizations.
ACM Trans. Archit. Code Optim., 2016
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016
Proceedings of the 2016 International Conference on Supercomputing, 2016
Proceedings of the 30th European Conference on Object-Oriented Programming, 2016
Proceedings of the 26th Annual International Conference on Computer Science and Software Engineering, 2016
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016
2015
TOP: A Framework for Enabling Algorithmic Optimizations for Distance-Related Problems.
Proc. VLDB Endow., 2015
Proceedings of the 5th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2015
Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2015
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015
Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup.
Proceedings of the 32nd International Conference on Machine Learning, 2015
Proceedings of the 15th Workshop on Hot Topics in Operating Systems, 2015
Proceedings of 25th Annual International Conference on Computer Science and Software Engineering, 2015
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015
2014
Space-efficient multi-versioning for input-adaptive feedback-driven program optimizations.
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the Languages and Compilers for Parallel Computing, 2014
Proceedings of the ACM/IEEE International Conference on Automated Software Engineering, 2014
SatScore: uncovering and avoiding a principled pitfall in responsiveness measurements of app launches.
Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2014
Challenging the "embarrassingly sequential": parallelizing finite state machine-based computations through principled speculation.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014
Finding the limit: examining the potential and complexity of compilation scheduling for JIT-based runtime systems.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014
SM-centric transformation: circumventing hardware restrictions for flexible GPU scheduling.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014
2013
HPar: A practical parallel parser for HTML-taming HTML complexities for parallel parsing.
ACM Trans. Archit. Code Optim., 2013
Int. J. Parallel Program., 2013
Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013
Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, 2013
Do computer programs have to be as dumb as they are?: input-centric dynamic program optimizations.
Proceedings of the VMIL@SPLASH '13: Proceedings of the 7th ACM workshop on Virtual machines and intermediate languages, 2013
Proceedings of the 2013 IEEE 21st International Symposium on Modelling, 2013
Simple Profile Rectifications Go a Long Way - Statistically Exploring and Alleviating the Effects of Sampling Errors for Program Optimizations.
Proceedings of the ECOOP 2013 - Object-Oriented Programming, 2013
Profmig: A framework for flexible migration of program profiles across software versions.
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013
Exploring hybrid memory for GPU energy efficiency through software-hardware co-design.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013
2012
IEEE Trans. Parallel Distributed Syst., 2012
Proceedings of the 2012 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '12, 2012
Proceedings of the 27th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2012
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2012
One stone two birds: synchronization relaxation and redundancy removal in GPU-CPU translation.
Proceedings of the International Conference on Supercomputing, 2012
Speculative parallelization needs rigor: probabilistic analysis for optimal speculation of finite-state machine applications.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
2011
The Complexity of Optimal Job Co-Scheduling on Chip Multiprocessors and Heuristics-Based Solutions.
IEEE Trans. Parallel Distributed Syst., 2011
A step towards transparent integration of input-consciousness into dynamic program optimizations.
Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2011
Proceedings of the Languages and Compilers for Parallel Computing, 2011
Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, 2011
Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
2010
Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010
Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2010
LU Decomposition on Cell Broadband Engine: An Empirical Study to Exploit Heterogeneous Chip Multiprocessors.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2010
Proceedings of the Languages and Compilers for Parallel Computing, 2010
Streamlining GPU applications on the fly: thread divergence elimination through runtime thread-data remapping.
Proceedings of the 24th International Conference on Supercomputing, 2010
Combining Locality Analysis with Online Proactive Job Co-scheduling in Chip Multiprocessors.
Proceedings of the High Performance Embedded Architectures and Compilers, 2010
Proceedings of the CGO 2010, 2010
Proceedings of the Compiler Construction, 19th International Conference, 2010
2009
ACM Trans. Program. Lang. Syst., 2009
ACM SIGOPS Oper. Syst. Rev., 2009
Proceedings of the 5th International Conference on Virtual Execution Environments, 2009
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009
Speculation with Little Wasting: Saving Cost in Software Speculation through Transparent Learning.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009
Proceedings of the CGO 2009, 2009
A study on optimally co-scheduling jobs of different lengths on chip multiprocessors.
Proceedings of the 6th Conference on Computing Frontiers, 2009
2008
Proceedings of the Languages and Compilers for Parallel Computing, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Adaptive Software Speculation for Enhancing the Cost-Efficiency of Behavior-Oriented Parallelization.
Proceedings of the 2008 International Conference on Parallel Processing, 2008
Proceedings of the Euro-Par 2008, 2008
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008
2007
IEEE Trans. Computers, 2007
J. Parallel Distributed Comput., 2007
Proceedings of the 34th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007
Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007
Proceedings of the Languages and Compilers for Parallel Computing, 2007
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007
Proceedings of the Workshop on Experimental Computer Science, 2007
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007
2006
Proceedings of the 5th International Symposium on Memory Management, 2006
2005
Proceedings of the Languages and Compilers for Parallel Computing, 2005
Proceedings of the 19th Annual International Conference on Supercomputing, 2005
Proceedings of the 2005 workshop on Memory System Performance, 2005
2004
Proceedings of the Storage and Retrieval Methods and Applications for Multimedia 2004, 2004
Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, 2004
Proceedings of the Languages and Compilers for High Performance Computing, 2004
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004
2003
Proceedings of the Languages and Compilers for Parallel Computing, 2003
2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
2000
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000