Lu Peng

Orcid: 0000-0003-3545-286X

Affiliations:
  • Tulane University, Department of Computer Science, New Orleans, LA, USA
  • Louisiana State University, Division of Electrical and Computer Engineering, Baton Rouge, LA, USA (former)
  • University of Florida, Department of Computer Science, Gainesville, FL, USA (former)


According to our database1, Lu Peng authored at least 98 papers between 2001 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Smart Agriculture: Current State, Opportunities, and Challenges.
IEEE Access, 2024

Soft Error Resilience Analysis of LSTM Networks.
Proceedings of the Great Lakes Symposium on VLSI 2024, 2024

A New Routing Strategy to Improve Success Rates of Quantum Computers.
Proceedings of the Great Lakes Symposium on VLSI 2024, 2024

2023
Boosting Performance and QoS for Concurrent GPU B+trees by Combining-Based Synchronization.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

Comprehensive Transformer-Based Model Architecture for Real-World Storm Prediction.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track, 2023

MMST-ViT: Climate Change-aware Crop Yield Prediction via Multi-Modal Spatial-Temporal Vision Transformer.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Graph Neural Network Assisted Quantum Compilation for Qubit Allocation.
Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

Stochastic Computing for Reliable Memristive In-Memory Computation.
Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

SCU: A Hardware Accelerator for Smart Contract Execution.
Proceedings of the IEEE International Conference on Blockchain, 2023

2022
Protecting Synchronization Mechanisms of Parallel Big Data Kernels via Logging.
IEEE Trans. Computers, 2022

Ensemble of fast learning stochastic gradient boosting.
Commun. Stat. Simul. Comput., 2022

High performance GPU concurrent B+tree.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022

Cascade Variational Auto-Encoder for Hierarchical Disentanglement.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

GeauxTrace: A Scalable Privacy-Protecting Contact Tracing App Design Using Blockchain.
Proceedings of the IEEE/ACM International Conference on Big Data Computing, 2022

2021
iChain: Peer-To-Peer Machine Learning Powered by Blockchain Technology.
Frontiers Blockchain, 2021

Evaluation of Algorithms for Randomizing Key Item Locations in Game Worlds.
IEEE Access, 2021

Precise Weather Parameter Predictions for Target Regions via Neural Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track, 2021

GPU-Assisted Memory Expansion.
Proceedings of the IEEE International Conference on Networking, Architecture and Storage, 2021

BifurKTM: Approximately Consistent Distributed Transactional Memory for GPUs.
Proceedings of the 12th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and 10th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms, 2021

2020
A High Throughput B+tree for SIMD Architectures.
IEEE Trans. Parallel Distributed Syst., 2020

Architectural Support for NVRAM Persistence in GPUs.
IEEE Trans. Parallel Distributed Syst., 2020

Computer comparisons in the presence of performance variation.
Frontiers Comput. Sci., 2020

Robust Cache-Aware Quantum Processor Layout.
Proceedings of the International Symposium on Reliable Distributed Systems, 2020

ATT: A Fault-Tolerant ReRAM Accelerator for Attention-based Neural Networks.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

BPU: A Blockchain Processing Unit for Accelerated Smart Contract Execution.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019
Hierarchical Hybrid Memory Management in OS for Tiered Memory Systems.
IEEE Trans. Parallel Distributed Syst., 2019

Long Short-Term Memory Network Design for Analog Computing.
ACM J. Emerg. Technol. Comput. Syst., 2019

Harmonia: a high throughput B+tree for GPUs.
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

CUDA-DTM: Distributed Transactional Memory for GPU Clusters.
Proceedings of the Networked Systems - 7th International Conference, 2019

Exploiting Model-Level Parallelism in Recurrent Neural Network Accelerators.
Proceedings of the 13th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2019

Hardware Accelerator for Adversarial Attacks on Deep Learning Neural Networks.
Proceedings of the Tenth International Green and Sustainable Computing Conference, 2019

Efficient GPU NVRAM Persistence with Helper Warps.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Fooling AI with AI: An Accelerator for Adversarial Attacks on Deep Learning Visual Classification.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

Thinking about A New Mechanism for Huge Page Management.
Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems, 2019

2018
qSwitch: Dynamical Off-Chip Bandwidth Allocation Between Local and Remote Accesses.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

A Multiple Input Floating Gate Based Arithmetic Logic Unit with a Feedback Loop for Digital Calibration.
J. Low Power Electron., 2018

Calibration method to reduce the error in logarithmic conversion with its circuit implementation.
IET Circuits Devices Syst., 2018

2017
Using Switchable Pins to Increase Off-Chip Bandwidth in Chip-Multiprocessors.
IEEE Trans. Parallel Distributed Syst., 2017

Exploring Energy-Efficient Cache Design in Emerging Mobile Platforms.
ACM Trans. Design Autom. Electr. Syst., 2017

Soft error resilience of Big Data kernels through algorithmic approaches.
J. Supercomput., 2017

QoS Management on Heterogeneous Architecture for Multiprogrammed, Parallel, and Domain-Specific Applications.
IEEE Syst. J., 2017

Emerging technology enabled energy-efficient GPGPUs register file.
Microprocess. Microsystems, 2017

A novel switchable pin method for regulating power in chip-multiprocessor.
Integr., 2017

Compact Modeling of Graphene Barristor for Digital Integrated Circuit Design.
Proceedings of the 2017 IEEE Computer Society Annual Symposium on VLSI, 2017

Accelerating GPU Hardware Transactional Memory with Snapshot Isolation.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Carpool: a bufferless on-chip network supporting adaptive multicast and hotspot alleviation.
Proceedings of the International Conference on Supercomputing, 2017

2016
Performance Analysis of Multimedia Retrieval Workloads Running on Multicores.
IEEE Trans. Parallel Distributed Syst., 2016

Soft error resilience in Big Data kernels through modular analysis.
J. Supercomput., 2016

Operational Cost Optimization for Cloud Computing Data Centers Using Renewable Energy.
IEEE Syst. J., 2016

Design space exploration for device and architectural heterogeneity in chip-multiprocessors.
Microprocess. Microsystems, 2016

Parallelizing image feature extraction algorithms on multi-core platforms.
J. Parallel Distributed Comput., 2016

A Low-Cost Mixed Clock Generator for High Speed Adiabatic Logic.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

Modeling of Graphene Nanoribbon Tunnel Field Effect Transistor in Verilog-A for Digital Circuit Design.
Proceedings of the IEEE International Symposium on Nanoelectronic and Information Systems, 2016

Efficient GPU hardware transactional memory through early conflict resolution.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015
A framework for evaluating comprehensive fault resilience mechanisms in numerical programs.
J. Supercomput., 2015

Powering Up Dark Silicon: Mitigating the Limitation of Power Delivery via Dynamic Pin Switching.
IEEE Trans. Emerg. Top. Comput., 2015

Cross-architecture prediction based scheduling for energy efficient execution on single-ISA heterogeneous chip-multiprocessors.
Microprocess. Microsystems, 2015

NBTI alleviation on FinFET-made GPUs by utilizing device heterogeneity.
Integr., 2015

Precise computer comparisons via statistical resampling methods.
Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015

Circuit Implementation of Switchable Pins in Chip Multiprocessor.
Proceedings of the IEEE International Symposium on Nanoelectronic and Information Systems, 2015

2014
Comprehensive and Efficient Design Parameter Selection for Soft Error Resilient Processors via Universal Rules.
IEEE Trans. Computers, 2014

Design configuration selection for hard-error reliable processors via statistical rules.
Microprocess. Microsystems, 2014

Mitigating NBTI Degradation on FinFET GPUs through Exploiting Device Heterogeneity.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2014

Energy efficient job scheduling in single-ISA heterogeneous chip-multiprocessors.
Proceedings of the Fifteenth International Symposium on Quality Electronic Design, 2014

Increasing off-chip bandwidth in multi-core processors with switchable pins.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

QoS management on heterogeneous architecture for parallel applications.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

2013
Predicting Architectural Vulnerability on Multithreaded Processors under Resource Contention and Sharing.
IEEE Trans. Dependable Secur. Comput., 2013

Effective thermal control techniques for liquid-cooled 3D multi-core processors.
Proceedings of the International Symposium on Quality Electronic Design, 2013

Lighting the dark silicon by exploiting heterogeneity on future processors.
Proceedings of the 50th Annual Design Automation Conference 2013, 2013

Optimization of Electricity and Server Maintenance Costs in Hybrid Cooling Data Centers.
Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, Santa Clara, CA, USA, June 28, 2013

2012
Model guided adaptive design and analysis in computer experiment.
Stat. Anal. Data Min., 2012

Optimal microarchitectural design configuration selection for processor hard-error reliability.
Proceedings of the Thirteenth International Symposium on Quality Electronic Design, 2012

2011
Efficient Prefetching with Hybrid Schemes and Use of Program Feedback to Adjust Prefetcher Aggressiveness.
J. Instr. Level Parallelism, 2011

Enhancements for Accurate and Timely Streaming Prefetcher.
J. Instr. Level Parallelism, 2011

Performance and Power Analysis of ATI GPU: A Statistical Approach.
Proceedings of the Sixth International Conference on Networking, Architecture, and Storage, 2011

Universal rules guided design parameter selection for soft error resilient processors.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2011

Architecture comparisons between Nvidia and ATI GPUs: Computation parallelism and data communications.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

Two-level soft error vulnerability prediction on SMT/CMP architectures.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

Tree structured analysis on GPU power study.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Statistical GPU power analysis using tree-based methods.
Proceedings of the 2011 International Green Computing Conference and Workshops, 2011

2010
Efficient Microarchitectural Vulnerabilities Prediction Using Boosted Regression Trees and Patient Rule Inductions.
IEEE Trans. Computers, 2010

A Host-Based Intrusion Detection System Using Architectural Features to Improve Sophisticated Denial-of-Service Attack Detections.
Int. J. Inf. Secur. Priv., 2010

Expediating IP lookups with reduced power via TBM and SST supernode caching.
Comput. Commun., 2010

Weak execution ordering - exploiting iterative methods on many-core GPUs.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2010

2009
Accurate and efficient processor performance prediction via regression tree based modeling.
J. Syst. Archit., 2009

Versatile prediction and fast estimation of Architectural Vulnerability Factor from processor performance metrics.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

A case study: Using architectural features to improve sophisticated denial-of-service attack detections.
Proceedings of the 2009 IEEE Symposium on Computational Intelligence in Cyber Security, 2009

2008
Memory hierarchy performance measurement of commercial dual-core desktop processors.
J. Syst. Archit., 2008

SecCMP: Enhancing Critical Secrets Protection in Chip-Multiprocessors.
Int. J. Inf. Secur. Priv., 2008

Efficient mart-aided modeling for microarchitecture design space exploration and performance prediction.
Proceedings of the 2008 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2008

2007
Memory Performance and Scalability of Intel's and AMD's Dual-Core Processors: A Case Study.
Proceedings of the 26th IEEE International Performance Computing and Communications Conference, 2007

Power Efficient IP Lookup with Supernode Caching.
Proceedings of the Global Communications Conference, 2007

2006
Coterminous locality and coterminous group data prefetching on chip-multiprocessors.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

SecCMP: a secure chip-multiprocessor architecture.
Proceedings of the 1st Workshop on Architectural and System Support for Improving Software Dependability, 2006

2004
A New Address-Free Memory Hierarchy Layer for Zero-Cycle Load.
J. Instr. Level Parallelism, 2004

Signature Buffer: Bridging Performance Gap between Registers and Caches.
Proceedings of the 10th International Conference on High-Performance Computer Architecture (HPCA-10 2004), 2004

2003
Address-free memory access based on program syntax correlation of loads and stores.
IEEE Trans. Very Large Scale Integr. Syst., 2003

2001
Symbolic Cache: Fast Memory Access Based on Program Syntax Correlation of Loads and Stores.
Proceedings of the 19th International Conference on Computer Design (ICCD 2001), 2001


  Loading...