Mahmut T. Kandemir
Orcid: 0000-0002-9940-9951Affiliations:
- Penn State, University Park, USA
According to our database1,
Mahmut T. Kandemir
authored at least 774 papers
between 1997 and 2024.
Collaborative distances:
Collaborative distances:
Awards
IEEE Fellow
IEEE Fellow 2016, "For contributions to compiler support for performance and energy optimization of computer architectures".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on cse.psu.edu
On csauthors.net:
Bibliography
2024
An Efficient Edge-Cloud Partitioning of Random Forests for Distributed Sensor Networks.
IEEE Embed. Syst. Lett., March, 2024
Proc. ACM Meas. Anal. Comput. Syst., 2024
CoRR, 2024
GameStreamSR: Enabling Neural-Augmented Game Streaming on Commodity Mobile Platforms.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Paldia: Enabling SLO-Compliant and Cost-Effective Serverless Computing on Heterogeneous Hardware.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Proceedings of the 38th ACM International Conference on Supercomputing, 2024
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops, 2024
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024
Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024
2023
Proc. ACM Program. Lang., October, 2023
Dataset, June, 2023
Clust. Comput., 2023
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the IEEE International Conference on Quantum Software, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the 31st International Symposium on Modeling, 2023
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2023
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Quantization for Bayesian Deep Learning: Low-Precision Characterization and Robustness.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023
Stash: A Comprehensive Stall-Centric Characterization of Public Cloud VMs for Distributed Deep Learning.
Proceedings of the 43rd IEEE International Conference on Distributed Computing Systems, 2023
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023
MicroBlend: An Automated Service-Blending Framework for Microservice-Based Cloud Applications.
Proceedings of the 16th IEEE International Conference on Cloud Computing, 2023
Proceedings of the 16th IEEE International Conference on Cloud Computing, 2023
2022
J. Supercomput., 2022
Data Convection: A GPU-Driven Case Study for Thermal-Aware Data Placement in 3D DRAMs.
Proc. ACM Meas. Anal. Comput. Syst., 2022
Proc. ACM Meas. Anal. Comput. Syst., 2022
J. Chem. Inf. Model., 2022
Seeker: Synergizing Mobile and Energy Harvesting Wearable Sensors for Human Activity Recognition.
CoRR, 2022
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022
Proceedings of the Middleware '22: 23rd International Middleware Conference, Quebec, QC, Canada, November 7, 2022
Skipper: Enabling efficient SNN training through activation-checkpointing and time-skipping.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
An architecture interface and offload model for low-overhead, near-data, distributed accelerators.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks.
Proceedings of the Tenth International Conference on Learning Representations, 2022
Proceedings of the 42nd IEEE International Conference on Distributed Computing Systems, 2022
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022
Proceedings of the GLSVLSI '22: Great Lakes Symposium on VLSI 2022, Irvine CA USA, June 6, 2022
Proceedings of the Euro-Par 2022: Parallel Processing, 2022
Proceedings of the IEEE Intl. Conf. on Dependable, 2022
Cypress: input size-sensitive container provisioning and request scheduling for serverless platforms.
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022
Splice: An Automated Framework for Cost-and Performance-Aware Blending of Cloud Services.
Proceedings of the 22nd IEEE International Symposium on Cluster, 2022
Proceedings of the IEEE International Conference on Big Data, 2022
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022
2021
MaxTracker: Continuously Tracking the Maximum Computation Progress for Energy Harvesting ReRAM-based CNN Accelerators.
ACM Trans. Embed. Comput. Syst., 2021
Proc. ACM Meas. Anal. Comput. Syst., 2021
Proc. ACM Program. Lang., 2021
Exploiting Activation based Gradient Output Sparsity to Accelerate Backpropagation in CNNs.
CoRR, 2021
CoRR, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
Proceedings of the PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021
Proceedings of the PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021
Structured in Space, Randomized in Time: Leveraging Dropout in RNNs for Efficient Training.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the IEEE International Conference on Networking, Architecture and Storage, 2021
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
GYAN: Accelerating Bioinformatics Tools in Galaxy with GPU-Aware Computation Mapping.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021
Proceedings of the IEEE International Symposium on Workload Characterization, 2021
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021
Origin: Enabling On-Device Intelligence for Human Activity Recognition Using Energy Harvesting Wireless Sensor Networks.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021
Proceedings of the CODASPY '21: Eleventh ACM Conference on Data and Application Security and Privacy, 2021
Kraken: Adaptive Container Provisioning for Deploying Dynamic DAGs in Serverless Platforms.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
2020
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020
Optimization of Intercache Traffic Entanglement in Tagless Caches With Tiling Opportunities.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020
Proc. ACM Meas. Anal. Comput. Syst., 2020
Proc. ACM Meas. Anal. Comput. Syst., 2020
Guiding Conventional Protein-Ligand Docking Software with Convolutional Neural Networks.
J. Chem. Inf. Model., 2020
Towards Designing a Self-Managed Machine Learning Inference Serving System inPublic Cloud.
CoRR, 2020
IEEE Comput. Archit. Lett., 2020
Proceedings of the 28th Euromicro International Conference on Parallel, 2020
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Proceedings of the Middleware '20: 21st International Middleware Conference, 2020
Proceedings of the Middleware '20: 21st International Middleware Conference, 2020
Proceedings of the WoSC@Middleware 2020: Proceedings of the 2020 Sixth International Workshop on Serverless Computing, 2020
Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020
Proceedings of the 21st International Symposium on Quality Electronic Design, 2020
Déjà View: Spatio-Temporal Compute Reuse for' Energy-Efficient 360° VR Video Streaming.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020
Proceedings of the IEEE International Symposium on Workload Characterization, 2020
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020
ResiRCA: A Resilient Energy Harvesting ReRAM Crossbar-Based Accelerator for Intelligent Embedded Processors.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020
Multiverse: Dynamic VM Provisioning for Virtualized High Performance Computing Clusters.
Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020
2019
J. Parallel Distributed Comput., 2019
Proceedings of the 2019 IEEE Symposium on Security and Privacy, 2019
Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019
Distilling the Essence of Raw Video to Reduce Memory Usage and Energy at Edge Devices.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
CASH: compiler assisted hardware design for improving DRAM energy efficiency in CNN inference.
Proceedings of the International Symposium on Memory Systems, 2019
Proceedings of the 46th International Symposium on Computer Architecture, 2019
Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019
Proceedings of the 11th USENIX Workshop on Hot Topics in Storage and File Systems, 2019
Proceedings of the 26th IEEE International Conference on High Performance Computing, 2019
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019
Proceedings of the ACM Symposium on Cloud Computing, SoCC 2019, 2019
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019
Spock: Exploiting Serverless Functions for SLO and Cost Aware Resource Procurement in Public Cloud.
Proceedings of the 12th IEEE International Conference on Cloud Computing, 2019
2018
IEEE Trans. Computers, 2018
ACM Trans. Archit. Code Optim., 2018
Proc. ACM Meas. Anal. Comput. Syst., 2018
IAA: Incidental Approximate Architectures for Extremely Energy-Constrained Energy Harvesting Scenarios using IoT Nonvolatile Processors.
IEEE Micro, 2018
Holistic Management of the GPGPU Memory Hierarchy to Manage Warp-level Latency Tolerance.
CoRR, 2018
Comput. Lang. Syst. Struct., 2018
IEEE Comput. Archit. Lett., 2018
Proceedings of the 31st IEEE International System-on-Chip Conference, 2018
Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2018
FlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Amber*: Enabling Precise Full-System Simulation with Detailed Modeling of All SSD Resources.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Invalid Data-Aware Coding to Enhance the Read Performance of High-Density Flash Memories.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Proceedings of the 26th IEEE International Symposium on Modeling, 2018
Proceedings of the 26th IEEE International Symposium on Modeling, 2018
Tolerating Write Disturbance Errors in PCM: Experimental Characterization, Analysis, and Mechanisms.
Proceedings of the 26th IEEE International Symposium on Modeling, 2018
Proceedings of the 26th IEEE International Symposium on Modeling, 2018
Efficient K nearest neighbor algorithm implementations for throughput-oriented architectures.
Proceedings of the 19th International Symposium on Quality Electronic Design, 2018
Proceedings of the 19th International Symposium on Quality Electronic Design, 2018
Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018
Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, 2018
Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018
Proceedings of the 16th USENIX Conference on File and Storage Technologies, 2018
Proceedings of the 2018 IEEE 16th Intl Conf on Dependable, 2018
Proceedings of the 55th Annual Design Automation Conference, 2018
Proceedings of the ACM Symposium on Cloud Computing, 2018
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018
2017
IEEE Trans. Parallel Distributed Syst., 2017
IEEE Trans. Computers, 2017
Exploiting Data Longevity for Enhancing the Lifetime of Flash-based Storage Class Memory.
Proc. ACM Meas. Anal. Comput. Syst., 2017
J. Syst. Archit., 2017
Proceedings of the 25th High Performance Computing Symposium, Virginia Beach, VA, USA, April 23, 2017
A Study on Performance and Power Efficiency of Dense Non-Volatile Caches in Multi-Core Systems.
Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, Urbana-Champaign, IL, USA, June 05, 2017
Proceedings of the 25th Euromicro International Conference on Parallel, 2017
Race-to-sleep + content caching + display caching: a recipe for energy-efficient video streaming on handhelds.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Proceedings of the International Symposium on Memory Systems, 2017
Proceedings of the 25th IEEE International Symposium on Modeling, 2017
Quantifying the Potential Benefits of On-chip Near-Data Computing in Manycore Processors.
Proceedings of the 25th IEEE International Symposium on Modeling, 2017
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Exploring the impact of memory block permutation on performance of a crossbar ReRAM main memory.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
TraceTracker: Hardware/software co-evaluation for large-scale I/O workload reconstruction.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017
Leveraging value locality for efficient design of a hybrid cache in multicore processors.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
Exploring the Potential for Collaborative Data Compression and Hard-Error Tolerance in PCM Memories.
Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2017
Hardware-Software Co-design to Mitigate DRAM Refresh Overheads: A Case for Refresh-Aware Process Scheduling.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017
2016
ACM Trans. Storage, 2016
CoRR, 2016
Asymmetrically reliable caches for multicore architectures under performance and energy constraints.
Clust. Comput., 2016
Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 2016
Exploring the potentials of parallel garbage collection in SSDs for enterprise storage systems.
Proceedings of the International Conference for High Performance Computing, 2016
Proceedings of the 5th Non-Volatile Memory Systems and Applications Symposium, 2016
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016
Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016
Proceedings of the Architecture of Computing Systems - ARCS 2016, 2016
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016
2015
IOPro: a parallel I/O profiling and visualization framework for high-performance storage systems.
J. Supercomput., 2015
EECache: A Comprehensive Study on the Architectural Design for Energy-Efficient Last-Level Caches in Chip Multiprocessors.
ACM Trans. Archit. Code Optim., 2015
Proceedings of the 28th International Conference on VLSI Design, 2015
Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2015
Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2015
Proceedings of the 2015 International Symposium on Memory Systems, 2015
Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015
A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Performance and Energy Efficient Asymmetrically Reliable Caches for Multicore Architectures.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015
Evaluating the Combined Impact of Node Architecture and Cloud Workload Characteristics on Network Traffic and Performance/Cost.
Proceedings of the 2015 IEEE International Symposium on Workload Characterization, 2015
Proceedings of the 5th International Conference on Energy Aware Computing Systems & Applications, 2015
Proceedings of the 35th IEEE International Conference on Distributed Computing Systems, 2015
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Network footprint reduction through data access and computation placement in NoC-based manycores.
Proceedings of the 52nd Annual Design Automation Conference, 2015
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015
TaPEr: tackling power emergencies in the dark silicon era by exploiting resource scalability.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015
NVMMU: A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
2014
ACM Trans. Archit. Code Optim., 2014
Exploring the future of out-of-core computing with compute-local non-volatile memory.
Sci. Program., 2014
Improved cache utilization and preconditioner efficiency through use of a space-filling curve mesh element- and vertex-reordering technique.
Eng. Comput., 2014
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the IEEE 22nd International Symposium on Modelling, 2014
Quantifying and Optimizing the Impact of Victim Cache Line Selection in Manycore Systems.
Proceedings of the IEEE 22nd International Symposium on Modelling, 2014
EECache: exploiting design choices in energy-efficient last-level caches for chip multiprocessors.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014
Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014
Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014
Proceedings of the IEEE 34th International Conference on Distributed Computing Systems, 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Triple-A: a Non-SSD based autonomic all-flash array for high performance storage systems.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014
Application-aware Memory System for Fair and Efficient Execution of Concurrent GPGPU Applications.
Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014
2013
Compiler-Directed Energy Reduction Using Dynamic Voltage Scaling and Voltage Islands for Embedded Systems.
IEEE Trans. Computers, 2013
Sci. Program., 2013
Proceedings of the 21st IEEE/IFIP International Conference on VLSI and System-on-Chip, 2013
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2013
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013
Interference Resolver in Shared Storage Systems to Provide Fairness to I/O Intensive Applications.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
Disk-Cache and Parallelism Aware I/O Scheduling to Improve Storage System Performance.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Proceedings of the International Conference on Supercomputing, 2013
Proceedings of the 5th USENIX Workshop on Hot Topics in Storage and File Systems, 2013
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013
OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013
2012
J. Circuits Syst. Comput., 2012
Proceedings of the 8th International Conference on Virtual Execution Environments, 2012
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012
Proceedings of the 20th Euromicro International Conference on Parallel, 2012
NANDFlashSim: Intrinsic latency variation aware NAND flash memory system modeling and simulation at microarchitecture level.
Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies, 2012
Proceedings of the Middleware 2012, 2012
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012
Proceedings of the International Symposium on Low Power Electronics and Design, 2012
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012
Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems, 2012
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012
Proceedings of the 4th USENIX Workshop on Hot Topics in Storage and File Systems, 2012
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
Proceedings of the 49th Annual Design Automation Conference 2012, 2012
Proceedings of the 49th Annual Design Automation Conference 2012, 2012
Performance enhancement under power constraints using heterogeneous CMOS-TFET multicores.
Proceedings of the 10th International Conference on Hardware/Software Codesign and System Synthesis, 2012
Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012
Proceedings of the Computing Frontiers Conference, CF'12, 2012
Improving the performance of k-means clustering through computation skipping and data locality optimizations.
Proceedings of the Computing Frontiers Conference, CF'12, 2012
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
MROrchestrator: A Fine-Grained Resource Orchestration Framework for MapReduce Clusters.
Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing, 2012
2011
Trans. High Perform. Embed. Archit. Compil., 2011
SIGMETRICS Perform. Evaluation Rev., 2011
Proceedings of the SIGMETRICS 2011, 2011
Proceedings of the SIGMETRICS 2011, 2011
Proceedings of the Conference on High Performance Computing Networking, 2011
Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011
Proceedings of the 19th International Euromicro Conference on Parallel, 2011
Proceedings of the 19th International Euromicro Conference on Parallel, 2011
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011
Reducing memory interference in multicore systems via application-aware memory channel partitioning.
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011
Proceedings of the 12th International Symposium on Quality Electronic Design, 2011
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2011
Improving energy efficiency of multi-threaded applications using heterogeneous CMOS-TFET multicores.
Proceedings of the 2011 International Symposium on Low Power Electronics and Design, 2011
Proceedings of the 2011 International Conference on Distributed Computing Systems, 2011
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011
Improving shared cache behavior of multithreaded object-oriented applications in multicores.
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011
Proceedings of the 48th Design Automation Conference, 2011
A helper thread based dynamic cache partitioning scheme for multithreaded applications.
Proceedings of the 48th Design Automation Conference, 2011
Proceedings of the CGO 2011, 2011
Proceedings of the CGO 2011, 2011
Adaptive QoS Decomposition and Control for Storage Cache Management in Multi-server Environments.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011
APP: Minimizing Interference Using Aggressive Pipelined Prefetching in Multi-level Buffer Caches.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
2010
J. Signal Process. Syst., 2010
IET Comput. Digit. Tech., 2010
Proceedings of the Annual IEEE International SoC Conference, SoCC 2010, 2010
Proceedings of the SIGMETRICS 2010, 2010
Proceedings of the Conference on High Performance Computing Networking, 2010
Proceedings of the Recent Advances in the Message Passing Interface, 2010
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010
Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2010
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010
Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, 2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
T-NUCA - a novel approach to non-uniform access latency cache architectures for 3D CMPs.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Proceedings of the 24th International Conference on Supercomputing, 2010
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010
Proceedings of the High Performance Embedded Architectures and Compilers, 2010
Scalable Parallelization Strategies to Accelerate NuFFT Data Translation on Multicores.
Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010
Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010
A special-purpose compiler for look-up table and code generation for function evaluation.
Proceedings of the Design, Automation and Test in Europe, 2010
Proceedings of the Design, Automation and Test in Europe, 2010
2009
ACM Trans. Embed. Comput. Syst., 2009
Compiler-assisted soft error detection under performance and energy constraints in embedded systems.
ACM Trans. Embed. Comput. Syst., 2009
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2009
IEEE Trans. Computers, 2009
An Automated Framework for Accelerating Numerical Algorithms on Reconfigurable Platforms Using Algorithmic/Architectural Optimization.
IEEE Trans. Computers, 2009
J. Parallel Distributed Comput., 2009
Int. J. Embed. Syst., 2009
Int. J. Distributed Sens. Networks, 2009
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
Proceedings of the IEEE 7th Symposium on Application Specific Processors, 2009
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009
Proceedings of the High Performance Embedded Architectures and Compilers, 2009
Adapting Application Mapping to Systematic Within-Die Process Variations on Chip Multiprocessors.
Proceedings of the High Performance Embedded Architectures and Compilers, 2009
Proceedings of the Euro-Par 2009 Parallel Processing, 2009
Proceedings of the 9th ACM & IEEE International conference on Embedded software, 2009
Using dynamic compilation for continuing execution under reduced memory availability.
Proceedings of the Design, Automation and Test in Europe, 2009
Proceedings of the Design, Automation and Test in Europe, 2009
Proceedings of the Design, Automation and Test in Europe, 2009
Proceedings of the 46th Design Automation Conference, 2009
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009
Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009
Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009
Proceedings of the 2009 International Conference on Compilers, 2009
Topology-Aware I/O Caching for Shared Storage Systems.
Proceedings of the 22nd International Conference on Parallel and Distributed Computing and Communication Systems, 2009
Power Aware Disk Allocation.
Proceedings of the 22nd International Conference on Parallel and Distributed Computing and Communication Systems, 2009
Dynamic Storage Cache Partitioning Using Feedback Control Theory.
Proceedings of the 22nd International Conference on Parallel and Distributed Computing and Communication Systems, 2009
2008
IEEE Trans. Very Large Scale Integr. Syst., 2008
IEEE Trans. Parallel Distributed Syst., 2008
ACM Trans. Design Autom. Electr. Syst., 2008
ACM Trans. Design Autom. Electr. Syst., 2008
ACM SIGOPS Oper. Syst. Rev., 2008
Capturing and optimizing the interactions between prefetching and cache line turnoff.
Microprocess. Microsystems, 2008
J. Softw., 2008
Implementation and evaluation of a migration-based NUCA design for chip multiprocessors.
Proceedings of the 2008 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2008
Proceedings of the 2008 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2008
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008
Enhancing the performance of MPI-IO applications by overlapping I/O, computation and communication.
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008
Proceedings of the 9th International Symposium on Quality of Electronic Design (ISQED 2008), 2008
Evaluating the role of scratchpad memories in chip multiprocessors for sparse matrix computations.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Improving I/O performance through compiler-directed code restructuring and adaptive prefetching.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
A helper thread based EDP reduction scheme for adapting application execution in CMPs.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 26th International Conference on Computer Design, 2008
Proceedings of the 2008 International Conference on Computer-Aided Design, 2008
Integrated code and data placement in two-dimensional mesh based chip multiprocessors.
Proceedings of the 2008 International Conference on Computer-Aided Design, 2008
Improving I/O Performance of Applications through Compiler-Directed Code Restructuring.
Proceedings of the 6th USENIX Conference on File and Storage Technologies, 2008
Proceedings of the 45th Design Automation Conference, 2008
Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, 2008
A Systematic Approach to Automatically Generate Multiple Semantically Equivalent Program Versions.
Proceedings of the Reliable Software Technologies, 2008
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008
2007
IEEE Trans. Syst. Man Cybern. Part C, 2007
IEEE Trans. Parallel Distributed Syst., 2007
Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling.
J. Supercomput., 2007
Trans. High Perform. Embed. Archit. Compil., 2007
Trans. High Perform. Embed. Archit. Compil., 2007
Solving the Register Allocation Problem for Embedded Systems Using a Hybrid Evolutionary Algorithm.
IEEE Trans. Evol. Comput., 2007
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2007
IET Comput. Digit. Tech., 2007
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007
Enhancing Locality in Two-Dimensional Space through Integrated Computation and Data Mappings.
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007
Proceedings of the Fourth International IEEE Security in Storage Workshop, 2007
Proceedings of the IEEE Workshop on Signal Processing Systems, 2007
Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2007
Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007
Proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, 2007
Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007
Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007
Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007
Improving MPI Independent Write Performance Using A Two-Stage Write-Behind Buffering Method.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007
Proceedings of the 2007 International Conference on Computer-Aided Design, 2007
Proceedings of the FPL 2007, 2007
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007
Proceedings of the 44th Design Automation Conference, 2007
Reducing Off-Chip Memory Access Costs Using Data Recomputation in Embedded Chip Multi-processors.
Proceedings of the 44th Design Automation Conference, 2007
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007
Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007
Integrated Data Reorganization and Disk Mapping for Reducing Disk Energy Consumption.
Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007
Lightweight barrier-based parallelization support for non-cache-coherent MPSoC platforms.
Proceedings of the 2007 International Conference on Compilers, 2007
Proceedings of the Regarding the Intelligence in Distributed Intelligent Systems, 2007
Proceedings of the Regarding the Intelligence in Distributed Intelligent Systems, 2007
Reducing Energy Consumption of On-Chip Networks Through a Hybrid Compiler-Runtime Approach.
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007
2006
Estimating and reducing the memory requirements of signal processing codes for embedded systems.
IEEE Trans. Signal Process., 2006
ACM Trans. Storage, 2006
ACM Trans. Design Autom. Electr. Syst., 2006
Reducing energy consumption of multiprocessor SoC architectures by exploiting memory bank locality.
ACM Trans. Design Autom. Electr. Syst., 2006
ACM Trans. Embed. Comput. Syst., 2006
ACM Trans. Embed. Comput. Syst., 2006
Reducing memory energy consumption of embedded applications that process dynamically allocated data.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2006
J. Syst. Archit., 2006
Int. J. Distributed Sens. Networks, 2006
Proceedings of the 2006 IEEE International SOC Conference, Austin, Texas, USA, 2006
Energy-Aware Code Replication for Improving Reliability in Embedded Chip Multiprocessors.
Proceedings of the 2006 IEEE International SOC Conference, Austin, Texas, USA, 2006
Proceedings of the 2006 IEEE International SOC Conference, Austin, Texas, USA, 2006
Proceedings of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2006
Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation, 2006
Proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Languages, 2006
An Integer Linear Programming Based Approach to Simultaneous Memory Space Partitioning and Data Allocation for Chip Multiprocessors.
Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006
Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006
Reducing Memory Requirements through Task Recomputation in Embedded Multi-CPU Systems.
Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006
Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006
Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006
Proceedings of the 7th International Symposium on Quality of Electronic Design (ISQED 2006), 2006
Proceedings of the 7th International Symposium on Quality of Electronic Design (ISQED 2006), 2006
Proceedings of the 7th International Symposium on Quality of Electronic Design (ISQED 2006), 2006
Proceedings of the 2006 International Symposium on Low Power Electronics and Design, 2006
Proceedings of the 2006 International Symposium on Low Power Electronics and Design, 2006
Proceedings of the Computer and Information Sciences, 2006
Proceedings of the 33rd International Symposium on Computer Architecture (ISCA 2006), 2006
Integrated link/CPU voltage scaling for reducing energy consumption of parallel sparse matrix applications.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
Proceedings of the 12th International Conference on Parallel and Distributed Systems, 2006
Proceedings of the 12th International Conference on Parallel and Distributed Systems, 2006
Proceedings of the 2006 International Conference on Computer-Aided Design, 2006
Proceedings of the 16th ACM Great Lakes Symposium on VLSI 2006, Philadelphia, PA, USA, April 30, 2006
Selective code/data migration for reducing communication energy in embedded MpSoC architectures.
Proceedings of the 16th ACM Great Lakes Symposium on VLSI 2006, Philadelphia, PA, USA, April 30, 2006
Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2006), 2006
Proceedings of the 2006 International Conference on Dependable Systems and Networks (DSN 2006), 2006
Dynamic partitioning of processing and memory resources in embedded MPSoC architectures.
Proceedings of the Conference on Design, Automation and Test in Europe, 2006
Proceedings of the Conference on Design, Automation and Test in Europe, 2006
Proceedings of the Conference on Design, Automation and Test in Europe, 2006
Proceedings of the 43rd Design Automation Conference, 2006
A Compiler-Guided Approach for Reducing Disk Power Consumption by Exploiting Disk Access Locality.
Proceedings of the Fourth IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2006), 2006
Proceedings of the Third Conference on Computing Frontiers, 2006
Proceedings of the Third Conference on Computing Frontiers, 2006
Using Task Recomputation During Application Mapping in Parallel Embedded Architectures.
Proceedings of the 2006 International Conference on Computer Design & Conference on Computing in Nanotechnology, 2006
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006
Compiler-Guided data compression for reducing memory consumption of embedded applications.
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006
Energy-aware computation duplication for improving reliability in embedded chip multiprocessors.
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006
Proceedings of the Reliable Software Technologies, 2006
2005
IEEE Trans. Very Large Scale Integr. Syst., 2005
IEEE Trans. Very Large Scale Integr. Syst., 2005
IEEE Trans. Parallel Distributed Syst., 2005
ACM Trans. Design Autom. Electr. Syst., 2005
ACM Trans. Embed. Comput. Syst., 2005
ACM Trans. Embed. Comput. Syst., 2005
ACM Trans. Embed. Comput. Syst., 2005
ACM Trans. Embed. Comput. Syst., 2005
IEEE Trans. Computers, 2005
Improving whole-program locality using intra-procedural and inter-procedural transformations<sup>, </sup>.
J. Parallel Distributed Comput., 2005
J. Parallel Distributed Comput., 2005
Processor-embedded distributed smart disks for I/O-intensive workloads: architectures, performance models and evaluation.
J. Parallel Distributed Comput., 2005
Int. J. Embed. Syst., 2005
Int. J. Embed. Syst., 2005
Exploiting frequent field values in java objects for reducing heap memory requirements.
Proceedings of the 1st International Conference on Virtual Execution Environments, 2005
Proceedings of the Proceedings 2005 IEEE International SOC Conference, 2005
On-Chip Memory Management for Embedded MpSoC Architectures Based on Data Compression.
Proceedings of the Proceedings 2005 IEEE International SOC Conference, 2005
Proceedings of the Proceedings 2005 IEEE International SOC Conference, 2005
Proceedings of the Static Analysis, 12th International Symposium, 2005
Proceedings of the 11th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2005), 2005
Exposing disk layout to compiler for reducing energy consumption of parallel disk based systems.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005
Fault Recovery Designs for Processor-Embedded Distributed Storage Architectures with I/O-Intensive DB Workloads.
Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST 2005), 2005
Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, 2005
Proceedings of the Languages and Compilers for Parallel Computing, 2005
Proceedings of the 2005 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2005), 2005
Proceedings of the 2005 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2005), 2005
Exploiting Inter-Processor Data Sharing for Improving Behavior of Multi-Processor SoCs.
Proceedings of the 2005 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2005), 2005
Proceedings of the 6th International Symposium on Quality of Electronic Design (ISQED 2005), 2005
Proceedings of the 6th International Symposium on Quality of Electronic Design (ISQED 2005), 2005
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005
Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005
Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
Reliability-Conscious Process Scheduling under Performance Constraints in FPGA-Based Embedded Systems.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
Proceedings of the 19th Annual International Conference on Supercomputing, 2005
Proceedings of the 23rd International Conference on Computer Design (ICCD 2005), 2005
Improving scratch-pad memory reliability through compiler-guided data block duplication.
Proceedings of the 2005 International Conference on Computer-Aided Design, 2005
Compiler-directed voltage scaling on communication links for reducing power consumption.
Proceedings of the 2005 International Conference on Computer-Aided Design, 2005
Proceedings of the 2005 International Conference on Computer-Aided Design, 2005
Integrating loop and data optimizations for locality within a constraint network based framework.
Proceedings of the 2005 International Conference on Computer-Aided Design, 2005
Proceedings of the 2005 International Conference on Computer-Aided Design, 2005
Proceedings of the 2005 International Conference on Computer-Aided Design, 2005
Proceedings of the 15th ACM Great Lakes Symposium on VLSI 2005, 2005
Proceedings of the 15th ACM Great Lakes Symposium on VLSI 2005, 2005
Proceedings of the 15th ACM Great Lakes Symposium on VLSI 2005, 2005
Proceedings of the 15th ACM Great Lakes Symposium on VLSI 2005, 2005
Proceedings of the EMSOFT 2005, 2005
Proceedings of the EMSOFT 2005, 2005
Proceedings of the 2005 International Conference on Dependable Systems and Networks (DSN 2005), 28 June, 2005
Proceedings of the 2005 Design, 2005
Proceedings of the 2005 Design, 2005
Proceedings of the 2005 Design, 2005
Proceedings of the 2005 Design, 2005
Proceedings of the 2005 Design, 2005
Proceedings of the 2005 Design, 2005
Proceedings of the 2005 Design, 2005
Proceedings of the 2005 Design, 2005
Locality-conscious workload assignment for array-based computations in MPSOC architectures.
Proceedings of the 42nd Design Automation Conference, 2005
Proceedings of the 42nd Design Automation Conference, 2005
Increasing on-chip memory space utilization for embedded chip multiprocessors through data compression.
Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2005
Proceedings of the 3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005), 2005
Proceedings of the Compiler Construction, 14th International Conference, 2005
Proceedings of the 2005 International Conference on Compilers, 2005
Proceedings of the 2005 International Conference on Compilers, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
FD-HGAC: a hybrid heuristic/genetic algorithm hardware/software co-synthesis framework with fault detection.
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005
2004
IEEE Trans. Very Large Scale Integr. Syst., 2004
IEEE Trans. Parallel Distributed Syst., 2004
IEEE Trans. Parallel Distributed Syst., 2004
Studying Energy Trade Offs in Offloading Computation/Compilation in Java-Enabled Mobile Devices.
IEEE Trans. Parallel Distributed Syst., 2004
A compiler-based approach for dynamically managing scratch-pad memories in embedded systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2004
IEEE Trans. Computers, 2004
ACM Trans. Archit. Code Optim., 2004
Des. Autom. Embed. Syst., 2004
Proceedings of the 12th Euromicro Workshop on Parallel, 2004
Proceedings of the 2004 ACM SIGPLAN/SIGBED Conference on Languages, 2004
Proceedings of the Languages and Compilers for High Performance Computing, 2004
Proceedings of the 4th International Symposium on Memory Management, 2004
Proceedings of the 2004 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2004), 2004
Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, 2004
Proceedings of the 2004 International Symposium on Low Power Electronics and Design, 2004
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
Improving Memory Performance of Embedded Java Applications by Dynamic Layout Modifications.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
Proceedings of the 8th International Database Engineering and Applications Symposium (IDEAS 2004), 2004
Proceedings of the 2004 International Conference on Computer-Aided Design, 2004
Proceedings of the 2004 International Conference on Computer-Aided Design, 2004
Proceedings of the 10th International Conference on High-Performance Computer Architecture (HPCA-10 2004), 2004
Proceedings of the 14th ACM Great Lakes Symposium on VLSI 2004, 2004
Proceedings of the Field Programmable Logic and Application, 2004
Proceedings of the ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays, 2004
Proceedings of the Evolutionary Computation in Combinatorial Optimization, 2004
Proceedings of the Euro-Par 2004 Parallel Processing, 2004
Proceedings of the Euro-Par 2004 Parallel Processing, 2004
Proceedings of the Euro-Par 2004 Parallel Processing, 2004
Data Windows: A Data-Centric Approach for Query Execution in Memory-Resident Databases.
Proceedings of the 2004 Design, 2004
Proceedings of the 2004 Design, 2004
Proceedings of the 2004 Design, 2004
Exploiting Processor Workload Heterogeneity for Reducing Energy Consumption in Chip Multiprocessors.
Proceedings of the 2004 Design, 2004
Tuning In-Sensor Data Filtering to Reduce Energy Consumption in Wireless Sensor Networks.
Proceedings of the 2004 Design, 2004
Proceedings of the 2004 Design, 2004
Proceedings of the 2004 Design, 2004
Proceedings of the 41th Design Automation Conference, 2004
Proceedings of the 41th Design Automation Conference, 2004
Proceedings of the 2nd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2004
Proceedings of the 2nd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2004
Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, 2004
Proceedings of the 2004 International Conference on Compilers, 2004
Proceedings of the 2004 International Conference on Compilers, 2004
Proceedings of the Ultra Low-Power Electronics and Design, 2004
2003
A high-performance application data environment for large-scale scientific computations.
IEEE Trans. Parallel Distributed Syst., 2003
Reducing False Sharing and Improving Spatial Locality in a Unified Compilation Framework.
IEEE Trans. Parallel Distributed Syst., 2003
ACM Trans. Embed. Comput. Syst., 2003
Evaluating Integrated Hardware-Software Optimizations Using a Unified Energy Estimation Framework.
IEEE Trans. Computers, 2003
Loop Transformations for Reducing Data Space Requirements of Resource-Constrained Applications.
Proceedings of the Static Analysis, 10th International Symposium, 2003
Proceedings of the 2003 ACM SIGPLAN Conference on Object-Oriented Programming Systems, 2003
Proceedings of the 2003 Conference on Languages, 2003
Proceedings of the Languages and Compilers for Parallel Computing, 2003
Proceedings of the 2003 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2003), 2003
Interplay of energy and performance for disk arrays running transaction processing workloads.
Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, 2003
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003
Exploiting program hotspots and code sequentiality for instruction cache leakage management.
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003
Proceedings of the 30th International Symposium on Computer Architecture (ISCA 2003), 2003
Energy and Performance Considerations in Work Partitioning for Mobile Spatial Queries.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
Proceedings of the 17th Annual International Conference on Supercomputing, 2003
Proceedings of the 21st International Conference on Computer Design (ICCD 2003), 2003
Proceedings of the 2003 International Conference on Computer-Aided Design, 2003
Proceedings of the 2003 International Conference on Computer-Aided Design, 2003
An Energy-Oriented Evaluation of Communication Optimizations for Microcensor Networks.
Proceedings of the Euro-Par 2003. Parallel Processing, 2003
Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors.
Proceedings of the Euro-Par 2003. Parallel Processing, 2003
Energy-Conscious Memory Allocation and Deallocation for Pointer-Intensive Applications.
Proceedings of the Embedded Software, Third International Conference, 2003
Proceedings of the 2003 International Conference on Dependable Systems and Networks (DSN 2003), 2003
CCC: Crossbar Connected Caches for Reducing Energy Consumption of On-Chip Multiprocessors.
Proceedings of the 2003 Euromicro Symposium on Digital Systems Design (DSD 2003), 2003
Proceedings of the 2003 Euromicro Symposium on Digital Systems Design (DSD 2003), 2003
Proceedings of the 2003 Design, 2003
Proceedings of the 2003 Design, 2003
Proceedings of the 2003 Design, 2003
Proceedings of the 2003 Design, 2003
Implementation and Evaluation of an On-Demand Parameter-Passing Strategy for Reducing Energy.
Proceedings of the 2003 Design, 2003
Proceedings of the 2003 Design, 2003
Interprocedural optimizations for improving data cache performance of array-intensive embedded applications.
Proceedings of the 40th Design Automation Conference, 2003
Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2003
Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2003
Proceedings of the Compiler Construction, 12th International Conference, 2003
Proceedings of the International Conference on Compilers, 2003
Proceedings of the International Conference on Compilers, 2003
Proceedings of the Embedded Software for SoC, 2003
Proceedings of the Embedded Software for SoC, 2003
2002
VLDB J., 2002
IEEE Trans. Parallel Distributed Syst., 2002
Tuning garbage collection for reducing memory system energy in an embedded java environment.
ACM Trans. Embed. Comput. Syst., 2002
J. Circuits Syst. Comput., 2002
Proceedings of the 7th Asia and South Pacific Design Automation Conference (ASP-DAC 2002), 2002
Proceedings of the 7th Asia and South Pacific Design Automation Conference (ASP-DAC 2002), 2002
Proceedings of the 7th Asia and South Pacific Design Automation Conference (ASP-DAC 2002), 2002
Proceedings of the 7th Asia and South Pacific Design Automation Conference (ASP-DAC 2002), 2002
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002
Proceedings of the 2002 Joint Conference on Languages, 2002
Proceedings of the 2002 Joint Conference on Languages, 2002
A Hybrid Strategy Based on Data Distribution and Migration for Optimizing Memory Locality.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002
Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium, 2002
Proceedings of the 2002 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2002), 2002
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002
Proceedings of the 2002 IEEE/ACM International Conference on Computer-aided Design, 2002
Using Complete Machine Simulation for Software Power Estimation: The SoftWatt Approach.
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002
Proceedings of the FAST '02 Conference on File and Storage Technologies, 2002
Proceedings of the Programming Languages and Systems, 2002
Proceedings of the Embedded Software, Second International Conference, 2002
Proceedings of the 2002 Design, 2002
Proceedings of the 2002 Design, 2002
Proceedings of the 2002 Design, 2002
Automatic data migration for reducing energy consumption in multi-bank memory systems.
Proceedings of the 39th Design Automation Conference, 2002
Proceedings of the 39th Design Automation Conference, 2002
Proceedings of the 39th Design Automation Conference, 2002
An integer linear programming based approach for parallelizing applications in On-chip multiprocessors.
Proceedings of the 39th Design Automation Conference, 2002
Proceedings of the 39th Design Automation Conference, 2002
Proceedings of the 39th Design Automation Conference, 2002
Proceedings of the Tenth International Symposium on Hardware/Software Codesign, 2002
Proceedings of the Tenth International Symposium on Hardware/Software Codesign, 2002
Kernel-Level Caching for Optimizing I/O by Exploiting Inter-Application Data Sharing.
Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002
Proceedings of the Compiler Construction, 11th International Conference, 2002
Proceedings of the International Conference on Compilers, 2002
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques (PACT 2002), 2002
Proceedings of the Compiler Design Handbook: Optimizations and Machine Code Generation, 2002
2001
Investigating Memory System Energy Behavior Using Software and Hardware Optimizations.
VLSI Design, 2001
IEEE Trans. Very Large Scale Integr. Syst., 2001
IEEE Trans. Parallel Distributed Syst., 2001
IEEE Trans. Computers, 2001
IEEE Trans. Computers, 2001
IEEE Trans. Computers, 2001
J. Parallel Distributed Comput., 2001
Proceedings of the 14th International Conference on VLSI Design (VLSI Design 2001), 2001
Formulation and Validation of an Energy Dissipation Model for the Clock Generation Circuitry and Distribution Networks.
Proceedings of the 14th International Conference on VLSI Design (VLSI Design 2001), 2001
Proceedings of the VLDB 2001, 2001
Proceedings of the 2001 ACM Symposium on Applied Computing (SAC), 2001
Proceedings of the Conference Record of POPL 2001: The 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2001
Proceedings of the 2001 ACM SIGPLAN Workshop on Optimization of Middleware and Distributed Systems, 2001
Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis For Software Tools and Engineering, 2001
Proceedings of the 34th Annual International Symposium on Microarchitecture, 2001
Improving Off-Chip Memory Energy Behavior in a Multi-processor, Multi-bank Environment.
Proceedings of the Languages and Compilers for Parallel Computing, 2001
Proceedings of the 1st Java Virtual Machine Research and Technology Symposium, 2001
Proceedings of the 14th International Symposium on Systems Synthesis, 2001
Proceedings of the 2001 International Symposium on Low Power Electronics and Design, 2001
Proceedings of the 2001 International Symposium on Low Power Electronics and Design, 2001
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001
Proceedings of the 19th International Conference on Computer Design (ICCD 2001), 2001
Proceedings of the 19th International Conference on Computer Design (ICCD 2001), 2001
Proceedings of the 2001 IEEE/ACM International Conference on Computer-Aided Design, 2001
Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA'01), 2001
Proceedings of the 38th Design Automation Conference, 2001
Proceedings of the 38th Design Automation Conference, 2001
Proceedings of the Ninth International Symposium on Hardware/Software Codesign, 2001
Proceedings of the Compiler Construction, 10th International Conference, 2001
Proceedings of the 2001 International Conference on Compilers, 2001
2000
A Unified Framework for Optimizing Locality, Parallelism, and Communication in Out-of-Core Computations.
IEEE Trans. Parallel Distributed Syst., 2000
IEEE Trans. Parallel Distributed Syst., 2000
Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed-Memory Machines.
J. Parallel Distributed Comput., 2000
Data management for large-scale scientific computations in high performance distributed systems.
Clust. Comput., 2000
Proceedings of the Integrated Circuit Design, 2000
APRIL: A Run-Time Library for Tape-Resident Data.
Proceedings of the Eighth NASA Goddard Space Flight Center Conference on Mass Storage Systems and Technologies in cooperation with Seventeenth IEEE Symposium on Mass Storage Systems, 2000
Proceedings of the Languages, 2000
Proceedings of the Languages and Compilers for Parallel Computing, 2000
Proceedings of the Languages and Compilers for Parallel Computing, 2000
Proceedings of the 2000 International Symposium on Low Power Electronics and Design, 2000
Proceedings of the 27th International Symposium on Computer Architecture (ISCA 2000), 2000
Proceedings of the 14th international conference on Supercomputing, 2000
Proceedings of the 2000 International Conference on Parallel Processing, 2000
Proceedings of the High Performance Computing, 2000
Proceedings of the High Performance Computing, 2000
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000
Proceedings of the 37th Conference on Design Automation, 2000
Proceedings of the 2000 International Conference on Compilers, 2000
1999
IEEE Trans. Parallel Distributed Syst., 1999
A global communication optimization technique based on data-flow analysis and linear algebra.
ACM Trans. Program. Lang. Syst., 1999
IEEE Trans. Computers, 1999
J. Parallel Distributed Comput., 1999
Improving Locality Using a Graph-Based Technique for Detecting Memory Layouts of Arrays.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999
Proceedings of the 3rd IEEE Metadata Conference 1999, MD 1999, Bethesda, 1999
A Graph Based Framework to Detect Optimal Memory Layouts for Improving Data Locality.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999
Proceedings of the 13th international conference on Supercomputing, 1999
A Framework for Interprocedural Locality Optimization Using Both Loop and Data Layout Transformations.
Proceedings of the International Conference on Parallel Processing 1999, 1999
Proceedings of the International Conference on Parallel Processing 1999, 1999
Data Management for Large-Scale Scientific Computations in High Performance Distributed Systems.
Proceedings of the Eighth IEEE International Symposium on High Performance Distributed Computing, 1999
Proceedings of the High-Performance Computing and Networking, 7th International Conference, 1999
Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999
Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, 1999
1998
J. Inf. Sci. Eng., 1998
An Experimental Study to Analyze and Optimize Hartree-Fock Application's I/O with Passion.
Int. J. High Perform. Comput. Appl., 1998
Proceedings of the 31st Annual IEEE/ACM International Symposium on Microarchitecture, 1998
Proceedings of the Languages, 1998
A Loop Transformation Algorithm Based on Explicit Data Layout Representation for Optimizing Locality.
Proceedings of the Languages and Compilers for Parallel Computing, 1998
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998
Proceedings of the 12th international conference on Supercomputing, 1998
Performance Implications of Architectural and Software Techniques on I/O-Intensive Applications.
Proceedings of the 1998 International Conference on Parallel Processing (ICPP '98), 1998
Proceedings of the Euro-Par '98 Parallel Processing, 1998
Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, 1998
1997
Proceedings of the ACM/IEEE Conference on Supercomputing, 1997
I/O Optimizations for Compiling Out-of Core Programs on Distributed-Memory Machines.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997
Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs on Distributed Memory Machines.
Proceedings of the 11th International Parallel Processing Symposium (IPPS '97), 1997
A Unified Compiler Algorithm for Optimizing Locality, Parallelism and Communication in Out-of-core Computations.
Proceedings of the Fifth Workshop on I/O in Parallel and Distributed Systems, 1997
Proceedings of the 11th international conference on Supercomputing, 1997
Proceedings of the 1997 International Conference on Parallel Processing (ICPP '97), 1997
Proceedings of the Fourth International on High-Performance Computing, 1997
Proceedings of the Euro-Par '97 Parallel Processing, 1997