Josué Feliu

Orcid: 0000-0003-3017-4266

According to our database1, Josué Feliu authored at least 29 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
SYNPA: SMT Performance Analysis and Allocation of Threads to Cores in ARM Processors.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

2023
Speculative inter-thread store-to-load forwarding in SMT architectures.
J. Parallel Distributed Comput., March, 2023

Cloud White: Detecting and Estimating QoS Degradation of Latency-Critical Workloads in the Public Cloud.
Future Gener. Comput. Syst., 2023

Rebasing Microarchitectural Research with Industry Traces.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023

CELLO: Compiler-Assisted Efficient Load-Load Ordering in Data-Race-Free Regions.
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

Thread-to-Core Allocation in ARM Processors Building Synergistic Pairs.
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2022
DeepP: Deep Learning Multi-Program Prefetch Configuration for the IBM POWER 8.
IEEE Trans. Computers, 2022

VMT: Virtualized Multi-Threading for Accelerating Graph Workloads on Commodity Processors.
IEEE Trans. Computers, 2022

The Forward Slice Core: A High-Performance, Yet Low-Complexity Microarchitecture.
ACM Trans. Archit. Code Optim., 2022

Effect of Hyper-Threading in Latency-Critical Multithreaded Cloud Applications and Utilization Analysis of the Major System Resources.
Future Gener. Comput. Syst., 2022

A Neural Network to Estimate Isolated Performance from Multi-Program Execution.
Proceedings of the 30th Euromicro International Conference on Parallel, 2022

2021
ITSLF: Inter-Thread Store-to-Load Forwardingin Simultaneous Multithreading.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

2020
Bandwidth-Aware Dynamic Prefetch Configuration for IBM POWER8.
IEEE Trans. Parallel Distributed Syst., 2020

Thread Isolation to Improve Symbiotic Scheduling on SMT Multicore Processors.
IEEE Trans. Parallel Distributed Syst., 2020

Understanding Cloud Workloads Performance in a Production like Environment.
CoRR, 2020

The Forward Slice Core Microarchitecture.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019
Precise Runahead Execution.
IEEE Comput. Archit. Lett., 2019

2018
Designing lab sessions focusing on real processors for computer architecture courses: A practical perspective.
J. Parallel Distributed Comput., 2018

A Workload Generator for Evaluating SMT Real-Time Systems.
Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

2017
Improving IBM POWER8 Performance Through Symbiotic Job Scheduling.
IEEE Trans. Parallel Distributed Syst., 2017

Perf&Fair: A Progress-Aware Scheduler to Enhance Performance and Fairness in SMT Multicores.
IEEE Trans. Computers, 2017

2016
Bandwidth-Aware On-Line Scheduling in SMT Multicores.
IEEE Trans. Computers, 2016

Symbiotic job scheduling on the IBM POWER8.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015
Addressing Fairness in SMT Multicores with a Progress-Aware Scheduler.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014
Cache-Hierarchy Contention-Aware Scheduling in CMPs.
IEEE Trans. Parallel Distributed Syst., 2014

Addressing bandwidth contention in SMT multicores through scheduling.
Proceedings of the 2014 International Conference on Supercomputing, 2014

2013
Using Huge Pages and Performance Counters to Determine the LLC Architecture.
Proceedings of the International Conference on Computational Science, 2013

L1-bandwidth aware thread allocation in multicore SMT processors.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Understanding Cache Hierarchy Contention in CMPs to Improve Job Scheduling.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012


  Loading...