Márcio Castro
Orcid: 0000-0002-9992-8540Affiliations:
- Federal University of Santa Catarina (UFSC), Informatics and Statistics Department
According to our database1,
Márcio Castro
authored at least 56 papers
between 2005 and 2025.
Collaborative distances:
Collaborative distances:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
on orcid.org
on inf.ufsc.br
on dl.acm.org
On csauthors.net:
Dynamic Load Balancing in Kubernetes Environments With Kubernetes Scheduling Extension (KSE).
Concurr. Comput. Pract. Exp., February, 2025
Improving edge AI for industrial IoT applications with distributed learning using consensus.
Des. Autom. Embed. Syst., March, 2024
Enabling the execution of HPC applications on public clouds with <i>HPC@Cloud</i> toolkit.
Concurr. Comput. Pract. Exp., 2024
Improving concurrency and memory usage in distributed operating systems for lightweight manycores via cooperative time-sharing lightweight tasks.
J. Parallel Distributed Comput., April, 2023
LWMPI: An MPI library for NoC-based lightweight manycore processors with on-chip memory constraints.
Concurr. Comput. Pract. Exp., 2023
A Performance Comparison of HPC Workloads on Traditional and Cloud-Based HPC Clusters.
Proceedings of the International Symposium on Computer Architecture and High Performance Computing Workshops , 2023
Proceedings of the International Symposium on Computer Architecture and High Performance Computing Workshops , 2023
Proceedings of the XII Brazilian Symposium on Computing Systems Engineering, 2022
Strategies for Fault-Tolerant Tightly-Coupled HPC Workloads Running on Low-Budget Spot Cloud Infrastructures.
Proceedings of the 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2022
ARTful: A model for user-defined schedulers targeting multiple high-performance computing runtime systems.
Softw. Pract. Exp., 2021
Dynamic power management under the RUN scheduling algorithm: a slack filling approach.
Real Time Syst., 2021
Inter-kernel communication facility of a distributed operating system for NoC-based lightweight manycores.
J. Parallel Distributed Comput., 2021
PackStealLB: A scalable distributed load balancer based on work stealing and workload discretization.
J. Parallel Distributed Comput., 2021
Co-Designing Clusters of Lightweight Manycores and Asymmetric Operating System Kernels.
IEEE Embed. Syst. Lett., 2021
A Task-based Execution Engine for Distributed Operating Systems Tailored to Lightweight Manycores with Limited On-Chip Memory.
Proceedings of the 33rd IEEE International Symposium on Computer Architecture and High Performance Computing, 2021
A trace-driven methodology to evaluate and optimize memory management services of distributed operating systems for lightweight manycores.
Proceedings of the SAC '21: The 36th ACM/SIGAPP Symposium on Applied Computing, 2021
Adaptive Load Balancing based on Machine Learning for Iterative Parallel Applications.
Proceedings of the 28th Euromicro International Conference on Parallel, 2020
Real-time video denoising on multicores and GPUs with Kalman-based and Bilateral filters fusion.
J. Real Time Image Process., 2019
Foreword to the special issue of the workshop on high performance computing systems (XVIII Simpósio em Sistemas Computacionais de Alto Desempenho, WSCAD 2017).
Concurr. Comput. Pract. Exp., 2019
Concurr. Comput. Pract. Exp., 2019
On the Performance and Isolation of Asymmetric Microkernel Design for Lightweight Manycores.
Proceedings of the IX Brazilian Symposium on Computing Systems Engineering, 2019
Distributed Memory Graph Representation for Load Balancing Data: Accelerating Data Structure Generation for Decentralized Scheduling.
Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019
Proceedings of the Symposium on High Performance Computing Systems, 2018
Proceedings of the 30th International Symposium on Computer Architecture and High Performance Computing, 2018
Proceedings of the Euro-Par 2018: Parallel Processing, 2018
CAP Bench: a benchmark suite for performance and energy evaluation of low-power many-core processors.
Concurr. Comput. Pract. Exp., 2017
Design methodology for workload-aware loop scheduling strategies based on genetic algorithm and simulation.
Concurr. Comput. Pract. Exp., 2017
Proceedings of the VII Brazilian Symposium on Computing Systems Engineering, 2017
Towards the Use of LITMUS RT as a Testbed for Multiprocessor Scheduling in Energy Harvesting Real-Time Systems.
Proceedings of the VII Brazilian Symposium on Computing Systems Engineering, 2017
Proceedings of the 2017 International Symposium on Computer Architecture and High Performance Computing Workshops, 2017
Extending OpenACC for Efficient Stencil Code Generation and Execution by Skeleton Frameworks.
Proceedings of the 2017 International Conference on High Performance Computing & Simulation, 2017
Proceedings of the International Conference on Computational Science, 2017
Proceedings of the International Conference on Computational Science, 2017
Performance Improvement of Stencil Computations for Multi-core Architectures based on Machine Learning.
Proceedings of the International Conference on Computational Science, 2017
Proceedings of the 30th IEEE International Symposium on Computer-Based Medical Systems, 2017
Parallel Comput., 2016
Proceedings of the 2016 IEEE International Conference on Electronics, Circuits and Systems, 2016
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016
Proceedings of the High Performance Computing - Third Latin American Conference, 2016
On the energy efficiency and performance of irregular application executions on multicore, NUMA and manycore platforms.
J. Parallel Distributed Comput., 2015
Performance/energy trade-off in scientific computing: the case of ARM big.LITTLE and Intel Sandy Bridge.
IET Comput. Digit. Tech., 2015
J. Parallel Distributed Comput., 2014
Int. J. Parallel Program., 2014
Energy Efficient Seismic Wave Propagation Simulation on a Low-Power Manycore Processor.
Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014
Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications.
Proceedings of the 22nd Euromicro International Conference on Parallel, 2014
Proceedings of the 21st International Conference on High Performance Computing, 2014
Analysis of computing and energy performance of multicore, NUMA, and manycore platforms for an irregular application.
Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013
Optimisation de la performance des applications de mémoire transactionnelle sur des plates-formes multicoeurs : une approche basée sur l'apprentissage automatique. (Improving the Performance of Transactional Memory Applications on Multicores : A Machine Learning-based Approach).
PhD thesis, 2012
Dynamic Thread Mapping Based on Machine Learning for Transactional Memory Applications.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012
Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures.
Proceedings of the 19th International Euromicro Conference on Parallel, 2011
A machine learning-based approach for thread mapping on transactional memory applications.
Proceedings of the 18th International Conference on High Performance Computing, 2011
Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010
Proceedings of the 21st International Symposium on Computer Architecture and High Performance Computing, 2009
NUMA-ICTM: A parallel version of ICTM exploiting memory placement strategies for NUMA machines.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009
Proceedings of the 2006 ACM Symposium on Applied Computing (SAC), 2006
Proceedings of the Parallel Computing Technologies, 2005