Paolo D'Alberto

Orcid: 0000-0002-1584-1270

According to our database1, Paolo D'Alberto authored at least 31 papers between 2000 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Weight Block Sparsity: Training, Compilation, and AI Engine Accelerators.
CoRR, 2024

2023
Strassen's Matrix Multiplication Algorithm Is Still Faster.
CoRR, 2023

Entropy Maximization in Sparse Matrix by Vector Multiplication (max<sub>E</sub>SpMV).
CoRR, 2023

Digital Advertising: the Measure of Mobile Visits Lifts.
CoRR, 2023

2022
xDNN: Inference for Deep Convolutional Neural Networks.
ACM Trans. Reconfigurable Technol. Syst., 2022

A Heterogeneous Solution to the All-pairs Shortest Path Problem using FPGAs.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

2021
DPUV3INT8: A Compiler View to programmable FPGA Inference Engines.
CoRR, 2021

2018
Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines.
CoRR, 2018

2015
Mapping and Matching Algorithms: Data Mining by Adaptive Graphs.
CoRR, 2015

Binary and Polytomous Responses Modeling: Multiple-Campaign Ad-Targeting Without Personal User Information.
CoRR, 2015

2013
Improving numerical accuracy for non-negative matrix multiplication on GPUs using recursive algorithms.
Proceedings of the International Conference on Supercomputing, 2013

2012
A Heterogeneous Accelerated Matrix Multiplication: OpenCL + APU + GPU+ Fast Matrix Multiply
CoRR, 2012

2011
Exploiting parallelism in matrix-computation kernels for symmetric multiprocessor systems: Matrix-multiplication and matrix-addition algorithm optimizations by software pipelining and threads allocation.
ACM Trans. Math. Softw., 2011

On the Weakenesses of Correlation
CoRR, 2011

Improving the Accuracy of High Performance BLAS Implementations Using Adaptive Blocked Algorithms.
Proceedings of the 23rd International Symposium on Computer Architecture and High Performance Computing, 2011

Pruning hardware evaluation space via correlation-driven application similarity analysis.
Proceedings of the 8th Conference on Computing Frontiers, 2011

2009
Adaptive Winograd's matrix multiplications.
ACM Trans. Math. Softw., 2009

Non-parametric Information-Theoretic Measures of One-Dimensional Distribution Functions from Continuous Time Series.
Proceedings of the SIAM International Conference on Data Mining, 2009

Automatic retrieval of similar content using search engine query interface.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008
Domain-specific library generation for parallel software and hardware platforms.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007
R-Kleene: A High-Performance Divide-and-Conquer Algorithm for the All-Pair Shortest Path for Densely Connected Networks.
Algorithmica, 2007

Adaptive Strassen's matrix multiplication.
Proceedings of the 21th Annual International Conference on Supercomputing, 2007

Performance/Energy Optimization of DSP Transforms on the XScale Processor.
Proceedings of the High Performance Embedded Architectures and Compilers, 2007

Generating FPGA-Accelerated DFT Libraries.
Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, 2007

2005
Line Size Adaptivity Analysis of Parameterized Loop Nests for Direct Mapped Data Cache.
IEEE Trans. Computers, 2005

Using Recursion to Boost ATLAS's Performance.
Proceedings of the High-Performance Computing - 6th International Symposium, 2005

2004
A Geometric Approach for Partitioning N-Dimensional Non-rectangular Iteration Spaces.
Proceedings of the Languages and Compilers for High Performance Computing, 2004

JuliusC: A Practical Approach for the Analysis of Divide-and-Conquer Algorithms.
Proceedings of the Languages and Compilers for High Performance Computing, 2004

2003
A Data Cache with Dynamic Mapping.
Proceedings of the Languages and Compilers for Parallel Computing, 2003

2001
Fractal Matrix Multiplication: A Case Study on Portability of Cache Performance.
Proceedings of the Algorithm Engineering, 2001

2000
On the Space and Access Complexity of Computation DAGs.
Proceedings of the Graph-Theoretic Concepts in Computer Science, 2000


  Loading...