S. Lennart Johnsson

Affiliations:
  • University of Houston, USA


According to our database1, S. Lennart Johnsson authored at least 95 papers between 1981 and 2021.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2021
Using CNNs for AD classification based on spatial correlation of BOLD signals during the observation.
CoRR, 2021

Analysis of Factors Affecting Power Consumption and Energy Efficiency of SGEMM on the Low-Power Myriad-2 VPU.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

2020
A Highly Efficient SGEMM Implementation using DMA on the Intel/Movidius Myriad-2.
Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020

An Adaptive Space-Filling Curve Trajectory for Ordering 3D Datasets to 1D: Application to Brain Magnetic Resonance Imaging Data for Classification.
Proceedings of the Computational Science - ICCS 2020, 2020

Squeeze U-Net: A Memory and Energy Efficient Image Segmentation Network.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Scalable machine learning computing a data summarization matrix with a parallel array DBMS.
Distributed Parallel Databases, 2019

2018
A performance spectrum for parallel computational frameworks that solve PDEs.
Concurr. Comput. Pract. Exp., 2018

2017
A Cloud System for Machine Learning Exploiting a Parallel Array DBMS.
Proceedings of the 28th International Workshop on Database and Expert Systems Applications, 2017

2016
Lifetime and Deployment Limits for Mobile, 3D-Perceptual Applications.
Proceedings of the Virtual, Augmented and Mixed Reality, 2016

2014
Instrumentation for accurate energy-to-solution measurements of a texas instruments TMS320C6678 digital signal processor and its DDR3 memory.
Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, 2014

Exploiting DMA for Performance and Energy Optimized STREAM on a DSP.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

2012
Overview of Data Centers Energy Efficiency Evolution.
Proceedings of the Handbook of Energy-Aware and Green Computing - Two Volume Set., 2012

2010
The SNIC/KTH PRACE prototype: Achieving high energy efficiency with commodity technology without acceleration.
Proceedings of the International Green Computing Conference 2010, 2010

2008
Automatic Generation of FFT for Translations of Multipole Expansions in Spherical Harmonics.
Int. J. High Perform. Comput. Appl., 2008

Scalable Grid-wide capacity allocation with the SweGrid Accounting System (SGAS).
Concurr. Comput. Pract. Exp., 2008

2007
Scheduling FFT computation on SMP and multicore systems.
Proceedings of the 21th Annual International Conference on Supercomputing, 2007

Adaptive Computation of Self Sorting In-Place FFTs on Hierarchical Memory Architectures.
Proceedings of the High Performance Computing and Communications, 2007

Dynamic, context-aware, least-privilege grid delegation.
Proceedings of the 8th IEEE/ACM International Conference on Grid Computing (GRID 2007), 2007

Developing Assays for the Detection of Influenza in Human Samples.
Proceedings of the International Conference on Bioinformatics & Computational Biology, 2007

2006
A Service-Oriented Approach to Enforce Grid Resource Allocations.
Int. J. Cooperative Inf. Syst., 2006

Toward an On-Demand Restricted Delegation Mechanism for Grids.
Proceedings of the 7th IEEE/ACM International Conference on Grid Computing (GRID 2006), 2006

Two Challenges in Genomics That Can Benefit from Petascale Platforms.
Proceedings of the Euro-Par 2006 Workshops: Parallel Processing, 2006

2005
New Grid Scheduling and Rescheduling Methods in the GrADS Project.
Int. J. Parallel Program., 2005

Scheduling strategies for mapping application workflows onto the grid.
Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing, 2005

2004
Automatic Performance Tuning for Fast Fourier Transforms.
Int. J. High Perform. Comput. Appl., 2004

An OGSA-based accounting system for allocation enforcement across HPC centers.
Proceedings of the Service-Oriented Computing, 2004

Scheduling workflow applications in GrADS.
Proceedings of the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2004), 2004

2003
CODELAB: A Developers' Tool for Efficient Code Generation and Optimization.
Proceedings of the Computational Science - ICCS 2003, 2003

2002
Toward a Framework for Preparing and Executing Adaptive Grid Programs.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries.
J. Parallel Distributed Comput., 2001

The GrADS Project: Software Support for High-Level Grid Application Development.
Int. J. High Perform. Comput. Appl., 2001

2000
HPFBench: a high performance Fortran benchmark suite.
ACM Trans. Math. Softw., 2000

An adaptive software library for fast Fourier transforms.
Proceedings of the 14th international conference on Supercomputing, 2000

1999
Some Metacomputing Experiences for Scientific Applications.
Parallel Process. Lett., 1999

Large scale distributed data repository: design of a molecular dynamics trajectory database.
Future Gener. Comput. Syst., 1999

1997
Hierarchical Load Balancing for Parallel Fast Legendre Transforms.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

A Data-Parallel Implementation of the Geometric Partitioning Algorithm.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

A Data-Parallel Adaptive N-body Method.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

On the Accuracy of Anderson's Fast N-body Algorithm.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

High Performance FORTRAN for Highly Unstructured Problems.
Proceedings of the Sixth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1997

DPF: A Data Parallel Fortran Benchmark Suite.
Proceedings of the 11th International Parallel Processing Symposium (IPPS '97), 1997

1996
Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages.
Sci. Program., 1996

Local Basic Linear Algebra Subroutines (LBLAS) for the CM-5/5E.
Int. J. High Perform. Comput. Appl., 1996

A Data-Parallel Implementation of Hierarchical N-Body Methods.
Int. J. High Perform. Comput. Appl., 1996

A Data-Parallel Implementation of O(N) Hierarchical N-Body Methods.
Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

1995
On the Conversion Between Binary Code and Binary-Reflected Gray Code on Binary Cubes.
IEEE Trans. Computers, 1995

All-to-All Communication on the Connection Machine CM-200.
Sci. Program., 1995

ROMM Routing on Mesh and Torus Networks.
Proceedings of the 7th Annual ACM Symposium on Parallel Algorithms and Architectures, 1995

1994
Index Transformation Algorithms in a Linear Algebra Framework.
IEEE Trans. Parallel Distributed Syst., 1994

POLYSHIFT Communications Software for the Connection Machine System CM-200.
Sci. Program., 1994

Multiplication of Matrices of Arbitrary Shape on a Data Parallel Computer.
Parallel Comput., 1994

Binary Cube Emulation of Butterfly Networks Encoded by Grad Code.
J. Parallel Distributed Comput., 1994

An Efficient Algorithms for Gray-to-Binary Permutation on Hypercubes.
J. Parallel Distributed Comput., 1994

Embedding hyperpyramids into hypercubes.
IBM J. Res. Dev., 1994

Optimal communication channel utilization for matrix transposition and related permutations on binary cubes.
Discret. Appl. Math., 1994

ROMM Routing: A Class of Efficient Minimal Routing Algorithms.
Proceedings of the Parallel Computer Routing and Communication, 1994

Scientific Software Libraries for Scalable.
Proceedings of the Parallel Scientific Computing, First International Workshop, 1994

Mesh Decomposition and Communication Procedures for Finite Element Applications on the Connection Machine CM-5 System.
Proceedings of the High-Performance Computing and Networking, 1994

1993
Block-Cyclic Dense Linear Algebra.
SIAM J. Sci. Comput., 1993

Minimizing the Communication Time for Matrix Multiplication on Multiprocessors.
Parallel Comput., 1993

The Connection Machine Systems CM-5.
Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, 1993

1992
Cooley-Tukey FFT on the Connection Machine.
Parallel Comput., 1992

Generalized Shuffle Permutations on Boolean Cubes.
J. Parallel Distributed Comput., 1992

Local Basic Linear Algebra Subroutines (Lblas) for Distributed Memory Architectures and Languages With Array Syntax.
Int. J. High Perform. Comput. Appl., 1992

All-To-All Broadcast and Applications On the Connection Machine.
Int. J. High Perform. Comput. Appl., 1992

Massively Parallel Computing: Data Distribution and Communication.
Proceedings of the Parallel Architectures and Their Efficient Use, 1992

1991
The Parallel Multipole Method on the Connection Machine.
SIAM J. Sci. Comput., 1991

Performance Modeling of Distributed Memory Architectures.
J. Parallel Distributed Comput., 1991

1990
Optimizing Tridiagonal Solvers for Alternating Direction Methods on Boolean Cube Multiprocessors.
SIAM J. Sci. Comput., 1990

A dataparallel implementation of an explicit method for the three-dimensional compressible Navier-Stokes equations.
Parallel Comput., 1990

Embedding Meshes in Boolean Cubes by Graph Decomposition.
J. Parallel Distributed Comput., 1990

Embedding Three-Dimensional Meshes in Boolean Cubes by Graph Decomposition.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

1989
Optimum Broadcasting and Personalized Communication in Hypercubes.
IEEE Trans. Computers, 1989

The Finite Element Method on a Data Parallel Computing System.
Int. J. High Speed Comput., 1989

Histogram Computation on Distributed Memory Architectures.
Concurr. Pract. Exp., 1989

A study of dissipation operators for the euler equations and a three- dimensional channel flow.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

Element order and convergence rate of the conjugate gradient method for data parallel stress analysis.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

A radix-2 FFT on connection machine.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

Matrix multiplication on the connection machine.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

Dilation <i>d</i> embedding of a hyper-pyramid into a hypercube.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

QCD with dynamical fermions on the connection machine.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

Data Parallel Algorithms for the Finite Element Method.
Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, 1989

1988
Expressing Boolean cube matrix algorithms in shared memory primitives.
Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, 1988

Optimal algorithms for stable dimension permutations on Boolean cubes.
Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, 1988

QED on the connection machine.
Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, 1988

1987
Solving banded systems on a parallel processor.
Parallel Comput., 1987

Communication Efficient Basic Linear Algebra Computations on Hypercube Architectures.
J. Parallel Distributed Comput., 1987

The Communication Efficiency fo Meshes, Boolean Cubes and Cube Connected Cycles for Wafer Scale Integraton.
Proceedings of the International Conference on Parallel Processing, 1987

Algorithms for Matrix Transposition on Boolean n-Cube Configured Ensemble Architectures.
Proceedings of the International Conference on Parallel Processing, 1987

On the Embedding of Arbitrary Meshes in Boolean Cubes With Expansion Two Dilation Two.
Proceedings of the International Conference on Parallel Processing, 1987

1986
Distributed Routing Algorithms for Broadcasting and Personalized Communication in Hypercubes.
Proceedings of the International Conference on Parallel Processing, 1986

1985
Solving Narrow Banded Systems on Ensemble Architectures.
ACM Trans. Math. Softw., 1985

Generation of layouts from MOS circuit schematics: a graph theoretic approach.
Proceedings of the 22nd ACM/IEEE conference on Design automation, 1985

1983
The Tree Machine: An Evaluation of Strategies for Reducing Program Loading Time.
Proceedings of the International Conference on Parallel Processing, 1983

1981
A Mathematical Approach to the Design of VLSI Networks for Real-Time Computation Problems.
Proceedings of the IEEE Real-Time Systems Symposium, 1981


  Loading...