Michael Gschwind

Affiliations:
  • IBM Research


According to our database1, Michael Gschwind authored at least 55 papers between 1994 and 2024.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2008, "For contributions to high-performance computer architecture and compilation technology".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024

2022

ScaDL 2022 Invited Talk 4: Sustainable AI @ Scale: Accelerating AI models for billions of users.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

2021
Sustainable AI: Environmental Implications, Challenges and Opportunities.
CoRR, 2021

First-Generation Inference Accelerator Deployment at Facebook.
CoRR, 2021

2018
IBM POWER9 system software.
IBM J. Res. Dev., 2018

Reengineering a server ecosystem for enhanced portability and performance.
IBM J. Res. Dev., 2018

2017
Optimizing the efficiency of deep learning through accelerator virtualization.
IBM J. Res. Dev., 2017

2016
Workload acceleration with the IBM POWER vector-scalar architecture.
IBM J. Res. Dev., 2016

2015
IBM POWER8 processor core microarchitecture.
IBM J. Res. Dev., 2015

The SIMD accelerator for business analytics on the IBM z13.
IBM J. Res. Dev., 2015

PLC Keynote.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

I/O virtualization and system acceleration in POWER8.
Proceedings of the 2015 IEEE Hot Chips 27 Symposium (HCS), 2015

2014
OpenPOWER: Reengineering a server ecosystem for large-scale data centers.
Proceedings of the 2014 IEEE Hot Chips 26 Symposium (HCS), 2014

2012
The IBM Blue Gene/Q Compute Chip.
IEEE Micro, 2012

Guest Editorial: Parallel Systems and Compilers.
Int. J. Parallel Program., 2012

Blue Gene/Q: design for sustained multi-petaflop computing.
Proceedings of the International Conference on Supercomputing, 2012

2011
SoftBeam: Precise tracking of transient faults and vulnerability analysis at processor design time.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

2010
Application Acceleration with the Cell Broadband Engine.
Comput. Sci. Eng., 2010

2009
High Performance Computing with the Cell Broadband Engine.
Sci. Program., 2009

Integrated execution: A programming model for accelerators.
IBM J. Res. Dev., 2009

64-bit prefix adders: Power-efficient topologies and design solutions.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2009

2008
Cell GC: using the cell synergistic processor as a garbage collection coprocessor.
Proceedings of the 4th International Conference on Virtual Execution Environments, 2008

Next-Generation Performance Counters: Towards Monitoring Over Thousand Concurrent Events.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2008

Optimizing data sharing and address translation for the Cell BE Heterogeneous Chip Multiprocessor.
Proceedings of the 26th International Conference on Computer Design, 2008

2007
The Cell Broadband Engine: Exploiting Multiple Levels of Parallelism in a Chip Multiprocessor.
Int. J. Parallel Program., 2007

An Open Source Environment for Cell Broadband Engine System Software.
Computer, 2007

2006
Synergistic Processing in Cell's Multicore Architecture.
IEEE Micro, 2006

Using advanced compiler technology to exploit the performance of the Cell Broadband Engine<sup>TM</sup> architecture.
IBM Syst. J., 2006

Chip multiprocessing and the cell broadband engine.
Proceedings of the Third Conference on Computing Frontiers, 2006

2005

Optimizing Compiler for the CELL Processor.
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), 2005

2004
Integrated Analysis of Power and Performance for Pipelined Microprocessors.
IEEE Trans. Computers, 2004

2003
New methodology for early-stage, microarchitecture-level power-performance analysis of microprocessors.
IBM J. Res. Dev., 2003

2002
Early-Stage Definition of LPX: A Low Power Issue-Execute Processor.
Proceedings of the Power-Aware Computer Systems, Second International Workshop, 2002

Optimizing pipelines for power and performance.
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002

Precise Exception Semantics in Dynamic Compilation.
Proceedings of the Compiler Construction, 11th International Conference, 2002

2001
FPGA prototyping of a RISC processor core for embedded applications.
IEEE Trans. Very Large Scale Integr. Syst., 2001

Dynamic Binary Translation and Optimization.
IEEE Trans. Computers, 2001

Optimization and precise exceptions in dynamic compilation.
SIGARCH Comput. Archit. News, 2001

Advances and future challenges in binary translation and optimization.
Proc. IEEE, 2001

2000
Dynamic and Transparent Binary Translation.
Computer, 2000

Binary translation and architecture convergence issues for IBM system/390.
Proceedings of the 14th international conference on Supercomputing, 2000

1999
Optimizations and Oracle Parallelism with Dynamic Translation.
Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture, 1999

Execution-Based Scheduling for VLIW Architectures.
Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999

Instruction set selection for ASIP design.
Proceedings of the Seventh International Workshop on Hardware/Software Codesign, 1999

1998
An eight-issue tree-VLIW processor for dynamic binary translation.
Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998

Hardware/Software Co-Design of a Fuzzy RISC Processor.
Proceedings of the 1998 Design, 1998

1996
Migration from Schematic-Based Designs to a VHDL Synthesis Environment.
Proceedings of the Field-Programmable Logic, 1996

An extendable MIPS-I processor kernel in VHDL for hardware/software co-design.
Proceedings of the conference on European design automation, 1996

1995
A VHDL Design Methodology for FPGAs.
Proceedings of the Field-Programmable Logic and Applications, 5th International Workshop, 1995

1994
FTP Access As a User-defined File System.
ACM SIGOPS Oper. Syst. Rev., 1994

Reprogrammable hardware for educational purposes.
Proceedings of the 25th SIGCSE Technical Symposium on Computer Science Education, 1994

A Fast FPGA Implementation of a General Purpose Neuron.
Proceedings of the Field-Programmable Logic, 1994

The Design of a Stack-Based Microprocessor.
Proceedings of the Field-Programmable Logic, 1994


  Loading...