Ping Tak Peter Tang

According to our database1, Ping Tak Peter Tang authored at least 52 papers between 1989 and 2022.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2022
Efficient Soft-Error Detection for Low-precision Deep Learning Recommendation Models.
Proceedings of the IEEE International Conference on Big Data, 2022

2021
Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale.
IEEE Micro, 2021

SecNDP: Secure Near-Data Processing with Untrusted Memory.
IACR Cryptol. ePrint Arch., 2021

Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

2020
Mixed-Precision Embedding Using a Cache.
CoRR, 2020

Fast Distributed Training of Deep Neural Networks: Dynamic Communication Thresholding for Model and Data Parallelism.
CoRR, 2020

Appropriate Evaluation of Diagnostic Utility of Machine Learning Algorithm Generated Images.
Proceedings of the Machine Learning for Health Workshop, 2020

2019
Sparse Dictionary Learning by Dynamical Neural Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

Leveraging the bfloat16 Artificial Intelligence Datatype For Higher-Precision Computations.
Proceedings of the 26th IEEE Symposium on Computer Arithmetic, 2019

2018
Dictionary Learning by Dynamical Neural Networks.
CoRR, 2018

A Progressive Batching L-BFGS Method for Machine Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Sparse Coding by Spiking Neural Networks: Convergence Theory and Computational Results.
CoRR, 2017

Enabling Sparse Winograd Convolution by Native Pruning.
CoRR, 2017

Faster CNNs with Direct Sparse Convolutions and Guided Pruning.
Proceedings of the 5th International Conference on Learning Representations, 2017

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
Feast Eigensolver for Non-Hermitian Problems.
SIAM J. Sci. Comput., 2016

A New Multiplication Algorithm for Extended Precision Using Floating-Point Expansions.
Proceedings of the 23nd IEEE Symposium on Computer Arithmetic, 2016

2015
Efficient Calculations of Faithfully Rounded <i>l</i><sub>2</sub>-Norms of <i>n</i>-Vectors.
ACM Trans. Math. Softw., 2015

Zolotarev Quadrature Rules and Load Balancing for the FEAST Eigensolver.
SIAM J. Sci. Comput., 2015

2014
Guest Editors' Introduction: Special Section on Computer Arithmetic.
IEEE Trans. Computers, 2014

FEAST As A Subspace Iteration Eigensolver Accelerated By Approximate Spectral Projection.
SIAM J. Matrix Anal. Appl., 2014

A new highly parallel non-Hermitian eigensolver.
Proceedings of the 2014 Spring Simulation Multiconference, 2014


2013
A framework for low-communication 1-D FFT.
Sci. Program., 2013

Efficient backprojection-based synthetic aperture radar computation with many-core processors.
Sci. Program., 2013

Subspace Iteration with Approximate Spectral Projection
CoRR, 2013

Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessors.
Proceedings of the International Conference for High Performance Computing, 2013

2011
Tight Certification Techniques for Digit-by-Rounding Algorithms with Application to a New 1/sqrt(x) Design.
Proceedings of the 20th IEEE Symposium on Computer Arithmetic, 2011

Radix-8 Digit-by-Rounding: Achieving High-Performance Reciprocals, Square Roots, and Reciprocal Square Roots.
Proceedings of the 20th IEEE Symposium on Computer Arithmetic, 2011

2009
A Software Implementation of the IEEE 754R Decimal Floating-Point Arithmetic Using the Binary Encoding Format.
IEEE Trans. Computers, 2009

2007
Modular Multiplication using Redundant Digit Division.
Proceedings of the 18th IEEE Symposium on Computer Arithmetic (ARITH-18 2007), 2007

A Software Implementation of the IEEE 754R Decimal Floating-Point Arithmetic Using the Binary Encoding Format.
Proceedings of the 18th IEEE Symposium on Computer Arithmetic (ARITH-18 2007), 2007

2005
DFTI---a new interface for fast fourier transform libraries.
ACM Trans. Math. Softw., 2005

2003
Intel® Itanium® floating-point architecture.
Proceedings of the 2003 workshop on Computer architecture education, 2003

An Overview of Floating-Point Support and Math Library on the Intel XScale<sup>TM</sup> Architecture.
Proceedings of the 16th IEEE Symposium on Computer Arithmetic (Arith-16 2003), 2003

2002
Scientific computing on the Itanium® processor.
Sci. Program., 2002

2001
Scientific computing on the Itanium processor.
Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 2001

2000
A Comprehensive DFT API for Scientific Computing.
Proceedings of the Architecture of Scientific Software, 2000

1999
A Proposal for a Comprehensive Package for Discrete Fourier Transform.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

New Algorithms for Improved Transcendental Functions on IA-64.
Proceedings of the 14th IEEE Symposium on Computer Arithmetic (Arith-14 '99), 1999

1997
Implementing the Complex Arcsine and Arccosine Functions Using Exception Handling.
ACM Trans. Math. Softw., 1997

1995
It Takes Six Ones To Reach a Flaw.
Proceedings of the 12th Symposium on Computer Arithmetic (ARITH-12 '95), 1995

1994
Dynamic Condition Estimation and Rayleigh-Ritz Approximation.
SIAM J. Matrix Anal. Appl., January, 1994

Implementing complex elementary functions using exception handling.
ACM Trans. Math. Softw., 1994

Fast Band-Toeplitz Preconditioners for Hermitian Toeplitz Systems.
SIAM J. Sci. Comput., 1994

1993
A Cholesky Up- and Downdating Algorithm for Systolic and SIMD Architectures.
SIAM J. Sci. Comput., 1993

Constrained minimax approximation and optimal preconditioned for Toeplitz matrices.
Numer. Algorithms, 1993

1992
Table-driven implementation of the Expm1 function in IEEE floating-point arithmetic.
ACM Trans. Math. Softw., 1992

1991
Table-lookup algorithms for elementary functions and their error analysis.
Proceedings of the 10th IEEE Symposium on Computer Arithmetic, 1991

1990
Table-driven implementation of the logarithm function in IEEE floating-point arithmetic.
ACM Trans. Math. Softw., 1990

Accurate and efficient testing of the exponential and logarithm functions.
ACM Trans. Math. Softw., 1990

1989
Table-driven implementation of the exponential function in IEEE floating-point arithmetic.
ACM Trans. Math. Softw., 1989


  Loading...