2024
Tadashi: Enabling AI-Based Automated Code Generation With Guaranteed Correctness.
CoRR, 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Welcome Message from LLMxHPC Workshop.
Proceedings of the IEEE International Conference on Cluster Computing, 2024
2023
At the Locus of Performance: Quantifying the Effects of Copious 3D-Stacked Cache on HPC Workloads.
,
,
,
,
,
,
,
,
,
,
,
ACM Trans. Archit. Code Optim., December, 2023
Myths and legends in high-performance computing.
Int. J. High Perform. Comput. Appl., July, 2023
2022
Preparing for the Future - Rethinking Proxy Applications.
Comput. Sci. Eng., 2022
Outliers Dimensions that Disrupt Transformers Are Driven by Frequency.
CoRR, 2022
Preparing for the Future - Rethinking Proxy Apps.
CoRR, 2022
At the Locus of Performance: A Case Study in Enhancing CPUs with Copious 3D-Stacked Cache.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
Why Globally Re-shuffle? Revisiting Data Shuffling in Large Scale Deep Learning.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
Outlier Dimensions that Disrupt Transformers are Driven by Frequency.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
2021
MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics.
CoRR, 2021
MLPerf™ HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2021
Matrix Engines for High Performance Computing: A Paragon of Performance or Grasping at Straws?
,
,
,
,
,
,
,
,
,
,
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021
2020
Scaling distributed deep learning workloads beyond the memory capacity with KARMA.
Proceedings of the International Conference for High Performance Computing, 2020
2019
Scaling Word2Vec on Big Corpus.
Data Sci. Eng., 2019
Learning Neural Representations for Predicting GPU Performance.
Proceedings of the High Performance Computing - 34th International Conference, 2019
2018
Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018
Predicting Performance Using Collaborative Filtering.
Proceedings of the IEEE International Conference on Cluster Computing, 2018
2017
The (too Many) Problems of Analogical Reasoning with Word Vectors.
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics, 2017
Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017
2016
GPU-Accelerated Large-Scale Distributed Sorting Coping with Device Memory Capacity.
IEEE Trans. Big Data, 2016
Critical mass in the emergence of collective intelligence: a parallelized simulation of swarms in noisy environments.
Artif. Life Robotics, 2016
Migrating Legacy Fortran to Python While Retaining Fortran-Level Performance through Transpilation and Type Hints.
Proceedings of the 6th Workshop on Python for High-Performance and Scientific Computing, 2016
Intrinsic Evaluations of Word Embeddings: What Can We Do Better?
Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, 2016
Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn't.
Proceedings of the Student Research Workshop, 2016
Word Embeddings, Analogies, and Machine Learning: Beyond king - man + woman = queen.
Proceedings of the COLING 2016, 2016
2015
Python, performance, and natural language processing.
Proceedings of the 5th Workshop on Python for High-Performance and Scientific Computing, 2015
Discovering Aspectual Classes of Russian Verbs in Untagged Large Corpora.
Proceedings of the IEEE International Conference on Data Science and Data Intensive Systems, 2015
2014
Large-scale distributed sorting for GPU-based heterogeneous supercomputers.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014
Efficient String Sorting on Multi - and Many-Core Architectures.
Proceedings of the 2014 IEEE International Congress on Big Data, Anchorage, AK, USA, June 27, 2014
2012
A Multi GPU Read Alignment Algorithm with Model-Based Performance Optimization.
Proceedings of the High Performance Computing for Computational Science, 2012
Sequence Alignment on Massively Parallel Heterogeneous Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012
2011
Poster: fast GPU read alignment with burrows wheeler transform based index.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011