Ruoming Pang

According to our database1, Ruoming Pang authored at least 86 papers between 2000 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Improve Vision Language Model Chain-of-thought Reasoning.
CoRR, 2024

EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing.
CoRR, 2024

Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo.
CoRR, 2024

ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities.
CoRR, 2024

Apple Intelligence Foundation Language Models.
CoRR, 2024

MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains.
CoRR, 2024

Large Language Model-guided Document Selection.
CoRR, 2024

Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training.
CoRR, 2024

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training.
CoRR, 2024


2023
Instruction-Following Speech Recognition.
CoRR, 2023

Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts.
CoRR, 2023

Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR.
CoRR, 2023

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens.
CoRR, 2023

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Hyperscale Hardware Optimized Neural Architecture Search.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2022

Pathways: Asynchronous Distributed Dataflow for ML.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

A Language Agnostic Multilingual Streaming On-Device ASR System.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Vector-quantized Image Modeling with Improved VQGAN.
Proceedings of the Tenth International Conference on Learning Representations, 2022


Massively Multilingual ASR: A Lifelong Learning Solution.
Proceedings of the IEEE International Conference on Acoustics, 2022

Transducer-Based Streaming Deliberation for Cascaded Encoders.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Co-training Transformer with Videos and Images Improves Action Recognition.
CoRR, 2021

GSPMD: General and Scalable Parallelization for ML Computation Graphs.
CoRR, 2021

Scaling End-to-End Models for Large-Scale Multilingual ASR.
CoRR, 2021

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models.
CoRR, 2021

Transformer Based Deliberation for Two-Pass Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Unsupervised Learning of Disentangled Speech Content and Style Representation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling.
Proceedings of the 9th International Conference on Learning Representations, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2021

Dynamic Sparsity Neural Networks for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Cascaded Encoders for Unifying Streaming and Non-Streaming ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data.
Proceedings of the IEEE International Conference on Acoustics, 2021

Searching for Fast Model Families on Datacenter Accelerators.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Scaling End-to-End Models for Large-Scale Multilingual ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition.
CoRR, 2020

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling.
CoRR, 2020

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency.
CoRR, 2020

Emitting Word Timings with End-to-End Models.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Conformer: Convolution-augmented Transformer for Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Attention-Based Joint Acoustic and Text on-Device End-To-End Model.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020


Towards Fast and Accurate Streaming End-To-End ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Deliberation Model Based Two-Pass End-To-End Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

BigNAS: Scaling up Neural Architecture Search with Big Single-Stage Models.
Proceedings of the Computer Vision - ECCV 2020, 2020

EfficientDet: Scalable and Efficient Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection.
CoRR, 2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.
CoRR, 2019

Zanzibar: Google's Consistent, Global Authorization System.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Shallow-Fusion End-to-End Contextual Biasing.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Two-Pass End-to-End Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Hierarchical Generative Modeling for Controllable Speech Synthesis.
Proceedings of the 7th International Conference on Learning Representations, 2019

Searching for MobileNetV3.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019


Semi-supervised Training for End-to-end Models via Weak Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2019

MnasNet: Platform-Aware Neural Architecture Search for Mobile.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

A Comparison of End-to-End Models for Long-Form Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Monotonic Infinite Lookback Attention for Simultaneous Machine Translation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Domain Adaptive Transfer Learning with Specialist Models.
CoRR, 2018

MnasNet: Platform-Aware Neural Architecture Search for Mobile.
CoRR, 2018

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Compression of End-to-End Models.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2006
The devil and packet trace anonymization.
Comput. Commun. Rev., 2006

Rethinking Hardware Support for Network Analysis and Intrusion Prevention.
Proceedings of the 1st USENIX Workshop on Hot Topics in Security, 2006

binpac: a yacc for writing application protocol parsers.
Proceedings of the 6th ACM SIGCOMM Internet Measurement Conference, 2006

2005
Part I: A Theory for Deadlock-Free Dynamic Network Reconfiguration.
IEEE Trans. Parallel Distributed Syst., 2005

A First Look at Modern Enterprise Traffic.
Proceedings of the 5th Internet Measurement Conference, 2005

2004
The dark side of the Web: an open proxy's view.
Comput. Commun. Rev., 2004

Reliability and Security in the CoDeeN Content Distribution Network.
Proceedings of the General Track: 2004 USENIX Annual Technical Conference, June 27, 2004

Characteristics of internet background radiation.
Proceedings of the 4th ACM SIGCOMM Internet Measurement Conference, 2004

2003
Deadlock-Free Dynamic Reconfiguration Schemes for Increased Network Dependability.
IEEE Trans. Parallel Distributed Syst., 2003

A high-level programming environment for packet trace anonymization and transformation.
Proceedings of the ACM SIGCOMM 2003 Conference on Applications, 2003

2002
Defensive Programming: Using an Annotation Toolkit to Build DoS-Resistant Software.
Proceedings of the 5th Symposium on Operating System Design and Implementation (OSDI 2002), 2002

Oblivious Hashing: A Stealthy Software Integrity Verification Primitive.
Proceedings of the Information Hiding, 5th International Workshop, 2002

2000
The Double Scheme: Deadlock-Free Dynamic Reconfiguration of Cut-Through Networks.
Proceedings of the 2000 International Conference on Parallel Processing, 2000


  Loading...