Andros Tjandra

Orcid: 0000-0003-1246-5908

According to our database1, Andros Tjandra authored at least 54 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Scaling Speech Technology to 1, 000+ Languages.
J. Mach. Learn. Res., 2024

Movie Gen: A Cast of Media Foundation Models.
CoRR, 2024

Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning.
CoRR, 2024

MusicFlow: Cascaded Flow Matching for Text Guided Music Generation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Generative Pre-training for Speech with Flow Matching.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Audiobox: Unified Audio Generation with Natural Language Prompts.
CoRR, 2023

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain.
CoRR, 2023

Learning ASR Pathways: A Sparse Multilingual ASR Model.
Proceedings of the IEEE International Conference on Acoustics, 2023

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities.
Proceedings of the IEEE International Conference on Acoustics, 2023

Voice-Preserving Zero-Shot Multiple Accent Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Learning ASR pathways: A sparse multilingual ASR model.
CoRR, 2022

NIX-TTS: Lightweight and End-to-End Text-to-Speech Via Module-Wise Distillation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improved Language Identification Through Cross-Lingual Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

Conformer-Based Self-Supervised Learning For Non-Speech Audio Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Code-Switching ASR and TTS Using Semisupervised Learning with Machine Speech Chain.
IEICE Trans. Inf. Syst., 2021

Improved Language Identification Through Cross-Lingual Self-Supervised Learning.
CoRR, 2021

Multimodal Chain: Cross-Modal Collaboration Through Listening, Speaking, and Visualizing.
IEEE Access, 2021

Unsupervised Learning of Disentangled Speech Content and Style Representation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020
Corrections to "Machine Speech Chain".
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Machine Speech Chain.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Recurrent Neural Network Compression Based on Low-Rank Tensor Representation.
IEICE Trans. Inf. Syst., 2020

Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis.
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages and Collaboration and Computing for Under-Resourced Languages, 2020

Transformer VQ-VAE for Unsupervised Unit Discovery and Speech Synthesis: ZeroSpeech 2020 Challenge.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Incremental Machine Speech Chain Towards Enabling Listening While Speaking in Real-Time.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Augmenting Images for ASR and TTS Through Single-Loop and Dual-Loop Multimodal Chain Framework.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transformer-Based Acoustic Modeling for Hybrid Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Deja-vu: Double Feature Presentation in Deep Transformer Networks.
CoRR, 2019

From Speech Chain to Multimodal Chain: Leveraging Cross-modal Data Augmentation for Semi-supervised Learning.
CoRR, 2019

End-to-End Speech Recognition Sequence Training With Reinforcement Learning.
IEEE Access, 2019

Recognition and translation of code-switching speech utterances.
Proceedings of the 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2019

VQVAE Unsupervised Unit Discovery and Multi-Scale Code2Spec Inverter for Zerospeech Challenge 2019.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-end Feedback Loss in Speech Chain Framework via Straight-through Estimator.
Proceedings of the IEEE International Conference on Acoustics, 2019

Speech-to-Speech Translation Between Untranscribed Unknown Languages.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Zero-Shot Code-Switching ASR and TTS with Multilingual Machine Speech Chain.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Listening While Speaking and Visualizing: Improving ASR Through Multimodal Chain.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Multi-Scale Alignment and Contextual History for Attention Mechanism in Sequence-to-Sequence Model.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Speech Chain for Semi-Supervised Learning of Japanese-English Code-Switching ASR and TTS.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Machine Speech Chain with One-shot Speaker Adaptation.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Compressing End-to-end ASR Networks by Tensor-Train Decomposition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Tensor Decomposition for Compressing Recurrent Neural Network.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Sequence-to-Sequence Asr Optimization Via Reinforcement Learning.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Local Monotonic Attention Mechanism for End-to-End Speech Recognition.
CoRR, 2017

Speech recognition features based on deep latent Gaussian models.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Compressing recurrent neural network with tensor train.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Local Monotonic Attention Mechanism for End-to-End Speech And Language Processing.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Attention-based Wav2Text with feature transfer learning.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Listening while speaking: Speech chain by deep learning.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
Gated Recurrent Neural Tensor Network.
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

2015
Combination of two-dimensional cochleogram and spectrogram features for deep learning-based ASR.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Stochastic Gradient Variational Bayes for deep learning-based ASR.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015


  Loading...