Professional skills:
Skills picked up from hobbies:
About Me
I am a Research Scientist at Zoom working on Spoken Language Models and Speech AI.
I obtained Master of Science in Intelligent Information Systems at Carnegie Mellon University Language Technologies Institute. I was fortunate to be advised by Dr. Shinji Watanabe. Under his supervision, I worked on speech processing problems and contributed to ESPnet. Before that, I obtained my joint Bachelor of Science in Data Science degree at Duke Kunshan University and Duke University.
I am driven by an endless curiosity for how things work and how to make them better and cooler. Outside of work, my hobby projects range from writing compilers and developing video games, to 3D printing custom parts and wrenching on my car. I love the challenge of mastering new skills to bring complex projects to life. I pride myself on my technical agility: the ability to rapidly assimilate new concepts and thrive in an ever-evolving field.
Publications
- [1] Improving ASR Contextual Biasing with Guided Attention
Jiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar, Shinji Watanabe
Accepted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, 2024 - [2] A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and
Multi-task Learning
Jiyang Tang, William Chen, Xuankai Chang, Shinji Watanabe, Brian MacWhinney
Annual Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, 2023 - [3] A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation,
and Understanding Tasks
Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe
Annual Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, 2023 - [4] End-to-End Mandarin Tone Classification with Short Term
Context Information
Jiyang Tang, Ming Li
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Tokyo, Japan, 2021
Machine Learning
| Project | Description |
|---|---|
| espnet | Contributing to Espnet2. Including the MAGICDATA ASR recipe and Aphasia English ASR recipe for [1] |
| speech-recognition | A hand-written speech recognition system for English pronunciation of 10 digits using Python+Numpy |
| asr-ctc | Implementation of the Conformer CTC speech recognition architecture |
Software Engineering
| Project | Description |
|---|---|
| tan | A compiler for my programming language called tan using LLVM+Clang |
| tos | A toy operating system called TOS that supports paging, APIC, ACPI, VBE console, and keyboard input with a custom libc |
| NO-tifications | Purge any notifications on Android |
Game Dev
| Project | Description |
|---|---|
| ExtendedCharacterMovement | A Unreal Engine plugin for extended character movement component for FPS/TPS. |
| tjy_vic3_fix | A collection of my Victoria 3 quality-of-life mod |
| dynamic_road_gen | A procedural road mesh generator like the one used in Cities:Skylines |
