About Me
Jiyang (Mark) Tang
I am a Software Engineer at Intel Corporation.
I obtained Master of Science in Intelligent Information Systems at Carnegie Mellon University Language Technologies Institute. I was fortunate to be advised by Dr. Shinji Watanabe. Under his supervision, I worked on speech processing problems and contributed to ESPnet
Before that, I obtained my joint Bachelor of Science in Data Science degree at Duke Kunshan University and Duke University.
Publications
- [1] Improving ASR Contextual
Biasing with Guided Attention
Jiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar, Shinji Watanabe
Accepted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, 2024 - [2] A New Benchmark of Aphasia
Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning
Jiyang Tang, William Chen, Xuankai Chang, Shinji Watanabe, Brian MacWhinney
Annual Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, 2023 - [3] A Comparative Study on
E-Branchformer vs Conformer in Speech Recognition,
Translation, and Understanding Tasks
Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe
Annual Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, 2023 - [4] End-to-End Mandarin Tone Classification with Short Term Context
Information
Jiyang Tang, Ming Li
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Tokyo, Japan, 2021
ML/AI (mainly speech processing)
Project | Description |
---|---|
tjysdsg/espnet | Contributing to Espnet2. Including the MAGICDATA ASR recipe and Aphasia English ASR recipe for [1] |
tjysdsg/speech-recognition | A hand-written speech recognition system for English pronunciation of 10 digits using Python+Numpy |
tjysdsg/asr-ctc | Implementation of the Conformer CTC speech recognition architecture |
tjysdsg/hippo | An AI-driven pronunciation coaching app called Hippo |
tjysdsg/capt-public | …and its server backend |
tjysdsg/kaldi | My Kaldi fork for CAPT using Goodness of Pronunciation (GOP) |
tjysdsg/std-mandarin-kaldi | A training recipe for Standard Mandarin (no accent) acoustic model |
tjysdg/tone_classifier | Mandarin Tone Classification experiments |
tjysdsg/aidatatang_force_align | Kaldi phone-level force alignment scripts for [3] |
tjysdsg/ml | Implementation of some fundamental ML algorithms |
tjysdsg/dance-classifier | A video dance style classifier |
tjysdsg/pytorch-projects | Some old pytorch ASR experiments |
tjysdsg/birds | A Chinese endemic bird image dataset |
tjysdsg/ali_to_phone | Some scripts for extracting phone alignment from ali.*.gz alignment files generated by Kaldi |
System Programming
Project | Description |
---|---|
tjysdsg/tan | A compiler for my programming language called tan using LLVM+Clang |
tjysdsg/tos | A toy operating system called TOS that supports paging, APIC, ACPI, VBE console, and keyboard input with a custom libc |
tjysdsg/newlib | Working on a TOS port of newlib |
tjysdsg/acpica | A TOS port of ACPICA |
tjysdsg/float_repr | A tool for visualizing IEEE float32 representation |
tjysdsg/test-bench | C++ experiments, tests, and benchmarks for learning and future reference |
tjysdsg/tjy_vic3_fix | A collection of my Victoria 3 quality-of-life mod (1 (1,075 subscribers), 2 (18 subscribers)) |
tjysdsg/dynamic_road_gen | A procedural road mesh generator like the one used in Cities:Skylines |
tjysdsg/cmu-15513 | THE best computer science course I have ever taken |
tjysdsg/cs308-slogo | Logo programming environment created with my wonderful teammates. By far THE best team work experience |
tjysdsg/notification_remover | An android app to remove annoying notifications, published on |
Misc
Project | Description |
---|---|
tjysdsg/blender-projects | Some 3D models I made with blender |
tjysdsg/dotfiles | Dotfiles |
tjysdsg/nvim | A minimal neovim configs |
tjysdsg/ohmyzsh | zsh configs |
Posts
I sometimes write posts about programming and ML