About Me

Jiyang (Mark) Tang

Speech Recognition Natural Language Processing Research Compiler Game Dev

C++ Python Java PyTorch ESPnet Kaldi LLVM SLURM AWS Docker Linux Flutter Git Unity

Vim JetBrains

I am a Software Engineer at Intel Corporation.

I obtained Master of Science in Intelligent Information Systems at Carnegie Mellon University Language Technologies Institute. I was fortunate to be advised by Dr. Shinji Watanabe. Under his supervision, I worked on speech processing problems and contributed to ESPnet

Before that, I obtained my joint Bachelor of Science in Data Science degree at Duke Kunshan University and Duke University.

Publications

  • [1] Improving ASR Contextual Biasing with Guided Attention
    Jiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar, Shinji Watanabe
    Accepted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, 2024
  • [2] A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning
    Jiyang Tang, William Chen, Xuankai Chang, Shinji Watanabe, Brian MacWhinney
    Annual Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, 2023
  • [3] A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
    Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe
    Annual Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, 2023
  • [4] End-to-End Mandarin Tone Classification with Short Term Context Information
    Jiyang Tang, Ming Li
    Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Tokyo, Japan, 2021

ML/AI (mainly speech processing)

Project Description
tjysdsg/espnet Contributing to Espnet2. Including the MAGICDATA ASR recipe and Aphasia English ASR recipe for [1]
tjysdsg/speech-recognition A hand-written speech recognition system for English pronunciation of 10 digits using Python+Numpy
tjysdsg/asr-ctc Implementation of the Conformer CTC speech recognition architecture
tjysdsg/hippo An AI-driven pronunciation coaching app called Hippo
tjysdsg/capt-public …and its server backend
tjysdsg/kaldi My Kaldi fork for CAPT using Goodness of Pronunciation (GOP)
tjysdsg/std-mandarin-kaldi A training recipe for Standard Mandarin (no accent) acoustic model
tjysdg/tone_classifier Mandarin Tone Classification experiments
tjysdsg/aidatatang_force_align Kaldi phone-level force alignment scripts for [3]
tjysdsg/ml Implementation of some fundamental ML algorithms
tjysdsg/dance-classifier A video dance style classifier
tjysdsg/pytorch-projects Some old pytorch ASR experiments
tjysdsg/birds A Chinese endemic bird image dataset
tjysdsg/ali_to_phone Some scripts for extracting phone alignment from ali.*.gz alignment files generated by Kaldi

System Programming

Project Description
tjysdsg/tan A compiler for my programming language called tan using LLVM+Clang
tjysdsg/tos A toy operating system called TOS that supports paging, APIC, ACPI, VBE console, and keyboard input with a custom libc
tjysdsg/newlib Working on a TOS port of newlib
tjysdsg/acpica A TOS port of ACPICA
tjysdsg/float_repr A tool for visualizing IEEE float32 representation
tjysdsg/test-bench C++ experiments, tests, and benchmarks for learning and future reference
tjysdsg/tjy_vic3_fix A collection of my Victoria 3 quality-of-life mod (1 (1,075 subscribers), 2 (18 subscribers))
tjysdsg/dynamic_road_gen A procedural road mesh generator like the one used in Cities:Skylines
tjysdsg/cmu-15513 THE best computer science course I have ever taken
tjysdsg/cs308-slogo Logo programming environment created with my wonderful teammates. By far THE best team work experience
tjysdsg/notification_remover An android app to remove annoying notifications, published on On Google Play

Misc

Project Description
tjysdsg/blender-projects Some 3D models I made with blender
tjysdsg/dotfiles Dotfiles
tjysdsg/nvim A minimal neovim configs
tjysdsg/ohmyzsh zsh configs

Posts

I sometimes write posts about programming and ML