Computer Science student (AI focus) at Stanford University; Passionate about languages, classicsical studies, machine intelligence + knowledge & healthcare; Bridging the quantitative and the humane.
Experiences
Graduate Research Assistant
Sep 2025 -- Working on the Marin Project: an open lab for all aspects of building foundation models
Machine Learning Engineer (internship)
Jun 2025 - Aug 2025- Developed observability & metrics to evaluate and improve deep NLU product features & enable live monitoring during the 10x scaling of voice agents' outbound calling capacity
Undergraduate Researcher & Developer
Dec 2024 - May 2025- Worked on My Heart Counts AI dataset and created time-series models at the Ashley Lab for precision medicine
- Led building Stanford Feedbridge, an iOS + Firebase baby tracker to prevent physiologic poor feeding in newborns with data-driven timely interventions
- Fullstack DevOps for DoctorWell physician well-being platform & study
Software Engineer (part-time)
Sep 2024 - Mar 2025- Building and testing core product (collaborative semantic knowledge graph and editor) in TypeScript and Next.js
- Conducting data science and NLP tasks; spec-ed and shipped graph query system
- Supporting early researcher users; pitching and some biz dev
Undergraduate Researcher
Jan 2024 –- Researched generative genetic models for perturbation prediction using PubMed-derived causal graphs
- Developed an experiment framework for synthetic bootstrapping and evaluation
- Initiated collaboration with Ideaflow Inc.
Some Writings
Delere imperium and animi imperio: the semantics of imperium in Cic. Cat. and Sall. Cat.. Latin Philology, 2021 (link)
Things They Said: A Narrative — Societal Rhetoric of Vision-Impairment in Modern Japanese History. Rhetoric, Japanese History, History of Medicine, 2022 (link)
The Dawn of Modern Probability: A New Partial Translation of & Commentary on Christiaan Huygens’ De Ratiociniis in Ludo Aleae (1657). Latin, History of Science, 2023 (link)
About 🇨🇦
I am currently a coterminal MSCS student at Stanford University (conferred my undergrad w/ the Class of 2025). Reach me at pinlinxu [at] stanford [dot] edu, or any of the other links above. You can usually find me in the San Francisco Bay Area or British Columbia, Canada.
I have also been the vice president of Stanford Kendo Club (Japanese sword-fighting martial art), and member of Stanford Storyboard Club and Stanford Amateur Radio Club (Call Sign KN6YCY, FCC General Class). I speak English, Mandarin Chinese, and Japanese (Latin not really).
Some Projects
FLFL: Japanese Furigana Generation using Aligned Whisper Audiobook Transcription
2024HuggingFace Trainer, axolotl, wandb, Modal.com
- Released finetuning datasets from parsing 4000 hours of public-domain audiobook data released by the Japanese National Diet Library
- Finetuned stockmark/gpt-neox-japanese-1.4b for furigana generation
- Evaluated performance against few-shot GPT-4, MeCab/Unidic (writeup incoming)
Non-Greedy Furigana String Generation
The Shades of Meaning: Investigating LLMs' Cross-lingual Representation of Grounded Structures
2024, CS 224N: Natural Language Processing with Deep LearningPython, PyTorch, HuggingFace transformers, SciPy
- Led an outstanding CS 224N custom project
- Collected and built a dataset of cross-lingual cultural color words
- Performed a series controlled experiments on how the quality of representations varies with language, model, context, and fine-tuning
- Introduced a novel color-mapping experiment that visualizes languages’ color representation
Predicting Hospital Length of Stay from Imbalanced Data
2024, CS 229: Machine LearningPython, scikit-learn, XGBoost
- Built and presented a strong classification-regression pipeline using Synthetic Minority Oversampling, ensemble learning
Allegorical Lisp Machine
2023, CS107E: Computer Systems from the Ground UpC, Lisp, ARMv6 Assembly
- A freestanding graphical Lisp environment on Raspberry Pi A+
- Implemented Lisp interpreter, system calls, exception handling, REPL, etc. from relevant papers
- Implemented memory allocation, bitmapped graphics, serial IO, math library, etc. in baremetal C
- Wrote specifications, tracked progress, and assigned tasks as co-dev and project manager
- Name is a play on Symbolics
7GUIs with React + TypeScript + MobX
2024React, TypeScript, MobX, Node.js
- A concise and accurate implementation of the 7GUIs challenge
- Fully reactive with state management & value derivation in MobX
Flow Tree-Style-Tab Browser
2019SwiftUI, UIKit, WKWebView, Combine
- The first tree-style tab browser on iOS & iPadOS
- Utilized (then) latest native APIs to enable features such as iCloud sync, adblocker, dark and light mode, drag and drop, and multiwindow interactions
- 89.8K impressions and 2.6K downloads while it was on the App Store
Hikari Ray Tracer
2022Typed Racket
- Implementation of The Ray Tracer Challenge in Typed Racket (PLT Scheme, Lisp dialect)
-
Has features of up to chapter 11 of the book and additionally:
- multithreaded rendering
- focal blur
- Source code is tangled from the (not very) literate raytracer-challenge.org document.
math.c
2021C, some Calculus
- Naive freestanding implementation of
math.h
Other older things I worked on include solutions to SICP exercises (like 2.58), and all kinds of scripts for indexing / tagging local media, automating building Anki decks from Kindle Vocabulary Builder using MeCab and MDict, scanning dactylic hexameter for homework, etc.