~/xirui-huang $ run portfolio.sh

Xirui
Huang

BASc Systems Design Eng · UWaterloo · AI Specialization

scroll
01 — About

Building at the
edge of AI.

Hey — I'm Xirui, a 21-year-old engineering student at the University of Waterloo, pursuing a BASc in Systems Design Engineering with a Minor in AI and an Intelligent Systems Specialization.

I've done back-to-back co-ops at companies from Tesla to AI startups, working on everything from production ML pipelines and LLM fine-tuning to battery cell quality systems. My research has been accepted to NeurIPS and cited by Google DeepMind researchers.

I love working at the intersection of rigorous engineering and applied AI — systems that actually scale, models that actually ship.

schoolUniversity of Waterloo
gpa3.93 / 4.0 · First in Class
focusAI/ML · Systems · Full-Stack
honoursDean's Honor List 4×
statusOpen to Fall 2026 roles
Xirui Huang
Campus and life
Campus and life
02 — Projects & Research

Selected
Work.

RESEARCH · NEURIPS 2024
Chopping Trees — SSDP paper
NeurIPS Efficient Reasoning Workshop
SSDP: Tree-of-Thought Pruning Framework

Co-authored a novel pruning framework for LLM reasoning chains. Achieves significant speedup by reducing search nodes while maintaining answer quality across mathematical benchmarks.

2.3× speedup · 85–90% node reduction · Cited by Google DeepMind
LLaMAQwenHuggingFaceRunPodFastAPI
RESEARCH · SICKKIDS INSTITUTE
Brain MRI views in 3D Slicer for AVM segmentation
ML Research
Brain AVM Segmentation for Pediatric Neurosurgery

Trained 2D U-Net and 3D V-Net on 1,000+ MRI/CT slices to segment brain arteriovenous malformations in pediatric patients. Automates a critical step in neurosurgical radiation planning.

85% DICE score · 1,000+ MRI/CT slices · HPC cluster deployment
PyTorchU-NetV-NetHPC
PROJECT 03 · SEQUENCE HOLDINGS
Chat mining and conversation analytics
AI Pipeline
Chat Mining & Conversation Analytics Platform

Built a pipeline that clusters 40K+ client conversations via HDBSCAN on vector embeddings, surfacing recurring and failing workflows. Includes real-time Slack alerts, client leaderboard, and BigQuery RBAC.

40K+ conversations · Adopted by 80%+ of users at portfolio company
PythonHDBSCANBigQueryTrigger.devDatabricks
PROJECT 04 · WISEDOCS
LoRA-style fine-tuning diagram
LLM Fine-tuning
LLM Entity Extraction & Gemini Fine-tuning

LLM-powered entity extraction for OCR-scanned PDFs using Claude API with structured JSON output. Also fine-tuned Google Gemini on multimodal tasks, outperforming prompt engineering baselines.

96% extraction accuracy · 91% inference accuracy after fine-tuning
Claude APIGeminiFastAPIKubernetesSnowflake
03 — Experience

Where I've
Worked.

Jan 2026 — Apr 2026
Sequence Holdings
AI Software Engineering Intern — Forward-Deployed
New York, New York
  • Architected a chat mining pipeline clustering 40K+ client conversations via HDBSCAN on vector embeddings through Trigger.dev & Databricks, surfacing failing workflows for template creation
  • Built a conversational template generator agent with multimodal input and multi-turn refinement via LangChain tools — adopted by 80%+ of users at portfolio company
  • Rebuilt templates platform into a marketplace with folders, sharing, AI suggestions, and zero-downtime feature-flag rollout
  • Rebuilt portfolio company's user directory consolidating 3 data sources with cross-env sync pipelines, migrating 4 services with zero downtime
PythonLangChainBigQueryDatabricksMongoDBTerraformAuth0
Sequence — late night team dinner and build
Sequence — desk, GitHub, and shipping
Sequence — New York office
May 2025 — Sep 2025
Wisedocs
Machine Learning Intern
Toronto, Ontario
  • Developed an LLM-powered entity extraction system for OCR-scanned PDFs using Claude API with structured JSON output — 96% accuracy on held-out test set
  • Fine-tuned Google Gemini on a multimodal task, outperforming prompt engineering baselines with 91% inference accuracy, including cost-benefit analysis against GPT-4 and Claude
  • Designed a FastAPI + PostgreSQL prompt store with versioning, deployed to Kubernetes via CI/CD with Terraform-based Grafana dashboards and E2E pytests
PyTorchClaude APIGeminiFastAPIKubernetesSnowflake
Toronto skyline from Wisedocs co-op
Wisedocs team
Sep 2024 — Dec 2024
Tesla
TPM Cell Engineering Intern
Austin, Texas · Gigafactory
  • Coordinated 20+ engineers to install high-speed vision upgrades to a Cybertruck battery cell production line — decreased wasted cathode foil by 5.1%
  • Led on-time install of 3,000+ vision parts and drove RFQ & bidding processes for next-gen machine upgrades, acquiring $2.6M in materials and services
Program ManagementManufacturing SystemsData Analysis
Tesla Gigafactory
Tesla factory floor
04 — Stack

Tech I
Work With.

Languages & Frameworks

PythonTypeScriptC++ ReactNext.jsFastAPI FlaskTailwindSQL

AI / ML

PyTorchTensorFlowLangChain HuggingFacescikit-learnpgvector PineconeRunPod

Cloud & Systems

DockerKubernetesAWS GCPTerraformAuth0 Trigger.devDoppler

Data

PostgreSQLSnowflakeDatabricks BigQueryMongoDBDynamoDB SupabaseNeon
06 — Beyond the Terminal

Other
Interests.

Sports

Soccer since I was young, hockey because I'm Canadian, golf for the suffering.

On the ice — intramural hockey
Soccer and golf
Mind games

Chess for structure and long-term calculation; poker for reading people, variance, and making the right call when information is incomplete. Same love of strategy, different clock speeds.

Chess
Poker
🥾
Hiking

Best way to get off a screen. Ontario trails, weekend getaways to the Rockies, and the occasional "this seemed like a good idea" elevation gain.

Trail and view
Summit hike
07 — Contact

Let's build
something great.

Open to Fall 2026 internship opportunities, interesting projects, and good conversations.

xirui@contact — bash
~ $ cat contact.json
{
  "github": "github.com/xrhuang10",
  "resume": "[download pdf] ↗"
}
~ $