Founding Engineer
SoulBio
- Client-facing systems
- Devloped and deployed open-source drug targetability and selectivity models, reducing hypothesis validation time by ~40% for oncology-focused drug discovery teams.
- Re-architected bulk and single-cell RNA-seq ingestion and computation workflows on a ~1 TB Postgres database, optimizing large-scale table operations involving millions of daily row updates and reducing processing time by ~10× (from ~1 day to a few hours).
- Built and productionized RNA-seq pipelines, internal APIs, and CI/CD automation to support large-scale biological data workflows and downstream analysis.
- Built and deployed an internal visualization and analysis platform over internal data to support various downstream analyses involving heavy computation.
- Maintained AWS cloud infrastructure for all deployed applications, including reliability, scaling, and cost considerations.
- Authored technical documentation and PRDs for internal platforms and new client projects.
- Internal platform & research
- Co-authored a research paper analyzing GPT-based cell type annotations and highlighting their limitations in real biological settings.
- Co-developed and deployed an LLM-based agent enabling natural language search across ~250,000 GEO datasets.
- Proposed and built an AI agent to automate complex RNA-seq bioinformatics workflows, reducing manual overhead and allowing scientists to focus on interpretation rather than pipeline orchestration.
- Wrote technical blogs and whitepapers for external communication and outreach.