The Archive

Musings on AI, software, career, and learning.

2025 Oct 22

Improving the Notebook Agent with Error Clustering

How we used clustering on the Notebook Agent's reasoning traces to figure out where it was getting stuck, and what we did about it.

→

2024 Feb 21

Emerging UX Patterns for Generative AI Apps & Copilots

LLM-powered applications, particularly those with chat interfaces, appear to have significant problems with user churn. For a multitude of reasons, including not understanding how to use the app, getting bad or unreliable results, or simply finding using the app too tedious to use, users often abandon these tools nearly as quickly as they came. Knowledge workers are tired of juggling multiple apps, and are looking for ways to pare down their tool stack to the bare minimum needed to get their work done.

→

2024 Jan 22

Using LLMs to Learn 3x Faster As An Engineer

Anyone in software engineering knows by now that being able to learn and get up to speed on new topics, libraries, or languages is a critical skill, both for onboarding onto a new team, as well as for continuously providing value on an existing team.

→

2023 Sep 7

What Happens When You Install Python

Recently someone I know needed to install Python, and as is completely normal and expected for newcomers, was confused by the process. In this post I explain what's actually happening when you install Python and add a few miscellaneous tips for development environment management in general.

→

2023 Aug 24

From BERT to GPT-4: An NLP Engineer's Perspective

Charlene Chambliss is a senior software engineer at Aquarium Learning, where she's working on tooling to help ML teams improve their model performance by improving their data. In addition to being an incredible engineer with an inspiring backstory, Charlene previously worked on NLP applications at Primer.AI. In this blog post, we interview Charlene about her experiences working with older models like BERT, and the perspective this gives her on the more recent wave of generative, RLHF-based LLMs (e.g. GPT-4 and LLaMA).

→

2020 Nov 1

Introducing FoodBERT: Food Extraction with DistilBERT

I built a token classification model using DistilBERT to provide a lightweight and fast method for extracting foods and ingredients from structured and unstructured text. This model can aid analysis of how foods are talked about and represented in various sources, in both research and commercial contexts.

→

2020 Oct 31

FAQ: What Is Data Science in the D2C Space Like?

This post is part of a series in which I highlight a few of the questions I get asked most often about DS/ML, what it takes to get in, and what it's like once you're there.

→

2020 Oct 31

FAQ: Getting Into Data Science / Machine Learning

This post is part of a series in which I highlight a few of the questions I get asked most often about DS/ML, what it takes to get in, and what it's like once you're there.

→

2020 Oct 31

FAQ: Why Focus On NLP, and What Am I Working On at Primer?

This post is part of a series in which I highlight a few of the questions I get asked most often about DS/ML, what it takes to get in, and what it's like once you're there.

→

2020 Aug 15

(Interview on TDS) Charlene Chambliss: From Psychology to Natural Language Processing and Applied Research

I'm an alum of the SharpestMinds data science mentorship program, and one of my fellow mentees recently interviewed me for her Women in Technology Series.

→

2020 Jan 17

How to Fine-Tune BERT for Named Entity Recognition

A two-part series I wrote on how to fine-tune BERT for named-entity recognition, a core information extraction task.

→

2019 Mar 30

Cleaning, Analyzing, and Visualizing Survey Data with Python

A tutorial using `pandas`, `matplotlib`, and `seaborn` to produce digestible insights from dirty customer survey data.

→

2019 Mar 3

Using word2vec to Analyze News Headlines and Predict Article Success

As part of my efforts to learn in public earlier on in my data science journey, I wrote this article on an end-to-end analysis I did on a dataset of news headlines.

→