Creating a segmentation strategy with Python

Analyze purchase patterns to create potential customer segments
  • Identify Instacart's busiest and least busy days and times, including whether certain times correlate with higher spending

  • Evaluate the popularity of different product categories, looking at differences overall and by demographic

  • Compare ordering behaviors based on buyer characteristics, such as loyalty status, purchase freqency, and demographics

  • Use demographic data to create buyer segments with different purchase patterns

Data

Context

Grocery delivery app Instacart serves a range of customers from all over the United States. In this student project for CareerFoundry, I used Python to identify commonalities in how people shop, from the most popular purchase days to which demographics spend more on the platform.

Professional Competencies

  • Data profiling & cleaning

  • Data wrangling & subsetting

  • Variable derivation

  • Excel reporting

  • Storytelling with data

  • Data visualization with Python

Objectives

This project uses real Instacart order data alongside fabricated consumer data developed for educational purposes. Instacart data sets are publicly available on Kaggle.

Phase 1: Preparing the Data

Before conducting the required analysis, I needed to learn how to clean and wrangle a dataset in Python. Having never done any kind of coding, this required patience and a commitment to problem-solving. Using the resources at my disposal, including mentor support and online documentation, I was able to clean and check the individual dataframes. and merge to create the final version I needed.

Phase 2: Deriving and Analyzing New Variables

To answer Instacart's key business questions, I needed to derive new variables and analyze the basic distribution within those groupings. This required a working knowledge of:

  • If-statements

  • For-loops

  • User-defined functions

  • Loc() functions

Loc() functions proved the most useful in this case, and I was able to analyze the distribution of individual variables.

Phase 3: Identify Ordering Differences Across Customer Profiles

Once I had the variables I needed, it was time to analyze buying patterns across customer segments. I drew on my experience in personalized digital marketing and considered what combinations of customer characteristics might impact buying patterns.

The coding aspect of this task was particularly challenging, as I needed to go beyond the provided instructions within the course to conduct more advanced segmentation. Combining the advice I received from peers and the documentation on Matplotlib's home site, I learned to produce multi-bar graphs. I used this strategy to analyze customer behavior.

I was less satisfied with the aesthetics of the Python visualizations. Ultimately, I decided to present these visualizations as-is to provide a complete picture of my results. I was confident in my ability to transfer the information to Tableau should the need arise for a more polished presentation.

Phase 4: Present Recommendations for Audience Segmentation

Instead of seeing this as a failure, I presented these insights as a money-saving opportunity: Instacart did not need to create complex product-focused strategies for each segment.

The company could focus on broader trends, such as the overall preference for mid-priced products at the middle- and low-income levels. Instacart had a starting point for its strategy and could use many of its current non-segmented resources.

My analysis showed middle-income parents to be the biggest generators of revenue for Instacart, with most customers buying lower-ticket items across demographic groups. However, the data showed minimal difference in preferred product types and ordering cadence between demographic groups.

Recommendations for Further Analysis

  • How does buying data change throughout the year? Understanding seasonal buying patterns can inform a segmented strategy. For example, do parents spend more during the holiday season?

  • Does past buying history predict future needs? Customers often order from the same categories. Would it be more cost-efficient for Instacart to segment based on history versus demographics?