Creating a segmentation strategy with Python
Analyze purchase patterns to create potential customer segments
Identify Instacart's busiest and least busy days and times, including whether certain times correlate with higher spending
Evaluate the popularity of different product categories, looking at differences overall and by demographic
Compare ordering behaviors based on buyer characteristics, such as loyalty status, purchase freqency, and demographics
Use demographic data to create buyer segments with different purchase patterns
Data
Context
Grocery delivery app Instacart serves a range of customers from all over the United States. In this student project for CareerFoundry, I used Python to identify commonalities in how people shop, from the most popular purchase days to which demographics spend more on the platform.
Professional Competencies
Data profiling & cleaning
Data wrangling & subsetting
Variable derivation
Excel reporting
Storytelling with data
Data visualization with Python
Objectives
This project uses real Instacart order data alongside fabricated consumer data developed for educational purposes. Instacart data sets are publicly available on Kaggle.
Phase 1: Preparing the Data
Before conducting the required analysis, I needed to learn how to clean and wrangle a dataset in Python. Having never done any kind of coding, this required patience and a commitment to problem-solving. Using the resources at my disposal, including mentor support and online documentation, I was able to clean and check the individual dataframes. and merge to create the final version I needed.
Phase 2: Deriving and Analyzing New Variables
To answer Instacart's key business questions, I needed to derive new variables and analyze the basic distribution within those groupings. This required a working knowledge of:
If-statements
For-loops
User-defined functions
Loc() functions
Loc() functions proved the most useful in this case, and I was able to analyze the distribution of individual variables.
Phase 3: Identify Ordering Differences Across Customer Profiles
Once I had the variables I needed, it was time to analyze buying patterns across customer segments. I drew on my experience in personalized digital marketing and considered what combinations of customer characteristics might impact buying patterns.
The coding aspect of this task was particularly challenging, as I needed to go beyond the provided instructions within the course to conduct more advanced segmentation. Combining the advice I received from peers and the documentation on Matplotlib's home site, I learned to produce multi-bar graphs. I used this strategy to analyze customer behavior.
I was less satisfied with the aesthetics of the Python visualizations. Ultimately, I decided to present these visualizations as-is to provide a complete picture of my results. I was confident in my ability to transfer the information to Tableau should the need arise for a more polished presentation.
Phase 4: Present Recommendations for Audience Segmentation
Instead of seeing this as a failure, I presented these insights as a money-saving opportunity: Instacart did not need to create complex product-focused strategies for each segment.
The company could focus on broader trends, such as the overall preference for mid-priced products at the middle- and low-income levels. Instacart had a starting point for its strategy and could use many of its current non-segmented resources.
My analysis showed middle-income parents to be the biggest generators of revenue for Instacart, with most customers buying lower-ticket items across demographic groups. However, the data showed minimal difference in preferred product types and ordering cadence between demographic groups.
Recommendations for Further Analysis
How does buying data change throughout the year? Understanding seasonal buying patterns can inform a segmented strategy. For example, do parents spend more during the holiday season?
Does past buying history predict future needs? Customers often order from the same categories. Would it be more cost-efficient for Instacart to segment based on history versus demographics?