Combining Qualtrics survey data with geocoding and U.S. Census demographics to understand why Peloton customers reduce their usage — and what geographic, income, and age patterns predict churn.
Peloton experienced rapid growth during the COVID-19 pandemic, followed by a significant decline in subscriber retention. This project investigates the factors behind customer churn using a dataset of 317 survey respondents collected via Qualtrics, enriched with geographic and demographic data.
The final dataset was assembled from three sources through a multi-step process, combining survey responses with geocoded location data and U.S. Census demographics.
Qualtrics survey with lat/long coordinates, usage behavior, and churn status
GeoPy API (Nominatim) converts 317 coordinate pairs to ZIP codes
ZIP code join with 40,959-row census dataset for city, state, population, density
Demographics (income, age, gender) and behavior data merged into final dataset
Survey responses were distributed across the United States, with the heaviest concentration in California, New York, and other major coastal states.
A donut chart showing which states had the most survey respondents. California and New York dominate, reflecting Peloton's urban-coastal customer base.
The same distribution as a bar chart for precise comparison. The long tail reveals that Peloton has customers spread across nearly every state.
Most Peloton customers are located in cities with moderate population sizes, with a right-skewed distribution showing some customers in very large metro areas.
Customer churn — defined as a reduction in Peloton usage over the past 6 months — was the central variable of interest. The visualizations below explore churn patterns across geography and age.
Stacked bar chart showing churn (orange) vs. retained (blue) customers in each state. California has both the highest total responses and the highest absolute churn count.
Customer churn broken down by age group. Churn appears across all age brackets, but certain age groups show disproportionately higher churn rates.
Histogram of the overall age distribution, providing context for interpreting the churn-by-age chart above. The majority of respondents fall in the 30-50 age range.
To understand whether any numerical variables share linear relationships, pairwise plots and correlation matrices were generated for population, density, ZIP code, age, and income.
Scatterplot matrix exploring relationships between population, density, ZIP code, age, and income. No strong linear relationships are apparent between these variables.
A slight correlation exists between population and density (0.52), with a minor correlation between age and income. The matrix does not indicate a strong relationship between ZIP code, income, and age.
This pivot table heatmap reveals two distinct churn patterns: (1) strong churn at lower income and higher age, and (2) a milder churn pattern at higher income and younger age (~29.5 years).
Two distinct churn profiles emerged from the analysis — suggesting that Peloton faces different retention challenges across customer segments.