Turtle Games - Data Analytics Portfolio

William MacDonald

Executive Summary

This analysis provides strategic insights to boost Turtle Games' sales performance through data-driven decision making. Key findings include:

Identified 5 distinct customer segments for targeted marketing
Developed a highly accurate model (98% R-squared) for predicting loyalty points
Uncovered sentiment trends in customer reviews to inform product strategy
Created tailored product recommendations for each customer segment

These insights can drive significant improvements in customer engagement, loyalty, and overall sales performance.

Introduction

Turtle Games, a leading game manufacturer and retailer, aims to enhance its sales performance through data-driven strategies. This analysis addresses four key areas:

Customer engagement with loyalty points
Customer segmentation for targeted marketing
Utilsation of customer review data
Predictive modelling of loyalty points

Loyalty Points Modelling

I developed and compared various regression models to predict customer loyalty points, a key indicator of customer engagement and potential sales.

Methodology

After data cleaning and normalisation, we tested several regression models:

Simple and Multiple Linear Regression
Decision Trees and Random Forests
Support Vector Regression (SVR)
Gradient Boosting

Results

Gradient Boosting with Cross Validation emerged as the best performing model, achieving an R-Squared value of 98% and a Mean Squared Error (MSE) of 33,311. The SVR model followed closely with an MSE of 36,593.

Feature Importance

Spending score and remuneration were identified as the most significant predictors of loyalty points. While other variables had minimal individual impact, their collective contribution improved overall model accuracy.

Customer Segmentation

I employed K-means clustering to segment customers based on their spending patterns and remuneration levels. This segmentation allows for more targeted marketing strategies.

Methodology

K-means clustering is an unsupervised machine learning technique that groups similar data points. We used the elbow and silhouette methods to determine the optimal number of clusters.

Cluster Analysis

Key Insights and Recommendations

Cluster 0: High income, low spenders. Focus on increasing their spending scores through targeted promotions and loyalty programs.
Cluster 1: Low income, low spenders. Recommend affordable, high-value products to increase engagement.
Cluster 2: Middle-income earners. Suggest a mix of mid-range products to maintain consistency.
Cluster 3: Low income, moderate spenders. Provide a mix of affordable and aspirational products.
Cluster 4: High income, high spenders. Prioritize retention efforts as they represent the most loyal and likely most profitable customers.

SVR Model and Cluster Analysis Visualization

This visualisation combines the SVR model for loyalty points prediction with the K-means clustering analysis, providing a comprehensive view of customer segments and their predicted loyalty.

Sentiment Analysis

I analyzed customer reviews to gain insights into product perception and overall customer satisfaction. This information is crucial for product development and marketing strategies.

Methodology

I cleaned the review data and compared two popular Natural Language Processing (NLP) libraries: TextBlob and VADER (Valence Aware Dictionary and Sentiment Reasoner).

VADER showed a higher proportion of positive reviews compared to TextBlob. After manual inspection of divergent sentiments, VADER was chosen for its more accurate sentiment identification.

Key Findings

Overall positive sentiment towards Turtle Games products
Identification of specific products with consistently positive or negative sentiment
Insights into common themes in customer feedback

Product Recommendations

By combining insights from our loyalty points modeling, customer segmentation, and sentiment analysis, I developed personalised product recommendations for each customer cluster.

Overall Product Performance

This scatter plot shows the overall performance of products based on sentiment and popularity, helping identify top-performing and underperforming products.

Top 10 Recommended Products

These are the top 10 products recommended across all customer segments.

Cluster-Specific Recommendations

Cluster 0: High Income, Low Spenders

Recommend top products in Cluster 4, as they have similar remuneration but differing spending scores

Cluster 1: Low Income, Low Spenders

Recommend top products in Cluster 3, as they have similar remuneration but differing spending scores.

Cluster 2: Moderate Income and Spending

Suggest a mix of mid-range products to maintain consistency. Recommend top rated products found within this Cluster.

Cluster 3: Low Income, Moderate Spenders

Recommend top products found within this cluster.

Cluster 4: High Income, High Spenders

Recommend top products found within this cluster.

Executive Summary

Introduction

Loyalty Points Modelling

Methodology

Results

Feature Importance

Customer Segmentation

Methodology

Cluster Analysis

Full Clustering Analysis

Key Insights and Recommendations

SVR Model and Cluster Analysis Visualization

Sentiment Analysis

Methodology

Key Findings

Product Recommendations

Overall Product Performance

Top 10 Recommended Products

Cluster-Specific Recommendations

Cluster 0: High Income, Low Spenders

Cluster 1: Low Income, Low Spenders

Cluster 2: Moderate Income and Spending

Cluster 3: Low Income, Moderate Spenders

Cluster 4: High Income, High Spenders