BUSINESS BACKGROUND¶

  • Pizza Hut is the first international pizza brand present in Vietnam since 2006. With the dynamic positioning by targeting young customers, this has been the market leader in terms of revenue and the quantity of 130 stores.

  • During the period 2019-2023, Covid has significantly changed consumer behavior in the F&B industry. Customers now not only seek a quality product but also demand more culinary experiences. To maintain its leading position, Pizza Hut has implemented many renovation campaigns to adapt to the changing patterns.

  • In 2023, Pizza Hut opened 2 signature stores with new design concepts to improve dine-in quality. Furthermore, it helps customers reduce queuing time by applying the Bring Your Own Device (BYOD) feature when ordering at the table. This is also a leading point to encourage customers to interact with online ordering activities on the website or through the in-house app.

  • In terms of R&D, the brand regularly introduces product innovations such as ‘Hot pot pizza, ‘Crispy shrimp pancake pizza’ or ‘Silkworm pizza’. This unique fusion practice has helped a Western-origin dish become more familiar to Vietnamese tastes.

  • Pizza Hut's communication activities also attract a vibrant discussion on social networks thanks to its youthful language, unique approach, and novel ideas in a context where fast food brands solely mention promotion.

  • However, the current market is becoming more and more competitive due to the emergence of new competitors and the diversity of culinary options. Besides the renovation activities mentioned, gaining additional insights from data such as Customer life value (CLV) and Customer churn rate has always been a critical perspective in determining business growth.

  • The Business Analyst team are required to report to the Head of department an accurate evaluation of the company, forecast its future performance and feasible recommendations.

OUTLINE¶

  1. Import Data, EDA and RFM Segmentation
  • Import Data
  • EDA
  • RFM Segmentation
  1. Churn and CLV Predict
  • Churn Risk Modeling
  • CLV Model
  1. Final Model for CLV and Churn Probability
  • Churn Probability and CLV Model for Cluster 1 and 3
  • Churn Probability and CLV Model for Cluster 0 and 2
  1. Additional Insights
  • Retention Rate
  • New Customer by Month
  1. Recommendations

1. Import Data, EDA and RFM Segmentation¶

Import Data¶

In [ ]:
import numpy as np
import pandas as pd
from scipy import stats
%matplotlib inline
import datetime as dt
In [ ]:
from IPython.display import display, HTML

# Function to display dataframes side by side
def display_side_by_side(*args):
    html_str = ''
    for df in args:
        html_str += df.to_html() + '\xa0\xa0\xa0'
    display(HTML('<div style="display: flex;">' + html_str + '</div>'))
In [ ]:
pd.set_option('display.max_column', None)
pd.set_option('display.max_row', None)
pd.set_option('display.float_format', lambda x: '%.3f' % x)
import warnings
warnings.filterwarnings('ignore')
In [ ]:
# Import Data
from config import DATA_PATH

data = pd.read_csv(DATA_PATH, parse_dates=["TransactionDate"])
data["TransactionDate"] = pd.to_datetime(data.TransactionDate).dt.date
data['TransactionDate'] = data['TransactionDate'].apply(pd.to_datetime)

# Exclude data from July (The data only contains data until 1st July)
data = data[data['TransactionDate'] < '2023-07-01']

EDA¶

In [ ]:
data.drop(columns=['Unnamed: 0'], inplace=True)

data.info() # there are no missing values
data[['TransactionDate', 'SalesAmount']].describe()
<class 'pandas.core.frame.DataFrame'>
Index: 1397202 entries, 0 to 1397201
Data columns (total 9 columns):
 #   Column           Non-Null Count    Dtype         
---  ------           --------------    -----         
 0   BillID           1397202 non-null  int64         
 1   Channel          1397202 non-null  object        
 2   OrderFrom        1397202 non-null  object        
 3   TransactionDate  1397202 non-null  datetime64[ns]
 4   SalesAmount      1397202 non-null  float64       
 5   CustomerID       1397202 non-null  int64         
 6   CustomerGender   1397202 non-null  object        
 7   VoucherStatus    1397202 non-null  object        
 8   Province         1397202 non-null  object        
dtypes: datetime64[ns](1), float64(1), int64(2), object(5)
memory usage: 106.6+ MB
Out[ ]:
TransactionDate SalesAmount
count 1397202 1397202.000
mean 2022-08-01 12:10:43.237555968 310330.403
min 2021-10-01 00:00:00 -31605.000
25% 2022-02-22 00:00:00 178089.000
50% 2022-07-21 00:00:00 261690.000
75% 2023-01-09 00:00:00 364380.750
max 2023-06-30 00:00:00 57731681.000
std NaN 317534.538

We see that the min value of SalesAmount is negative, so we will need to see the number of negative values and the value itself to decide how to handle it.

In [ ]:
data[data['SalesAmount'] <= 0].head()
Out[ ]:
BillID Channel OrderFrom TransactionDate SalesAmount CustomerID CustomerGender VoucherStatus Province
1052712 1052712 Take Away STORE 2023-01-12 -31605.000 1996642 Female No Nothern Provinces
1212340 1212340 Take Away STORE 2023-03-29 -1104.000 768491 Male No Hanoi
In [ ]:
display_side_by_side(data[data['CustomerID'] == 1996642][['CustomerID', 'SalesAmount']], 
                        data[data['CustomerID'] == 768491][['CustomerID', 'SalesAmount']])
CustomerID SalesAmount
289260 1996642 255165.000
1048367 1996642 3821364.000
1052712 1996642 -31605.000
   
CustomerID SalesAmount
14910 768491 480696.000
258382 768491 527982.000
359281 768491 395003.000
515940 768491 390227.000
658327 768491 450617.000
790788 768491 483823.000
1133172 768491 666475.000
1186516 768491 232791.000
1212096 768491 279389.000
1212340 768491 -1104.000
1233398 768491 242667.000
   

Based on the data, we see that the data is probably have the wrong sales amount, since the value without the minus sign is very low compared to normal SalesAmount of these 2 customers. So we'll cutoff these 2 records.

In [ ]:
data = data[data['SalesAmount'] > 0]
In [ ]:
data.nunique() # There is no duplicate value in billID
Out[ ]:
BillID             1397200
Channel                  3
OrderFrom                4
TransactionDate        638
SalesAmount         529684
CustomerID          718050
CustomerGender           3
VoucherStatus            2
Province                 4
dtype: int64
In [ ]:
import matplotlib.pyplot as plt
import seaborn as sns

group = ["CustomerGender", "VoucherStatus", "Province", "OrderFrom", "Channel"]

for x in group:
    plt.figure(figsize=(7,4))
    
    # Plotting the average sales amount
    avg_sales = data.groupby(x)['SalesAmount'].mean()
    plt.subplot(1, 2, 1)
    sns.barplot(x=avg_sales.index, y=avg_sales.values, palette='viridis')  # Change the color palette here
    plt.title('Average Sales Amount per ' + x)
    plt.xticks(rotation=90)
    
    # Plotting the total sales amount
    total_sales = data.groupby(x)['SalesAmount'].sum()
    plt.subplot(1, 2, 2)
    sns.barplot(x=total_sales.index, y=total_sales.values, palette='viridis')  # Change the color palette here
    plt.title('Total Sales Amount per ' + x)
    plt.xticks(rotation=90)
    
    plt.tight_layout()
    plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
  • There are certain segments of the population that are more likely to eat pizza than others, with a slight majority of male customers. Nearly 60% of the bill doesn’t reveal the gender.
  • Two cities provide 63.3% of sales, with Hanoi favoring stores and Ho Chi Minh residents preferring websites. Hanoi accounts for the largest proportion of sales amount, followed by Ho Chi Minh City.
  • Most of the revenue in Hanoi branches come from stores, while website is the “golden source” for Ho Chi Minh City.
  • Customers take advantage of vouchers through app & website, but still mostly order from store. Most customers arrange from store & store orders have the lowest average amount.
  • Large number of people use vouchers through app and website.
  • Customers prefer takeout or receive delivery, but “Dine In” & “Take away” show opposite tendency. Takeaway & delivery account for nearly 98% of the order.
  • Customers tend to spend more when Dine In.
In [ ]:
# Count the number of orders per customer
order_counts = data.groupby('CustomerID')['BillID'].count()

# Filter customers with only one order
customers_with_one_order = order_counts[order_counts == 1]

# Get the count of customers with only one order
number_of_customers_with_one_order = len(customers_with_one_order)

print("Number of customers with only one order:", number_of_customers_with_one_order)
Number of customers with only one order: 495255
In [ ]:
from datetime import timedelta
# Find the final date in the 'TransactionDate' column
final_date = data['TransactionDate'].max()

# Calculate the cutoff date for the last 6 months from the final date
cutoff_date = final_date - timedelta(days=180)

# Filter customers who haven't had any orders in the last 6 months from the final date
inactive_customers = data[data['TransactionDate'] < cutoff_date].groupby('CustomerID').size()

# Get the count of inactive customers
number_of_inactive_customers = len(inactive_customers)

print("Number of customers with no orders in the last 6 months from the final date:", number_of_inactive_customers)
Number of customers with no orders in the last 6 months from the final date: 575736
In [ ]:
# Sort values ​​to calculate time since last order
data_sorted = data.sort_values(['CustomerID', 'TransactionDate'])

# Lag transaction time
data_sorted['lag'] = data_sorted.groupby('CustomerID')['TransactionDate'].shift(1)

# Convert timedelta to hours
data_sorted['diff'] = (data_sorted['TransactionDate'] - data_sorted['lag'])/np.timedelta64(1, 'D')

# Calculate the average time between orders
average_time_between_orders = data_sorted['diff'].mean()

print("Average time between customer orders:", average_time_between_orders)
Average time between customer orders: 79.64693513951262
In [ ]:
# Filter customers who haven't made an order in 2023
inactive_customers_2023 = data[data['TransactionDate'].dt.year == 2023].groupby('CustomerID').size()

# Get the count of inactive customers in 2023
number_of_inactive_customers_2023 = len(inactive_customers_2023)

print("Number of customers who haven't made an order in 2023:", number_of_inactive_customers_2023)
Number of customers who haven't made an order in 2023: 240184
In [ ]:
sns.distplot(data_sorted[~data_sorted['diff'].isna()]['diff'])
plt.show()
No description has been provided for this image
  • Here we can see a glimpse of the distribution of the gap between user purchases, with the majority of customers having a gap between purchases of less than 100 days, with an average of 80 days.
  • The majority of Pizza Hut's customers are one-time customers (495255 customers).

Other Conclusion (From analysis in PowerBI):¶

  • PizzaHut is a “Hot spot” during weekends & holiday seasons. The closer the weekend is, the more customers tend to buy as people mostly buy on Saturday & Sunday. The sales amount per day, especially in weekends & holiday seasons is higher than normal day.
  • The number of customers choose Dine In tend to increase, while the number of customers choose Takeaway decrease.

RFM Segmentation¶

In [ ]:
today = max(data['TransactionDate']) + timedelta(days=1)

# Calculate RFM values
RFM = data.groupby(['CustomerID']).agg({
    'TransactionDate': lambda x: (today - x.max()).days,
    'BillID': 'count',
    'SalesAmount': 'sum'})

# Rename column
RFM.columns = ['Recency', 'Frequency', 'Monetary']
In [ ]:
fig, ax = plt.subplots(3, 1, figsize=(8, 12))

# Plot Monetary
sns.kdeplot(data=RFM, x='Monetary', ax=ax[0], fill=True, palette="crest", linewidth=3)
ax[0].set_title('Monetary Distribution')

# Plot Frequency
sns.kdeplot(data=RFM, x='Frequency', ax=ax[1], fill=True, palette="crest", linewidth=3)
ax[1].set_title('Frequency Distribution')

# Plot Recency
sns.kdeplot(data=RFM, x='Recency', ax=ax[2], fill=True, palette="crest", linewidth=3)
ax[2].set_title('Recency Distribution')

plt.tight_layout()
plt.show()
No description has been provided for this image

Here we can see the Monetary and Frequency is highly skewed to the right. So will we need to try some transformation technique to reduce the skewness.

In [ ]:
from numpy import sqrt, log10

# Define transformation methods
transformations = {
    'none': lambda x: x,
    'sqrt': sqrt,
    'log10': log10,
    'boxcox': lambda x: stats.boxcox(x)[0]
}

# Calculate and print skewness for each transformation
for name, transform in transformations.items():
    recency_transformed = transform(RFM['Recency'])
    recency_skew = stats.skew(recency_transformed)

    frequency_transformed = transform(RFM['Frequency'])
    frequency_skew = stats.skew(frequency_transformed)

    monetary_transformed = transform(RFM['Monetary'])
    monetary_skew = stats.skew(monetary_transformed)

    print(f"\n{name.capitalize()} transformation:")
    print(f"Skewness of Recency: {recency_skew}")
    print(f"Skewness of Frequency: {frequency_skew}")
    print(f"Skewness of Monetary: {monetary_skew}")
None transformation:
Skewness of Recency: 0.04356166369846459
Skewness of Frequency: 12.369499427471565
Skewness of Monetary: 18.954224583976302

Sqrt transformation:
Skewness of Recency: -0.49097751362347036
Skewness of Frequency: 3.4505122033956948
Skewness of Monetary: 2.83530947499132

Log10 transformation:
Skewness of Recency: -1.632735789608205
Skewness of Frequency: 1.8799794010067354
Skewness of Monetary: 0.6165928123199858

Boxcox transformation:
Skewness of Recency: -0.3102933245609721
Skewness of Frequency: 0.8771981678957986
Skewness of Monetary: -0.01735377665060057

Based on the result, we'll choose:

  • None for Recency
  • Boxcox for Frequency and Monetary
In [ ]:
# Preview Skewness of data
sns.displot(RFM['Recency'])
sns.displot(stats.boxcox(RFM['Frequency'])[0])
sns.displot(stats.boxcox(RFM['Monetary'])[0])
plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

After the transformation, we can see that the Monetary is very symmetric, while the Frequency is still highly right skewed because of the number of one-time customers are too large.

In [ ]:
from sklearn.preprocessing import MinMaxScaler

# Box-cox Transformation
RFM['Frequency_boxcox'], _ = stats.boxcox(RFM['Frequency'])
RFM['Monetary_boxcox'], _ = stats.boxcox(RFM['Monetary'])

# Data for transformation
fit_RFM = RFM[['Recency', 'Frequency_boxcox', 'Monetary_boxcox']]
fit_RFM.columns = ['Recency', 'Frequency', 'Monetary']

# Scale Data
scaler = MinMaxScaler()
scaler.fit(fit_RFM)
transformed_RFM = scaler.transform(fit_RFM)
In [ ]:
import logging
from sklearn.cluster import KMeans

# Suppress warnings
logging.getLogger('matplotlib.font_manager').disabled = True

from yellowbrick.cluster import KElbowVisualizer
Elbow_M = KElbowVisualizer(KMeans(), k=10, metric='distortion')
Elbow_M.fit(transformed_RFM)
Elbow_M.show() # 4 cluster is the optimize number
No description has been provided for this image
Out[ ]:
<Axes: title={'center': 'Distortion Score Elbow for KMeans Clustering'}, xlabel='k', ylabel='distortion score'>
In [ ]:
from sklearn.metrics import silhouette_score

# Fitting KMeans Model
model = KMeans(n_clusters=4, init='k-means++', random_state=3, n_init=30)
model.fit(transformed_RFM)

labeled_RFM = pd.DataFrame(transformed_RFM, columns=['Recency', 'Frequency', 'Monetary'], index=fit_RFM.index)
labeled_RFM['Cluster'] = model.predict(labeled_RFM)

# Evaluate model
silhouette = silhouette_score(labeled_RFM[['Recency', 'Frequency', 'Monetary']], labeled_RFM['Cluster'], sample_size=40000)

print("Silhouette Coefficient:", silhouette) # The silhouette score is quite good
Silhouette Coefficient: 0.5688068708100638
In [ ]:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(12, 10))
ax = fig.add_subplot(111, projection='3d')

scatter = ax.scatter(labeled_RFM['Recency'], 
                    labeled_RFM['Frequency'], 
                    labeled_RFM['Monetary'], 
                    c=labeled_RFM['Cluster'].astype(int), 
                    cmap='viridis',
                    alpha=0.7)

ax.set_title('Clusters', fontsize=16, pad=20)
ax.set_xlabel('Recency', fontsize=14, labelpad=10)
ax.set_ylabel('Frequency', fontsize=14, labelpad=10)
ax.set_zlabel('Monetary', fontsize=14, labelpad=10)

ax.view_init(elev=20, azim=120)

cbar = plt.colorbar(scatter, shrink=0.5)
cbar.set_label('Cluster Number', rotation=270, labelpad=15)

ax.tick_params(axis='both', which='major', labelsize=12)

plt.show()
No description has been provided for this image

From this angle, it quite hard to see the different between clusters in term of Monetary, but we can clearly see the four clusters are separated by the different in the Recency and Frequency value.

In [ ]:
# Create new df
transformed_RFM_df = pd.DataFrame(transformed_RFM,
                                    index=RFM.index,
                                    columns=RFM[['Recency', 'Frequency', 'Monetary']].columns)
transformed_RFM_df['Cluster'] = labeled_RFM.Cluster

# Create long table
RFM_melt = pd.melt(transformed_RFM_df.reset_index(), 
                    id_vars=['CustomerID', 'Cluster'],
                    value_vars=['Recency', 'Frequency', 'Monetary'],
                    var_name='Attribute',
                    value_name='Value')
plt.title("Snake plot")
sns.lineplot(x='Attribute', y='Value', hue='Cluster', data=RFM_melt)
plt.show()
No description has been provided for this image

The Snake plot is distorted in the Monetary section because applying MinMaxScaler to this column along with the existence of a number of outliers with a very large total purchase value has accidentally caused this column to have very small values.
To check whether the model has effective clustering, we will use the original Monetary result.

In [ ]:
# Assign Cluster number
RFM['Cluster'] = labeled_RFM.Cluster

# Calculate Average RFM Values of each Cluster
print(RFM.groupby('Cluster')[['Recency', 'Frequency', 'Monetary']].mean())
         Recency  Frequency    Monetary
Cluster                                
0        162.068      1.000  307799.016
1         97.538      4.851 1565071.450
2        486.570      1.000  289320.191
3        394.498      2.704  816652.064

We can see there are differences between the 4 groups in 3 RFM values. Specifically, Cluster 0 is recent customers but has only purchased once and has an average Monetary value.
Meanwhile, Cluster 1 is the Cluster with three scores at the best level.
Cluster 2 is the customer group with the three lowest indexes among the four groups and has a very high possibility of churn.
Cluster 3 is a group of customers with high purchasing frequency and high value, but have not returned for a long time and are at high risk of losing this group of customers to competitors.

In [ ]:
# Building Relative Importance Heatmap
cluster_avg = RFM.groupby(['Cluster']).agg({'Recency':'mean',
                                            'Frequency':'mean',
                                            'Monetary':'mean'})
population_avg = RFM[['Recency','Frequency','Monetary']].mean()
relative_imp = cluster_avg/population_avg - 1
plt.figure(figsize=(8,2))
plt.title('Relative Importance of Attributes')
sns.heatmap(data=relative_imp, annot=True, fmt='.2f', cmap='RdYlGn')
plt.show()
No description has been provided for this image

The farther the values ​​are from 0, the more effective the grouping is. As we can see, all divided groups have values ​​quite far from the value 0, showing that KMeans has effectively divided the data into 4 groups.

2. Churn and CLV Predict¶

In this section, we will use the statistical model Beta Geometric Negative Binomial Distribution to calculate customer churn probability, predict the average Frequency of each customer; and the Gamma-Gamma model to predict the average Monetary. Then we'll predict the CLV of customers based on these 2 models.

Because this model only applies to customers with repeated purchases, we will use this model to test and calculate with Cluster 1 and 3 only.

A. Churn Risk Modeling¶

In [ ]:
# Import Model
from lifetimes import *
from lifetimes.utils import *
from lifetimes.plotting import *
In [ ]:
# Create data and auto calculate RFM Score (Formula of the BetaGeo Model is different than the way that we calculated above)
churn_RFM = summary_data_from_transaction_data(data.reset_index(drop=True), 'CustomerID' , 'TransactionDate' , 'SalesAmount' )

bgf = BetaGeoFitter(penalizer_coef=0.0) 
bgf.fit(churn_RFM['frequency'], churn_RFM['recency'], churn_RFM['T'])
t = 180 # Forecast  for the next 180 days
churn_RFM['expected_purchases'] = bgf.conditional_expected_number_of_purchases_up_to_time(t, churn_RFM['frequency'], churn_RFM['recency'], churn_RFM['T'])

# Alive Probability of Customers
churn_RFM['Retention'] = bgf.conditional_probability_alive(churn_RFM['frequency'], churn_RFM['recency'], churn_RFM['T'])
churn_RFM['Churn'] = 1 - churn_RFM['Retention']

Next we want to evaluate the model to see how well it performs in the future. I'll split the data into a training (calibration) period and a holdout (observation) period, train the BG/NBD model and evaluate performance with four plots:

  1. Calibration period histogram: does the model fit the training data?
  2. Cumulative transaction plot: does the model predict cumulative sales well?
  3. Incremental transaction plot: does the model capture the overall trend in transactions?
  4. Conditional expectations plot: can the model predict the number of purchases a customer will make based on the training data?

Note: except for graph 1, in this section we will only consider the model's performance on data from customers in Cluster 1 and 3, two groups of customers with repeated purchases.

1. Calibration Period Histogram¶

In [ ]:
plot_period_transactions(bgf).set_yscale('log')
No description has been provided for this image

As we can see, the model predicts quite accurately the number of customers according to the number of repeat transactions, the difference becomes larger at 5 and 6 repeat purchases because there is not much data of these customers.

2. Cumulative Transaction Plot¶

In [ ]:
# Split Calibration and Holdout data
summary_cal_holdout = calibration_and_holdout_data(data.reset_index(drop=True), 'CustomerID', 'TransactionDate',
                                        calibration_period_end='2023-01-01',
                                        observation_period_end='2023-06-30',
                                        monetary_value_col='SalesAmount')
summary_cal_holdout = summary_cal_holdout[summary_cal_holdout['frequency_cal'] > 0]
In [ ]:
# Add Frequency column to data
data.index = data['CustomerID']
data['Frequency'] = RFM.Frequency

# Plot Cumulative transaction of Model vs Actual data
bgf.fit(summary_cal_holdout['frequency_cal'], summary_cal_holdout['recency_cal'], summary_cal_holdout['T_cal'])
plot_cumulative_transactions(bgf, data[data['Frequency'] > 1].reset_index(drop=True), 'TransactionDate', 'CustomerID', 637, 456)
Out[ ]:
<Axes: title={'center': 'Tracking Cumulative Transactions'}, xlabel='day', ylabel='Cumulative Transactions'>
No description has been provided for this image

The orange line represents the boundary between the calibration period on the left and the holdout period on the right. As we can see, the model predicts quite closely the orange line in terms of the number of transactions over time with the gap between the two lines not too significant.

3. Incremental Transaction Plot¶

In [ ]:
plot_incremental_transactions(bgf, data[data['Frequency'] > 1].reset_index(drop=True), 'TransactionDate', 'CustomerID', 637, 456)
Out[ ]:
<Axes: title={'center': 'Tracking Daily Transactions'}, xlabel='day', ylabel='Transactions'>
No description has been provided for this image

The model captures the general trend of the number of purchases by day, as we can see the orange line is trending down and captures the recent slight downward trend.

4. Conditional Expectations Plot¶

In [ ]:
plot_calibration_purchases_vs_holdout_purchases(bgf, summary_cal_holdout)
Out[ ]:
<Axes: title={'center': 'Actual Purchases in Holdout Period vs Predicted Purchases'}, xlabel='Purchases in calibration period', ylabel='Average of Purchases in Holdout Period'>
No description has been provided for this image

The distance between the two lines tends to decrease, showing that the model predicts the number of purchases quite well.

In [ ]:
from sklearn.metrics import mean_absolute_error

predicted_purchases = bgf.conditional_expected_number_of_purchases_up_to_time(t, 
                                                                            summary_cal_holdout['frequency_cal'], 
                                                                            summary_cal_holdout['recency_cal'], 
                                                                            summary_cal_holdout['T_cal'])
actual_purchases = summary_cal_holdout['frequency_holdout']
mae_frequency = mean_absolute_error(actual_purchases, predicted_purchases)
print(f'Mean Absolute Error: {mae_frequency}')
Mean Absolute Error: 0.8605611140417139

The MAE of the frequency is < 1, which is very good since the Cluster 1 and Cluster 3 are the two clusters with very high average frequency of customers.

5. Alive Probability¶

  • The BG/NBD model assumes that death can only occur after a repeat purchase, since the customer leaving occurs during a purchase and the first purchase is reserved to signal a customer's birth.
  • Because of this, customers with only one transactions will have a 100% probability of being alive, which is suspect. To account for this limitation, we'll only predict churn risk on customers who have made at least one repeat transaction, which is customers in Cluster 1 and 3.
In [ ]:
# Filter out Frequency=1 Customers
repeat_customers = churn_RFM[churn_RFM['frequency'] > 0]
repeat_customers['Cluster'] = labeled_RFM.Cluster
In [ ]:
# Plot alive probability
repeat_customers['prob_alive'] = bgf.conditional_probability_alive(repeat_customers['frequency'], repeat_customers['recency'], repeat_customers['T'])
sns.distplot(repeat_customers['prob_alive'])
plt.show()
No description has been provided for this image
In [ ]:
print(repeat_customers.groupby("Cluster")["prob_alive"].mean())
Cluster
1   0.608
3   0.101
Name: prob_alive, dtype: float64

We can see that there is a clear difference in average prob_alive between the two Clusters, with Cluster 1 having much higher prob_alive than Cluster 3.

B. CLV Model¶

The assumption that the transaction frequency (how often a customer purchases) is independent of the monetary value (how much they spend per transaction) is central to the Gamma-Gamma model. To test this assumption we'll use Pearson Correlation.

In [ ]:
repeat_customers[['frequency', 'monetary_value']].corr()
Out[ ]:
frequency monetary_value
frequency 1.000 -0.012
monetary_value -0.012 1.000
  • We see that -0.01225 is a low correlation number --> we can use the model.
  • Beside the above assumption, this model also assume that the monetary values are gamma-distributed. To check this, we'll see the histogram plot of monetary value after transformed using box-cox, with the overlay PDF of the gamma distribution.
In [ ]:
from scipy.stats import gamma

repeat_customers['monetary_value_boxcox'], _ = stats.boxcox(repeat_customers['monetary_value'])

# Fit a gamma distribution to monetary values
ag, loc, scale = gamma.fit(repeat_customers['monetary_value_boxcox'])

# Generate values from fitted gamma distribution
gamma_fitted = gamma.pdf(np.linspace(min(repeat_customers['monetary_value_boxcox']), max(repeat_customers['monetary_value_boxcox']), 100), ag, loc, scale)

# Plot histogram and fitted gamma PDF
plt.figure()
plt.hist(repeat_customers['monetary_value_boxcox'], bins=30, density=True, alpha=0.75, label='Observed data')
plt.plot(np.linspace(min(repeat_customers['monetary_value_boxcox']), max(repeat_customers['monetary_value_boxcox']), 100), gamma_fitted, 'r-', label='Gamma fit')
plt.xlabel('Monetary Value')
plt.ylabel('Density')
plt.title('Fit of Gamma Distribution')
plt.legend()
plt.show()
No description has been provided for this image

The fit of the Gamma distribution to the data seems quite good, meaning that we can proceed with this model to predict CLV of customers.

In [ ]:
ggf = GammaGammaFitter(penalizer_coef=0.0)

summary_cal_holdout['monetary_value_cal_boxcox'], mon_val_lmbda = stats.boxcox(summary_cal_holdout['monetary_value_cal'])

ggf.fit(summary_cal_holdout['frequency_cal'], summary_cal_holdout['monetary_value_cal_boxcox'])
Out[ ]:
<lifetimes.GammaGammaFitter: fitted with 166916 subjects, p: 525.60, q: 540.76, v: 16.96>
In [ ]:
# Predict the expected number of transactions in the next 180 days since 2023-01-01
days = (dt.datetime(2023, 6, 30) - dt.datetime(2023, 1, 1)).days

predicted_bgf = bgf.predict(days,
                        summary_cal_holdout['frequency_cal'], 
                        summary_cal_holdout['recency_cal'], 
                        summary_cal_holdout['T_cal'])

# Predict the average order value
monetary_pred_boxcox = ggf.conditional_expected_average_profit(
    summary_cal_holdout['frequency_cal'],
    summary_cal_holdout['monetary_value_cal_boxcox'])
In [ ]:
# Inverse Box-Cox transformation to map predictions back to the original scale
def inverse_boxcox(y, lambda_):
    if lambda_ == 0:
        return np.exp(y)
    else:
        return (y * lambda_ + 1) ** (1 / lambda_)
    
monetary_pred = inverse_boxcox(monetary_pred_boxcox, mon_val_lmbda)

# Calculate the predicted sales
sales_pred = predicted_bgf * monetary_pred

# Actual values calculation
actual = summary_cal_holdout['monetary_value_holdout'] * summary_cal_holdout['frequency_holdout']
In [ ]:
import matplotlib.pyplot as plt
from sklearn.metrics import mean_absolute_error

# Calculate the mean of the training set
train_actual = summary_cal_holdout['monetary_value_cal'] * summary_cal_holdout['frequency_cal']
median_sales = train_actual.median()

def evaluate(actual, sales_prediction, median_sales):
    actual_log = np.log1p(actual)  # Log-transforming the actual values
    sales_pred_log = np.log1p(sales_prediction)  # Log-transforming the predictions
    
    print(f"Total Sales Actual: {np.round(actual.sum())}")
    print(f"Total Sales Predicted: {np.round(sales_prediction.sum())}")
    
    # Calculate and print the percentage difference
    percentage_difference = ((sales_prediction.sum() - actual.sum()) / actual.sum()) * 100
    print(f"Percentage Difference between Actual Sales and Predicted Sales: {percentage_difference:.2f}%")
    
    print(f"Individual Mean Absolute Error: {mean_absolute_error(actual, sales_prediction)}")

    # Predict the mean sales for all data points
    naive_predictions = np.full(shape=actual.shape, fill_value=median_sales)

    # Calculate the MAE of the naive model
    naive_mae = mean_absolute_error(actual, naive_predictions)
    print(f"Naive Mean Absolute Error: {naive_mae}")
    
    plt.scatter(sales_pred_log, actual_log, alpha=0.5)
    plt.xlabel('Log of Prediction')
    plt.ylabel('Log of Actual')
    plt.title('Log-Log Plot of Actual vs. Prediction')
    plt.show()

# Apply the function with actual and predicted values
evaluate(actual, sales_pred, median_sales)
Total Sales Actual: 47080105051.0
Total Sales Predicted: 45781350436.0
Percentage Difference between Actual Sales and Predicted Sales: -2.76%
Individual Mean Absolute Error: 291539.45987536316
Naive Mean Absolute Error: 469809.1762456828
No description has been provided for this image
In [ ]:
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split

# Prepare the features and target
X = summary_cal_holdout[['frequency_cal', 'recency_cal', 'T_cal', 'monetary_value_cal']]
y = summary_cal_holdout['monetary_value_holdout'] * summary_cal_holdout['frequency_holdout']

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the XGBRegressor
xgb = XGBRegressor(objective='reg:squarederror', random_state=42)

# Fit the model
xgb.fit(X_train, y_train)

# Make predictions
xgb_predictions = xgb.predict(X_test)

# Calculate the MAE of the XGBRegressor
xgb_mae = mean_absolute_error(y_test, xgb_predictions)

print(f"XGBRegressor Mean Absolute Error: {xgb_mae}")
XGBRegressor Mean Absolute Error: 302864.0472673454
  • We can see that the MAE of the GammaGamma model is lower compared to both the two benchmarks are XGBoost and Naive prediction approach (here we use the median since the raw Monetary data is skewed, so using median instead of mean would be the better benchmark - The mean approach have MAE of > 700,000), indicating that the model is suitable for these two Clusters in predict the CLV of customers.
  • Beside that, the Total Sales Actual and the Total Sales Predicted are very close, only off by 2.8%. There are a lot of customers that have 0 sales value but predicted > 0 sales value. We can understand this as the period is not long enough for those users to proceed next transactions.

3. Final Model for CLV and Churn Probability¶

Here, we will calculate the churn probability of a customer for each Cluster and along with that customer's CLV in the next 6 months.

Churn Probability and CLV Model for Cluster 1 and 3¶

In [ ]:
# Data for fitting model
temp = summary_data_from_transaction_data(data.reset_index(drop=True), 'CustomerID', 'TransactionDate', monetary_value_col='SalesAmount')
cluster_1_3 = temp.loc[temp.frequency > 0, :]

# BG/NBD - Churn Probability
bgf = BetaGeoFitter(penalizer_coef=0.0)
bgf.fit(cluster_1_3['frequency'], cluster_1_3['recency'], cluster_1_3['T'])
cluster_1_3['Churn'] = 1 - bgf.conditional_probability_alive(cluster_1_3['frequency'], cluster_1_3['recency'], cluster_1_3['T'])

# Gamma-Gamma
ggf = GammaGammaFitter(penalizer_coef=0.0)

cluster_1_3['monetary_value_boxcox'], monetary_lmbda = stats.boxcox(cluster_1_3['monetary_value'])

# Fitting ggf model
ggf.fit(cluster_1_3['frequency'],
        cluster_1_3['monetary_value_boxcox'])

# Avg Monetary predict
expected_avg_profit = ggf.conditional_expected_average_profit(
    cluster_1_3['frequency'],
    cluster_1_3['monetary_value_boxcox']
)

expected_avg_profit_true = inverse_boxcox(expected_avg_profit, monetary_lmbda)

# Avg frequency predict
expected_transactions = bgf.predict(
    180, # 6 months
    cluster_1_3['frequency'],
    cluster_1_3['recency'],
    cluster_1_3['T']
)

cluster_1_3['clv'] = expected_transactions * expected_avg_profit_true

cluster_1_3['Cluster'] = labeled_RFM.Cluster
print(cluster_1_3.groupby("Cluster")["Churn"].mean())
print(cluster_1_3.groupby("Cluster")["clv"].mean())
Cluster
1   0.364
3   0.862
Name: Churn, dtype: float64
Cluster
1   355426.359
3    38039.774
Name: clv, dtype: float64

Here, we can clearly see the difference in average CLV value and Churn probability between the two Clusters, with Cluster 1 having a much higher value than Cluster 3 and having a lower average Churn probability, showing that Customers in the group are very valuable.

Churn Probability and CLV Model for Cluster 0 and 2¶

In [ ]:
data_sorted.index = data_sorted['CustomerID']
data_sorted['Cluster'] = RFM.Cluster
average_diff = data_sorted[(data_sorted['Cluster'] == 1) | (data_sorted['Cluster'] == 3)].reset_index(drop=True).groupby('CustomerID')['diff'].mean()
average_diff.columns = ['average_diff']
count, bins_count = np.histogram(average_diff, bins=20) 

# using numpy np.cumsum to calculate the CDF 
pdf = count / sum(count) 
cdf = np.cumsum(pdf) 

# plotting PDF and CDF 
plt.plot(bins_count[1:], cdf, label="CDF")
plt.xlabel("Days")
plt.ylabel("Percentage of Customers Buying Again")
plt.legend() 
Out[ ]:
<matplotlib.legend.Legend at 0x7fc4cff3c2f0>
No description has been provided for this image

According to hackermoon's theory, we will use CDF (Cumulative Distribution Function) to choose the threshold that determines the number of days since last order at which a customer will churn.
A good rule of thumb is to find the interval where there is an inflection point in the curve --> we will choose a threshold of 93% (0.93 quantile).

In [ ]:
#Filter for cluster 0 and 2
cluster_0_2 = RFM[(RFM['Cluster'] == 0) | (RFM['Cluster'] == 2)]

#Define threshold for defining churn
churn_threshold = average_diff.reset_index()['diff'].quantile(0.93)

#Probability of churn
cluster_0_2['Churn_1'] = cluster_0_2['Recency']/churn_threshold
cluster_0_2['Churn'] = np.where(cluster_0_2['Churn_1']>1, 1, cluster_0_2['Churn_1'])
In [ ]:
cluster_0_2.groupby('Cluster')['Churn'].mean()
Out[ ]:
Cluster
0   0.517
2   1.000
Name: Churn, dtype: float64

Cluster 2 has a Churn probability of 100%, proving that 100% of customers in this group have not made a purchase in the past year. However, this method is only approximate and not 100% confident.

Because Cluster 2 has a churn probability of 100%, we will assume the CLV of this Cluster in the next 6 months is 0.
For Cluster 0, we will use XGBRegressor to predict the CLV of this Cluster for the next 6 months.

In [ ]:
# Calculate cutoff date for target variable
n_days = 180
max_date = data['TransactionDate'].max()
cutoff = max_date - pd.to_timedelta(n_days, unit='d')

# Split data
temporal_in_df = data[data['TransactionDate'] <= cutoff]
temporal_out_df = data[data['TransactionDate'] > cutoff]

# Making target data
targets_df = temporal_out_df[['CustomerID', 'SalesAmount']].reset_index(drop=True).\
    groupby('CustomerID').sum().rename({'SalesAmount':'spend_180_total'}, axis=1).assign(spend_180_flag=1)

# RFM
max_day = temporal_in_df['TransactionDate'].max() + timedelta(days=1)

# Calculate RFM values
features_df = temporal_in_df.reset_index(drop=True).groupby(['CustomerID']).agg({
    'TransactionDate': lambda x: (max_day - x.max())/pd.to_timedelta(1,"day"),
    'BillID': 'count',
    'SalesAmount': 'sum'}).merge(targets_df, left_index=True, right_index=True, how='left').fillna(0)

features_df['Cluster'] = RFM.Cluster

# Rename column
features_df.rename(columns={'TransactionDate': 'Recency', 'BillID': 'Frequency', 'SalesAmount': 'Monetary'}, inplace=True)
In [ ]:
# MACHINE LEARNING FOR REGRESSION
X = features_df[['Recency', 'Frequency', 'Monetary']]
y_spend = features_df['spend_180_total']

##Train and test split
from sklearn.model_selection import KFold, cross_val_score
X_train_reg, X_test_reg, y_train_reg, y_test_reg = train_test_split(X, y_spend, test_size=0.2, random_state=3)

#Create XGB Regression model and Tuning Model
xgb_reg = XGBRegressor(n_estimators=100, learning_rate=0.05, random_state=2, n_jobs=12)

kf = KFold(n_splits=5, random_state=1, shuffle=True)

cv_scores = cross_val_score(xgb_reg, X_train_reg, y_train_reg, cv=kf, scoring='neg_mean_absolute_error')

# Fit the model
xgb_reg.fit(X_train_reg, y_train_reg)

# Print the cross-validation scores
print('Cross-validation scores:', cv_scores)

#Predict X
predictions_reg = xgb_reg.predict(X_test_reg)
predictions_reg = np.where(predictions_reg < 0, 0, predictions_reg)

#Mean Absolute Error
print('Mean Absolute Error of XGB Regression Model: {}'.format(mean_absolute_error(y_test_reg, predictions_reg)))
Cross-validation scores: [-141070.05543361 -141681.34271143 -143925.69511609 -139769.06362863
 -144515.03960475]
Mean Absolute Error of XGB Regression Model: 141027.53955605143
In [ ]:
import optuna
from optuna.samplers import TPESampler

def objective(trial):
    params = {
        "learning_rate": trial.suggest_float("learning_rate", 0.01, 1.0),
        "n_estimators": trial.suggest_int("n_estimators", 100, 1000),
        "max_depth": trial.suggest_int("max_depth", 1, 15),
        "min_child_weight": trial.suggest_int("min_child_weight", 1, 10),
        "gamma": trial.suggest_float("gamma", 0.0, 1.0),
        "colsample_bytree": trial.suggest_float("colsample_bytree", 0.1, 1.0),
        "subsample": trial.suggest_float("subsample", 0.1, 1.0),
        "reg_alpha": trial.suggest_float("reg_alpha", 1e-8, 10.0, log=True),
        "reg_lambda": trial.suggest_float("reg_lambda", 1e-8, 10.0, log=True),
        "scale_pos_weight": trial.suggest_float("scale_pos_weight", 1e-6, 500.0, log=True)
    }
    
    model = XGBRegressor(**params, n_jobs=12)
    model.fit(X_train_reg, y_train_reg)
    
    # Use the validation set for evaluation
    preds = model.predict(X_test_reg)
    mae = mean_absolute_error(y_test_reg, preds)
    
    return -mae  # return negative MAE for maximization

optuna.logging.set_verbosity(optuna.logging.WARNING)

sampler = TPESampler(seed=1)
study = optuna.create_study(direction="maximize", sampler=sampler)
study.optimize(objective, n_trials=100)

best_params = study.best_params
best_score = study.best_value

print(f"Best parameters: {best_params}")
print(f"Best negative MAE score: {best_score}")
Best parameters: {'learning_rate': 0.8044705559265699, 'n_estimators': 343, 'max_depth': 1, 'min_child_weight': 8, 'gamma': 0.5212677432318678, 'colsample_bytree': 0.46482910817480444, 'subsample': 0.10165275993209091, 'reg_alpha': 1.412096705991424e-06, 'reg_lambda': 0.8675148598594582, 'scale_pos_weight': 3.3352054545636855e-05}
Best negative MAE score: -140164.43003257317
In [ ]:
# FINAL MODEL AND CLUSTER 0 CLV
# Fit model
XGB_reg_tuned = XGBRegressor(reg_lambda=16, min_child_weight=12, max_depth=10, learning_rate=0.25, random_state=3)
XGB_reg_tuned.fit(X, y_spend)

# Predict
clv_predict = XGB_reg_tuned.predict(RFM[RFM['Cluster'] == 0][['Recency', 'Frequency', 'Monetary']])
clv_predict_tune = np.where(clv_predict < 0, 0, clv_predict)  # Setting CLV = 0 for negative values

result_df = pd.concat([pd.DataFrame(clv_predict_tune).set_axis(['clv'], axis=1),
                        RFM[RFM['Cluster'] == 0][['Recency', 'Frequency', 'Monetary']].reset_index()], 
                        axis=1)
result_df.index = result_df['CustomerID']
result_df['Cluster'] = RFM.Cluster
cluster_0_clv = result_df.groupby('Cluster')['clv'].mean()
cluster_0_clv.head()
Out[ ]:
Cluster
0   50440.219
Name: clv, dtype: float32
In [ ]:
# Concat 2 df
cluster_0_2 = cluster_0_2.merge(result_df, left_index=True, right_index=True, how='left', suffixes=('', '_x'))
cluster_0_2 = cluster_0_2[['Frequency', 'Recency', 'Monetary', 'Churn', 'clv', 'Cluster']].fillna(0)
cluster_1_3.rename(columns={'frequency': 'Frequency', 'recency': 'Recency', 'monetary_value': 'Monetary'}, inplace=True)
cluster_1_3 = cluster_1_3.drop(['T'], axis=1)
final_clv = pd.concat([cluster_0_2, cluster_1_3], axis=0)
final_clv.sort_index(ascending=True, inplace=True)
In [ ]:
print("Total CLV of each Cluster: ", final_clv.groupby('Cluster')['clv'].sum()) # Total CLV
print("Average CLV of each Cluster: ", final_clv.groupby('Cluster')['clv'].mean()) # Average CLV
print("Churn Probability of each Cluster: ", final_clv.groupby('Cluster')['Churn'].mean()) # Average Churn Probability
Total CLV of each Cluster:  Cluster
0   10771205926.307
1   49285196011.201
2             0.000
3    3115913941.070
Name: clv, dtype: float64
Average CLV of each Cluster:  Cluster
0    50440.218
1   355426.359
2        0.000
3    38039.774
Name: clv, dtype: float64
Churn Probability of each Cluster:  Cluster
0   0.517
1   0.364
2   1.000
3   0.862
Name: Churn, dtype: float64

4. Additional Insights¶

In this section, we will look at the company's Retention Rate and its number of new customers by month over time.

Retention Rate¶

In [ ]:
# Get the first month of each Customer
data = data.reset_index(drop=True)
def get_month(x): 
    return dt.datetime(x.year, x.month, 1)

data['transaction_month'] = data['TransactionDate'].apply(get_month)
first_month = data.groupby("CustomerID").agg({'transaction_month': 'min'}).reset_index().rename(columns={'transaction_month': 'first_month'})
cohort = data.merge(first_month, how='left', on='CustomerID')

# Calculate the months since Customer's first transaction month
def month_diff(a, b):
    return 12*(a.dt.year - b.dt.year) + (a.dt.month - b.dt.month)

cohort['diff'] = month_diff(cohort.transaction_month, cohort.first_month)
cohort_pivot = cohort.groupby(['first_month', 'diff'])['CustomerID'].nunique().reset_index()\
                    .pivot_table(values="CustomerID", index="first_month", columns="diff")

# Draw the heatmap of MoM retention rate
mom_retention_rate = cohort_pivot.divide(cohort_pivot.iloc[:, 0], axis=0).round(3) * 100

plt.figure(figsize=(18,14))
plt.title("MoM Retention Rate for Customer Transaction Data")

ax = sns.heatmap(data=mom_retention_rate, annot=True, vmin=0, vmax=10, cmap='crest', fmt=".1f")
ax = ax.set_yticklabels(mom_retention_rate.index.strftime('%Y-%m-%d'))

plt.xlabel('Months since first Order')
plt.ylabel('Month')
plt.show()
No description has been provided for this image
  • We can see that the company's retention rate generally tends to decrease for customers acquired in the past. The company's 2021-10 rate of new customers who purchased after the first month was 14.5%, but this number dropped to 6.8% after 20 months.
  • Customer retention rates after the first month also decreased. This rate in 2021-10 is 14.5%, but has decreased to 7% in 2023-05.

New Customer by Month¶

In [ ]:
#Draw the number of new customer by month overtime
plt.figure(figsize=(12, 6))
sns.lineplot(cohort.groupby('first_month')['CustomerID'].nunique())
plt.xlabel('Month')
plt.ylabel('Number of Customer')
plt.show()
No description has been provided for this image

We can see Pizza Hut's number of new customers decrease sharply from 2021-10 to 2023-06, from nearly 70,000 new customers in October 2021 to more than 20,000 new customers in June 2023.

RECOMMENDATIONS¶

Cluster 0¶

BEHAVIOR:

  • Nearly 60% of the customer uses “Take away” channel.
  • 57.6% of them order from Store & 22.71% from Website.
  • Most of the people who order in Hanoi, Northern & Southern Provinces use “Take away” while people who order in Ho Chi Minh City choose “Delivery”.

RECOMMENDATION:

  • Increase retention through loyalty programs, discounts, and promotions to encourage customer purchases.
  • Focusing on optimizing the customer experience at the store & Optimizing the online shopping process and website interface.
  • Ask them for feedback or reviews on their first purchase and use their insights to improve your products and services.
  • Use geo-targeting or location-based marketing to send them timely and relevant messages when they are near the store or in the delivery area.

Cluster 1¶

BEHAVIOR:

  • 53.58% of the customer uses “Delivery” channel.
  • 40.9% orders from Store and 21.27% from App. Through time customers have a tendency to order more from App, while orders via Call Center witnessed a decrease.
  • Even though the number of customers is only in Third place, the number of people who use vouchers is the highest.
  • Most of the people who order in Northern & Southern Provinces use “Take away” while people who order in Hanoi & Ho Chi MInh City choose “Delivery”.

RECOMMENDATION:

  • Optimizing delivery services: timely deliveries and methods to maintain the food temperature.
  • Providing attractive and personalized voucher offers to increase customer loyalty.
  • Utilizing up-selling strategies to optimize revenue as well as enhance customer experience.
  • Improve the ordering experience on the mobile app and promote the advertising of special benefits when ordering on the app, building an appropriate promotion code program.

Cluster 2¶

BEHAVIOR:

  • The preference for “Take Away” was 58.45%, while “Delivery” was 40.69%.
  • 59.1% of customers ordered at the Store, 22.03% ordered on the Website.
  • Most of the people who order in Hanoi, Northern & Southern Provinces use “Take Away” while people who order in Ho Chi Minh City choose “Delivery”.
  • Customers last purchased in August 2022.

RECOMMENDATION:

  • Ignore this group of customers.

Cluster 3¶

BEHAVIOR:

  • Most of the people who order in Northern & Southern Provinces use “Take Away” while people who order in Hanoi & Ho Chi Minh City choose “Delivery”.
  • These customers used to be frequent buyers, but they have not purchased in a long time.
  • These customers prefer “Delivery” (49.97%), but they often order in-store (48.96%) or Website (22.68%).

RECOMMENDATION:

  • Craft personalized win-back campaigns for inactive but previously frequent buyers, offering compelling incentives or discounts to encourage their return.
  • Consider reaching out via phone calls with exclusive reactivation offers to reconnect with these customers personally.
  • Showcase any service improvements or enhancements made since their last purchase to entice them back.
  • Promote testimonials from returning customers to develop trust in the brand's quality and service.

General Recommendations¶

MARKETING & PROMOTIONS:

  • Provide cultural immersive experience by generating pizza cars on weekends or holidays in some famous places in Hanoi or Ho Chi Minh City like Hoan Kiem Lake or Nguyen Hue Walking Street.
  • Using KOL and KOC to expand its brand recognition.
  • The company can open another signature restaurant in Hanoi to boost the Dine-in experience of the customer.
  • Use trending TV shows as a channel of Marketing
  • Promote marketing campaigns via interesting activities, Challenges on Social Media

--> By taking the above actions, Pizza Hut can attract many new customers when the number of new customers is currently following the trend of decreasing month after month.

ENHANCING CUSTOMER LOYALTY:

  • Strengthening the loyalty program with tiered rewards and exclusive benefits like vouchers, discounts, coupons.
  • Optimize the Pizza Hut mobile app for a seamless and engaging user experience, offering exclusive promotions and features.

--> Since the retention rate of Pizza Hut tends to decrease over time, this is an important action in maintaining the consistency of the ratio of loyal customers of the company.

INNOVATION IN SERVICES & PRODUCTS:

  • Continue introducing innovative menu items to cater to diverse tastes and consider limited-time releases to create excitement and encourage exploration of the menu (Ex: Introducing new flavors for current L1MO - 1 meter pizza).
  • For customizable products (My Box), create an interactive, digital experience where customers can customize their pizza with their favorite toppings, crust, and size.

FEEDBACK & CONTINUOUS DEVELOPMENT:

  • Establish easy-to-access feedback mechanisms for continuous improvement, give customers rewards after sending feedback.
  • Analyze customer feedback to identify areas for enhancement in service, menu, or overall experience.
  • Improve information systems to reduce unknowns in Gender category, helping companies gain specific insight into appropriate marketing plans and selling strategy for special holidays such as International Women’s Day, Vietnamese Women’s Day or Men’s Day.