Now that we've created the ML model and an endpoint to serve up the inferences, we'd like to connect that endpoint to Amazon Aurora. That way we can request a prediction on whether this customer will churn at the same time that we retrieve information about this customer.
In addition, we'll call Amazon Comprehend to Amazon Aurora. That way, we can also request an assessment of the customer's sentiment when they send a message to customer service.
With both of these pieces of information in hand, we can then make an on-the-fly decision about whether to offer the customer an incentive program of some kind. Of course, the details of that incentive and the rules on when to offer it must come from Marketing.
Note that for simplicity we're using a predefined Amazon SageMaker endpoint_name here. The AWS CloudFormation template created this endpoint (together with an endpoint configuration), added it to an IAM role (this role authorizes the users of Aurora database to access AWS ML services), and assigned the Aurora Database cluster parameter group value 'aws_default_sagemaker_role' to this IAM role. This combination of settings gives Aurora permission to call the Amazon SageMaker endpoint.
If you'd like to read further on this setup, documentation on how to create the policy and a role can be found here. Details on how to create a custom database parameter group are described here.
If the Python module mysql.connector
is not installed, install it with pip.
import sys
# upgrade pip
!{sys.executable} -m pip install --upgrade pip
!{sys.executable} -m pip install mysql.connector
import mysql.connector as mysql
For this use case, we've created the S3 bucket and appropriate IAM roles for you during the launch of the AWS CloudFormation template. The bucket name was saved in a parameter file called "cloudformation_values.py" during creation of the notebook instance, along with the DB secret name and ML endpoint name.
# import installed module
import mysql.connector as mysql
import os
import pandas as pd
import numpy as np
import boto3
import json
import cloudformation_values as cfvalues
# get the session information
session = boto3.Session()
# extract the region and account id
region = cfvalues.REGION
# AWS Secrets stores our database credentials.
db_secret_name = cfvalues.DBSECRET
# Get the secret from AWS Secrets manager. Extract user, password, host.
from utilities import get_secret
get_secret_value_response = get_secret(db_secret_name, region)
creds = json.loads(get_secret_value_response['SecretString'])
db_user = creds['username']
db_password = creds['password']
# Writer endpoint
db_host = creds['host']
# This is the Amazon SageMaker preset endpoint_name created by the Cloud Formation
endpoint_name = cfvalues.ENDPOINT
print(endpoint_name)
# Define the database and table names
database_name = "telecom_customer_churn"
churn_table = "customers"
customer_msgs_table = "customer_message"
customer_churn_results = "customer_churn_results"
Connect to the database using the credentials retrieved above.
# create connection to the database
cnx = mysql.connect(user = db_user,
password = db_password,
host = db_host,
database = database_name)
dbcursor = cnx.cursor(buffered = True)
Let's first test that we can call Amazon Comprehend from our SQL query, and return the sentiment for a customer message. We'll use the messages we inserted into our "customer call history" table in the part 1 to test this capability.
sql = """SELECT message,
aws_comprehend_detect_sentiment(message, 'en') AS sentiment,
aws_comprehend_detect_sentiment_confidence(message, 'en') AS confidence
FROM {};""".format(customer_msgs_table)
dbcursor.execute(sql)
dbcursor.fetchall()
Here we can see the customer's sentiment, based on the text of their customer service contact text. We have an overall assessment, such as 'POSITIVE', and a numeric confidence. We can use the assessment and the score to make a decision on what to offer the customer.
Now we need to set up the Aurora database to call the Amazon SageMaker endpoint and pass the data it needs to return an inference.
Our original data contained numeric variables as well as several categorical variables (such as area_code
and int_plan
) which are needed for prediction. During creation of the ML model, the categorical variables were converted to one-hot vectors.
In the final model, we used only 1 of these values: int_plan_no
. There are two ways to approach this problem:
Here we will demonstrate the second option.
Below, we've listed the features used by our final model. If this list has changed in content or in order for your run, you will need to modify the steps below so that they match your list.
cols_used = ['acc_length', 'vmail_msg', 'day_mins', 'day_calls', 'eve_mins', 'night_mins', 'night_calls', 'int_calls', 'int_charge', 'cust_service_calls', 'int_plan_no']
cols_used = ['acc_length', 'vmail_msg', 'day_mins', 'day_calls', 'eve_mins', 'night_mins',
'night_calls', 'int_calls', 'int_charge', 'cust_service_calls', 'int_plan_no']
dbcursor.execute("DESCRIBE {churn_table};".format(churn_table=churn_table))
dbcursor.fetchall()
Create functions to perform one-hot encoding:
# one-hot encoding for int_plan
dbcursor.execute("DROP FUNCTION IF EXISTS IntPlanOneHot;")
sql = """CREATE FUNCTION IntPlanOneHot(int_plan varchar(2048))
RETURNS INT
BEGIN
DECLARE int_plan_no INT;
IF int_plan = 'no' THEN SET int_plan_no = 1;
ELSE SET int_plan_no = 0;
END IF;
RETURN int_plan_no;
END
;"""
dbcursor.execute(sql)
# one-hot encoding for area_code to generate area_code_510
# While this function is not used for this model run, we provide it as an additional demonstration,
# and in case a similar feature is used in a later model run
dbcursor.execute("DROP FUNCTION IF EXISTS AreaCode510;")
sql = """CREATE FUNCTION AreaCode510(area_code bigint(20))
RETURNS INT
BEGIN
DECLARE area_code_510 INT;
IF area_code = 510 THEN SET area_code_510 = 1;
ELSE SET area_code_510 = 0;
END IF;
RETURN area_code_510;
END
;"""
dbcursor.execute(sql)
# one-hot encoding for area_code to generate area_code_510
# While this function is not used for this model run, we provide it as an additional demonstration,
# and in case a similar feature is used in a later model run
dbcursor.execute("DROP FUNCTION IF EXISTS stateTX;")
sql = """CREATE FUNCTION stateTX(state varchar(2048))
RETURNS INT
BEGIN
DECLARE state_TX INT;
IF state = 'TX' THEN SET state_TX = 1;
ELSE SET state_TX = 0;
END IF;
RETURN state_TX;
END
;"""
dbcursor.execute(sql)
Quick demonstration that the functions have been created and work correctly:
dbcursor.execute("""SELECT IntPlanOneHot(int_plan), AreaCode510(area_code), stateTX(state),
int_plan, area_code, state FROM {} LIMIT 5;""".format(churn_table))
dbcursor.fetchall()
We need to create a function that passes all the information needed by the Amazon SageMaker endpoint as described here (see section "Enabling Aurora Machine Learning"). Here we will create a function will_churn
that will use the variables needed by the model. Notice that we are now including the columns created by one-hot encoding in the previous section. The endpoint name was declared in the beginning of the notebook.
If the list of columns printed below has changed in content or in order for your run, you will need to modify the steps below so that they match your list.
print(cols_used)
# REMEMBER! to modify the columns in the SQL below to match the cols_used (if it doesn't already)
dbcursor.execute("DROP FUNCTION IF EXISTS will_churn;")
sql = """CREATE FUNCTION will_churn (
acc_length bigint(20),
vmail_msg bigint(20),
day_mins double,
day_calls bigint(20),
eve_mins double,
night_mins double,
night_calls bigint(20),
int_calls bigint(20),
int_charge double,
cust_service_calls bigint(20),
int_plan_no int)
RETURNS float
alias aws_sagemaker_invoke_endpoint
endpoint name '{endpoint}' ; """.format(endpoint = endpoint_name)
dbcursor.execute(sql)
cnx.commit()
Now we can call the function with data from our table and ask for the results of the prediction.
# To make it easier to carry across SQL without error, we'll capture the parameters in a function here.
will_churn_fn = """will_churn(acc_length, vmail_msg, day_mins, day_calls, eve_mins, night_mins,
night_calls, int_calls, int_charge, cust_service_calls, IntPlanOneHot(int_plan) )"""
sql = """SELECT state, area_code, phone, round(day_charge + eve_charge + night_charge + int_charge, 2),
int_plan, cust_service_calls,
round({will_churn_fn},3) will_churn FROM {customers}
LIMIT 5;""".format(will_churn_fn = will_churn_fn, customers = churn_table)
dbcursor.execute(sql)
dbcursor.fetchall()
If the previous command returns a list of entries, then the request from Amazon Aurora for predictions from the model worked!
The last value in the tuple is the model's prediction score for whether this customer will churn.
Currently the model returns a prediction value before assigning a cutoff (since we deployed it to return such a value). We could choose to convert this value to a Boolean True or False either via a modification to the SageMaker endpoint, or via an additional transformation on the DB side. However, in this case we'll leave it, so at some later time we can explore these values in greater detail. For example, there is likely a large range of "maybe churn", between "will" and will not churn". From a Marketing perspective, these are the customers we'd ideally likely to identify and target.
Now let's add sentiment detection into the SQL request.
dbcursor.execute("""SELECT day_mins, cust_service_calls, int_plan,
round({will_churn_fn},3) will_churn,
aws_comprehend_detect_sentiment('You morons! You charged me extra again!', 'en') AS sentiment,
round(aws_comprehend_detect_sentiment_confidence('You morons! You charged me extra again!', 'en'),3)
AS confidence
FROM customers
WHERE area_code=415 AND phone='358-1921';""".format(will_churn_fn = will_churn_fn))
dbcursor.fetchall()
The values returned are:
Now we're finally ready to put all the pieces together in our campaign to prevent customer churn!
We've received our first round of proposed incentives from Marketing. We've coded their rules into a function, suggest_incentive, shown below.
After the function, we'll send it some test requests.
# Create a select with a join: customer_message, customers; with a call on the fly to do the one-hot encoding AND call comprehend
import random
import json
def suggest_incentive(day_mins, cust_service_calls, int_plan_no, will_churn, sentiment, confidence):
# Returns a suggestion of what to offer as a rebate to this customer, based on their churn characteristics and this interaction
if sentiment == 'POSITIVE' and confidence>0.5:
if will_churn < 0.5: # Basically happy customer
return "Sentiment POSITIVE and will_churn<0.5: No incentive."
else: # Good interaction, but at-risk-of churn; let's offer something
return "Sentiment POSITIVE and will_churn>0.5: $5 credit"
elif sentiment == 'NEGATIVE' and confidence>0.7:
if will_churn > 0.8: # oh-oh! High odds! Pull out all stops
return "Sentiment NEGATIVE and will_churn>0.8: $25 credit"
elif will_churn > 0.4: # Not so bad, but still need to offer something. But what?
if random.choice([0,1]) == 1:
return "Will_churn confidence > 0.4, experiment: $15 credit"
else:
return "Will_churn confidence > 0.4, experiment: $5 credit"
else: # Pretty happy customer, we'll trust it's just a blip
return "Will_churn confidence <= 0.4: No incentive."
elif cust_service_calls > 2 and not int_plan_no:
return "cust_service_calls > 4 and not int_plan_no: 1000 free minutes of international calls"
else:
return "NOT (cust_service_calls > 4 and not int_plan_no): No incentive."
return "No incentive."
def assess_and_recommend_incentive(area_code, phone, message):
sql = """SELECT day_mins, cust_service_calls, IntPlanOneHot(int_plan) as int_plan_no,
round({will_churn_fn},3) as will_churn,
aws_comprehend_detect_sentiment('{message}', 'en') AS sentiment,
round(aws_comprehend_detect_sentiment_confidence('{message}', 'en'),3)
AS confidence
FROM {customers}
WHERE area_code={area_code}
AND phone='{phone}';""".format(will_churn_fn = will_churn_fn,
customers = churn_table,
message = message,
area_code = area_code,
phone = phone)
dbcursor.execute(sql)
result = dbcursor.fetchone()
incentive = suggest_incentive(result[0], result[1], result[2], result[3], result[4].decode(), result[5])
ret = {"area_code": area_code,
"phone": phone,
"service_calls": result[1],
"international_plan": 1 - result[2],
"churn_prob": result[3],
"msg_sentiment": result[4].decode(),
"msg_confidence": result[5],
"incentive": incentive
}
return ret
print(assess_and_recommend_incentive(408, '375-9999' , "You morons! You charged me extra again!"), "\n")
print(assess_and_recommend_incentive(415, '358-1921', "How do I dial Morocco?"), "\n")
print(assess_and_recommend_incentive(415, '329-6603', "Thank you very much for resolving the issues with my account"), "\n")
Now we have the ability to detect, during a customer interaction, whether this is an "at-risk" customer, and to intervene - to provide them with an incentive to stay. We can choose the factors on which we want to base our response: the sentiment of this interaction, combined with the customer's current characteristics, interaction history, and estimated risk. We can randomize our responses, too, in order to begin to gather more data on what effective incentives are.
We have the beginnings of understanding the key factors, and the beginnings of an experimentation platform.
We can now begin to add experimentation and nuance to our incentive program:
As we explore these alternatives, we move from prediction, to providing tangible and actionable business value. Over to you, Marketing!