0:05

So, this is the result of hierarchical clustering done using Excel Stat.

Â Excel Stat uses agglomerative clustering, and

Â you can see the y axis in the dendrogram that's on the left.

Â The higher up the connection, the more different the groups being connected are.

Â And so, this branch connects two very different groups.

Â This branch connecting two groups, that are still fairly different, but

Â not as different as that initial branch.

Â And as we get further toward the bottom,

Â the smaller the difference between the individuals.

Â Who are being connected.

Â We can see where the software has indicated for us, the dashed line and

Â as well we can see on the other graph, just desumed in the view of this.

Â On that dashed line, if we think about how many clusters that represents, you can see

Â the labeling here, being that the dashline here in blue is one large cluster,

Â and then looks like we've got two smaller clusters on the other side.

Â One that's represented in red, one that's represented in green, right.

Â So, here's a visual way of saying based on the variables that we're using.

Â So, what we used were the factor scores from our previous exercise.

Â Based on those nine factor scores.

Â What we've determined is that it appears that there are three different segments,

Â based on those factors that were put into the analysis.

Â So that's nice,

Â it tells us that we might want to look to have three clusters in our data.

Â And what you can do, is you can go through and

Â see which individuals each of these are.

Â And Assign the individuals to clusters based on this approach.

Â A slightly easier way of doing it is, let's take the results from

Â the hierarchical clustering, we say there are three clusters.

Â 2:21

So what K-means clustering does, this is going to be an iterative method.

Â I'm going to tell the algorithm, how many clusters I want.

Â And I'm going to begin with a random assignment,

Â of individuals to the different clusters.

Â And iteration by iteration we're going to kind of reshuffle that mix,

Â and what we're seeking to do again is individual,

Â or respondent that are put within the same cluster.

Â We want them to have relatively small difference from each other, and

Â we want the cluster centers to be more different from each other.

Â So within clusters differences are small,

Â between cluster differences are large,right?

Â So, within Excel stat under the analyze data,

Â the choice of using K-Means Clustering, you specify

Â where the variables are that you want to use as the basis for segmentation.

Â So, this basis for segmentation are the factors scores, for

Â the nine different factors we got out of that automotive survey.

Â We also specify how many segments or how many classes we want.

Â And so, I've told the computer here are my the nine items,

Â the nine facts the nine themes that we identified,

Â where each responded how to squad each of those teams.

Â And I want you to allocate customers,

Â allocate my respondents into three different classes.

Â 3:40

And so one of the outputs that we get from K Means clustering

Â is this table that tells us, what is the center of the cluster.

Â Think of this, almost like as the average score within the cluster.

Â And so, what I've done in this table,

Â is that I've put in bold kind of the high and low values, for the different score.

Â So, if we look at cluster one, which has 122 responded incident, so

Â you can notice that our clusters are relatively evenly sized.

Â What does cluster one look like?

Â They score higher on the financial freedom dimension.

Â This is called lower on the optimism dimension,

Â higher on that societal indifference dimension.

Â They're the highest, scoring on the family dimension.

Â 4:27

And they scored low on that environmental indifference.

Â And so, those were the nine factors, that were put in as the basis for segmentation.

Â Purchase intent I've added here after the fact, that's not one of our basis for

Â segmentation.

Â So, what we've done is we've built profiles using these nine scores,

Â of the three segments.

Â So, segment one seems to have financial freedom, but they're not optimistic and

Â they're not image oriented.

Â Segment two is very optimistic, very patriotic,

Â image oriented and adventurous, not focused on the family.

Â 5:03

And then segment three, seems to be the least patriotic,

Â also the least adventures.

Â Well, these segments are relatively equal sizes.

Â We can look at what's the average response,

Â in terms of purchase intention, for each of these three segments.

Â And that's where it gets interesting.

Â Now, what we see is find a way, set class two, or

Â segment two has the highest purchase intent.

Â These are the people, who are the most interested in this particular product.

Â Well, the people who are most interested are optimistic.

Â They're patriotic, they're image conscious and they're adventurous.

Â So, when we're going about building our marketing campaign,

Â coming up with the ad creative, these are the individuals that we want to appeal to.

Â 5:45

The next most interested in the product, is segment one and

Â the least interested in the product is segment three.

Â So, based on the survey, we've conducted

Â the factor analysis to get a better understanding of those underlying themes.

Â We've used those underlying themes to form three different market segments.

Â And we've identified which segment is the most interested in the product,

Â and how we want to communicate with them.

Â So, that's what segmentation allows us to do.

Â 6:25

that psychographic profile using factor analysis.

Â Now, we've used cluster analysis to build our segments.

Â We've identified which segment is the most interested in the product.

Â That's going to allow us, to build communications material.

Â To target those individuals.

Â Now, the challenge is, how do we reach that segment?

Â What media are they using?

Â Who are these people?

Â 6:49

And that's where another technique comes into play.

Â It's referred to as discriminant analysis.

Â And what we essentially, this is the Muller image of cluster analysis.

Â Cluster analysis said, I have information about customers, but

Â I need to organize them in such a way that I have similar segments.

Â What discriminant analysis does is say, you tell me the segments that your

Â customers belong to, and who your customers currently are.

Â And I'm going to tell you, what are the most important criteria,

Â in reaching those customers?

Â Which are the factors, that you can look at to say, someone who scores high on this

Â dimension, that's what puts them in a particular cluster, all right.

Â So, we can just summarize the general idea of discriminant analysis is.

Â I have a set of individuals for whom I know, which segment they belong to and

Â I've additional information about those individuals.

Â I want to figure out what is the most informative information I have,

Â that tells me when a new person walks in.

Â And I have those demographics or psychographics available to me,

Â which ones are the most diagnostic of assigning them, to a particular segment.

Â 8:24

That's they shared the idea, of using a kind of linear combination

Â of your predictors as part of the algorithm.

Â The places where they're different from each other,

Â one is in terms of the objective function.

Â We're not trying to minimize our sum of squared errors,

Â like we do in linear regression.

Â What discriminant analysis is trying to do is maximize the hit rate.

Â That is, I want to assign customers to the right segment.

Â If I get it right, that's a success.

Â If I get it wrong, that's a failure, and so the hit rate is the average-- think of

Â it as your success rate, how frequently am I accurately assigning people to segments.

Â So, we want to maximize that.

Â The other place where discriminant analysis, and

Â regression differ from each other, is just the nature of the outcome variable.

Â The outcome here,

Â that we are using with discriminant analysis is group assignments.

Â Think of that as a categorical outcome.

Â Where as, with linear regression, we are dealing with continuous measures.

Â 9:23

Just to show you the screenshot within XL Stat,

Â if you are using that particular tool.

Â Your outcome variable.

Â It is going to be qualitative.

Â That is the group membership number.

Â If you have three segments, you are belong segment one, segment two, or

Â segment three.

Â The axis, is the explanatory variable.

Â That could be quantitative measures, could be qualitative measures.

Â But those are the predictors,

Â that we're going to use to try to assign individuals to different segments.

Â Now, the important thing that we want to look at, with discriminate analysis is,

Â just how good a job are we doing, at I'm predicting where someone belongs.

Â 10:16

So within SPSS and within most packages,

Â you can look at your original or your calibration data.

Â And we have the predicted group membership versus,

Â this is the actual group membership, versus the predicted group membership.

Â And if we look at respondents who were actually in segment one, and

Â we predicted them to go to segment one is 75.8%.

Â Respondents who were in segment two,

Â when we predicted to go to segment two, is about 90%.

Â Respondents who were in segment three,

Â we predicted them going to segment three Is about 83%.

Â And so we can think of the overall hit rate it's giving us in the foot no on

Â bottom, 83.3 % of the correspondents were classified correctly.

Â So, that diagonal tell us the accurate classification.

Â The off diagonals triangles where we screwed up, all right, so

Â that's for the calibration data.

Â One of the nice things that some software packages do for you is cross validation.

Â Let me omit some of my data, and

Â let me see how accurately I'm able to predict the membership.

Â For those particular observations, even though they're not used for

Â calibrating the model.

Â And, you see, we do Not a bad job here in the cross validation case

Â 81.8% were classified correct.

Â So, we don't drop too much.

Â Now, one of the things that I've done in this exercise, was we had factors that

Â will produce ultimately giving us, nine underlying behavioral themes.

Â And so, what I had done was to say okay well those nine factors are the result

Â of a survey.

Â And it was a pretty lengthy survey around 30 or so items on that survey.

Â Suppose that I don't get a chance to

Â give that survey to someone every time they come to a car dealership.

Â But I'd like to know which segment they're in.

Â Could I get away with, instead of asking them 30 questions, what if I was only able

Â to ask them one question that corresponds to each of the factors?

Â How accurately would I be able to classify people?

Â So, I've gone from 30 questions, down to just nine questions, and

Â we see if we can get the answers to just those nine questions.

Â We do a pretty good job, of classifying people into these different segments.

Â And so, that might be one approach is to take your sales associates, and

Â maybe they can get some information from individuals.

Â Consumers, they may not be able to get all the information from survey, but

Â they can get some of that information.

Â 12:47

Now, another way we can think about discriminant analysis being used,

Â is suppose I have very detailed surveys.

Â That I'm doing 100 of questions, and I've run it through the cluster analysis.

Â Well, I'd like to be able to identify those people, as they come into my store,

Â or as they come into my dealership.

Â But in order to do that, I have to rely only on demographic information.

Â We can run discriminant analysis using just demographics.

Â And so, even if our survey was all based on psychographics attitudinal responses,

Â we can still say well let's see how good a job demographics do

Â at capturing the differences that exist across these different segments.

Â All right so, as far as takeaways from the session.

Â Customer segmentation is a fundamental task within marketing.

Â We're doing segmentation, we're doing targeting, we're doing positioning.

Â As far as forming those segments,

Â we get a lot of that data that we need from surveys.

Â You can also form market segments if you're doing marketing analytics.

Â Such that you're getting customer level coefficients,

Â where you're doing conjoint analysis and you're getting customer preferences for

Â different product attributes.

Â We can actually engage in forming market segments based on those coefficients.

Â But, conducting factor analysis and then forming market segments based on

Â The factor scores, very common way of approaching dealing with survey data.

Â So, we've talked about in this session,

Â how do we move from having those factor scores to assigning individuals,

Â to the different segments we've built profiles for those different segments.

Â We can describe them, we can say which of these segments

Â Is the most likely to be interested in the product.

Â All right, but what are the next steps?

Â We've said that there are different market segments,

Â and some segments are more interested in the product than others.

Â We still have to know, how big are those segments relative to each other's?

Â Which one can we more easily reach,

Â what's the appropriate way of reaching those segments?

Â Which segments might we face more competition when we go after?

Â So, this is by no means the end of the road, this is a best way in the middle

Â we understand our consumers better than we did, without doing the segmentation we

Â identified the more homogeneous group of consumers within each of the segment.

Â Yeah, the next task is, let's take these results and figure out the marketing mix

Â that's appropriate, figure out the media mix that's going to be appropriate for

Â those segments, that we ultimately decide are worth going after.

Â