but what happens in month eight?

What happens in month nine?

Well, it turns out that in a lot of those cases,

those forecasting models don't do a very good job.

We have some models that even though we're dealing with survival curve projections,

start to go up again because of the functional form that was chosen.

We have others like the linear model, that keep on going down at that same rate.

And so, none of these do a particularly good job of being able to

forecast out what customer retention looks like in the future.

Even though all of them have good R square values,

none of them did a good job at forecasting the future performance or

the future decisions of this cohort of customers.

So can we build something based on a simple model, and when we start putting

all of the pieces together, allows us to get very good forecasts?

Well, that's going to be the goal.

All right,

a couple of things to keep in mind when we're dealing with timing models.

And this does not come up in any of the other forms of data that we

had talked about.

Well, for timing models,

we only observe actions taken during a specific period of time.

So for example,

let's say that we're looking at customer retention for a 12 month period.

Well, we observe all the customers who dropped service

during that 12 month period.

We also observe a set of customers at the end of 12 months who still have service.

Well, that issue is referred to as right-censoring, all right?

That I observe data during this particular window, 0 to T.

What happens after T?

I have no idea.

Left-censoring is a different problem,

we've only observed beginning at a particular point in time.

We observe everything that happens after that.

Well, we don't get to observe what happened before that.

So suppose that we're looking at a queue.

People lined up at a customer service window, and

we have some people who are in that line and we know what time we got there.

We have no idea what time the people who were there before us got there.

So we have a minimum guess for how long they've been there but

we don't know the exact time, that's the issue of left-censoring.

Interval-censoring, we know that something happened in a particular interval of time.

Let's say within a particular hour or within a particular 15 minute chunk,

but we don't have it down to the exact second.

That's going to be more common for us to have to deal with,

just in terms of the nature of the data that's coming in.

If you're dealing with clickstream data, that's something that we might have.

We might intentionally group observations together into

more coarse units to simplify our analysis.

And in fact, the examples that we're going to be looking at,

that's what we're going to do, is we're going to assume

discrete intervals of time rather than continuous time.

But general timing models, you can account for all of these forms of censoring.