Data repositories in which cases are related to subcases are identified as hierarchical. This course covers the representation schemes of hierarchies and algorithms that enable analysis of hierarchical data, as well as provides opportunities to apply several methods of analysis.

Associate Professor at Arizona State University in the School of Computing, Informatics & Decision Systems Engineering and Director of the Center for Accelerating Operational Efficiency School of Computing, Informatics & Decision Systems Engineering

K. Selcuk Candan

Professor of Computer Science and Engineering Director of ASU’s Center for Assured and Scalable Data Engineering (CASCADE)

In our previous modules,

we talked about visualization design,

we talked about multivariate data,

and how to compute statistics for those.

We talked about different sorts of data types and data structures,

and a lot of data that we're collecting currently has time built-in.

If you think about driving your car around town,

you're driving at a certain time of day,

as you move you're moving through traffic over time,

you may be collecting stock market trends and wanting

to predict how much the stock market might make tomorrow.

You have your calendar of events of things that you're doing,

you might want to be able to visualize your calendar,

you might want to be able to understand

different patterns of the days of the week of how you're going about your day.

All of these things require techniques for temporal analysis and visualization.

In this module, we want to talk about what we mean by

temporal analysis and set the stage for data exploration and visualization.

So time is an outstanding dimension

reflected by Schneiderman's task by data type taxonomy.

Time becomes a critical element that we want to be able to think about how we're going to

visualize to help people understand the progression of events when they occur,

relationships, and patterns in between them.

Time oriented data is ubiquitous.

Like I said, we have stock markets, movie trends, medicine.

Each data case is maybe the event of some kind with one variable being the date and time.

So we may be interested in what time of day does more crime occur.

We may want to visualize a sports match,

where we have four quarters and things are happening over time,

we want to try to summarize the plays.

All of these have time oriented elements underlying the data structures.

So, we want to think about how we can quickly summarize data,

how we can find motives and examples of related elements in those.

So, the ubiquity of time series data

can be shown simply by just looking at the newspaper.

If I take a random selection of 4,000 graphs from

50 newspapers and magazines worldwide from 1974 to 1980,

75 percent of these graphs were time series.

So, what's great about time series data is

that general audiences are familiar with time series.

They understand, they sort of have

been inundated with these graphs so they're used to looking at time series graphs,

where essentially we may have some time on the x-axis and value on y.

So this might be our stock market trend for example, right?

So looking at how the stock market changes over time.

So we can start thinking about what questions we want to ask about time series data,

not necessarily the questions of the visualization,

but if I'm giving you a time series data set,

what things do you want to know?

So some questions might be,

does a data object exist at a certain time?

So, for example, is there any robberies in my neighborhood at night?

Or, when does a certain object exist in the data?

If there are robberies,

what time of day do they occur?

How long does a data object exist?

So, for example, if we're tracking people at

an amusement park trying to think about how long they're waiting for in line for a ride,

how long is that person at that location waiting to get on the ride?

How fast and how much does that object change?

So think about trajectories and cars.

What order did the objects appear in?

Is there a cyclical pattern?

Which objects exist at the same time?

So, all of these are questions we can ask about our time series data,

and we want to start thinking about how we can explore and ask these questions.

Time has a lot of different elements to it.

Time can be and is ordered.

We think about time in a progression,

so we have yesterday, today, and tomorrow.

We split this up and think about what occurred before,

what occurred after, and we try to forecast what might come next.

In that sense, it's continuous.

But we also need to think about time as cyclical.

So we have hours of the days,

days of the weeks,

months of the year, seasons, and those sorts of things.

So things might repeat based on these underlying cyclical patterns of the environment.

Time can also be independent of location as well.

So, we can think about linear time versus cyclical time.

In linear time, one point precedes another,

and time being ordered is closely bound to the notion of causality.

So, we're going to talk in future modules

about space and how to explore data over space and time.

In this model, we want to focus primarily on time,

where one event happens and then another and so forth.

An event could just be a measurement in the stock market.

How much money did the stock market make this second,

then the next second, then the next.

It could be a measure of how much did temperature change

over time or what was the average temperature this year to the next.

Cyclical time is the ordering of points.

In a cyclical time domain is somewhat meaningless.

Winter comes before summer,

the winter also comes after summer depending how we organize this.

Is Monday the first day of the week?

Is Sunday the first day of the week?

Sure, Monday comes after Sunday,

but it also could come before Sunday depending on how we organize this.

So we want to think about how to combine these ideas of cyclical with linear,

and we want to think about what questions we should start asking our time series data.

So, given a large chunk of time series data, in this module,

we're going to discuss different methods

for extracting and exploring interesting features in time series data.

Remember, our visual analytics mantra where we want to analyze first, show the important.

So if we have a whole bunch of time series data,

how do we determine what's important?

There may be a lot of different ways,

and there may be questions you as the analyst want to ask of time series data.

You may say, hey,

I'm interested in when my data has an upward trend.

Show me all the times where we have this upward trend,

or you may even want to say,

show me the times when we have an upward trend followed by a downward trend.

So we can start thinking about how to ask questions of

data changes to our data through interactive visualizations.

So we want to combine

different data analysis techniques with different data visualizations,

and thinking about how to wrap these in interactive packet to

allow for advanced exploration,

knowledge discovery, and reasoning. Thank you.

Explore our Catalog

Join for free and get personalized recommendations, updates and offers.