In this video we will be covering NumPy in 1D,

in particular ND arrays.

NumPy is a library for scientific computing.

It has many useful functions.

There are many other advantages like speed and memory.

NumPy is also the basis for pandas.

So check out our pandas video.

In this video we will be covering

the basics and array creation,

indexing and slicing,

basic operations, universal functions.

Let's go over how to create a NumPy array.

A Python list is a container that

allows you to store and access data.

Each element is associated with an index.

We can access each element using

a square bracket as follows.

A NumPy array or ND array is similar to a list.

It's usually fixed in size

and each element is of the same type,

in this case integers.

We can cast a list to

a NumPy array by first importing NumPy.

We then cast the list as follows;

we can access the data via an index.

As with the list, we can access

each element with an integer and a square bracket.

The value of a is stored as follows.

If we check the type of the array we get, Numpy.ndarray.

As NumPy arrays contain data of the same type,

we can use the attribute

dtype to obtain the data type of the array's elements.

In this case a 64-bit integer.

Let's review some basic array attributes

using the array a.

The attribute size is

the number of elements in the array.

As there are five elements the result is five.

The next two attributes will make

more sense when we get to higher dimensions,

but let's review them.

The attribute ndim represents

the number of array dimensions or the rank of the array,

in this case one.

The attribute shape is a tuple of

integers indicating the size of

the array in each dimension.

We can create a NumPy array with real numbers.

When we check the type of the array,

we get numpy.ndarray.

If we examine the attribute D type,

we see float64 as the elements are not integers.

There were many other attributes, check out numpy.org.

Let's review some indexing and slicing methods.

We can change the first element of

the array to 100 as follows.

The array's first value is now 100.

We can change the fifth element of the array as follows.

The fifth element is now zero.

Like lists and tuples we can slice a NumPy array.

The elements of the array correspond

to the following index.

We can select the elements from one to three and assign

it to a new NumPy array d as follows.

The elements in d correspond to the index.

Like lists, we do not count

the element corresponding to the last index.

We can assign the corresponding indices

to new values as follows.

The array c now has new values.

See the labs or numpy.org for

more examples of what you can do with NumPy.

NumPy makes it easier to do

many operations that are

commonly performed in data science.

The same operations are

usually computationally faster and

require less memory in NumPy compared to regular Python.

Let's review some of these operations

on one-dimensional arrays.

We will look at many of the operations in the context of

Euclidian vectors to make things more interesting.

Vector addition is a widely used operation

in data science.

Consider the vector u with two elements,

the elements are distinguished by the different colors.

Similarly, consider the vector v with two components.

In vector addition, we create

a new vector in this case z.

The first component of z

is the addition of the first component

of vectors u and v. Similarly,

the second component is

the sum of the second components of

u and v. This new vector z is now

a linear combination of the vector u and

v. Representing vector addition

with line segment or arrows is helpful.

The first vector is represented in red,

the vector will point in

the direction of the two components.

The first component of the vector is one.

As a result the arrow is offset

one unit from the origin in the horizontal direction.

The second component is zero,

we represent this component in the vertical direction.

As this component is zero,

the vector does not point in the horizontal direction.

We represent the second vector in blue.

The first component is zero,

therefore the arrow does not

point to the horizontal direction.

The second component is one.

As a result the vector points in

the vertical direction one unit.

When we add the vector u and v,

we get the new vector z.

We add the first component,

this corresponds to the horizontal direction.

We also add the second component.

It's helpful to use the tip to

tail method when adding vectors,

placing the tail of the vector v on the tip of vector u.

The new vector z is constructed by connecting

the base of the first vector u with the tail of the

second v. The following three lines of code

we'll add the two lists and place

the result in the list z.

We can also perform

vector addition with one line of NumPy code.

It would require multiple lines to perform

vector subtraction on two lists

as shown on the right side of the screen.

In addition, the NumPy code will run much faster.

This is important if you have lots of data.

We can also perform vector subtraction by changing

the addition sign to a subtraction sign.

It would require multiple lines

perform vector subtraction

on two lists as shown on the right side of the screen.

Vector multiplication with a scalar is

another commonly performed operation.

Consider the vector y,

each component is specified by a different color.

We simply multiply the vector by

a scalar value in this case two.

Each component of the vector is multiplied by two,

in this case each component is doubled.

We can use the line segment or

arrows to visualize what's going on.

The original vector y is in purple.

After multiplying it by a scalar value of two,

the vector is stretched out by two units as shown in red.

The new vector is twice as long in each direction.

Vector multiplication with a scalar only

requires one line of code using NumPy.

It would require multiple lines

to perform the same task as

shown with Python lists

as shown on the right side of the screen.

In addition, the operation would also be much slower.

Hadamard product is

another widely used operation in data science.

Consider the following two vectors,

u and v. The Hadamard product

of u and v is a new vector z.

The first component of z is the product of

the first element of u and v. Similarly,

the second component is

the product of the second element of

u and v. The resultant vector consists of

the entry wise product of u and v. We can

also perform hadamard product

with one line of code in NumPy.

It would require multiple lines to perform

hadamard product on two lists

I shown on the right side of the screen.

The dot product is

another widely used operation in data science.

Consider the vector u and v,

the dot product is a single number given by

the following term and

represents how similar two vectors are.

We multiply the first component from v and u,

we then multiply the second component

and add the result together.

The result is a number that represents

how similar the two vectors are.

We can also perform dot product using the NumPy function

dot and assign it with the variable result as follows.

Consider the array u,

the array contains the following elements.

If we add a scalar value to the array,

NumPy will add that value to each element.

This property is known as broadcasting.

A universal function is a function that

operates on ND arrays.

We can apply a universal function to a NumPy array.

Consider the arrays a,

we can calculate the mean or average value of

all the elements in a using the method mean.

This corresponds to the average of all the elements.

In this case the result is zero.

There are many other functions.

For example, consider the NumPy arrays b.

We can find the maximum value using the method five.

We see the largest value is five,

therefore the method max returns a five.

We can use NumPy to create functions that

map NumPy arrays to new NumPy arrays.

Let's implement some code on the left side of the screen

and use the right side of

the screen to demonstrate what's going on.

We can access the value of pie in NumPy as follows.

We can create the following NumPy array in radians.

This array corresponds to the following vector.

We can apply the function sin to the array

x and assign the values to the array y.

This applies the sine function

to each element in the array,

this corresponds to applying

the sine function to each component of the vector.

The result is a new array y,

where each value corresponds to

a sine function being applied to

each element in the array x.

A useful function for plotting

mathematical functions is line space.

Line space returns evenly

spaced numbers over specified interval.

We specify the starting point of the sequence,

the ending point of the sequence.

The parameter num indicates

the number of samples to generate,

in this case five.

The space between samples is one.

If we change the parameter num to nine,

we get nine evenly spaced numbers

over the integral from negative two to two.

The result is the difference

between subsequent samples is

0.5 as opposed to one as before.

We can use the function line space to generate 100

evenly spaced samples from the interval zero to two pie.

We can use the NumPy function sine to

map the array x to a new array y.

We can import the library pyplot as

plt to help us plot the function.

As we are using a Jupiter notebook,

we use the command matplotlib inline to display the plot.

The following command plots a graph.

The first input corresponds to

the values for the horizontal or x-axis.

The second input corresponds to

the values for the vertical or y-axis.

There's a lot more you can do with NumPy.

Check out the labs and NumPy.org for more.

Thanks for watching this video.