The numpy module

The numpy module is one of the most useful packages for doing math and physics in python. Numpy contains several tools including new data types, functions for doing common mathematical tasks, and tools for using special functions, series, and linear algebra. For us, the most important tool that numpy gives us is the array data type and we will spend this lesson working extensively with arrays and a few tools that allow us to create and manipulate them.

Arrays

So far in this course, we’ve worked extensively with python lists. You should remember that lists are great because they are very generic- each element in a list can be any data type. This has many practical uses, however, making lists work in this way means they can’t be used as easily to perform mathematical operations as might be otherwise possible. The numpy array datatype fixes this. Numpy arrays allow us to create N-dimentional variables so that we can do things like vector and matrix operations efficiently. For example:

[1]:
import numpy as np

myarray = np.array([[4,5,6],[1,3,2],[9,1,7],[8,2,5]])
print(myarray)
[[4 5 6]
 [1 3 2]
 [9 1 7]
 [8 2 5]]

Here I’ve created a 2D array (or matrix) that has 3 columns and 4 rows. I did this using numpy’s array() function which takes an “array like” object as a required argument. Here, the array like object that I passed were a set of normal python lists. Once created, we can do things like index our array similarly to how we would index a list. Since, our array is 2D, we need 2 indices to select a single element:

[2]:
print(myarray[2,1])
1

Note that the first index corresponds to the row number and the second index corresponds to the column. I can also grab a slice of the array:

[3]:
print(myarray[:,1])
print(myarray[0,0:2])
[5 3 1 2]
[4 5]

The first example grabs all elements from the middle column. The second example grabs the first two elements from the 1st row.

Arrays can have any number of dimensions. So, lets see what a 3D matrix looks like.

[4]:
threeDarray = np.array([[[1,2],[4,6]],[[3,6],[9,5]],[[5,2],[7,8]]])
print(threeDarray)
[[[1 2]
  [4 6]]

 [[3 6]
  [9 5]]

 [[5 2]
  [7 8]]]

This might be a little hard to visualize- as you probably have a 2D monitor. But, I’ve created a 3x2x2 matrix here (you can tell by how I defined it. The first set of two lists are enclosed in their own set of square brackets. As are the second set of two lists and the 3rd set of two lists). Python is showing us 3 slices of our 3D matrix. Each slice is itself a 2D matrix. If you think of this data being in the shape of a cube, the top 2D matrix is the front face of the cube:

[5]:
print(threeDarray[0,:,:])
[[1 2]
 [4 6]]

Then we have the middle face:

[6]:
print(threeDarray[1,:,:])
[[3 6]
 [9 5]]

and finally the back face:

[7]:
print(threeDarray[2,:,:])
[[5 2]
 [7 8]]

Of course, I can slice in any dimension:

[8]:
print(threeDarray[:,1,:])
print("")
print(threeDarray[:,1,0])
[[4 6]
 [9 5]
 [7 8]]

[4 9 7]

Slicing once (top example, 1 index is specified) gives me an object with N-1 dimensions, where N is the number of dimensions of the array itself. Slicing twice (second example, 2 indices are specified) gives me an object with N-2 dimensions.

Building arrays

You will probably never create an array in the way that I demonstrated above: by calling the array() function with actual numbers. Instead, you will more likely build arrays be filling them with the results of some mathematical operation, using variables, instead of hard coding as in those examples. In this course, we’ve made extensive use of the append() method of the list datatype to do this sort of thing. For example, we might calculate the trajectory of a baseball pitch by looping over some number of milliseconds and calculating the position of the ball during each time step. We would use the append() method to add an element to some position list and in that way build the list element by element as our code runs.

Numpy arrays don’t have access to this method so we can’t append to them in the same way. For this reason, it is common to convert between list data types and arrays when working with data. However, this can get confusing when working with arrays that have 3 or more dimensions. So, a reasonable question is how do we create arrays if we aren’t going to be able to hard code in the data that we are working with? If we can’t append to an array, how can we add elements to the array if we don’t know their values before the code runs?

The answer is usually, before the main part of our code runs, we create an array of the appropriate size filled with placeholders. We don’t have to append to the array because all of the elements exist. Instead, we just change the elements as our code runs. The zeros() function helps us do this by creating an array of a specific shape filled with, you guessed it, zeros.

[9]:
newarray = np.zeros((5,3))
print(newarray)
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

Just like that I have a 5x3 array ready to be manipulated. Note the strange looking format of the zeros() function. The first argument of zeros is a tuple that contains the shape of the array to be created. That’s why there are two parenthesis right next to each other. array() can take other arguments, so if that extra set of parens weren’t there, the “3” would look like argument #2 instead of part of argument #1.

Now that the array is created, I can loop through it and modify each value:

[10]:
for i in range(3):
    for j in range(5):
        newarray[j,i] = i+j

print(newarray)
[[0. 1. 2.]
 [1. 2. 3.]
 [2. 3. 4.]
 [3. 4. 5.]
 [4. 5. 6.]]

as one simple example. Normally, we’d be doing some physics or something here.

Copying arrays (and lists)

Making copies of a variable is a common task in programming. It is not unusual to see something like the following:

[11]:
lista = [5,6,2]
listb = lista
print("listb: ",listb)
listb[0] = listb[0]+3
print("listb: ",listb)
print("lista: ",lista)
listb:  [5, 6, 2]
listb:  [8, 6, 2]
lista:  [8, 6, 2]

That should seem weird to you. We made lista and then made a copy and called it listb. Then, we changed one element of listb and found that lista had also been changed! In python, copying something in this way makes a real copy. If you make a change to the copy, the original also gets changed. Numpy arrays work the same:

[12]:
arraya = np.array([4,2])
arrayb = arraya
arrayb[0] = 10
print(arraya)
[10  2]

Again, I made arrayb by coping arraya. Then, when I change arrayb, arraya is automatically changed as a result. What if we don’t want this behavior? Instead, if we make a copy, then modify the copy, can the original stays the same? Sure! Both lists and arrays have access to the copy() method!

[13]:
arrayc = arraya.copy()
print(arrayc)
print(arraya)
arrayc[0] = 100
print(arrayc)
print(arraya)
[10  2]
[10  2]
[100   2]
[10  2]

Now when we change the copy (arrayc) the original stays the same.

Array shape

If we are working with arrays, one of the pieces of information that we need access to is its shape. In other words, how many dimensions does an array have and how many elements in each dimension. We can get that information quickly using the shape attribute.

[14]:
print(newarray.shape)
(5, 3)

Note that shape isn’t a method, it’s an attribute of the array object, so we don’t use parens. Anyway, the attribute tells us that our array is 2D and is 5x3. Shape also works on slices:

[15]:
print(threeDarray.shape)
print(threeDarray[:,1,:].shape)
(3, 2, 2)
(3, 2)