Data structures

Until now all the variables we have used have contained a single piece of information, for example, a = 4 makes a variable a containing a single number, 4. It’s very common to want to refer to collections of data. You can think, for example, of a bank statement that contains the list of expenses you had last month.

Python has several build-in data structures that facilitate working with this common kind of data. In this beginners course we will only see list, but keep in mind there are other built-in data structures.

A list is a data type that stores an ordered sequence of elements and can be created using square brackets [], with elements separated by commas ,. Let’s create a new script named list.py with the following code:

list.py
my_list = ["cat", "dog", "horse"]

print(my_list)

This will create a Python list with three elements and assign it to the variable my_list. The square brackets [ and ] in this case mean “create a list” and the elements of the list are separated by commas. As with previous variable types, you can print lists by passing their name to the print() function. Run this script in the terminal with python list.py and look at the output.

['cat', 'dog', 'horse']

You can have as many items in a list as you like, even zero items as in:

my_list = []

You can even have a mixture of different type of data types:

my_list = [5, 34.6, "Hello", -6, False]
Exercise

Edit the script list.py so that my_list has some more items in it. Try adding some different data types or rearranging items. Make sure you save the file and rerun it in the terminal and check that the output matches what you expect.

We have changed our list by adding two more items to the end of it. We add an integer, 7 and a new string, "quail". Each is still separated by a comma:

list.py
my_list = ["cat", "dog", "horse", 7, "quail"]

print(my_list)
Terminal/Command Prompt
python list.py
['cat', 'dog', 'horse', 7, 'quail']

Here we edit our list so that the items are all in a different order:

list.py
my_list = ["quail", "cat", 7, "dog", "horse"]
print(my_list)
['quail', 'cat', 7, 'dog', 'horse']

Indexing

The power of Python’s lists comes not simply from being able to hold many pieces of data but from being able to get specific pieces of data out. The primary method of this is called indexing. Indexing a list in Python is done using the square brackets []. This is a different use of the square brackets to that which we saw above for making a list.

To get a single element out of a list you write the name of the variable followed by a pair of square brackets with a single number between them:

list.pys
my_list = ["cat", "dog", "horse"]

print(my_list[1])
dog

The code my_list[1] means “give me the number 1 element of the list my_list”. Is the above output what you expected?

Probably you noticed that it prints dog whereas you may have expected it to print cat. This is because in Python you count from zero when indexing lists and so index 1 refers to the second item in the list. To get the first item you must use the index 0. This “zero-indexing” is very common and is used in many programming languages.

Exercise

Try accessing some different elements from the list by putting in different number between the square brackets.

As well as setting my_element to the “0th” element of the list, we also print the value of the element at index 2:

my_list = ["cat", "dog", "horse"]

my_element = my_list[0]
print(my_element)

print(my_list[2])
cat
horse

Reverse indexing

Putting a single positive number in the square brackets gives us back the element which is at that distance from the start of the list, but what if we want the last element? If we know the length of the list (in our case here, 3 elements) then we can use that to know the index of the last element (in this case, 2), but perhaps we don’t know how long the list is (or we don’t want to check it).

In this case we can use Python’s reverse indexing by placing a negative integer in the square brackets:

list.py
my_list = ["cat", "dog", "horse"]

print(my_list[-1])
horse

If you run this code then you will see that it prints horse which is the last item in the list. Using negative numbers allows you to count backwards from the end of the list so that -1 is the last item, -2 is the second-last item etc.

Slicing

As well as being able to select individual elements from a list, you can also grab sections of it at once. This process of asking for subsections of a list of called slicing. Slicing starts out the same way as standard indexing (i.e. with square brackets) but instead of putting a single number between them, you put multiple numbers separated by a colon.

Between the square brackets you put two numbers, the starting index and the ending index. So, to get the elements from index 2 to index 4, you do:

my_list = [3, 5, "green", 5.3, "house", 100, 1]

my_slice = my_list[2:5]

print(my_slice)
['green', 5.3, 'house']

You see that is printed ['green', 5.3, 'house'] which is index 2 ('green'), index 3 (5.3) and index 4 ('house'). Notice that it did not give us the element at index 5 and that is because with slicing, Python will give you the elements from the starting index up to, but not including, the end index.

Errors while working with lists

It is very likely that indexing lists is the first time you will see a Python error. Seing Python errors (also sometimes called exceptions) is not a sign that you’re a bad programmer or that you’re doing something terrible. Even experienced programmers see Python errors on their screen.

Error messages are in fact a very useful feedback mechanism for the programmer but that can be a bit daunting when you first see them. Let’s recreate a typical error message: a list with three elements will not have an element at index 6 (the highest index in that case would be 2) and produce an error if we ask for it.

list.py
my_list = ["cat", "dog", "horse"]

my_element = my_list[6]

print(my_element)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[12], line 3
      1 my_list = ["cat", "dog", "horse"]
----> 3 my_element = my_list[6]
      5 print(my_element)

IndexError: list index out of range

Running this you will see the following printed to the screen:

which a very dense collection of information. Usually is the the last line of an error as that is where the most useful information is. In this case, the last line reads IndexError: list index out of range which has two parts to it. The first is the word before the colon which tells you the type of the exception is an IndexError, i.e. an error when indexing. The second part of that line is usually a slightly more descriptive message, in this case telling us that the specific problem was that the index was “out of range”, i.e. too high or too low.

Moving to the lines above that, we see printed the line of code at which the exception occured along with the file name and line number within the script. These are essential in larger scripts to track down where the problem came from.

Take your time to read the error messages when they are printied to the screen, they will most likely help you solve the issue. If you think that you’ve fixed the problem but the error persists, make sure that you’ve saved the script file and rerun your code afterwards.

Exercise

Edit your script to print various slices of a list. If you get an error printed, make sure you understand what it is telling you.

We make a list and then select a few slices of it:

list.py
my_list = [3, 5, "green", 5.3, "house", 100, 1]

print(my_list[2:4])
print(my_list[3:-2])
print(my_list[-4:-1])
print(my_list[33:37])

The first print selects every element from index-2 up to, but not including, the index-4, i.e. the 2nd and 3rd elements.

The second print starts at index 3 and goe as far as index -2 (which is the same as index 5 in this list).

The next print starts at index -4 (i.e. index 3) and goes until index -1 (i.e. index 6).

Last, print(my_list[33:37]) tries to access elements of a range that do not exist, but the behaviour of Python is different than when accessing a single element as we saw before. In this case, it returns and empty list instead of showing an error message. This wrapping of slicing calls give us some flexibility when accessing a list, but will requiere us to check, in most cases, that the output is not an empty list.

Terminal/Command Prompt
python list.py
['green', 5.3]
[5.3, 'house']
[5.3, 'house', 100]
[]

Adding elements to lists

Lists in Python are dynamic, meaning that they can change in size during your script execution. You can add items to the end of your list by using the append function. The append function is a little different to other functions that we have used so far (like print and range) in that it is a part of the list data type so we use it in a slighlty different way.

my_list = ["cat", "dog", "horse"]

my_list.append("quokka")

print(my_list)
['cat', 'dog', 'horse', 'quokka']

Here you see we gave the name of our list (my_list) followed it by a dot (.) and then the name of the function that we wanted to call (append). Functions which are part of data types like this are sometimes called methods. We might describe the middle line here as “calling the append method on the object my_list”.