Python Cheat Sheet

Python is a strongly-typed, dynamically typed, interpreted, general-purpose programming language. The language is widely used for teaching, for data science or analytics, web development, or scripting.

This cheat sheet reviews core programming language concepts and vocabulary. This should get you back up to speed if it’s been a while since you’ve written in Python, or if you’re familiar with another language and need a rapid succession of examples.

Designing Programs

Programming language are built from five essential components.¹


Variables	store a value for later use. `x = 1`
Conditionals	choose a behavior based on an observation. `if`, `elif`, `else`
Repetition	repeat a procedure until some condition is met. `for`, `while`
Abstraction	encapsulate a behavior; hide the details. `def`, `class`, `import`
Application	invoke an abstraction to return a result. `x + 1`

Every complex program—operating systems, video games, machine learning models, space shuttles—is at some low level of abstraction doing all five of these things. Major innovations happened over the last fifty years that made computers faster, smaller, and more affordable; but the core operation of transforming data is still here.

In “How to Design Programs”, Felleisen et al. define a “systematic program design” approach as the following six steps. When you’re working alone, these can guide you toward a solution. When you’re working with other agents—prompting large language models (LLMs) or asking someone for guidance—these can communicate where your thoughts are and how you organize ideas.

The Function Design Recipe

The “How to Design Programs” systematic design steps:²

From Problem Analysis to Data Definitions. Identify the information that must be represented and how it is represented in the chosen programming language. Formulate data definitions and illustrate them with examples.

Signature, Purpose Statement, Header. State what kind of data the desired function consumes and produces. Formulate a concise answer to the question what the function computes. Define a stub that lives up to the signature.

Functional Examples. Work through examples that illustrate the function’s purpose.

Function Template Translate the data definitions into an outline of the function.

Function Definition. Fill in the gaps in the function template. Exploit the purpose statement and the examples.

Testing. Articulate the examples as tests and ensure the function passes all. Doing so discovers mistakes. Tests also supplement examples in that they help others read and understand the definition when the need arises—and it will arise for any serious problems.

Starting and Stopping Python

In a terminal, we can start a Python REPL by running python3:

python3

The version numbers, dates, and platform information will look slightly different on different machines. But in general: the universal sign of a Python REPL is the triple greater-than signs: >>>

$ python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

REPL is an acronym for “Read-Eval-Print-Loop.” A REPL can be a helpful location for testing out our ideas, because its four steps give us instant feeback on everything we do:

Read: Read an input expression from the user
Eval: Evaluate the expression
Print: Print the result of evaluating the expression, or show nothing
Loop: Jump to (1)

When one is finished, calling exit() will quit out of the Python REPL, returning one back to their shell.

$ python3
>>> exit()
$

Primitive Types

A types or data type is a noun: they are the things or objects that we talk about in a language. A primitive type is the lowest level in a type hierarchy: they cannot be broken down into smaller units.³

Type
`int`	`-10` `5` `0` `300`
`float`	`0.1` `0.2` `-10.5` `1e5` `1e-3`
`bool`	`True` `False`
`str`	`"0"` `"5"` `"xyz"` `'hello'`
`None`	`None`

There is another word you may encounter at this level: the object. We will use the words type and object interchangeably. This is because defining a new object is really defining a new type of data: a data modeling problem.

Type Casting

Type casting happens when we convert something from one type to another.

Sometimes this change is lossless when there is a one-to-one relationship between the data types:

>>> int(False)
0
>>> int(True)
1

>>> int("123")
123
>>> str(123)
'123'

Other times changing the data type is lossy. Information about the underlying data is lost when we convert from one representation to another:

>>> int(2.5)
2
>>> float(2)
2.0

>>> float(int(2.5)) == 2.5
False

>>> float(str(2.5)) == 2.5
True

Truthiness

Truthiness is the idea that some types are inherently True and others are inherently False. A type’s truth can be checked by casting it to a bool:

>>> bool(0)
False
>>> bool(1)
True

As a rule: Falsey values correspond with emptiness, nothingness, or zero-ness.

>>> bool(0)
False
>>> bool(None)
False
>>> bool("")        # the empty string is `False`
False
>>> bool([])        # the empty list is `False`
False

Everything which is not False is True. Truthy values therefore correspond with full-ness, something-ness, or existence. For example, every non-zero number is True:

>>> [i for i in range(-3, 3)]
[  -3,   -2,   -1,     0,    1,    2]

>>> [bool(i) for i in range(-3, 3)]
[True, True, True, False, True, True]

Identifiers, Variables, and Names

A variable binds an identifier to a value through assignment with the equal sign =:

>>> x = 1
>>> x
1

Variables vary in that re-assigning an identifier to a new value changes its value:

>>> x = 1       # assign `x` to `1`
>>> x = 2       # re-assign `x` to contain `2`
>>> x
2

An identifier is a letter-number combination:

>>> x1 = -1
>>> x2 = "a"

Identifiers must start with a letter, and there exist many symbols which the language does not consider as valid parts of an identifier.

>>> 📦 = 1          # SyntaxError
>>> $ = 1           # SyntaxError
>>> 1c = 1          # SyntaxError: starts with a number
>>> one! = 1        # SyntaxError

Some identifiers are reserved by the language. This barrier prevents potentially dangerous side effects, like changing the meanings of True and False.

The full list of Python’s reserved keywords are maintained in Python’s lexical analysis documentation.

False      await      else       import     pass
None       break      except     in         raise
True       class      finally    is         return
and        continue   for        lambda     try
as         def        from       nonlocal   while
assert     del        global     not        with
async      elif       if         or         yield

Finally, a defined identifier is given a special title, a name. Trying to invoke a name that does not exist is therefore a NameError:

>>> v
NameError: name 'v' is not defined

Expressions, Math, and Operators

Wikipedia phrases an expression as “A syntactic entity in a programming language that may be evaluated to determine its value.”⁴ Translating from Wikipediese, we have two things: a syntactic entity, and evaluation. A syntactic entity for our purposes means “a valid piece of Python code”.

The simplest expressions are the primitive types, and the simplest rule of evaluation is that every primitive type evaluates to itself:

>>> 0
0
>>> 1
1
>>> 'foo'
'foo'
>>> True
True
>>> None
None

More interesting expressions involve combining primitive types with operators and operands. By example: in the expression 0 + 4, the plus + symbol is an operator, while 0 and 4 are the operands in the expression.

>>> 0 + 4
4
>>> 0 - 4
-4

In concert, operators and operands answer the question: what action is being carried out, and what is it being carried out upon?

Understanding evaluation in full quickly devolves into trying to comprehend “how does Python actually work?” So the simple definition that we will stick with is that “evaluation is the 2nd step in REPL, where a piece of code turns into a result”.

Since operators (+, -) act upon types/objects/operands, we’ll extend our analogy to say that types are to nouns as operators are verbs.

$$ \text{type} : \text{noun} :: \text{operator} : \text{verb} $$

This gives us the logical operators, math operators, and binary relations:

Symbol	Operator Name	Usage
`+`	addition	`(2 + 5) == 7`
`-`	subtraction	`(5 - 2) == 3`
`*`	multiplication	`(5 * 7) == 35`
`//`	floor division	`(36 // 7) == 5`
`%`	modulo (remainder)	`(10 % 9) == 1`
`**`	exponentiation	`(2 ** 3) == 8`
`/`	(float) division	`(6 / 4) == 1.5`
`and`	logical and	`True and True`
`or`	logical or	`True or False`
`==`	equal	`1 == 1`
`<`	less than	`2 < 3`
`>`	greater than	`3 > 2`
`<=`	less than or equal	`2 <= 3`
`>=`	greater than or equal	`3 >= 3`
`!=`	not equal	`2 != 3`

Expressions themselves may contain other expressions. Evaluation must therefore act on tree structures, which for math operators follows the PEMDAS rules (parentheses, exponentiation, multiplication, division, addition, subtraction). Or one may be precise and add parentheses to specify a particular order:

>>> (0 + 4) + (0 - 4)
0

graph TD
    A["+"]
    A-->B["+"]
    B-->C[0]
    B-->D[4]
    A-->E["-"]
    E-->F[0]
    E-->G[4]

versus the case without parentheses:

>>> 0 + 4 + 0 - 4
0

graph TD
    A["+"]
    A-->B[0]
    A-->C[4]
    D["+"]
    D-->A
    D-->E[0]
    F["-"]
    F-->D
    F-->G[4]

Finally, evaluation is done with respect to an environment. In this context,⁵ an environment is the set of all valid names when evaluation happens. Therefore an environment is a kind of mapping between identifiers and their value, allowing us to express ideas which require storing data and retrieving it later.

>>> ZERO = 0
>>> ONE = 1
>>> ZERO + ONE
1

graph TD
    subgraph environment
    ZERO-->0
    ONE-->1
    end

    subgraph evaluate
    A["+"]
    A-->B[ZERO]
    A-->C[ONE]
    end

From Operators to Functions

Operators and operands are implemented in Python using functions. So what is the difference between an operator and a function? In theory: nothing. In Python: how we use them. Peruse your keyboard, is there a symbol on it that represents the concept of maximum or minimum? There isn’t an agreed-upon standard, so the symbol for maximum is usually the word max.

>>> max(1, 3, 5, 2, 4, 7, 6)
7
>>> min(1, 3, 5, 2, 4, 7, 6)
1

The Python language developers built common functions into the language for many of the routine operations that programmers need to accomplish. Types and control flow around types:

bool()
dict()
float()
hex()
int()
len()
list()
set()
str()
tuple()
type()
isinstance()

Logic and math functions:

all()
any()
abs()
hash()
max()
min()
pow()
round()
sum()

Debugging and input/output control:

breakpoint()
dir()
format()
help()
id()
input()
open()
print()

And finally, iteration controls and higher-order functions:

enumerate()
filter()
map()
next()
range()
reversed()
sorted()
zip()

A function is a verb: and verbs accomplish goals. A function takes some arguments and returns some outcome.

$$ \text{type} : \text{noun} :: \text{function} : \text{verb} $$

In Python and most other programming languages, subject-verb phrases must be written so as to be explicit about which verbs act on which nouns. “Unload the couch from the truck” is valid English, but we must be precise and express that we have a truck (noun), and we receive the couch (noun) when we unload (verb) the truck.

couch = unload(truck)

Defining New Functions

A function is created through definition with the def statement, and every function will return something when it completes. Since functions must always be explicit about the objects they act upon: zero or more objects are passed into the function, and one or more value is returned at the end.

def _____():                    # 0x1
    return _____


def _____(_____):               # 1x1
    return _____


def _____(_____, _____):        # 2x1
    return _____


def _____(_____, _____):        # 2x2
    return _____, _____

Every function returns something. Functions that do not explicitly return something will return None:

>>> def does_nothing():
...     pass
...
>>> does_nothing()
None

Local and Global Scoping

Scoping rules govern the relationship between where a name gets defined and what that name means.

Names fall into one of two categories: local and global. For example, if we bind the value 1 to the name x in a global scope, then that variable will also be available from within a function:

x = 1

def returns_x():
    return x

print(returns_x())
# 1

But the inverse is not true. Functions are like Vegas: names defined in the function stay in the function.

def returns_1():
    v = 3
    return 1

print(v)
# NameError: 'v' is not defined

Scoping rules in Python obey a specific set of behaviors called lexical scoping or lexical addressing. In the formal study of programming languages, one would learn the relationship between the context in which a name is defined and the context in which that name is evaluated. Puzzles for a niche audience: why does the following print 1?

y = 3

def y(x):
    def y(x):
        y = 1
        return y
    return y(x)

print(y(3))

The strategy we will recommend is to minimize global state, and prefer any global behaviors are treated like immutable or constant data—data that are declared once and never modified. A convention is to declare these variables with “screaming snake case”: where all letters are capitalized and words are separated by underscores when necessary. For example, a program that uses a comma-separated value (CSV) file might declare a global set of strings representing column names. This global state can then be used to enforce consistency when reading, writing, and performing error handling:

import csv

EXPECT_COLUMNS = ["id", "name", "phone"]


def inspect_csv(file_name: str) -> bool:
    """Does the first line of a .csv file have the correct header?"""
    with open(file_name) as csvf:
        for first in csv.reader(csvf):
            return first == EXPECT_COLUMNS


def load_people(file_name: str) -> list | None:
    if not inspect_csv(file_name):
        return None

    with open(file_name) as csvf:
        return list(csv.DictReader(csvf))


if __name__ == "__main__":
    print(load_people("people3.csv"))

⚠️ Danger: Forcing Local or Global Behavior

Python reserves two keywords: nonlocal and global, which allow programmers to switch between local and global contexts on demand.

We mention this because you should avoid this. Consider the difference between this program, which should obviously raise a NameError since x is a local variable:
def foo(y):
    x = y
    return y

foo(3)
print(x)      # NameError: 'x' is undefined.
Contrast it with this, which will print 3:
def foo(y):
    global x
    x = y
    return y

foo(3)        # Calling `foo` changes the value of `x`
print(x)      # 3
Left unchecked, this is a slippery beetle into bugs. Programs that use global mutable state, such as assigning to global variables, become difficult to reason about as they grow. Instead: keep most variables scoped inside of functions, and minimize how much global data there is.

Dynamic Typing and Function Polymorphism

Python is dynamically typed: meaning that every variable in the language has a type, but that type can change at runtime. This also means that Python functions behave differently according to the data that are passed into them. A function like:

def sum_three(x, y, z):
    return x + y + z

… should have an obvious interpretation when x, y, and z are integers:

>>> sum_three(1, 2, 3)
6

But this interpretation may be less obvious when x, y, and z are strings:

>>> sum_three("a", "b", "c")
'abc'

… or lists:

>>> sum_three([1], [2], [3])
[1, 2, 3]

This is called polymorphism: operations like plus + behave differently depending on the data type. When we have two variables $x$ and $y$, and we know they contain numbers $(x, y) \in \mathbb{Z}^{2}$ then we call the + operator “addition”. If instead $x$ and $y$ are strings, then we call the + operator “concatenation”.

A polymorphic function is therefore a function that behaves differently depending on what gets passed into it. Often this is advantageous, but may also be a source of unexpected bugs. How might be be explicit about the data types that we expect our functions to work with?

Functions with Type Annotations

When declaring a function, one may use the name of a variable, a colon :, and a type to declare the types of values that the function expects. This can make it more clear to ourselves, other programmers, or other entities how we expect parts of the program to behave.

def sum_three_nums(x: int, y: int, z: int) -> int:
    return x + y + z

def _____(_____: ___, ...) -> ___:
    return _____

Note that current versions of Python treat type annotations like guidelines. Other tools do exist to validate types through various approaches collectively called static analysis. One can declare and call functions that clearly violate the type signatures:

def bar(x: int) -> int:
    return x

print(bar("str, not int"))

But tools like mypy or Visual Studio Code’s Pylance language server’s typeCheckingMode treat type errors as actual errors:

$ mypy bad_typing.py
bad_typing.py:4: error: Argument 1 to "bar" has incompatible type "str"; expected "int"  [arg-type]
Found 1 error in 1 file (checked 1 source file)

Statements: Conjunction and Control Flow

So far we have types (nouns) and operators/functions (verbs), but the ideas we may express are limited without some way to link different clauses together.

Python, and many languages derived from C, follow a procedural programming view. In it: most core program behavior should be defined inside of types and functions that call and refer to each other, all mediated via control flow mechanisms. The English words if, for, and while connect clauses—but in Python these words affect our interpretation on how the types and functions relate to overall program behavior.⁶

$$ \begin{align} \text{type} &: \text{noun} \cr \text{function} &: \text{verb} \cr \text{statement} &: \text{conjunction} \end{align} $$

Python defines simple statements as any statement taking zero or one arguments:

Statement	Example	Result
`return`	`def foo(): return 1`	`foo() == 1`
`del`	`x = {0: 1} del x[0]`	`x == {}`
`pass`	`def bar(): pass`	`bar() == None`
`continue`	`total = 0 for i in [3, 2, 1]: if i == 2: continue total += i`	`total == 4`
`break`	`total = 0 for i in [1, 2, 3]: if i == 1: break total += i`	`total == 0`
`import`	`import csv`	(csv module available)

Compound statements refer to everything else, and you can recognize one because they are always accompanied by a colon :.

def
if, elif, else
for
while
with
try, except, except*, else, finally

Data Structures and Collections

A data structure is a particular way to arrange a collection of objects such that efficient algorithms may be built on top of them. Algorithm design and analysis is an advanced topic in computer science that we will not cover. However: many smart people already did that work, and you can benefit from their knowledge.

The three fundamental data structures in Python are lists, tuples, and dictionaries. Many more exist, but the core language and all other data structures may be explained in terms of these three.

Lists are ordered sequences of items, represented with square brackets: [, ].

>>> lst1 = [0, 1, 2, 3, 4]
>>> lst2 = [4, 3, 2, 1, 0]

Dictionaries are unordered mappings that implement an association between a key and a value. These act like physical dictionaries where each word has a meaning: making each word a key and the meaning its value.

vocabulary = {
    "python": "a programming language",
    "list": "an ordered sequence",
    "dictionary": "an unordered mapping",
}

Tuples are ordered sequences of items. Unlike lists: they are immutable. Tuples are often mistaken as being represented by parentheses (, )—the reality is that the parentheses are convenient, but the comma , is all that one needs to represent a tuple:

red = 255, 0, 0
green = 0, 255, 0
blue = 0, 0, 255

Understanding these three data structures gives enough mental scaffolding to understand most other data structures. For example, a set is an unordered collection which can answer whether an element is a member of the set or not. In other words: a set is like a dictionary that only has keys.⁷

>>> some_set = {"Alexander", "Erika"}
>>> like_a_set = {
...     "Alexander": 0,
...     "Erika": 0,
... }
...
>>> some_set == like_a_set.keys()
True

Dictionaries

Recall that dictionaries are collections of key, value pairs. We’ll usually recommend keeping dictionary types simple: such as mapping from strings to integers dict[str, int], or strings to strings dict[str, str]. Also recall that dictionary keys must be unique and immutable (e.g. str, int, tuple), but their values can be any data type: including other lists or other dictionaries.

Dictionary values are accessed via their key:

>>> fruit = {"apple": 1, "orange": 3, "pear": 2}
>>> fruit["pear"]
2

Attempting to access a key that doesn’t exist in the dictionary is a KeyError:

>>> fruit = {"apple": 1, "orange": 3, "pear": 2}
>>> fruit["kiwi"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'kiwi'

… unless one uses a dictionary’s .get method, which returns None to indicate absense, or returns a default value if one is provided:

>>> fruit = {"apple": 1, "orange": 3, "pear": 2}
>>> print(fruit.get("kiwi"))
None

>>> fruit.get("kiwi", 0)
0

Updating a (key, value) pair uses assignment = to assign a key to a new value:

>>> fruit = {"apple": 1, "orange": 3, "pear": 2}
>>> fruit["apple"] = 1000
>>> fruit
{'apple': 1000, 'orange': 3, 'pear': 2}

… or if one assigns a value to a key that does not exist, they will be added:

>>> fruit = {"apple": 1, "orange": 3, "pear": 2}
>>> fruit["tangerine"] = 75
{'apple': 1, 'orange': 3, 'pear': 2, 'tangerine': 75}

Removing something from a dictionary may be done using the del keyword:

>>> fruit = {"apple": 1, "orange": 3, "pear": 2}
>>> del fruit["orange"]
>>> fruit
{'apple': 1, 'pear': 2}

Nested and Composite Data Structures

Continuing the vocabulary analogy, the Merriam Webster English dictionary presents multiple word meanings.

mw = {
    "guardrail": [
        "a railing guarding usually against danger",
    ],
    "balustrade": [
        "a row of balusters topped by a rail",
        "a low parapet or barrier",
    ]
}

Indexing, Selecting, Slicing, and Attributes

Selecting data out of a data structure is one of the most routine operations used across programming. Selecting data requires some definition of an index to exist: where in the data structure is the information that one needs? The exact nature of how indexing works is a topic for another time, but the three most common flavors to be aware of are

integer-based: used by lists
key-based indexing: used in dictionaries
attribute-based indexing: used in everything else

Lists are indexed using integers. A list has some fixed number of items in it, and each item therefore must have an ordered position in the list. If one has a list of tasks that they want to accomplish:

tasks = ["write", "edit", "get feedback"]

One can visually inspect the code to see that the list contains three things. Python’s syntax for selecting data out of a list involves square brackets [] and the index position of the item in the list:

>>> tasks[0]
'write'
>>> tasks[1]
'edit'
>>> tasks[2]
'get feedback'

Dictionaries behave similarly, but similar to how one would look up a word in a physical dictionary or online dictionary: each item in the dictionary is a $(key, value$) pair, so one may look up the value by looking up the key. For example, if you choose to represent the workouts you do on each day of the week as a dictionary, the keys could be the name of each weekday and the values could be the associated exercise for that day:

workout_routine = {
    "Monday": "Cardio",
    "Tuesday": "Core",
    "Wednesday": "Rest",
    "Thursday": "Leg Day",
    "Friday": "Upper Body",
}

Selecting the weekday from the dictionary will therefore result in the value for what should be done on that day:

>>> workout_routine["Wednesday"]
'Rest'
>>> workout_routine["Thursday"]
'Leg Day'

Integer or key-based indexing is sufficient to extract one item at a time, but what if we need to handle multiple items at a time? Imagine we’ve been keeping track of our heart rate, but we want to know what the average heart rate is over some period of time. If we measure our heart rate every minute for five minutes, then we’ll have a list of five heart rates:

heart_rates = [74, 77, 78, 77, 75]

Slicing represents extracting consecutive elements in a list—as if you have a Nerds Rope in front of you and you want to split the candy into three pieces, then imagine you have a knife and make a few cuts:

heart_rates = [74,  77,  78,  77,   75]
              ---- --------------  ----
              [74] [77,  78,  77], [75]

One can slice from $(0, 1)$ to get a list containing the first item in the list, or slice from $(1, 4)$ to get the middle three elements, or slice from $(4, 5)$ to get a list representing the last thing in the list.

>>> heart_rates[0:1]
[74]
>>> heart_rates[1:4]
[77, 78, 77]
>>> heart_rates[4:5]
[75]

The underlying object that accomplishes this in Python is the slice object, requiring a start and an end (and optionally a step, representing a kind of skip or stride or every other element in the slice(None, None, 2) but for now understanding the start and end point in a list is more than sufficient).

Slicing can therefore be used as a way to represent concepts like the first two items:

>>> heart_rates[:2]
[74, 77]

… or the last two items:

>>> heart_rates[-2:]
[77, 75]

… or everything between the first and last element:

>>> heart_rates[1:-1]
[77, 78, 77]

To round this out: many data structures in Python are implemented in terms of objects, usually defined with a class. We mentioned earlier that we use the words type and object interchangeably. If we define a new type to represent some point in two-dimensional space:

class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y
    def __repr__(self):
        return f"Point({self.x}, {self.y})"

… then we’ve defined a new noun in our language. From the __init__ definition (sometimes called a constructor or initializer), we can see that a Point has an x and y coordinate. The names x and y are available to anyone who uses this type, finally bringing us to attribute-based indexing. Attribute-based indexing looks similar to the key indexing we saw with dictionaries,⁸ but now the indexing is performed using a period or dot and the name of the attribute one intends to access.

For example, the origin in a coordinate system is $(0, 0)$. If we instantiate a variable named origin, then we may later access origin.x for the $x$-coordinate and origin.y for the $y$-coordinate:

>>> origin = Point(0, 0)
>>> origin.x
0
>>> origin.y
0

Even if you aren’t defining your own types, you might often be working with a type that is built into the language, and therefore may need to know how to look up the value of an attribute defined on that type. Remember those slices we just mentioned? The start and stop values are available as attributes after initializing a slice:

>>> slc = slice(0, 3)
>>> slc.start
0
>>> slc.stop
3

Or if you’re diving into how some of the built-in types actually work, you might find out that every integer also has some attributes defined on them: a numerator and denominator:

>>> x = 7
>>> x.numerator
7
>>> x.denominator
1

To review: indexing is how Python represents where something is, and indexing comes in three varieties (integers, keys, and attributes). The three approaches are mixed and matched in order to select data out of composite data structures by following the access logic. If one represents a triangle as a list of three points, then one may can access the $x$-coordinate of the first point with the integer [0], then with the attribute .x.

>>> triangle = [Point(0, 1), Point(3, 1), Point(5, 4)]
>>> triangle[0].x
0
>>> triangle[1].x
3
>>> triangle[2].x
5

Iterables and Ordering

Iteration is a single step in a sequence—progressing toward completion. One may iterate on their current draft in order to make it better. We already mentioned loops (while, for) and said that there were built-in Python functions related to iteration:

for i in range(3):
    print(i)
# 0
# 1
# 2

An iterable is therefore any type, data structure, or object which may be iterated with a loop. Many objects which can be thought of as an ordered collection of smaller objects—like strings, lists, or tuples—are also iterable. For example, we might iterate over a list of words (strings), then iterate over each letter in each word:

words = ["foo", "bar", "baz"]

for word in words:
    for letter in word:
        print(letter, end=" ")
# f o o b a r b a z

However, one should be mindful that there do exist things which are not ordered, but are iterable. We said earlier that sets and dictionaries are unordered collections of objects. Despite not having an obvious ordering, both data structures may be iterated over with a loop.

The important point to keep in mind is that the order that one may expect may not be the one that Python uses. In the workout dictionary, the English names “Monday” through “Friday” may have some semantic meaning when a person reads them:

workout_routine = {
    "Monday": "Cardio",
    "Tuesday": "Core",
    "Wednesday": "Rest",
    "Thursday": "Leg Day",
    "Friday": "Upper Body",
}

But will Python iterate through the keys in that order?

>>> for day in workout_routine:
...     print(day)
Monday
Tuesday
Wednesday
Thursday
Friday

In this case: yes 😉 Python 3.7 started enforcing that for objects which are otherwise considered to be “unordered”: the iteration order is the same as insertion order. Since workout_routine was initially defined with "Monday" at the beginning and "Friday" at the end: that order is invariant when we check the order later.

This means that if we wanted to write a program to assign a random exercise goal to each day of the week, we might preserve the weekday order by preserving the order of keys going into the dictionary:

from random import shuffle

weekdays = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
workouts = ["Cardio", "Core", "Rest", "Leg Day", "Upper Body"]

shuffle(workouts)

this_week = dict(zip(weekdays, workouts))

for day, exercise in this_week.items():
    print(f"{day} -- {exercise}")
# Monday -- Leg Day     (random outputs)
# Tuesday -- Cardio
# Wednesday -- Core
# Thursday -- Rest
# Friday -- Upper Body

Functions as Tuples or Dictionaries

Knowing about tuples and dictionaries provides one more way to think about functions. So far we’ve treated functions as a name (like foo) accompanied by an ordered set of arguments.

def foo(x, y):
    return x + y

An ordered, immutable set of arguments is equivalent to how we defined tuples.

>>> args = (3, 5)
>>> foo(*args)
8

Similarly, keys and values are similar to how we thought about dictionaries. When defining functions, we can define keyword arguments which can take on default values when calling the function:

def bar(x, y, base=0):
    return x + y + base

def baz(x, y, debug=False):
    if debug:
        print(x, y)
    return x + y

Methods

Methods are special kinds of verbs: reflexive verbs. Reflexive verbs happen in English when an agent does something to itself. For example:

One can “self-describe” - only you can self-describe you
You can “self-evaluate” - but no person can self-evaluate you
One can “perjure” - but one cannot perjure someone else

$$ \text{type} : \text{noun} :: \text{method} : \text{reflexive verb} $$

A method is a function defined on a type. We can access methods with dot notation: calling a method similar to type.method() causes something to happen.

Methods that take an argument often modify the underlying type in some way, such as appending something to a list.

>>> lst = []
>>> lst.append(3)     # "append 3 to yourself"
>>> lst
[3]

A method can also answer a question about the underlying data. What keys are in a dictionary? We can check by querying .keys():

>>> dct = {0: 1, 1: 2}
>>> dct.keys()
dict_keys([0, 1])

Modules

Names are also assigned on a per-module or per-file basis. If one has two Python scripts: printer.py and writer.py, then one cannot use a function from one without first importing it.

# writer.py
def make_title_case(title: str) -> str:
    """Convert a space-separated string to titlecase.

    Unlike `.title()`, this does not convert numbers to all-caps.
    """
    title_words = []
    for word in title.split():
        title_words.append(word[0].upper() + word[1:])
    return " ".join(title_words)

# printer.py
from writer import make_title_case

print(make_title_case("autobiography of mark twain"))
# Autobiography Of Mark Twain

Data Representation

Let’s wrap up this cheat sheet by talking about design choices. As programmers, software engineers, or developers—we’re often making decisions about how we write our code in order to best maintain the software over time, or to meet some external criteria (readability, scalability, testability, reliability, and a whole scrabble board of words ending in -ility). The code we write, and the data that the code operates on is therefore subject to decisions about how everything in our representation of the universe should work.

Let’s use color as an example.

Colors in HTML and CSS are red/green/blue RGB triples defined with three integers between 0-255. This true color, or 24-bit color depth, used on just about every mainstream computer display is capable of rendering 16,777,216 colors.

But let’s stick with five colors that we may believe are sufficient for a problem we’re working on. Should we store colors like this:

colors = {
    "white": "ffffff",
    "black": "000000",
    "red":   "ff0000",
    "green": "00ff00",
    "blue":  "0000ff",
}

or should we represent the colors like this?

colors = {
    "white": (255, 255, 255),
    "black": (0, 0, 0),
    "red": (255, 0, 0),
    "green": (0, 255, 0),
    "blue": (0, 0, 255),
}

One could say well it depends because it always does, but that advice is general to the point of being useless. A more interesting answer is that the two representations are actually the same. Representing "black" with the tuple (0, 0, 0) or the string "000000" are two representations of the same concept. (0, 0, 0) is more explicit about the view that true color is comprised from three components R/G/B. The hexadecimal number "000000" might be less transparent about this fact at first, but this representation could be ideal for readers who are (1) already aware of the hexadecimal representation, or (2) end users or downstream programs which will eventually need an HTML-like hexadecimal number anyway.

Here’s the advice: don’t overthink these decisions, but don’t underthink them either. Software is meant to be soft—one may only figure out much later which decision was ultimately correct. Should one be paralyzed by indecision trying to reason through all possible decisions and the downstream effects of all possible decisions? No! Time is better spent designing a prototype and iterating on it as new feedback and new information comes in.

Since this color example shows two equivalent representations, there’s another option: store the data in one way, but if you need the other representation at any point, one could convert between the two representations with a function:

def color_to_hex(r: int, g: int, b: int) -> str:
    return "".join((c).to_bytes(1, "big").hex() for c in (r, g, b))

One could even define a new data type representing the TrueColor, and define methods and properties on this type to build these behaviors into the representation:

class TrueColor:
    def __init__(self, r: int, g: int, b: int):
        self.r, self.g, self.b = r, g, b

    @property
    def hex(self) -> str:
        return "".join((c).to_bytes(1, "big").hex() for c in (self.r, self.g, self.b))

    def __repr__(self):
        return f"TrueColor({self.r}, {self.g}, {self.b})"

But here inlies the chief tension: building new levels of abstraction comes with an intellectual cost. When something needs to change in the future (and tech changes quickly: it will need to change in the future and the due dates may approach rapidly), one may need to traverse mountains of abstractions even to make what feel as if they should be simple changes.

So as parting advice: aim for a kind of minimalism in the code you write. Flexibility and the ease with which one can read, understand, and modify code should be its own reward.⁹

Footnotes

These five follow from a procedural approach to programming and programming languages. Other paradigms exist which may appear to bend these rules—such as structured query language (SQL), which is an instance of a declarative language. A lambda calculus approach to studying languages would tell you that all computation can actually be done with three rules: definition, abstraction, and application—the astute reader may wonder where concepts like conditions and repetition went? The answer is that those concepts can just as easily be defined in terms of abstraction and application.

From: Felleisen et al. 2014, “How to Design Programs”. Used under the terms of the Creative Commons CC BY-NC-ND license. Online: HTDP, Preface, Systematic Program Design

Lower levels do exist: in Python at least, every type has an associated class or metaclass. Furthermore, the float or int types in particular have binary representations, and at the lowest level: computers are moving bits around. Awareness of these details—that technically there is something lower than the primitive types—makes for a more accurate representation. But we can be productive without this detail, whereas digging into this footnote further would quickly lead us down the path of: “but how exactly does Python work?” Our goal is to eventually build web applications, a theoretical study of programming languages and how they actually work is outside our scope.

⁴

https://en.wikipedia.org/wiki/Expression_(computer_science)

⁵

As we’ll see later when we talk about virtual environments, the word environment is overloaded in informatics, computing, and engineering. But even when the word is used differently, the texture is the same: an environment always represents a set of assumptions that get passed along with the code we write. The nature of the environment will grow more complex at higher levels of abstraction though: at a low level an environment represents all the valid variables, and at a high level the environment will refer to the state of entire computers or networks of computers working together.

⁶

But one may also ellide conjunctions altogether using functions by making more λs.

⁷

In fact, sets in Python originally were dictionaries. It’s wasteful to store values that aren’t needed though, so after a few releases the Python developers optimized away the extraneous values.

⁸

The similar appearance of key indexing (foo['x']) and attribute indexing (foo.x) has a deeper reasoning: every object in Python is implemented as what we call a “thin wrapper” around a dictionary. With a few steps, one could even define data structures that further blur the lines between objects and dictionaries by automatically making attributes available as keys and vice-versa (for example, see the scikit-learn Bunch object: https://scikit-learn.org/stable/modules/generated/sklearn.utils.Bunch.html)

⁹

Chris Hanson and Gerald Jay Sussman, “Software Design for Flexibility: How to Avoid Programming Yourself into a Corner”. The MIT Press, 2021-03-09, 978-0-26204549-0

An Introduction to Information Infrastructure II