A Crash Cource in Python¶

The Basics¶

Whitespace Formatting¶

Many languages use curly braces to delimit blocks of code. Python uses indentation:

In [1]:

for i in [1, 2, 3, 4, 5]:
    print(i)
    for j in [1, 2, 3, 4, 5]:
        print(j)
        print(i + j)
    print(i)
print("done looping")

1
1
2
2
3
3
4
4
5
5
6
1
2
1
3
2
4
3
5
4
6
5
7
2
3
1
4
2
5
3
6
4
7
5
8
3
4
1
5
2
6
3
7
4
8
5
9
4
5
1
6
2
7
3
8
4
9
5
10
5
done looping

Whitespace is ignored inside parentheses and brackets

In [2]:

long_winded_computation = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10
                           + 11 + 12 +
                           13 + 14 + 15 + 16 + 17 + 18 + 19 + 20)

for making code easier to read

In [3]:

list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
easier_to_read_list_of_lists = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]

Use a backslash to indicate that a statement continues onto the next line

In [4]:

two_plus_three = 2 + \
                 3

Modules¶

Import the modules that contain features

import regular expression module: re is the module containing functions and constants for working with regular expressions.

In [5]:

import re
my_regex = re.compile("[0-9]+", re.I)

You may use an alias

In [6]:

import re as regex
my_regex = regex.compile("[0-9]+", regex.I)

In [7]:

import matplotlib.pyplot as plt

You can import them explicitly and use them without qualification

In [8]:

from collections import defaultdict, Counter
lookup = defaultdict(int)
my_counter = Counter()

You could import the entire contents of a module into your namespace, which might inadvertently overwrite variables you’ve already defined:

In [9]:

match = 10
from re import *  # uh oh, re has a match function
print(match)  # "<function re.match>"

<function match at 0x7fb5d73a8160>

Arithmetic¶

Remember quotient-remainder theorem $$ n = d \cdot q + r $$

$ d $ is a divisor, $ q $ is a quotient, $ r $ is a remainder,
$ 0 \leq r < q $ when $ q $ is positive and $ q < r \leq 0 $ when negative

In [10]:

print(2 ** 10)  # 1024
print(2 ** 0.5)  # 1.414...
print(2 ** -0.5)  #
print(5 / 2)  # 2.5
print(5 % 3)  # 2
print(5 // 3)  # 1
print((-5) % 3)  # 1
print((-5) // 3)  # -2
print(5 % (-3))  # -1
print((-5) // (-3))  # 1
print((-5) % (-3))  # -2
print(7.2 // 3.5)  # 2.0
print(7.2 % 3.5)  # 0.2

1024
1.4142135623730951
0.7071067811865476
2.5
2
1
1
-2
-1
1
-2
2.0
0.20000000000000018

Functions¶

A function is a rule for taking zero or more inputs and returning a corresponding output

In [11]:

def double(x):
    """this is where you put an optional docstring
    that explains what the function does.
    for example, this function multiplies its input by 2"""
    return x * 2
double(2)

Out[11]:

Python functions are first-class, which means that we can assign them to variables and pass them into functions just like any other arguments:

In [12]:

def apply_to_one(f):
    """calls the function f with 1 as its argument"""
    return f(1)

my_double = double
x = apply_to_one(my_double)

print(x)

Lambda function: short anonymous functions

In [13]:

y = apply_to_one(lambda x: x + 4)
print(y)

In [14]:

another_double = lambda x: 2 * x
def another_double(x): return 2 * x  # more readable

Function parameters can also be given default arguments

In [15]:

def my_print(message="my default message"):
    print(message)
    
my_print("hello") # prints 'hello'
my_print() # prints 'my default message'

hello
my default message

In [16]:

def subtract(a=0, b=0):
    return a - b

subtract(10, 5) # returns 5
subtract(0, 5) # returns -5
subtract(b=5) # same as previous

Out[16]:

-5

Strings¶

Strings can be delimited by single or double quotation marks

In [17]:

single_quoted_string = 'data science'
double_quoted_string = "data science"

In [18]:

tab_string = "\t"  # represents the tab character
len(tab_string)  # is 1

Out[18]:

multiline strings using triple-double-quotes

In [19]:

multi_line_string = """This is the first line.
and this is the second line
and this is the third line"""

Lists¶

the most fundamental data structure in Python

In [20]:

integer_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [ integer_list, heterogeneous_list, [] ]

list_length = len(integer_list) # equals 3
list_sum = sum(integer_list) # equals 6

You can get or set the nth element of a list with square brackets

In [21]:

x = list(range(10))  # is the list [0, 1, ..., 9]
zero = x[0]  # equals 0, lists are 0-indexed
one = x[1]  # equals 1
nine = x[-1]  # equals 9, 'Pythonic' for last element
eight = x[-2]  # equals 8, 'Pythonic' for next-to-last element
x[0] = -1  # now x is [-1, 1, 2, 3, ..., 9]

You can also use square brackets to “slice” lists:

In [22]:

first_three = x[:3]  # [-1, 1, 2]
three_to_end = x[3:]  # [3, 4, ..., 9]
one_to_four = x[1:5]  # [1, 2, 3, 4]
last_three = x[-3:]  # [7, 8, 9]
without_first_and_last = x[1:-1]  # [1, 2, ..., 8]
copy_of_x = x[:]  # [-1, 1, 2, ..., 9]

in operator to check for list membership

In [23]:

1 in [1, 2, 3] # True
0 in [1, 2, 3] # False

Out[23]:

False

To concatenate lists together:

In [24]:

x = [1, 2, 3]
x.extend([4, 5, 6])  # x is now [1,2,3,4,5,6]

In [25]:

x = [1, 2, 3]
y = x + [4, 5, 6]  # y is [1, 2, 3, 4, 5, 6]; x is unchanged

To append to lists one item at a time:

In [26]:

x = [1, 2, 3]
x.append(0) # x is now [1, 2, 3, 0]
y = x[-1] # equals 0
z = len(x) # equals 4

It is convenient to unpack lists:

In [27]:

x, y = [1, 2] # now x is 1, y is 2

In [28]:

_, y = [1, 2] # now y == 2, didn't care about the first element

Tuples¶

Tuples are lists’ immutable cousins.

In [29]:

my_list = [1, 2]
my_tuple = (1, 2)
other_tuple = 3, 4
my_list[1] = 3  # my_list is now [1, 3]

try:
    my_tuple[1] = 3
except TypeError:
    print("cannot modify a tuple")

cannot modify a tuple

Tuples are a convenient way to return multiple values from functions:

In [30]:

def sum_and_product(x, y):
    return (x + y),(x * y)

sp = sum_and_product(2, 3)  # equals (5, 6)
s, p = sum_and_product(5, 10)  # s is 15, p is 50

Tuples (and lists) can also be used for multiple assignment:

In [31]:

x, y = 1, 2  # now x is 1, y is 2
x, y = y, x  # Pythonic way to swap variables; now x is 2, y is 1

Dictionaries¶

Another fundamental data structure which associates values with keys
It allows you to quickly retrieve the value corresponding to a given key:

In [32]:

empty_dict = {}  # Pythonic
empty_dict2 = dict()  # less Pythonic
grades = { "Joel" : 80, "Tim" : 95 }  # dictionary literal

You can look up the value for a key using square brackets:

In [33]:

joels_grade = grades["Joel"]  # equals 80

KeyError if you ask for a key that’s not in the dictionary:

In [34]:

try:
    kates_grade = grades["Kate"]
except KeyError:
    print("no grade for Kate!")

no grade for Kate!

You can check for the existence of a key using in :

In [35]:

joel_has_grade = "Joel" in grades  # True
kate_has_grade = "Kate" in grades  # False

Dictionaries have a get method that returns a default value (instead of raising an exception) when you look up a key that’s not in the dictionary:

In [36]:

joels_grade = grades.get("Joel", 0)  # equals 80
kates_grade = grades.get("Kate", 0)  # equals 0
no_ones_grade = grades.get("No One")  # default default is None

You assign key-value pairs using the same square brackets:

In [37]:

grades["Tim"] = 99  # replaces the old value
grades["Kate"] = 100   # adds a third entry
num_students = len(grades)  # equals 3

In [38]:

tweet = {
    "user" : "joelgrus",
    "text" : "Data Science is Awesome",
    "retweet_count" : 100,
    "hashtags" : ["#data", "#science", "#datascience", "#awesome", "#yolo"]
}

Iteration: we can look at all of them

In [39]:

tweet_keys = tweet.keys()  # list of keys
tweet_values = tweet.values()  # list of values
tweet_items = tweet.items()  # list of (key, value) tuples

"user" in tweet_keys # True, but uses a slow list in
"user" in tweet # more Pythonic, uses faster dict in
"joelgrus" in tweet_values # True

Out[39]:

True

WordCount Example: Create a dictionary in which the keys are words and the values are counts.

In [40]:

document = ['I', 'am', 'a', 'boy', 'I', 'love', 'you']

First Approach:

In [41]:

word_counts = {}
for word in document:
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1

Second Approach:

In [42]:

word_counts = {}
for word in document:
    try:
        word_counts[word] += 1
    except KeyError:
        word_counts[word] = 1

Third Approach:

In [43]:

word_counts = {}
for word in document:
    previous_count = word_counts.get(word, 0)
    word_counts[word] = previous_count + 1

In [44]:

word_counts

Out[44]:

{'I': 2, 'am': 1, 'a': 1, 'boy': 1, 'love': 1, 'you': 1}

defaultdict¶

A defaultdict is like a regular dictionary, except that when you try to look up a key it doesn’t contain, it first adds a value for it using a zero-argument function you provided when you created it.

In [45]:

from collections import defaultdict

word_counts = defaultdict(int)  # int() produces 0
# word_counts = defaultdict(lambda: 100)  # returns 100
for word in document:
    word_counts[word] += 1
    
print(word_counts)

defaultdict(<class 'int'>, {'I': 2, 'am': 1, 'a': 1, 'boy': 1, 'love': 1, 'you': 1})

In [46]:

int()

Out[46]:

In [47]:

dd_list = defaultdict(list)  # list() produces an empty list
dd_list[2].append(1)  # now dd_list contains {2:[1]}

dd_dict = defaultdict(dict)  # dict() produces an empty dict
dd_dict["Joel"]["City"] = "Seattle"  # { "Joel" : { "City" : Seattle"}}
dd_pair = defaultdict(lambda: [0, 0])
dd_pair[2][1] = 1  # now dd_pair contains {2:[0,1]}

Counter¶

A Counter turns a sequence of values into a defaultdict(int)-like object mapping keys to counts.
We will primarily use it to create histograms

In [48]:

from collections import Counter
c = Counter([0, 1, 2, 0])  # c is (basically) { 0 : 2, 1 : 1, 2 : 1 }

In [49]:

word_counts = Counter(document)

Use help function to see a man page

In [50]:

help(word_counts)

Help on Counter in module collections object:

class Counter(builtins.dict)
 |  Counter(iterable=None, /, **kwds)
 |  
 |  Dict subclass for counting hashable items.  Sometimes called a bag
 |  or multiset.  Elements are stored as dictionary keys and their counts
 |  are stored as dictionary values.
 |  
 |  >>> c = Counter('abcdeabcdabcaba')  # count elements from a string
 |  
 |  >>> c.most_common(3)                # three most common elements
 |  [('a', 5), ('b', 4), ('c', 3)]
 |  >>> sorted(c)                       # list all unique elements
 |  ['a', 'b', 'c', 'd', 'e']
 |  >>> ''.join(sorted(c.elements()))   # list elements with repetitions
 |  'aaaaabbbbcccdde'
 |  >>> sum(c.values())                 # total of all counts
 |  15
 |  
 |  >>> c['a']                          # count of letter 'a'
 |  5
 |  >>> for elem in 'shazam':           # update counts from an iterable
 |  ...     c[elem] += 1                # by adding 1 to each element's count
 |  >>> c['a']                          # now there are seven 'a'
 |  7
 |  >>> del c['b']                      # remove all 'b'
 |  >>> c['b']                          # now there are zero 'b'
 |  0
 |  
 |  >>> d = Counter('simsalabim')       # make another counter
 |  >>> c.update(d)                     # add in the second counter
 |  >>> c['a']                          # now there are nine 'a'
 |  9
 |  
 |  >>> c.clear()                       # empty the counter
 |  >>> c
 |  Counter()
 |  
 |  Note:  If a count is set to zero or reduced to zero, it will remain
 |  in the counter until the entry is deleted or the counter is cleared:
 |  
 |  >>> c = Counter('aaabbc')
 |  >>> c['b'] -= 2                     # reduce the count of 'b' by two
 |  >>> c.most_common()                 # 'b' is still in, but its count is zero
 |  [('a', 3), ('c', 1), ('b', 0)]
 |  
 |  Method resolution order:
 |      Counter
 |      builtins.dict
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __add__(self, other)
 |      Add counts from two counters.
 |      
 |      >>> Counter('abbb') + Counter('bcc')
 |      Counter({'b': 4, 'c': 2, 'a': 1})
 |  
 |  __and__(self, other)
 |      Intersection is the minimum of corresponding counts.
 |      
 |      >>> Counter('abbb') & Counter('bcc')
 |      Counter({'b': 1})
 |  
 |  __delitem__(self, elem)
 |      Like dict.__delitem__() but does not raise KeyError for missing values.
 |  
 |  __iadd__(self, other)
 |      Inplace add from another counter, keeping only positive counts.
 |      
 |      >>> c = Counter('abbb')
 |      >>> c += Counter('bcc')
 |      >>> c
 |      Counter({'b': 4, 'c': 2, 'a': 1})
 |  
 |  __iand__(self, other)
 |      Inplace intersection is the minimum of corresponding counts.
 |      
 |      >>> c = Counter('abbb')
 |      >>> c &= Counter('bcc')
 |      >>> c
 |      Counter({'b': 1})
 |  
 |  __init__(self, iterable=None, /, **kwds)
 |      Create a new, empty Counter object.  And if given, count elements
 |      from an input iterable.  Or, initialize the count from another mapping
 |      of elements to their counts.
 |      
 |      >>> c = Counter()                           # a new, empty counter
 |      >>> c = Counter('gallahad')                 # a new counter from an iterable
 |      >>> c = Counter({'a': 4, 'b': 2})           # a new counter from a mapping
 |      >>> c = Counter(a=4, b=2)                   # a new counter from keyword args
 |  
 |  __ior__(self, other)
 |      Inplace union is the maximum of value from either counter.
 |      
 |      >>> c = Counter('abbb')
 |      >>> c |= Counter('bcc')
 |      >>> c
 |      Counter({'b': 3, 'c': 2, 'a': 1})
 |  
 |  __isub__(self, other)
 |      Inplace subtract counter, but keep only results with positive counts.
 |      
 |      >>> c = Counter('abbbc')
 |      >>> c -= Counter('bccd')
 |      >>> c
 |      Counter({'b': 2, 'a': 1})
 |  
 |  __missing__(self, key)
 |      The count of elements not in the Counter is zero.
 |  
 |  __neg__(self)
 |      Subtracts from an empty counter.  Strips positive and zero counts,
 |      and flips the sign on negative counts.
 |  
 |  __or__(self, other)
 |      Union is the maximum of value in either of the input counters.
 |      
 |      >>> Counter('abbb') | Counter('bcc')
 |      Counter({'b': 3, 'c': 2, 'a': 1})
 |  
 |  __pos__(self)
 |      Adds an empty counter, effectively stripping negative and zero counts
 |  
 |  __reduce__(self)
 |      Helper for pickle.
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  __sub__(self, other)
 |      Subtract count, but keep only results with positive counts.
 |      
 |      >>> Counter('abbbc') - Counter('bccd')
 |      Counter({'b': 2, 'a': 1})
 |  
 |  copy(self)
 |      Return a shallow copy.
 |  
 |  elements(self)
 |      Iterator over elements repeating each as many times as its count.
 |      
 |      >>> c = Counter('ABCABC')
 |      >>> sorted(c.elements())
 |      ['A', 'A', 'B', 'B', 'C', 'C']
 |      
 |      # Knuth's example for prime factors of 1836:  2**2 * 3**3 * 17**1
 |      >>> prime_factors = Counter({2: 2, 3: 3, 17: 1})
 |      >>> product = 1
 |      >>> for factor in prime_factors.elements():     # loop over factors
 |      ...     product *= factor                       # and multiply them
 |      >>> product
 |      1836
 |      
 |      Note, if an element's count has been set to zero or is a negative
 |      number, elements() will ignore it.
 |  
 |  most_common(self, n=None)
 |      List the n most common elements and their counts from the most
 |      common to the least.  If n is None, then list all element counts.
 |      
 |      >>> Counter('abracadabra').most_common(3)
 |      [('a', 5), ('b', 2), ('r', 2)]
 |  
 |  subtract(self, iterable=None, /, **kwds)
 |      Like dict.update() but subtracts counts instead of replacing them.
 |      Counts can be reduced below zero.  Both the inputs and outputs are
 |      allowed to contain zero and negative counts.
 |      
 |      Source can be an iterable, a dictionary, or another Counter instance.
 |      
 |      >>> c = Counter('which')
 |      >>> c.subtract('witch')             # subtract elements from another iterable
 |      >>> c.subtract(Counter('watch'))    # subtract elements from another counter
 |      >>> c['h']                          # 2 in which, minus 1 in witch, minus 1 in watch
 |      0
 |      >>> c['w']                          # 1 in which, minus 1 in witch, minus 1 in watch
 |      -1
 |  
 |  update(self, iterable=None, /, **kwds)
 |      Like dict.update() but add counts instead of replacing them.
 |      
 |      Source can be an iterable, a dictionary, or another Counter instance.
 |      
 |      >>> c = Counter('which')
 |      >>> c.update('witch')           # add elements from another iterable
 |      >>> d = Counter('watch')
 |      >>> c.update(d)                 # add elements from another counter
 |      >>> c['h']                      # four 'h' in which, witch, and watch
 |      4
 |  
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |  
 |  fromkeys(iterable, v=None) from builtins.type
 |      Create a new dictionary with keys from iterable and values set to value.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from builtins.dict:
 |  
 |  __contains__(self, key, /)
 |      True if the dictionary has the specified key, else False.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __le__(self, value, /)
 |      Return self<=value.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __lt__(self, value, /)
 |      Return self<value.
 |  
 |  __ne__(self, value, /)
 |      Return self!=value.
 |  
 |  __reversed__(self, /)
 |      Return a reverse iterator over the dict keys.
 |  
 |  __ror__(self, value, /)
 |      Return value|self.
 |  
 |  __setitem__(self, key, value, /)
 |      Set self[key] to value.
 |  
 |  __sizeof__(...)
 |      D.__sizeof__() -> size of D in memory, in bytes
 |  
 |  clear(...)
 |      D.clear() -> None.  Remove all items from D.
 |  
 |  get(self, key, default=None, /)
 |      Return the value for key if key is in the dictionary, else default.
 |  
 |  items(...)
 |      D.items() -> a set-like object providing a view on D's items
 |  
 |  keys(...)
 |      D.keys() -> a set-like object providing a view on D's keys
 |  
 |  pop(...)
 |      D.pop(k[,d]) -> v, remove specified key and return the corresponding value.
 |      
 |      If key is not found, default is returned if given, otherwise KeyError is raised
 |  
 |  popitem(self, /)
 |      Remove and return a (key, value) pair as a 2-tuple.
 |      
 |      Pairs are returned in LIFO (last-in, first-out) order.
 |      Raises KeyError if the dict is empty.
 |  
 |  setdefault(self, key, default=None, /)
 |      Insert key with a value of default if key is not in the dictionary.
 |      
 |      Return the value for key if key is in the dictionary, else default.
 |  
 |  values(...)
 |      D.values() -> an object providing a view on D's values
 |  
 |  ----------------------------------------------------------------------
 |  Class methods inherited from builtins.dict:
 |  
 |  __class_getitem__(...) from builtins.type
 |      See PEP 585
 |  
 |  ----------------------------------------------------------------------
 |  Static methods inherited from builtins.dict:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes inherited from builtins.dict:
 |  
 |  __hash__ = None

In [51]:

# print the 10 most common words and their counts
for word, count in word_counts.most_common(10):
    print(word, count)

I 2
am 1
a 1
boy 1
love 1
you 1

Sets¶

Another data structure is set, which represents a collection of distinct elements:

In [52]:

s = set()
s.add(1) # s is now { 1 }
s.add(2) # s is now { 1, 2 }
s.add(2) # s is still { 1, 2 }
x = len(s) # equals 2
y = 2 in s # equals True
z = 3 in s # equals False

For a membership test, a set is more appropriate than a list
in is a very fast operation on sets.

In [53]:

hundreds_of_other_words = []
stopwords_list = ["a","an","at"] + hundreds_of_other_words + ["yet", "you"]

"zip" in stopwords_list  # False, but have to check every element

stopwords_set = set(stopwords_list)
"zip" in stopwords_set  # very fast to check

Out[53]:

False

To find the distinct items in a collection:

In [54]:

item_list = [1, 2, 3, 1, 2, 3]
num_items = len(item_list) # 6
item_set = set(item_list) # {1, 2, 3}
num_distinct_items = len(item_set) # 3
distinct_item_list = list(item_set) # [1, 2, 3]

Control Flow¶

if statement:

In [55]:

if 1 > 2:
    message = "if only 1 were greater than two..."
elif 1 > 3:
    message = "elif stands for 'else if'"
else:
    message = "when all else fails use else (if you want to)"

a ternary if-then-else on one line

In [56]:

parity = "even" if x % 2 == 0 else "odd"

while statement:

In [57]:

x = 0
while x < 10:
    print(x, "is less than 10")
    x += 1

0 is less than 10
1 is less than 10
2 is less than 10
3 is less than 10
4 is less than 10
5 is less than 10
6 is less than 10
7 is less than 10
8 is less than 10
9 is less than 10

for statement:

In [58]:

for x in range(10):
    print(x, "is less than 10")

0 is less than 10
1 is less than 10
2 is less than 10
3 is less than 10
4 is less than 10
5 is less than 10
6 is less than 10
7 is less than 10
8 is less than 10
9 is less than 10

continue and break statement:

In [59]:

for x in range(10):
    if x == 3:
        continue  # go immediately to the next iteration
    if x == 5:
        break  # quit the loop entirely
    print(x)

Truthiness¶

In [60]:

one_is_less_than_two = 1 < 2  # equals True
true_equals_false = True == False  # equals False

Python uses the value None to indicate a nonexistent value

In [61]:

x = None
print(x == None)  # prints True, but is not Pythonic
print(x is None)  # prints True, and is Pythonic

True
True

The following are all “Falsy”:

False
None
[]
 (an empty list)
{}
 (an empty dict)
""
set()
0
0.0

In [62]:

s = 'abc'
if s:
    first_char = s[0]
else:
    first_char = ""

In [63]:

first_char = s and s[0]  # A simpler way of doing the same

In [64]:

safe_x = x or 0  # if x is either a number or possibly None

Python has an all function, which takes a list and returns True precisely when every element is truthy, and
an any function, which returns True when at least one element is truthy:

In [65]:

all([True, 1, { 3 }])  # True
all([True, 1, {}])  # False, {} is falsy
any([True, 1, {}])  # True, True is truthy
all([])  # True, no falsy elements in the list
any([])  # False, no truthy elements in the list

Out[65]:

False

The Not-So-Basics¶

Sorting¶

In [66]:

x = [4,1,2,3]
y = sorted(x)  # is [1,2,3,4], x is unchanged
x.sort()  # now x is [1,2,3,4]

In [67]:

# sort the list by absolute value from largest to smallest
x = sorted([-4,1,-2,3], key=abs, reverse=True) # is [-4,3,-2,1]
# sort the words and counts from highest count to lowest
wc = sorted(word_counts.items(),
key=lambda x: x[1], reverse=True)

List Comprehensions¶

you’ll want to transform a list into another list, by choosing only certain elements, or by transforming elements, or both. The Pythonic way of doing this is list comprehensions:
Always use list comprehension if possible.

In [68]:

even_numbers = [x for x in range(5) if x % 2 == 0]  # [0, 2, 4]
squares = [x * x for x in range(5)]  # [0, 1, 4, 9, 16]
even_squares = [x * x for x in even_numbers]  # [0, 4, 16]

You can similarly turn lists into dictionaries or sets:

In [69]:

square_dict = { x : x * x for x in range(5) }  # { 0:0, 1:1, 2:4, 3:9, 4:16 }
square_set = { x * x for x in [1, -1] }  # { 1 }

It’s conventional to use an underscore as the variable:

A list comprehension can include multiple fors:

In [70]:

pairs = [(x, y) for x in range(10) for y in range(10)] # 100 pairs (0,0), (0,1), ... (9,8), (9,9)

later fors can use the results of earlier ones:

In [71]:

increasing_pairs = [(x, y) for x in range(10) for y in range(x + 1, 10)]

Generators and Iterators¶

A problem with lists is that they can easily grow very big. range(1000000) creates an actual list of 1 million elements. If you only need to deal with them one at a time, this can be a huge source of inefficiency (or of running out of memory). If you potentially only need the first few values, then calculating them all is a waste.
A generator is something that you can iterate over (for us, usually using for ) but whose values are produced only as needed (lazily).
One way to create generators is with functions and the yield operator:

In [72]:

def lazy_range(n):
    """a lazy version of range"""
    i = 0
    while i < n:
        yield i
        i += 1

In [73]:

# The following loop will consume the yield ed values one at a time until none are left:
for i in lazy_range(10):
    print(i)

A second way to create generators is by using for comprehensions wrapped in parentheses:

In [74]:

lazy_evens_below_20 = (i for i in lazy_range(20) if i % 2 == 0)

In [75]:

lazy_evens_below_20

Out[75]:

<generator object <genexpr> at 0x7fb59c32f190>

Randomness¶

To generate random numbers, we can do with the random module
random.random() produces numbers uniformly between 0 and 1

In [76]:

import random
four_uniform_randoms = [random.random() for _ in range(4)]
four_uniform_randoms

Out[76]:

[0.28755605092476433,
 0.7352577031141632,
 0.5982984069418092,
 0.867637181150536]

if you want to get reproducible results:

In [77]:

random.seed(10)
print(random.random())
random.seed(10)
print(random.random())

0.5714025946899135
0.5714025946899135

random.randrange takes either 1 or 2 arguments and returns an element chosen randomly from the corresponding range()

In [78]:

random.randrange(10)  # choose randomly from range(10) = [0, 1, ..., 9]
random.randrange(3, 6)  # choose randomly from range(3, 6) = [3, 4, 5]

Out[78]:

random.shuffle randomly reorders the elements of a list:

In [79]:

up_to_ten = list(range(10))
random.shuffle(up_to_ten)
print(up_to_ten)

[4, 5, 8, 1, 2, 6, 7, 3, 0, 9]

To randomly pick one element from a list:

In [80]:

my_best_friend = random.choice(["Alice", "Bob", "Charlie"])

To randomly choose a sample of elements without replacement (i.e., with no duplicates)

In [81]:

lottery_numbers = range(60)
winning_numbers = random.sample(lottery_numbers, 6)

To choose a sample of elements with replacement (i.e., allowing duplicates)

In [82]:

four_with_replacement = [random.choice(range(10)) for _ in range(4)]

Regular Expressions¶

Regular expressions provide a way of searching text.
They are incredibly useful but also fairly complicated, so much so that there are entire books written about them.

In [83]:

import re
print(all([
    not re.match("a", "cat"),
    re.search("a", "cat"),
    not re.search("c", "dog"),
    3 == len(re.split("[ab]", "carbs")),
    "R-D-" == re.sub("[0-9]", "-", "R2D2")
])) # prints True

True

Object-Oriented Programming¶

In [84]:

# by convention, we give classes PascalCase names
class Set:
    # these are the member functions
    # every one takes a first parameter "self" (another convention)
    # that refers to the particular Set object being used

    def __init__(self, values=None):
        """This is the constructor.
        It gets called when you create a new Set.
        You would use it like
        s1 = Set()  # empty set
        s2 = Set([1,2,2,3])  # initialize with values"""

        self.dict = {} # each instance of Set has its own dict property which is what we'll use to track memberships
        if values is not None:
            for value in values:
                self.add(value)
    
    def __repr__(self):
        """this is the string representation of a Set object
        if you type it at the Python prompt or pass it to str
        ()"""
        
        return "Set: " + str(self.dict.keys())
    
    # we'll represent membership by being a key in self.dict with value True
    def add(self, value):
        self.dict[value] = True
        
    # value is in the Set if it's a key in the dictionary
    def contains(self, value):
        return value in self.dict
    
    def remove(self, value):
        del self.dict[value]

In [85]:

s = Set([1,2,3])
s.add(4)
print(s.contains(4))  # True
s.remove(3)
print(s.contains(3))  # False

True
False

Functional Tools¶

When passing functions around, sometimes we’ll want to partially apply (or curry) functions to create new functions.

In [86]:

def exp(base, power):
    return base ** power

def two_to_the(power):
    return exp(2, power)

In [87]:

two_to_the(3)

Out[87]:

A different approach is to use functools.partial :

In [88]:

from functools import partial
two_to_the = partial(exp, 2)  # is now a function of one variable
print(two_to_the(3))  # 8

In [89]:

square_of = partial(exp, power=2)
print(square_of(3))  # 9

We will also occasionally use map, reduce, and filter, which provide functional alternatives to list comprehensions:

Always use map, reduce, and filter if possible

Map¶

In [90]:

def double(x):
    return 2 * x

xs = [1, 2, 3, 4]
twice_xs = [double(x) for x in xs]
twice_xs = map(double, xs)

list_doubler = partial(map, double)
twice_xs = list_doubler(xs)

In [91]:

def multiply(x, y): return x * y

products = map(multiply, [1, 2], [4, 5])  # [1 * 4, 2 * 5] = [4, 10]

Filter¶

In [92]:

def is_even(x):
    """True if x is even, False if x is odd"""
    return x % 2 == 0

x_evens = [x for x in xs if is_even(x)]
x_evens = filter(is_even, xs)

list_evener = partial(filter, is_even)
x_evens = list_evener(xs)

Reduce¶

In [93]:

from functools import reduce

x_product = reduce(multiply, xs)

list_product = partial(reduce, multiply)
x_product = list_product(xs)

enumerate¶

To iterate over a list and use both its elements and their indexes:

In [94]:

documents = ["I", "am", "a", "boy"]

# not Pythonic
for i in range(len(documents)):
    document = documents[i]
    print(i, document)
    
# also not Pythonic
i = 0
for document in documents:
    print(i, document)
    i += 1

0 I
1 am
2 a
3 boy
0 I
1 am
2 a
3 boy

The Pythonic solution is enumerate , which produces tuples (index, element) :

In [95]:

for i, document in enumerate(documents):
    print(i, document)

0 I
1 am
2 a
3 boy

In [96]:

for i in range(len(documents)): print(i)  # not Pythonic
for i, _ in enumerate(documents): print(i)  # Pythonic

zip and Argument Unpacking¶

To zip two or more lists together.
zip transforms multiple lists into a single list of tuples of corresponding elements:

In [97]:

list1 = ['a', 'b', 'c']
list2 = [1, 2, 3]
list(zip(list1, list2))  # is [('a', 1), ('b', 2), ('c', 3)]

Out[97]:

[('a', 1), ('b', 2), ('c', 3)]

You can also “unzip” a list using a strange trick:

In [98]:

pairs = [('a', 1), ('b', 2), ('c', 3)]
letters, numbers = zip(*pairs)

In [99]:

list(zip(('a', 1), ('b', 2), ('c', 3)))

Out[99]:

[('a', 'b', 'c'), (1, 2, 3)]

Numpy Tutorial (0)	2024.11.03
Visualizing Data (0)	2024.11.03
FastAPI를 이용한 웹캠 스트리밍 서버 (0)	2024.10.29
Numpy in Python (0)	2024.09.10
CS231n Python Tutorial (0)	2024.09.10

새소식

인기 검색어

Crash Cource in Python