Crash Cource in Python
- -
A Crash Cource in Python¶
The Basics¶
Whitespace Formatting¶
Many languages use curly braces to delimit blocks of code. Python uses indentation:
for i in [1, 2, 3, 4, 5]:
print(i)
for j in [1, 2, 3, 4, 5]:
print(j)
print(i + j)
print(i)
print("done looping")
1 1 2 2 3 3 4 4 5 5 6 1 2 1 3 2 4 3 5 4 6 5 7 2 3 1 4 2 5 3 6 4 7 5 8 3 4 1 5 2 6 3 7 4 8 5 9 4 5 1 6 2 7 3 8 4 9 5 10 5 done looping
Whitespace is ignored inside parentheses and brackets
long_winded_computation = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10
+ 11 + 12 +
13 + 14 + 15 + 16 + 17 + 18 + 19 + 20)
for making code easier to read
list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
easier_to_read_list_of_lists = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
Use a backslash to indicate that a statement continues onto the next line
two_plus_three = 2 + \
3
Modules¶
Import the modules that contain features
- import regular expression module: re is the module containing functions and constants for working with regular expressions.
import re
my_regex = re.compile("[0-9]+", re.I)
You may use an alias
import re as regex
my_regex = regex.compile("[0-9]+", regex.I)
import matplotlib.pyplot as plt
You can import them explicitly and use them without qualification
from collections import defaultdict, Counter
lookup = defaultdict(int)
my_counter = Counter()
You could import the entire contents of a module into your namespace, which might inadvertently overwrite variables you’ve already defined:
match = 10
from re import * # uh oh, re has a match function
print(match) # "<function re.match>"
<function match at 0x7fb5d73a8160>
Arithmetic¶
Remember quotient-remainder theorem $$ n = d \cdot q + r $$
- $ d $ is a divisor, $ q $ is a quotient, $ r $ is a remainder,
- $ 0 \leq r < q $ when $ q $ is positive and $ q < r \leq 0 $ when negative
print(2 ** 10) # 1024
print(2 ** 0.5) # 1.414...
print(2 ** -0.5) #
print(5 / 2) # 2.5
print(5 % 3) # 2
print(5 // 3) # 1
print((-5) % 3) # 1
print((-5) // 3) # -2
print(5 % (-3)) # -1
print((-5) // (-3)) # 1
print((-5) % (-3)) # -2
print(7.2 // 3.5) # 2.0
print(7.2 % 3.5) # 0.2
1024 1.4142135623730951 0.7071067811865476 2.5 2 1 1 -2 -1 1 -2 2.0 0.20000000000000018
Functions¶
A function is a rule for taking zero or more inputs and returning a corresponding output
def double(x):
"""this is where you put an optional docstring
that explains what the function does.
for example, this function multiplies its input by 2"""
return x * 2
double(2)
4
Python functions are first-class, which means that we can assign them to variables and pass them into functions just like any other arguments:
def apply_to_one(f):
"""calls the function f with 1 as its argument"""
return f(1)
my_double = double
x = apply_to_one(my_double)
print(x)
2
Lambda function: short anonymous functions
y = apply_to_one(lambda x: x + 4)
print(y)
5
another_double = lambda x: 2 * x
def another_double(x): return 2 * x # more readable
Function parameters can also be given default arguments
def my_print(message="my default message"):
print(message)
my_print("hello") # prints 'hello'
my_print() # prints 'my default message'
hello my default message
def subtract(a=0, b=0):
return a - b
subtract(10, 5) # returns 5
subtract(0, 5) # returns -5
subtract(b=5) # same as previous
-5
Strings¶
- Strings can be delimited by single or double quotation marks
single_quoted_string = 'data science'
double_quoted_string = "data science"
tab_string = "\t" # represents the tab character
len(tab_string) # is 1
1
multiline strings using triple-double-quotes
multi_line_string = """This is the first line.
and this is the second line
and this is the third line"""
Lists¶
- the most fundamental data structure in Python
integer_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [ integer_list, heterogeneous_list, [] ]
list_length = len(integer_list) # equals 3
list_sum = sum(integer_list) # equals 6
You can get or set the nth element of a list with square brackets
x = list(range(10)) # is the list [0, 1, ..., 9]
zero = x[0] # equals 0, lists are 0-indexed
one = x[1] # equals 1
nine = x[-1] # equals 9, 'Pythonic' for last element
eight = x[-2] # equals 8, 'Pythonic' for next-to-last element
x[0] = -1 # now x is [-1, 1, 2, 3, ..., 9]
You can also use square brackets to “slice” lists:
first_three = x[:3] # [-1, 1, 2]
three_to_end = x[3:] # [3, 4, ..., 9]
one_to_four = x[1:5] # [1, 2, 3, 4]
last_three = x[-3:] # [7, 8, 9]
without_first_and_last = x[1:-1] # [1, 2, ..., 8]
copy_of_x = x[:] # [-1, 1, 2, ..., 9]
in operator to check for list membership
1 in [1, 2, 3] # True
0 in [1, 2, 3] # False
False
To concatenate lists together:
x = [1, 2, 3]
x.extend([4, 5, 6]) # x is now [1,2,3,4,5,6]
x = [1, 2, 3]
y = x + [4, 5, 6] # y is [1, 2, 3, 4, 5, 6]; x is unchanged
To append to lists one item at a time:
x = [1, 2, 3]
x.append(0) # x is now [1, 2, 3, 0]
y = x[-1] # equals 0
z = len(x) # equals 4
It is convenient to unpack lists:
x, y = [1, 2] # now x is 1, y is 2
_, y = [1, 2] # now y == 2, didn't care about the first element
Tuples¶
- Tuples are lists’ immutable cousins.
my_list = [1, 2]
my_tuple = (1, 2)
other_tuple = 3, 4
my_list[1] = 3 # my_list is now [1, 3]
try:
my_tuple[1] = 3
except TypeError:
print("cannot modify a tuple")
cannot modify a tuple
Tuples are a convenient way to return multiple values from functions:
def sum_and_product(x, y):
return (x + y),(x * y)
sp = sum_and_product(2, 3) # equals (5, 6)
s, p = sum_and_product(5, 10) # s is 15, p is 50
Tuples (and lists) can also be used for multiple assignment:
x, y = 1, 2 # now x is 1, y is 2
x, y = y, x # Pythonic way to swap variables; now x is 2, y is 1
Dictionaries¶
- Another fundamental data structure which associates values with keys
- It allows you to quickly retrieve the value corresponding to a given key:
empty_dict = {} # Pythonic
empty_dict2 = dict() # less Pythonic
grades = { "Joel" : 80, "Tim" : 95 } # dictionary literal
You can look up the value for a key using square brackets:
joels_grade = grades["Joel"] # equals 80
KeyError if you ask for a key that’s not in the dictionary:
try:
kates_grade = grades["Kate"]
except KeyError:
print("no grade for Kate!")
no grade for Kate!
You can check for the existence of a key using in :
joel_has_grade = "Joel" in grades # True
kate_has_grade = "Kate" in grades # False
Dictionaries have a get method that returns a default value (instead of raising an exception) when you look up a key that’s not in the dictionary:
joels_grade = grades.get("Joel", 0) # equals 80
kates_grade = grades.get("Kate", 0) # equals 0
no_ones_grade = grades.get("No One") # default default is None
You assign key-value pairs using the same square brackets:
grades["Tim"] = 99 # replaces the old value
grades["Kate"] = 100 # adds a third entry
num_students = len(grades) # equals 3
tweet = {
"user" : "joelgrus",
"text" : "Data Science is Awesome",
"retweet_count" : 100,
"hashtags" : ["#data", "#science", "#datascience", "#awesome", "#yolo"]
}
Iteration: we can look at all of them
tweet_keys = tweet.keys() # list of keys
tweet_values = tweet.values() # list of values
tweet_items = tweet.items() # list of (key, value) tuples
"user" in tweet_keys # True, but uses a slow list in
"user" in tweet # more Pythonic, uses faster dict in
"joelgrus" in tweet_values # True
True
WordCount Example: Create a dictionary in which the keys are words and the values are counts.
document = ['I', 'am', 'a', 'boy', 'I', 'love', 'you']
First Approach:
word_counts = {}
for word in document:
if word in word_counts:
word_counts[word] += 1
else:
word_counts[word] = 1
Second Approach:
word_counts = {}
for word in document:
try:
word_counts[word] += 1
except KeyError:
word_counts[word] = 1
Third Approach:
word_counts = {}
for word in document:
previous_count = word_counts.get(word, 0)
word_counts[word] = previous_count + 1
word_counts
{'I': 2, 'am': 1, 'a': 1, 'boy': 1, 'love': 1, 'you': 1}
defaultdict¶
- A defaultdict is like a regular dictionary, except that when you try to look up a key it doesn’t contain, it first adds a value for it using a zero-argument function you provided when you created it.
from collections import defaultdict
word_counts = defaultdict(int) # int() produces 0
# word_counts = defaultdict(lambda: 100) # returns 100
for word in document:
word_counts[word] += 1
print(word_counts)
defaultdict(<class 'int'>, {'I': 2, 'am': 1, 'a': 1, 'boy': 1, 'love': 1, 'you': 1})
int()
0
dd_list = defaultdict(list) # list() produces an empty list
dd_list[2].append(1) # now dd_list contains {2:[1]}
dd_dict = defaultdict(dict) # dict() produces an empty dict
dd_dict["Joel"]["City"] = "Seattle" # { "Joel" : { "City" : Seattle"}}
dd_pair = defaultdict(lambda: [0, 0])
dd_pair[2][1] = 1 # now dd_pair contains {2:[0,1]}
Counter¶
- A Counter turns a sequence of values into a defaultdict(int)-like object mapping keys to counts.
- We will primarily use it to create histograms
from collections import Counter
c = Counter([0, 1, 2, 0]) # c is (basically) { 0 : 2, 1 : 1, 2 : 1 }
word_counts = Counter(document)
Use help function to see a man page
help(word_counts)
Help on Counter in module collections object: class Counter(builtins.dict) | Counter(iterable=None, /, **kwds) | | Dict subclass for counting hashable items. Sometimes called a bag | or multiset. Elements are stored as dictionary keys and their counts | are stored as dictionary values. | | >>> c = Counter('abcdeabcdabcaba') # count elements from a string | | >>> c.most_common(3) # three most common elements | [('a', 5), ('b', 4), ('c', 3)] | >>> sorted(c) # list all unique elements | ['a', 'b', 'c', 'd', 'e'] | >>> ''.join(sorted(c.elements())) # list elements with repetitions | 'aaaaabbbbcccdde' | >>> sum(c.values()) # total of all counts | 15 | | >>> c['a'] # count of letter 'a' | 5 | >>> for elem in 'shazam': # update counts from an iterable | ... c[elem] += 1 # by adding 1 to each element's count | >>> c['a'] # now there are seven 'a' | 7 | >>> del c['b'] # remove all 'b' | >>> c['b'] # now there are zero 'b' | 0 | | >>> d = Counter('simsalabim') # make another counter | >>> c.update(d) # add in the second counter | >>> c['a'] # now there are nine 'a' | 9 | | >>> c.clear() # empty the counter | >>> c | Counter() | | Note: If a count is set to zero or reduced to zero, it will remain | in the counter until the entry is deleted or the counter is cleared: | | >>> c = Counter('aaabbc') | >>> c['b'] -= 2 # reduce the count of 'b' by two | >>> c.most_common() # 'b' is still in, but its count is zero | [('a', 3), ('c', 1), ('b', 0)] | | Method resolution order: | Counter | builtins.dict | builtins.object | | Methods defined here: | | __add__(self, other) | Add counts from two counters. | | >>> Counter('abbb') + Counter('bcc') | Counter({'b': 4, 'c': 2, 'a': 1}) | | __and__(self, other) | Intersection is the minimum of corresponding counts. | | >>> Counter('abbb') & Counter('bcc') | Counter({'b': 1}) | | __delitem__(self, elem) | Like dict.__delitem__() but does not raise KeyError for missing values. | | __iadd__(self, other) | Inplace add from another counter, keeping only positive counts. | | >>> c = Counter('abbb') | >>> c += Counter('bcc') | >>> c | Counter({'b': 4, 'c': 2, 'a': 1}) | | __iand__(self, other) | Inplace intersection is the minimum of corresponding counts. | | >>> c = Counter('abbb') | >>> c &= Counter('bcc') | >>> c | Counter({'b': 1}) | | __init__(self, iterable=None, /, **kwds) | Create a new, empty Counter object. And if given, count elements | from an input iterable. Or, initialize the count from another mapping | of elements to their counts. | | >>> c = Counter() # a new, empty counter | >>> c = Counter('gallahad') # a new counter from an iterable | >>> c = Counter({'a': 4, 'b': 2}) # a new counter from a mapping | >>> c = Counter(a=4, b=2) # a new counter from keyword args | | __ior__(self, other) | Inplace union is the maximum of value from either counter. | | >>> c = Counter('abbb') | >>> c |= Counter('bcc') | >>> c | Counter({'b': 3, 'c': 2, 'a': 1}) | | __isub__(self, other) | Inplace subtract counter, but keep only results with positive counts. | | >>> c = Counter('abbbc') | >>> c -= Counter('bccd') | >>> c | Counter({'b': 2, 'a': 1}) | | __missing__(self, key) | The count of elements not in the Counter is zero. | | __neg__(self) | Subtracts from an empty counter. Strips positive and zero counts, | and flips the sign on negative counts. | | __or__(self, other) | Union is the maximum of value in either of the input counters. | | >>> Counter('abbb') | Counter('bcc') | Counter({'b': 3, 'c': 2, 'a': 1}) | | __pos__(self) | Adds an empty counter, effectively stripping negative and zero counts | | __reduce__(self) | Helper for pickle. | | __repr__(self) | Return repr(self). | | __sub__(self, other) | Subtract count, but keep only results with positive counts. | | >>> Counter('abbbc') - Counter('bccd') | Counter({'b': 2, 'a': 1}) | | copy(self) | Return a shallow copy. | | elements(self) | Iterator over elements repeating each as many times as its count. | | >>> c = Counter('ABCABC') | >>> sorted(c.elements()) | ['A', 'A', 'B', 'B', 'C', 'C'] | | # Knuth's example for prime factors of 1836: 2**2 * 3**3 * 17**1 | >>> prime_factors = Counter({2: 2, 3: 3, 17: 1}) | >>> product = 1 | >>> for factor in prime_factors.elements(): # loop over factors | ... product *= factor # and multiply them | >>> product | 1836 | | Note, if an element's count has been set to zero or is a negative | number, elements() will ignore it. | | most_common(self, n=None) | List the n most common elements and their counts from the most | common to the least. If n is None, then list all element counts. | | >>> Counter('abracadabra').most_common(3) | [('a', 5), ('b', 2), ('r', 2)] | | subtract(self, iterable=None, /, **kwds) | Like dict.update() but subtracts counts instead of replacing them. | Counts can be reduced below zero. Both the inputs and outputs are | allowed to contain zero and negative counts. | | Source can be an iterable, a dictionary, or another Counter instance. | | >>> c = Counter('which') | >>> c.subtract('witch') # subtract elements from another iterable | >>> c.subtract(Counter('watch')) # subtract elements from another counter | >>> c['h'] # 2 in which, minus 1 in witch, minus 1 in watch | 0 | >>> c['w'] # 1 in which, minus 1 in witch, minus 1 in watch | -1 | | update(self, iterable=None, /, **kwds) | Like dict.update() but add counts instead of replacing them. | | Source can be an iterable, a dictionary, or another Counter instance. | | >>> c = Counter('which') | >>> c.update('witch') # add elements from another iterable | >>> d = Counter('watch') | >>> c.update(d) # add elements from another counter | >>> c['h'] # four 'h' in which, witch, and watch | 4 | | ---------------------------------------------------------------------- | Class methods defined here: | | fromkeys(iterable, v=None) from builtins.type | Create a new dictionary with keys from iterable and values set to value. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Methods inherited from builtins.dict: | | __contains__(self, key, /) | True if the dictionary has the specified key, else False. | | __eq__(self, value, /) | Return self==value. | | __ge__(self, value, /) | Return self>=value. | | __getattribute__(self, name, /) | Return getattr(self, name). | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __gt__(self, value, /) | Return self>value. | | __iter__(self, /) | Implement iter(self). | | __le__(self, value, /) | Return self<=value. | | __len__(self, /) | Return len(self). | | __lt__(self, value, /) | Return self<value. | | __ne__(self, value, /) | Return self!=value. | | __reversed__(self, /) | Return a reverse iterator over the dict keys. | | __ror__(self, value, /) | Return value|self. | | __setitem__(self, key, value, /) | Set self[key] to value. | | __sizeof__(...) | D.__sizeof__() -> size of D in memory, in bytes | | clear(...) | D.clear() -> None. Remove all items from D. | | get(self, key, default=None, /) | Return the value for key if key is in the dictionary, else default. | | items(...) | D.items() -> a set-like object providing a view on D's items | | keys(...) | D.keys() -> a set-like object providing a view on D's keys | | pop(...) | D.pop(k[,d]) -> v, remove specified key and return the corresponding value. | | If key is not found, default is returned if given, otherwise KeyError is raised | | popitem(self, /) | Remove and return a (key, value) pair as a 2-tuple. | | Pairs are returned in LIFO (last-in, first-out) order. | Raises KeyError if the dict is empty. | | setdefault(self, key, default=None, /) | Insert key with a value of default if key is not in the dictionary. | | Return the value for key if key is in the dictionary, else default. | | values(...) | D.values() -> an object providing a view on D's values | | ---------------------------------------------------------------------- | Class methods inherited from builtins.dict: | | __class_getitem__(...) from builtins.type | See PEP 585 | | ---------------------------------------------------------------------- | Static methods inherited from builtins.dict: | | __new__(*args, **kwargs) from builtins.type | Create and return a new object. See help(type) for accurate signature. | | ---------------------------------------------------------------------- | Data and other attributes inherited from builtins.dict: | | __hash__ = None
# print the 10 most common words and their counts
for word, count in word_counts.most_common(10):
print(word, count)
I 2 am 1 a 1 boy 1 love 1 you 1
Sets¶
- Another data structure is set, which represents a collection of distinct elements:
s = set()
s.add(1) # s is now { 1 }
s.add(2) # s is now { 1, 2 }
s.add(2) # s is still { 1, 2 }
x = len(s) # equals 2
y = 2 in s # equals True
z = 3 in s # equals False
- For a membership test, a set is more appropriate than a list
- in is a very fast operation on sets.
hundreds_of_other_words = []
stopwords_list = ["a","an","at"] + hundreds_of_other_words + ["yet", "you"]
"zip" in stopwords_list # False, but have to check every element
stopwords_set = set(stopwords_list)
"zip" in stopwords_set # very fast to check
False
To find the distinct items in a collection:
item_list = [1, 2, 3, 1, 2, 3]
num_items = len(item_list) # 6
item_set = set(item_list) # {1, 2, 3}
num_distinct_items = len(item_set) # 3
distinct_item_list = list(item_set) # [1, 2, 3]
Control Flow¶
if statement:
if 1 > 2:
message = "if only 1 were greater than two..."
elif 1 > 3:
message = "elif stands for 'else if'"
else:
message = "when all else fails use else (if you want to)"
a ternary if-then-else on one line
parity = "even" if x % 2 == 0 else "odd"
while statement:
x = 0
while x < 10:
print(x, "is less than 10")
x += 1
0 is less than 10 1 is less than 10 2 is less than 10 3 is less than 10 4 is less than 10 5 is less than 10 6 is less than 10 7 is less than 10 8 is less than 10 9 is less than 10
for statement:
for x in range(10):
print(x, "is less than 10")
0 is less than 10 1 is less than 10 2 is less than 10 3 is less than 10 4 is less than 10 5 is less than 10 6 is less than 10 7 is less than 10 8 is less than 10 9 is less than 10
continue and break statement:
for x in range(10):
if x == 3:
continue # go immediately to the next iteration
if x == 5:
break # quit the loop entirely
print(x)
0 1 2 4
Truthiness¶
one_is_less_than_two = 1 < 2 # equals True
true_equals_false = True == False # equals False
Python uses the value None to indicate a nonexistent value
x = None
print(x == None) # prints True, but is not Pythonic
print(x is None) # prints True, and is Pythonic
True True
The following are all “Falsy”:
False
None
[]
(an empty list)
{}
(an empty dict)
""
set()
0
0.0
s = 'abc'
if s:
first_char = s[0]
else:
first_char = ""
first_char = s and s[0] # A simpler way of doing the same
safe_x = x or 0 # if x is either a number or possibly None
- Python has an all function, which takes a list and returns True precisely when every element is truthy, and
- an any function, which returns True when at least one element is truthy:
all([True, 1, { 3 }]) # True
all([True, 1, {}]) # False, {} is falsy
any([True, 1, {}]) # True, True is truthy
all([]) # True, no falsy elements in the list
any([]) # False, no truthy elements in the list
False
The Not-So-Basics¶
Sorting¶
x = [4,1,2,3]
y = sorted(x) # is [1,2,3,4], x is unchanged
x.sort() # now x is [1,2,3,4]
# sort the list by absolute value from largest to smallest
x = sorted([-4,1,-2,3], key=abs, reverse=True) # is [-4,3,-2,1]
# sort the words and counts from highest count to lowest
wc = sorted(word_counts.items(),
key=lambda x: x[1], reverse=True)
List Comprehensions¶
- you’ll want to transform a list into another list, by choosing only certain elements, or by transforming elements, or both. The Pythonic way of doing this is list comprehensions:
- Always use list comprehension if possible.
even_numbers = [x for x in range(5) if x % 2 == 0] # [0, 2, 4]
squares = [x * x for x in range(5)] # [0, 1, 4, 9, 16]
even_squares = [x * x for x in even_numbers] # [0, 4, 16]
You can similarly turn lists into dictionaries or sets:
square_dict = { x : x * x for x in range(5) } # { 0:0, 1:1, 2:4, 3:9, 4:16 }
square_set = { x * x for x in [1, -1] } # { 1 }
- It’s conventional to use an underscore as the variable:
A list comprehension can include multiple fors:
pairs = [(x, y) for x in range(10) for y in range(10)] # 100 pairs (0,0), (0,1), ... (9,8), (9,9)
later fors can use the results of earlier ones:
increasing_pairs = [(x, y) for x in range(10) for y in range(x + 1, 10)]
Generators and Iterators¶
- A problem with lists is that they can easily grow very big. range(1000000) creates an actual list of 1 million elements. If you only need to deal with them one at a time, this can be a huge source of inefficiency (or of running out of memory). If you potentially only need the first few values, then calculating them all is a waste.
- A generator is something that you can iterate over (for us, usually using for ) but whose values are produced only as needed (lazily).
- One way to create generators is with functions and the yield operator:
def lazy_range(n):
"""a lazy version of range"""
i = 0
while i < n:
yield i
i += 1
# The following loop will consume the yield ed values one at a time until none are left:
for i in lazy_range(10):
print(i)
0 1 2 3 4 5 6 7 8 9
A second way to create generators is by using for comprehensions wrapped in parentheses:
lazy_evens_below_20 = (i for i in lazy_range(20) if i % 2 == 0)
lazy_evens_below_20
<generator object <genexpr> at 0x7fb59c32f190>
Randomness¶
- To generate random numbers, we can do with the random module
- random.random() produces numbers uniformly between 0 and 1
import random
four_uniform_randoms = [random.random() for _ in range(4)]
four_uniform_randoms
[0.28755605092476433, 0.7352577031141632, 0.5982984069418092, 0.867637181150536]
if you want to get reproducible results:
random.seed(10)
print(random.random())
random.seed(10)
print(random.random())
0.5714025946899135 0.5714025946899135
random.randrange takes either 1 or 2 arguments and returns an element chosen randomly from the corresponding range()
random.randrange(10) # choose randomly from range(10) = [0, 1, ..., 9]
random.randrange(3, 6) # choose randomly from range(3, 6) = [3, 4, 5]
4
random.shuffle randomly reorders the elements of a list:
up_to_ten = list(range(10))
random.shuffle(up_to_ten)
print(up_to_ten)
[4, 5, 8, 1, 2, 6, 7, 3, 0, 9]
To randomly pick one element from a list:
my_best_friend = random.choice(["Alice", "Bob", "Charlie"])
To randomly choose a sample of elements without replacement (i.e., with no duplicates)
lottery_numbers = range(60)
winning_numbers = random.sample(lottery_numbers, 6)
To choose a sample of elements with replacement (i.e., allowing duplicates)
four_with_replacement = [random.choice(range(10)) for _ in range(4)]
Regular Expressions¶
- Regular expressions provide a way of searching text.
- They are incredibly useful but also fairly complicated, so much so that there are entire books written about them.
import re
print(all([
not re.match("a", "cat"),
re.search("a", "cat"),
not re.search("c", "dog"),
3 == len(re.split("[ab]", "carbs")),
"R-D-" == re.sub("[0-9]", "-", "R2D2")
])) # prints True
True
Object-Oriented Programming¶
# by convention, we give classes PascalCase names
class Set:
# these are the member functions
# every one takes a first parameter "self" (another convention)
# that refers to the particular Set object being used
def __init__(self, values=None):
"""This is the constructor.
It gets called when you create a new Set.
You would use it like
s1 = Set() # empty set
s2 = Set([1,2,2,3]) # initialize with values"""
self.dict = {} # each instance of Set has its own dict property which is what we'll use to track memberships
if values is not None:
for value in values:
self.add(value)
def __repr__(self):
"""this is the string representation of a Set object
if you type it at the Python prompt or pass it to str
()"""
return "Set: " + str(self.dict.keys())
# we'll represent membership by being a key in self.dict with value True
def add(self, value):
self.dict[value] = True
# value is in the Set if it's a key in the dictionary
def contains(self, value):
return value in self.dict
def remove(self, value):
del self.dict[value]
s = Set([1,2,3])
s.add(4)
print(s.contains(4)) # True
s.remove(3)
print(s.contains(3)) # False
True False
Functional Tools¶
- When passing functions around, sometimes we’ll want to partially apply (or curry) functions to create new functions.
def exp(base, power):
return base ** power
def two_to_the(power):
return exp(2, power)
two_to_the(3)
8
A different approach is to use functools.partial :
from functools import partial
two_to_the = partial(exp, 2) # is now a function of one variable
print(two_to_the(3)) # 8
8
square_of = partial(exp, power=2)
print(square_of(3)) # 9
9
We will also occasionally use map, reduce, and filter, which provide functional alternatives to list comprehensions:
- Always use map, reduce, and filter if possible
Map¶
def double(x):
return 2 * x
xs = [1, 2, 3, 4]
twice_xs = [double(x) for x in xs]
twice_xs = map(double, xs)
list_doubler = partial(map, double)
twice_xs = list_doubler(xs)
def multiply(x, y): return x * y
products = map(multiply, [1, 2], [4, 5]) # [1 * 4, 2 * 5] = [4, 10]
Filter¶
def is_even(x):
"""True if x is even, False if x is odd"""
return x % 2 == 0
x_evens = [x for x in xs if is_even(x)]
x_evens = filter(is_even, xs)
list_evener = partial(filter, is_even)
x_evens = list_evener(xs)
Reduce¶
from functools import reduce
x_product = reduce(multiply, xs)
list_product = partial(reduce, multiply)
x_product = list_product(xs)
enumerate¶
- To iterate over a list and use both its elements and their indexes:
documents = ["I", "am", "a", "boy"]
# not Pythonic
for i in range(len(documents)):
document = documents[i]
print(i, document)
# also not Pythonic
i = 0
for document in documents:
print(i, document)
i += 1
0 I 1 am 2 a 3 boy 0 I 1 am 2 a 3 boy
The Pythonic solution is enumerate , which produces tuples (index, element) :
for i, document in enumerate(documents):
print(i, document)
0 I 1 am 2 a 3 boy
for i in range(len(documents)): print(i) # not Pythonic
for i, _ in enumerate(documents): print(i) # Pythonic
0 1 2 3 0 1 2 3
zip and Argument Unpacking¶
- To zip two or more lists together.
- zip transforms multiple lists into a single list of tuples of corresponding elements:
list1 = ['a', 'b', 'c']
list2 = [1, 2, 3]
list(zip(list1, list2)) # is [('a', 1), ('b', 2), ('c', 3)]
[('a', 1), ('b', 2), ('c', 3)]
You can also “unzip” a list using a strange trick:
pairs = [('a', 1), ('b', 2), ('c', 3)]
letters, numbers = zip(*pairs)
list(zip(('a', 1), ('b', 2), ('c', 3)))
[('a', 'b', 'c'), (1, 2, 3)]
'Coding > Python' 카테고리의 다른 글
Numpy Tutorial (0) | 2024.11.03 |
---|---|
Visualizing Data (0) | 2024.11.03 |
FastAPI를 이용한 웹캠 스트리밍 서버 (0) | 2024.10.29 |
Numpy in Python (0) | 2024.09.10 |
CS231n Python Tutorial (0) | 2024.09.10 |
소중한 공감 감사합니다