Lesser known and useful features in python3

Lesser known and useful features in python3

This is a summary of changes introduced in python 3, by version.

Much of the content is quoted directly from the python docs. This is a quick summary for convenience, and I figure it'll be useful for others to have an at-a-glance summary of the cool (and useful!) things they could use, new in python 3 (vs python 2).

3.0

Keyword only arguments

I use this one a lot! It's a great way to enforce named arguments for readability.

def f(a, b, *args, option=True):
    ...
  • option can only be specified by name
  • You can write just * instead of * if you don't need the other named args:
def f(a, b, *, option=True):
    ...
  • Allows you to provide a stable API and reorder arguments, etc
  • Add new keyword arguments without breaking the API

print() function, customize separator between items:

Quick and easy way to print multiple values:

print("There are <", 2**32, "> possibilities!", sep="")

Produces:

There are <4294967296> possibilities!

Print function docs

nonlocal

PEP 3104: nonlocal statement. Using nonlocal x you can now assign directly to a variable in an outer (but non-global) scope. nonlocal is a new reserved word.

maxsize (use instead of maxint)

The sys.maxint constant was removed, since there is no longer a limit to the value of integers. However, sys.maxsize can be used as an integer larger than any practical list or string index.

extended iterable unpacking

PEP 3132: Extended Iterable Unpacking. You can now write things like a, b, *rest = some_sequence. And even *rest, a = stuff. The rest object is always a (possibly empty) list; the right-hand side may be any iterable. Example:

(a, *rest, b) = range(5)

dictionary comprehensions

Dictionary comprehensions: {k: v for k, v in stuff} means the same thing as dict(stuff) but is more flexible.

exception catch syntax

Change from except exc, var to except exc as var. See PEP 3110.

metaclass, not metaclass

This:

class C(metaclass=M):

format with interpolation

name = 'world'
print(f'Hello, {name}')

Exception chaining

Maintain the original traceback by re-raising the exception after you handle it:

raise exception from e

3.1

format numbers, currency

PEP 378: Format Specifier for Thousands Separator

>>> format(1234567, ',d')
'1,234,567'
>>> format(1234567.89, ',.2f')
'1,234,567.89'
>>> format(12345.6 + 8901234.12j, ',f')
'12,345.600000+8,901,234.120000j'
>>> format(Decimal('1234567.89'), ',f')
'1,234,567.89'

format automatically numbers arguments

'Sir {} of {}'.format('Gallahad', 'Camelot')

with multiple context managers

The syntax of the with statement now allows multiple context managers in a single statement:

with open('mylog.txt') as infile, open('a.out', 'w') as outfile:
    for line in infile:
        if '<critical>' in line:
            outfile.write(line)

dummy logger

The logging module now implements a simple logging.NullHandler class for applications that are not using logging but are calling library code that does. Setting-up a null handler will suppress spurious warnings such as "No handlers could be found for logger foo":

h = logging.NullHandler()
logging.getLogger("foo").addHandler(h)

3.2

use argparse (not optparse)

A new module for command line parsing, argparse, was introduced to overcome the limitations of optparse which did not provide support for positional arguments (not just options), subcommands, required options and other common patterns of specifying and validating options.

See more here.

pycache folders instead of .pyc files

See more here.

If you used to clean .pyc files, now you have to clean __pycache__ files too:

find . | grep -E "(__pycache__|\.pyc|\.pyo$)" | xargs rm -rf

Prevent generating these in development with an environment variable:

export PYTHONDONTWRITEBYTECODE=""

str.format_map()

A new str.format_map() method that extends the capabilities of the existing str.format() method by accepting arbitrary mapping objects. Some cool examples:

import shelve
d = shelve.open('tmp.shl')
'The {project_name} status is {status} as of {date}'.format_map(d)


class LowerCasedDict(dict):
    def __getitem__(self, key):
        return dict.__getitem__(self, key.lower())
lcd = LowerCasedDict(part='widgets', quantity=10)
'There are {QUANTITY} {Part} in stock'.format_map(lcd)


class PlaceholderDict(dict):
    def __missing__(self, key):
        return '<{}>'.format(key)
'Hello {name}, welcome to {location}'.format_map(PlaceholderDict())

functools cache and total_ordering

import functools
@functools.lru_cache(maxsize=300)
def get_phone_number(name):
    c = conn.cursor()
    c.execute('SELECT phonenumber FROM phonelist WHERE name=?', (name,))
    return c.fetchone()[0]

For example, supplying __eq__ and __lt__ will enable total_ordering() to fill-in __le__, __gt__ and __ge__:

@total_ordering
class Student:
    def __eq__(self, other):
        return ((self.lastname.lower(), self.firstname.lower()) ==
                (other.lastname.lower(), other.firstname.lower()))

    def __lt__(self, other):
        return ((self.lastname.lower(), self.firstname.lower()) <
                (other.lastname.lower(), other.firstname.lower()))

See here.

datetime has timezone

>>> from datetime import datetime, timezone

>>> datetime.now(timezone.utc)
datetime.datetime(2010, 12, 8, 21, 4, 2, 923754, tzinfo=datetime.timezone.utc)

>>> datetime.strptime("01/01/2000 12:00 +0000", "%m/%d/%Y %H:%M %z")
datetime.datetime(2000, 1, 1, 12, 0, tzinfo=datetime.timezone.utc)

contextmanager as decorator

from contextlib import contextmanager
import logging

logging.basicConfig(level=logging.INFO)

@contextmanager
def track_entry_and_exit(name):
    logging.info('Entering: %s', name)
    yield
    logging.info('Exiting: %s', name)
Formerly, this would have only been usable as a context manager:

with track_entry_and_exit('widget loader'):
    print('Some time consuming activity goes here')
    load_widget()
Now, it can be used as a decorator as well:

@track_entry_and_exit('widget loader')
def activity():
    print('Some time consuming activity goes here')
    load_widget()

improvements to ast.literal_eval

A safe eval for literals. See more here.

require class or static methods on subclasses

The abc module now supports abstractclassmethod() and abstractstaticmethod().

These tools make it possible to define an abstract base class that requires a particular classmethod() or staticmethod() to be implemented:

class Temperature(metaclass=abc.ABCMeta):
    @abc.abstractclassmethod
    def from_fahrenheit(cls, t):
        ...
    @abc.abstractclassmethod
    def from_celsius(cls, t):
        ...

unittest: equal counts in iterable

Another new method, assertCountEqual() is used to compare two iterables to determine if their element counts are equal (whether the same elements are present with the same number of occurrences regardless of order):

def test_anagram(self):
    self.assertCountEqual('algorithm', 'logarithm')

3.3

PEP 380: Syntax for Delegating to a Subgenerator

Delegate to another generator using yield from:

def g(x):
    yield from range(x, 0, -1)
    yield from range(x)

list(g(5))

One can also use the return values. See PEP 380.

The main principle driving this change is to allow even generators that are designed to be used with the send and throw methods to be split into multiple subgenerators as easily as a single large function can be split into multiple subfunctions.

Casefolding to match cases

For example, 'ß'.casefold() returns 'ss'. More here.

Parse HTML without errors

html.parser.HTMLParser is now able to parse broken markup without raising errors

3.4

No new syntax features were added. Just some interesting new libraries.

Static typing

Best to just read the docs. There's also mypy for support in python 2 and 3.

Read this quick example of why static typing is a good idea. And another one.

For a lengthier, good read on why typing isn't enough, and it's best to still use naming conventions, see this wonderful article.

Support for enum types

Read more here.

from enum import Enum
class Color(Enum):
    RED = 1
    GREEN = 2
    BLUE = 3

Pathlib - work with paths as objects

I won't go into detail here, but using pathlib can be much more powerful and reduce the chance of bugs. Check out this great write-up with examples.

Trace python memory allocations

tracemalloc.

Force relative imports to start with .

Force relative imports to start with . - see PEP-0328

from __future__ import absolute_import

3.5

Typing and type hints

Added in PEP-484, type hints are incredibly useful. Also, it's funny how we're moving to add typing to what were originally scripting languages (see also TypeScript for JavaScript).

def greeting(name: str) -> str:
    return 'Hello ' + name

Additional Unpacking Generalizations

Added in PEP-448. Can be used multiple times in a list of arguments:

>>> print(*[1], *[2], 3, *[4, 5])
1 2 3 4 5

>>> def fn(a, b, c, d):
...     print(a, b, c, d)
...

>>> fn(**{'a': 1, 'c': 3}, **{'b': 2, 'd': 4})
1 2 3 4

Similarly for tuples, sets and dictionaries.

Traceback developer improvemets

New functions to walk_stack() and walk_tb(), and more.

Logging

Logging now acccepts an exc_info argument.

Pathlib

A less known, but new way to work with paths, introduced in python 3.4, Path.samefile() is now a way to check two paths are the same. More improvements here.

3.6

f-strings

The new big feature here is f-strings, formatted string literals, introduced in PEP-498.

>>> name = "Fred"
>>> f"He said his name is {name}."
'He said his name is Fred.'
>>> width = 10
>>> precision = 4
>>> value = decimal.Decimal("12.34567")
>>> f"result: {value:{width}.{precision}}"  # nested fields
'result:      12.35'

Typing: variable annotations

Introduced in PEP-484, now you can type variables upon declaration.

primes: List[int] = []

captain: str  # Note: no initial value!

class Starship:
    stats: Dict[str, int] = {}

Underscores in large numbers for readability

Added in PEP-515.

>>> 1_000_000_000_000_000
1000000000000000
>>> '{:_}'.format(1000000)
'1_000_000'

Async/await improvements

Python 3.5 and 3.6 introduced lots of async/await support. I won't dwell on this one, as async/await code is still not mainstream to this day (in early 2020), though much of the language support was released 3-4 years ago.

Custom class creation

Instead of metaclasses, it's now possible to use __init__subclass__. The new __init_subclass__ classmethod will be called on the base class whenever a new subclass is created:

class PluginBase:
    subclasses = []

    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        cls.subclasses.append(cls)

class Plugin1(PluginBase):
    pass

class Plugin2(PluginBase):
    pass

Indicated unsupported operation

It is now possible to set a special method to None to indicate that the corresponding operation is not available. For example, if a class sets __iter__() to None, the class is not iterable.

Secure random numbers

Use the new secrets module.

3.7

Lazy eval of annotations

from __future__ import annotations  # need this import in 3.7+, will be default in 4.0


class C:
    @classmethod
    def from_string(cls, source: str) -> C:  # Note self reference
        ...

    def validate_b(self, obj: B) -> bool:  # Note reference to class declared later
        ...

class B:
    ...

# To handle self-reference in 3.6 or at construction, use quotes:
# See https://stackoverflow.com/a/33533514

class Tree:
    def __init__(self, left: 'Tree', right: 'Tree'):
        self.left = left
        self.right = right

Built-in breakpoint()

From 3.7 onwards, I stopped using the old familiar import pdb; pdb.set_trace() to set a breakpoint, and instead just:

breakpoint()

This is really handy and much easier to remember. See PEP-553 for more.

You can use an env var to customize what this command does:

# Use PuDB
$ PYTHONBREAKPOINT=pudb.set_trace python3.7 script.py
# Or embed an IPython shell:
$ PYTHONBREAKPOINT=IPython.embed python3.7 script.py

Dataclasses

While not very commonly used, there have been times when I wanted something like this. See also PEP-557.

@dataclass(order=True)
class Point:
    x: float
    y: float
    z: float = 0.0

p = Point(1.5, 2.5)
print(p)   # produces "Point(x=1.5, y=2.5, z=0.0)"

The order argument enables sorting of instances of the class.

Other small wins

  • Dictionary order is officially guaranteed.
  • Improvements to typing: core support, generic types, performance x7 and bug fixes - see PEP-484.
  • __getattr__() on modules, as well as __dir__() - see PEP-562.
  • ImportError now displays module name and module __file__ path when from ... import ... fails.
  • Tons of asyncio improvements, async and await are reserved keywords.
  • datetime.fromisoformat() - more
  • The subprocess.run() function accepts the new capture_output keyword argument. When true, stdout and stderr will be captured. This is equivalent to passing subprocess.PIPE as stdout and stderr arguments.
  • Timing precision to nanoseconds - see PEP-564.

3.8

Assignment expressions (walrus operator)

Use := to assign values to variables as part of a larger expression. This is great for declaring variables in the block they are used, a nice shorthand!

# In conditionals
if (n := len(a)) > 10:
    print(f"List is too long ({n} elements, expected <= 10)")
    
# In while loops
# Loop over fixed length blocks
while (block := f.read(256)) != '':
    process(block)
    
# In list comprehensions
[clean_name.title() for name in names
 if (clean_name := normalize('NFC', name)) in allowed_names]

Positional-only parameters

There is a new function parameter syntax / to indicate that some function parameters must be specified positionally and cannot be used as keyword arguments.

def f(a, b, /, c, d, *, e, f):
    print(a, b, c, d, e, f)

# Valid call:
f(10, 20, 30, d=40, e=50, f=60)

# Invalid calls:
f(10, b=20, c=30, d=40, e=50, f=60)   # b cannot be a keyword argument
f(10, 20, 30, 40, 50, f=60)           # e must be a keyword argument

Good use cases from the docs:

  • Disallow calling the parameter by name when it's not helpful, e.g. len(obj="hello")... obj impairs readability here
  • Allows the parameter name to be changed in the future without breaking client code.

Since the parameters to the left of / are not exposed as possible keywords, the parameters names remain available for use in **kwargs:

def f(a, b, /, **kwargs):
    print(a, b, kwargs)

f(10, 20, a=1, b=2, c=3)         # a and b are used in two ways

f-strings = for debugging

This example says it all:

user = 'eric_idle'
member_since = date(1975, 7, 31)
f'{user=} {member_since=}'
# "user='eric_idle' member_since=datetime.date(1975, 7, 31)"

# With formatting
delta = date.today() - member_since
f'{user=!s}  {delta.days=:,d}'
# 'user=eric_idle  delta.days=16,075'

# Using functions
print(f'{theta=}  {cos(radians(theta))=:.3f}')
# theta=30  cos(radians(theta))=0.866

Read metadata from packages

The new importlib.metadata module provides (provisional) support for reading metadata from third-party packages. For example, it can extract an installed package's version number, list of entry points, and more:

>>> # Note following example requires that the popular "requests"
>>> # package has been installed.
>>>
>>> from importlib.metadata import version, requires, files
>>> version('requests')
'2.22.0'
>>> list(requires('requests'))
['chardet (<3.1.0,>=3.0.2)']
>>> list(files('requests'))[:5]
[PackagePath('requests-2.22.0.dist-info/INSTALLER'),
 PackagePath('requests-2.22.0.dist-info/LICENSE'),
 PackagePath('requests-2.22.0.dist-info/METADATA'),
 PackagePath('requests-2.22.0.dist-info/RECORD'),
 PackagePath('requests-2.22.0.dist-info/WHEEL')]

Cached property

This is similar to Django's util, except it's standard python!

import functools
import statistics

class Dataset:
   def __init__(self, sequence_of_numbers):
      self.data = sequence_of_numbers

   @functools.cached_property
   def variance(self):
      return statistics.variance(self.data)

Typing improvements

  • A TypedDict to allow only string keys. By default, every key is required to be present. Specify total=False to allow keys to be optional:
class Location(TypedDict, total=False):
    lat_long: tuple
    grid_square: str
    xy_coordinate: tuple

More async improvements

Read about these in the summary.

3.9

Check out this great article.

Thanks!

Thanks for reading. Hope you learned about new python 3 goodies that you'll use for years to come.p