【翻译】谷歌Python风格指南(翻译中)

本文翻译自Google Python Style Guide,请勿用于商业用途。

谷歌Python风格指南


1 背景

Python是谷歌使用的一种主要的动态语言,本规范通过列出正误例子,来帮助你以正确的格式编写Python代码。
我们为Vim创建了一个配置文件
对于Emacs而言,用它本来的默认设置就可以了。

也有很多团队使用yapf自动格式化代码来避免格式争议。


2 Python语言规则


2.1 Lint

使用pylintrc,在你的代码上运行pylint


2.1.1 定义

pylint是一个用来发现Python源代码中的Bug和格式问题的工具。
它可以找到那些典型的非动态语言,例如C和C++,的编译器才能捕捉到的问题。
毕竟由于Python是一种动态语言,(它给出)的一些警告可能是错误的。
但是这些误报并不经常出现。


2.1.2 优点

捕获容易出现的错误,如拼写错误、赋值前使用变量等。


2.1.3 缺点

pylint并不完美。为了利用它,有时我们需要迎合它来写代码,抑制其警告或修复它。


2.1.4 结论

请确保你在你的代码上执行了pylint

如果发现误报,则取消该警告,以免遗漏其他警告。

1
dict = 'something awful'  # Bad Idea... pylint: disable=redefined-builtin

pylint的每个警告都由符号名称标识(empty-docstring)。
谷歌特定的警告则由g-开头。

如果你无法消除这些警告,请添加一个注释说明。

以这种抑制警告的好处是我们可以轻松搜索抑制掉的警告并重新审视它们。

你可以通过以下代码列出pylint的警告

1
pylint --list-msgs

如果你想得到特定警告内容的更多信息,请使用:

1
pylint --help-msg=C6409

我们更偏爱pylint: disable而不是已经过时的pylint: disable-msg

你可以在函数的开始处删除多余的变量(用del释放掉多余的变量)来抑制未使用参数的警告。
请留下一个说明告诉我们你为什么删掉了它。
仅仅注释“没用”就足够了。

例如:

1
2
3
def viking_cafe_order(spam: str, beans: str, eggs: Optional[str] = None) -> str:
del beans, eggs # 没用 by vikings.
return spam + spam + spam

其他还有一些抑制警告的常见的形式。
包括使用’_‘作为未用参数的标识符,或者给未使用参数附上’unused_‘的前缀,或者把他们赋值到’_‘。
这些形式可以用,但是不鼓励。
他们通过名称传递参数,但并不强制要求参数必须实际使用。


2.2 引用 Import

只在包(packages)和模块(modules)上使用import,而不要把import用于类和函数。
但从typing module,
typing_extensions module
six.moves module引用是不受次规则限制。


2.2.1 定义

提供一种从一个模块到另一个模块的可重用的共享代码机制。


2.2.2 优点

命名空间管理的规则会变得很简单。
每一个标识符的来源都可以按照固定的规则表示。
例如x.Obj则说明Obj在模块x中定义。


2.2.3 缺点

即使是模块名也是可能会冲突的。
而且一些模块名可能很长。


2.2.4 结论

  • 使用import x来引用包或者模块
  • 使用from x import y,其中x是包名,y是没有前缀的模块名
  • 使用from x import y as z,如果两个模块都被命名为y且需要同时引用;或者y是一个超长的名字。
  • 使用import y as z,仅当z是一个标准缩写时。(例如numpy的标准缩写np

比如说sound.effects.echo可能按照以下的方式引用

1
2
3
from sound.effects import echo
...
echo.EchoFilter(input, output, delay=0.7, atten=4)

不要利用相对名称进行引用。
即使这个模块在相同的包里。
请使用完全包名称。
这样可以避免无意中两次引用了相同的包


2.3 包 Packages

引用任何一个模块时,都使用该模块完全路径名称。


2.3.1 优点

避免模块名的冲突或者由于非预期的模块检索路径造成的冲突。
使寻找模块更加容易。


2.3.2 缺点

用于你不得不把按层次复制包,部署代码会变得复杂。
但对于现代的部署机制(CD)这不是问题。


2.3.3 结论

所有的的代码在引用模块时,必须使用完整的包名。

引用应该如下:

正:

1
2
3
4
5
# 在代码中引用使用完整的名称引用absl.flags (复杂方式).
import absl.flags
from doctor.who import jodie

FLAGS = absl.flags.FLAGS
1
2
3
4
5
# 仅仅使用模块名来引入flags (一般方式).
from absl import flags
from doctor.who import jodie

FLAGS = flags.FLAGS

误: (假设这个Python代码文件存在于 doctor/who/jodie.py 也在其下 )

1
2
3
4
# 不清楚该工程师想要引用什么模块。
# 在该程序中,实际的引用的模块取决于sys.path.这样的外部因素。
# 因此无法猜测该工程师想要导入的是哪个jodie模块
import jodie

尽管在某些环境中会发生这种情况, 我们不应假定主二进制文件所在的目录位于 sys.path
因此,import jodie应当引用自一个第三方库,或者一个叫做jodie的顶层包,而不是jodie.jp


2.4 异常 Exceptions

异常可以用,但要很小心。


2.4.1 定义

异常是一种用于处理错误或其他异常情况而中断程序流手段。


2.4.2 优点

正常操作代码的控制流程不会被异常处理代码弄乱。
当一个确定的情况发生时,异常处理也允许控制流跳过一些语句,例如,从一个N阶嵌套函数里直接返回,而不需要检查全部错误代码。


2.4.3 Cons

异常处理可能会迷惑控制流。
当调用一个库的时候,异常情况容易被遗漏。


2.4.4 结论

Exceptions must follow certain conditions:

  • 在有意义时使用内部异常类。比如,抛出一个ValueError来表示一个违反约定条件的错误
    (比如你本来想要一个正值,结果传递给你一个负值)。不要用assert断言在一个公开API
    上来验证参数值。assert只能用来确定内部的正确性,而不能限制程序的正确使用,也不表
    示发生了某些意外事件。如果要表示某些意外事件,请使用raise语句。
    例如:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    正:
    def connect_to_next_port(self, minimum: int) -> int:
    """链接到下一个可用端口.

    Args:
    minimum: 一个大约或等于1024的端口值.

    Returns:
    新的最小端口值.

    Raises:
    ConnectionError: 如果没有发现可用的端口.
    """
    if minimum < 1024:
    # 注意,这个异常的抛出并没有用"Raises:"节写在描述文档里,因为保障这种对API误用的特定行为是不必要的。
    raise ValueError(f'Min. port must be at least 1024, not {minimum}.')
    port = self._find_next_open_port(minimum)
    if not port:
    raise ConnectionError(
    f'不能通过端口 {minimum} 或更高端口链接到服务器.')
    assert port >= minimum, (
    f' 当 {minimum} 为最小端口时候,{port} 是一个非期望的端口.')
    return port
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    误:
    def connect_to_next_port(self, minimum: int) -> int:
    """链接到下一个可用的端口

    Args:
    minimum: 一个大约或等于1024的端口值.

    Returns:
    新的最小端口值.
    """
    assert minimum >= 1024, '最小端口至少是 1024.'
    port = self._find_next_open_port(minimum)
    assert port is not None
    return port
  • 库或者包会有他们自定义的异常。这些异常必须继承自一个已有的异常类。
    异常的名字必须以Error结尾,并且不能拗口(foo.FooError)。

  • 永远不要使用全部捕捉except:语句,或者捕捉Exception或者StandardError,除非:

    • 想要再次抛出这个异常,或者
    • 创建一个程序分离点,在这个分离点异常不再被抛出,但会被记录和抑制,例如
      从一个为了保护其最外层块而崩溃的线程

    Python在except:上是非常宽容的,它可以捕捉任何异常包括拼写错误、sys.exit()、
    Ctrl+C中断、单元测试错误,以及所有你不想被捕捉的异常。

  • 最小化try/except块之间的代码数量。代码数量越大,引发非期待的异常的可能性就越大。
    在这种情况下,try/except块会隐藏掉真正的异常。

  • 使用finally收尾。无论异常是否被抛出。finally常常用于清理,例如关闭一个文件。


2.5 Global variables

避免全局变量!


2.5.1 定义

变量应该声明在模块里,或者作为类成员被声明。


2.5.2 优点

偶尔有用。


2.5.3 缺点

由于对全局变量的赋值在模块第一次被引用时已经完成了,因此再次引用该模块时有可能改变模块的行为。
Has the potential to change module behavior during the import, because
assignments to global variables are done when the module is first imported.


2.5.4 结论

避免全局变量。

如果你需要一个技术性的变量(technically variables),我们推荐使用模块层的常量。
例如:_MAX_HOLY_HANDGRENADE_COUNT = 3
常量的定义应该全部使用大写字母和下划线。
请参考命名规则

如果非得使用全局变量,应该在模块级别声明全局变量,并通过在名称前添加 _ 使其成为模块内部变量。
外部访问必须通过公共模块级函数完成。
请参考命名规则


2.6 嵌套/本地/内部类和函数

可以用嵌套局部函数或类关闭局部变量。
内部类也可以。


2.6.1 定义

一个类可以被定义在方法,函数,或者另一个类里面。
一个函数可以被定义在一个方法或函数里。
嵌套函数对作用域中定义的变量具有只读访问权限。


2.6.2 优点

允许定义仅在非常有限的范围内使用的实用程序类和函数。
通常用于实现装饰器。
非常 ADT-y.


2.6.3 缺点

嵌套函数不可以被直接测试。
嵌套函数可能会让其外部函数更长更难读。


2.6.4 Decision

可以在遵守注意事项的前提下使用嵌套。
除非关闭了除 selfcls 之外的本地值,否则应该避免嵌套函数或类。
不要因为向用户隐层某些模块而使用嵌套函数。
如果你想隐藏一些函数,请在模块级别使用前缀_,这样它可以被测试到。





2.7 解释和生成表达式 (Comprehensions & Generator Expressions)

在一些简单的情况下可以使用


2.7.1 定义

列表,字典,和集合解释,例如生成表达式提供了一个简洁和有效的方式来创建容器类型和迭代器,
而不是诉诸于传统的循环。


2.7.2 优点

简单的列表解释比其他的创建字典、列表、或集合的方法更加清楚和简单。
表达式生成器非常有效,因为它避免了创建一个完全的列表。


2.7.3 缺点

复杂的生成表达式很难读懂。


2.7.4 结论

在简单的情况下可以使用生成表达式。
其所有的部分必须写在一行内,包括:映射表达式,for循环文,过滤表达式。
最好不要有多个for文。
如果有多个for文最好使用循环来避免代码过于复杂。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
正:
result = [mapping_expr for value in iterable if filter_expr]

result = [{'key': value} for value in iterable
if a_long_filter_expression(value)]

result = [complicated_transform(x)
for x in iterable if predicate(x)]

descriptive_name = [
transform({'key': key, 'value': value}, color='black')
for key, value in generate_iterable(some_input)
if complicated_condition_is_met(key, value)
]

result = []
for x in range(10):
for y in range(5):
if x * y > 10:
result.append((x, y))

return {x: complicated_transform(x)
for x in long_generator_function(parameter)
if x is not None}

squares_generator = (x**2 for x in range(10))

unique_names = {user.name for user in users if user is not None}

eat(jelly_bean for jelly_bean in jelly_beans
if jelly_bean.color == 'black')
1
2
3
4
5
6
7
8
9
10
11
12
13
误:
result = [complicated_transform(
x, some_argument=x+1)
for x in iterable if predicate(x)]

result = [(x, y) for x in range(10) for y in range(5) if x * y > 10]

return ((x, y, z)
for x in range(5)
for y in range(5)
if x != y
for z in range(5)
if y != z)

2.8 默认的迭代器和运算符

请使用类型支持的默认迭代器和运算符,例如列表,字典和文件。


2.8.1 定义

集合类型,例如字典和列表,都有定义其默认的迭代器和成员测试运算符(”in” 和 “not in”)。


2.8.2 优点

默认的迭代器和运算使用简单且效率高。
他们可以直接操作,而不需要调用额外的方法。
使用默认运算符的方法是可以再利用的。
它可以与支持该操作的任何类型一起使用。


2.8.3 缺点

您无法通过读取方法名称来判断对象的类型,例如has_key()指的是一个字典。
但它没准也是一个优点。


2.8.4 结论

请使用类型支持的默认迭代器和运算符,例如列表,字典和文件。
内置类型也定义了迭代方法。
与返回列表的方法相比,这些方法更受欢迎,除非你不想在迭代的过程中对其成员进行变换。

1
2
3
4
5
6
Yes:  for key in adict: ...
if key not in adict: ...
if obj in alist: ...
for line in afile: ...
for k, v in adict.items(): ...
for k, v in six.iteritems(adict): ...
1
2
3
4
No:   for key in adict.keys(): ...
if not adict.has_key(key): ...
for line in afile.readlines(): ...
for k, v in dict.iteritems(): ...


2.9 生成器 yield

如果需要可以使用生成器


2.9 定义

生成器函数返回一个迭代器,这个迭代器将生成一个值当它每次运行到yield文的时候。
在它生成了一个值以后,生成函数的运行时状态会被挂起直到需要下一个值得时候。


2.9.2 优点

代码更加简单,因为局部变量和控制流的状态会为每一次运行而保留。
比起一个会创建整个列表的函数,生成器使用的内存更少。


2.9.3 缺点

没有。


2.9.4 结论

在你的注释文档里,使用 “Yields:” 比使用 “Returns:” 好。


2.10 Lambda表达式

一行结束。
在做生成表达式里,更喜欢lambda而不是map()filter()


2.10.1 定义

Lambda表达式用来定义一个匿名函数,不需要声明函数。


2.10.2 优点

方便。


2.10.3 缺点

比起本地函数,Lambda表达式比较难读,难排错。
匿名化意味着异常栈更难懂。
表达能力比较有限,因为它只允许包含一个语句。


2.10.4 结论

如果你只想写一个语句,可以用Lambda表达式。
但是如果你的语句超过60-80个字符,建议你定义一个传统的内嵌函数.

对于乘法一类的一般操作符,请使用operator模块里的函数,而不是Lambda表达式。
比如,operator.mul要比lambda x, y: x * y更好。


2.11 条件表达式

简单的情况下可以使用。


2.11.1 定义

条件表达式(又称“三元运算符”)为IF语句提供一种更简短的表达。
例如:x = 1 if cond else 2


2.11.2 优点

更短,更方便。


2.11.3 缺点

当多余一个IF的时候,可能更难读。
如果表达式太长,条件语句可能比较难以定位。


2.11.4 结论

简单情况下可以使用。
每一个表达式必须占一行:true-表达式, if-表达式, else-表达式。
如果过于复杂可以使用完整的IF语句。

1
2
3
4
5
6
7
8
正:
one_line = 'yes' if predicate(value) else 'no'
slightly_split = ('yes' if predicate(value)
else 'no, nein, nyet')
the_longest_ternary_style_that_can_be_done = (
'yes, true, affirmative, confirmed, correct'
if predicate(value)
else 'no, false, negative, nay')
1
2
3
4
5
6
7
误:
bad_line_breaking = ('yes' if predicate(value) else
'no')
portion_too_long = ('yes'
if some_long_module.some_long_predicate_function(
really_long_variable_name)
else 'no, false, negative, nay')


2.12 参数默认值

在大多数情况下都可以使用。


2.12.1 定义

你可以在函数参数的后部指定变量的默认值,例如def foo(a, b=0):
如果在调用foo时只输入了一个参数,则b将被默认设置为0.
如果输入了两个参数,则b被赋予第二个输入参数的值。


2.12.2 优点

你写的函数常常包含很多的默认值,但你常常不需要去重写这些默认值。
参数默认值提供了实现这种方法的一种途径,你不需要为了这些不常见的情况去定义很多的函数。
而且Python不支持方法重载,参数默认值是代替方法重载的一种简单的方法。


2.12.3 缺点

参数默认值会在模块被加载的时候被评估一次。
如果参数是可变对象,例如列表或字典,这可能会导致一些问题。
如果函数改变了这些对象(例如增加列表中的元素),默认值也会被改变。


2.12.4 Decision

在注意一下几点的前提下可以使用参数默认值:

不要在函数定义时使用可变对象作为默认值。

1
2
3
4
5
6
7
8
正: def foo(a, b=None):
if b is None:
b = []
正: def foo(a, b: Optional[Sequence] = None):
if b is None:
b = []
正: def foo(a, b: Sequence = ()): # 空元组是可以的因为空元组是不可变的
...
1
2
3
4
5
6
7
8
误:  def foo(a, b=[]):
...
误: def foo(a, b=time.time()): # 无法确定此时模块已经被调用
...
误: def foo(a, b=FLAGS.my_thing): # sys.argv 尚未处理...
...
误: def foo(a, b: Mapping = {}): # 可能会被传递未经验证的代码
...


2.13 属性 Properties (getter, setter等方法)

参考

当你需要简单、轻量的getter, setter方法时,使用属性去读取或设定数据。


2.13.1 定义

当计算是轻量级时,包装方法的方法调用getting和setting属性可作为标准属性访问。


2.13.2 优点

通过消除对于简单成员变量的显性读取和设定来增加代码的可读性。
可以实现延迟加载。
考虑Python式的类接口维护。
在性能方面,当访问一个直接变量时,也允许绕过麻烦的get方法。
这也允许将来在不破坏接口的情况下添加get方法。


2.13.3 缺点

它可以像操作符重载一样隐藏副作用。
子类可能会造成混淆。


2.13.4 结论

当计算是轻量级时,可以使用get和set去访问活设置属性。
在定义属性时,请使用@property decorator

如果属性不需要被重载,继承该属性可以不是显性的。
因此,必须确保间接调用访问器方法以确保子类中覆盖的方法被调用template method design pattern

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
正: import math

class Square:
"""一个正方形包括两个属性: 一个可读写的面积和一个只读的周长.

用例:
>>> sq = Square(3)
>>> sq.area
9
>>> sq.perimeter
12
>>> sq.area = 16
>>> sq.side
4
>>> sq.perimeter
16
"""

def __init__(self, side: float):
self.side = side

@property
def area(self) -> float:
"""正方形的面积."""
return self._get_area()

@area.setter
def area(self, area: float):
self._set_area(area)

def _get_area(self) -> float:
"""直接访问side以计算'area'"""
return self.side ** 2

def _set_area(self, area: float):
"""直接设定边长'side'"""
self.side = math.sqrt(area)

@property
def perimeter(self) -> float:
return self.side * 4


2.14 True/False Evaluations

Use the “implicit” false if at all possible.


2.14.1 Definition

Python evaluates certain values as False when in a boolean context. A quick
“rule of thumb” is that all “empty” values are considered false, so 0, None, [], {}, '' all evaluate as false in a boolean context.


2.14.2 Pros

Conditions using Python booleans are easier to read and less error-prone. In
most cases, they’re also faster.


2.14.3 Cons

May look strange to C/C++ developers.


2.14.4 Decision

Use the “implicit” false if possible, e.g., if foo: rather than if foo != []:. There are a few caveats that you should keep in mind though:

  • Always use if foo is None: (or is not None) to check for a None value.
    E.g., when testing whether a variable or argument that defaults to None
    was set to some other value. The other value might be a value that’s false
    in a boolean context!

  • Never compare a boolean variable to False using ==. Use if not x:
    instead. If you need to distinguish False from None then chain the
    expressions, such as if not x and x is not None:.

  • For sequences (strings, lists, tuples), use the fact that empty sequences
    are false, so if seq: and if not seq: are preferable to if len(seq):
    and if not len(seq): respectively.

  • When handling integers, implicit false may involve more risk than benefit
    (i.e., accidentally handling None as 0). You may compare a value which is
    known to be an integer (and is not the result of len()) against the
    integer 0.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    Yes: if not users:
    print('no users')

    if foo == 0:
    self.handle_zero()

    if i % 10 == 0:
    self.handle_multiple_of_ten()

    def f(x=None):
    if x is None:
    x = []
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    No:  if len(users) == 0:
    print('no users')

    if foo is not None and not foo:
    self.handle_zero()

    if not i % 10:
    self.handle_multiple_of_ten()

    def f(x=None):
    x = x or []
  • Note that '0' (i.e., 0 as string) evaluates to true.


2.16 Lexical Scoping

Okay to use.


2.16.1 Definition

A nested Python function can refer to variables defined in enclosing functions,
but cannot assign to them. Variable bindings are resolved using lexical scoping,
that is, based on the static program text. Any assignment to a name in a block
will cause Python to treat all references to that name as a local variable, even
if the use precedes the assignment. If a global declaration occurs, the name is
treated as a global variable.

An example of the use of this feature is:

1
2
3
4
5
6
def get_adder(summand1: float) -> Callable[[float], float]:
"""Returns a function that adds numbers to a given number."""
def adder(summand2: float) -> float:
return summand1 + summand2

return adder


2.16.2 Pros

Often results in clearer, more elegant code. Especially comforting to
experienced Lisp and Scheme (and Haskell and ML and …) programmers.


2.16.3 Cons

Can lead to confusing bugs. Such as this example based on
PEP-0227:

1
2
3
4
5
6
7
8
9
10
i = 4
def foo(x: Iterable[int]):
def bar():
print(i, end='')
# ...
# A bunch of code here
# ...
for i in x: # Ah, i *is* local to foo, so this is what bar sees
print(i, end='')
bar()

So foo([1, 2, 3]) will print 1 2 3 3,
not 1 2 3 4.


2.16.4 Decision

Okay to use.



2.17 Function and Method Decorators

Use decorators judiciously when there is a clear advantage. Avoid staticmethod
and limit use of classmethod.


2.17.1 Definition

Decorators for Functions and Methods
(a.k.a “the @ notation”). One common decorator is @property, used for
converting ordinary methods into dynamically computed attributes. However, the
decorator syntax allows for user-defined decorators as well. Specifically, for
some function my_decorator, this:

1
2
3
4
class C:
@my_decorator
def method(self):
# method body ...

is equivalent to:

1
2
3
4
class C:
def method(self):
# method body ...
method = my_decorator(method)


2.17.2 Pros

Elegantly specifies some transformation on a method; the transformation might
eliminate some repetitive code, enforce invariants, etc.


2.17.3 Cons

Decorators can perform arbitrary operations on a function’s arguments or return
values, resulting in surprising implicit behavior. Additionally, decorators
execute at import time. Failures in decorator code are pretty much impossible to
recover from.


2.17.4 Decision

Use decorators judiciously when there is a clear advantage. Decorators should
follow the same import and naming guidelines as functions. Decorator pydoc
should clearly state that the function is a decorator. Write unit tests for
decorators.

Avoid external dependencies in the decorator itself (e.g. don’t rely on files,
sockets, database connections, etc.), since they might not be available when the
decorator runs (at import time, perhaps from pydoc or other tools). A
decorator that is called with valid parameters should (as much as possible) be
guaranteed to succeed in all cases.

Decorators are a special case of “top level code” - see main for
more discussion.

Never use staticmethod unless forced to in order to integrate with an API
defined in an existing library. Write a module level function instead.

Use classmethod only when writing a named constructor or a class-specific
routine that modifies necessary global state such as a process-wide cache.


2.18 Threading

Do not rely on the atomicity of built-in types.

While Python’s built-in data types such as dictionaries appear to have atomic
operations, there are corner cases where they aren’t atomic (e.g. if __hash__
or __eq__ are implemented as Python methods) and their atomicity should not be
relied upon. Neither should you rely on atomic variable assignment (since this
in turn depends on dictionaries).

Use the Queue module’s Queue data type as the preferred way to communicate
data between threads. Otherwise, use the threading module and its locking
primitives. Prefer condition variables and threading.Condition instead of
using lower-level locks.


2.19 Power Features

Avoid these features.


2.19.1 Definition

Python is an extremely flexible language and gives you many fancy features such
as custom metaclasses, access to bytecode, on-the-fly compilation, dynamic
inheritance, object reparenting, import hacks, reflection (e.g. some uses of
getattr()), modification of system internals, __del__ methods implementing
customized cleanup, etc.


2.19.2 Pros

These are powerful language features. They can make your code more compact.


2.19.3 Cons

It’s very tempting to use these “cool” features when they’re not absolutely
necessary. It’s harder to read, understand, and debug code that’s using unusual
features underneath. It doesn’t seem that way at first (to the original author),
but when revisiting the code, it tends to be more difficult than code that is
longer but is straightforward.


2.19.4 Decision

Avoid these features in your code.

Standard library modules and classes that internally use these features are okay
to use (for example, abc.ABCMeta, dataclasses, and enum).


2.20 Modern Python: from __future__ imports

New language version semantic changes may be gated behind a special future
import to enable them on a per-file basis within earlier runtimes.


2.20.1 Definition

Being able to turn on some of the more modern features via from __future__ import statements allows early use of features from expected future Python
versions.


2.20.2 Pros

This has proven to make runtime version upgrades smoother as changes can be made
on a per-file basis while declaring compatibility and preventing regressions
within those files. Modern code is more maintainable as it is less likely to
accumulate technical debt that will be problematic during future runtime
upgrades.


2.20.3 Cons

Such code may not work on very old interpreter versions prior to the
introduction of the needed future statement. The need for this is more common in
projects supporting an extremely wide variety of environments.


2.20.4 Decision

from __future__ imports

Use of from __future__ import statements is encouraged. It allows a given
source file to start using more modern Python syntax features today. Once you no
longer need to run on a version where the features are hidden behind a
__future__ import, feel free to remove those lines.

In code that may execute on versions as old as 3.5 rather than >= 3.7, import:

1
from __future__ import generator_stop

For legacy code with the burden of continuing to support 2.7, import:

1
2
3
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

For more information read the
Python future statement definitions
documentation.

Please don’t remove these imports until you are confident the code is only ever
used in a sufficiently modern environment. Even if you do not currently use the
feature a specific future import enables in your code today, keeping it in place
in the file prevents later modifications of the code from inadvertently
depending on the older behavior.

Use other from __future__ import statements as you see fit. We did not include
unicode_literals in our recommendations for 2.7 as it was not a clear win due
to implicit default codec conversion consequences it introduced in many places
within 2.7. Most dual-version 2-and-3 code was better off with explicit use of
b'' and u'' bytes and unicode string literals where necessary.

The six, future, and past libraries

When your project still needs to support use under both Python 2 and 3, use the
six,
future, and
past libraries as you see fit. They exist to
make your code cleaner and life easier.




2.21 Type Annotated Code

You can annotate Python 3 code with type hints according to
PEP-484, and type-check the code at
build time with a type checking tool like pytype.

Type annotations can be in the source or in a
stub pyi file. Whenever
possible, annotations should be in the source. Use pyi files for third-party or
extension modules.


2.21.1 Definition

Type annotations (or “type hints”) are for function or method arguments and
return values:

1
def func(a: int) -> List[int]:

You can also declare the type of a variable using similar
PEP-526 syntax:

1
a: SomeType = some_func()

Or by using a type comment in code that must support legacy Python versions.

1
a = some_func()  # type: SomeType


2.21.2 Pros

Type annotations improve the readability and maintainability of your code. The
type checker will convert many runtime errors to build-time errors, and reduce
your ability to use Power Features.


2.21.3 Cons

You will have to keep the type declarations up to date.
You might see type errors that you think are
valid code. Use of a
type checker
may reduce your ability to use Power Features.


2.21.4 Decision

You are strongly encouraged to enable Python type analysis when updating code.
When adding or modifying public APIs, include type annotations and enable
checking via pytype in the build system. As static analysis is relatively new to
Python, we acknowledge that undesired side-effects (such as
wrongly
inferred types) may prevent adoption by some projects. In those situations,
authors are encouraged to add a comment with a TODO or link to a bug describing
the issue(s) currently preventing type annotation adoption in the BUILD file or
in the code itself as appropriate.


3 Python Style Rules


3.1 Semicolons

Do not terminate your lines with semicolons, and do not use semicolons to put
two statements on the same line.


3.2 Line length

Maximum line length is 80 characters.

Explicit exceptions to the 80 character limit:

  • Long import statements.
  • URLs, pathnames, or long flags in comments.
  • Long string module level constants not containing whitespace that would be
    inconvenient to split across lines such as URLs or pathnames.
    • Pylint disable comments. (e.g.: # pylint: disable=invalid-name)

Do not use backslash line continuation except for with statements requiring
three or more context managers.

Make use of Python’s
implicit line joining inside parentheses, brackets and braces.
If necessary, you can add an extra pair of parentheses around an expression.

1
2
3
4
5
Yes: foo_bar(self, width, height, color='black', design=None, x='foo',
emphasis=None, highlight=0)

if (width == 0 and height == 0 and
color == 'red' and emphasis == 'strong'):

When a literal string won’t fit on a single line, use parentheses for implicit
line joining.

1
2
x = ('This will build a very long long '
'long long long long long long string')

Within comments, put long URLs on their own line if necessary.

1
2
Yes:  # See details at
# http://www.example.com/us/developer/documentation/api/content/v2.0/csv_file_name_extension_full_specification.html
1
2
3
No:  # See details at
# http://www.example.com/us/developer/documentation/api/content/\
# v2.0/csv_file_name_extension_full_specification.html

It is permissible to use backslash continuation when defining a with statement
whose expressions span three or more lines. For two lines of expressions, use a
nested with statement:

1
2
3
4
Yes:  with very_long_first_expression_function() as spam, \
very_long_second_expression_function() as beans, \
third_thing() as eggs:
place_order(eggs, beans, spam, beans)
1
2
3
No:  with VeryLongFirstExpressionFunction() as spam, \
VeryLongSecondExpressionFunction() as beans:
PlaceOrder(beans, spam)
1
2
3
Yes:  with very_long_first_expression_function() as spam:
with very_long_second_expression_function() as beans:
place_order(beans, spam)

Make note of the indentation of the elements in the line continuation examples
above; see the indentation section for explanation.

In all other cases where a line exceeds 80 characters, and the
yapf
auto-formatter does not help bring the line below the limit, the line is allowed
to exceed this maximum. Authors are encouraged to manually break the line up per
the notes above when it is sensible.


3.3 Parentheses

Use parentheses sparingly.

It is fine, though not required, to use parentheses around tuples. Do not use
them in return statements or conditional statements unless using parentheses for
implied line continuation or to indicate a tuple.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Yes: if foo:
bar()
while x:
x = bar()
if x and y:
bar()
if not x:
bar()
# For a 1 item tuple the ()s are more visually obvious than the comma.
onesie = (foo,)
return foo
return spam, beans
return (spam, beans)
for (x, y) in dict.items(): ...
1
2
3
4
5
No:  if (x):
bar()
if not(x):
bar()
return (foo)


3.4 Indentation

Indent your code blocks with 4 spaces.

Never use tabs or mix tabs and spaces. In cases of implied line continuation,
you should align wrapped elements either vertically, as per the examples in the
line length section; or using a hanging indent of 4 spaces,
in which case there should be nothing after the open parenthesis or bracket on
the first line.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Yes:   # Aligned with opening delimiter
foo = long_function_name(var_one, var_two,
var_three, var_four)
meal = (spam,
beans)

# Aligned with opening delimiter in a dictionary
foo = {
'long_dictionary_key': value1 +
value2,
...
}

# 4-space hanging indent; nothing on first line
foo = long_function_name(
var_one, var_two, var_three,
var_four)
meal = (
spam,
beans)

# 4-space hanging indent in a dictionary
foo = {
'long_dictionary_key':
long_dictionary_value,
...
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
No:    # Stuff on first line forbidden
foo = long_function_name(var_one, var_two,
var_three, var_four)
meal = (spam,
beans)

# 2-space hanging indent forbidden
foo = long_function_name(
var_one, var_two, var_three,
var_four)

# No hanging indent in a dictionary
foo = {
'long_dictionary_key':
long_dictionary_value,
...
}








3.4.1 Trailing commas in sequences of items?

Trailing commas in sequences of items are recommended only when the closing
container token ], ), or } does not appear on the same line as the final
element. The presence of a trailing comma is also used as a hint to our Python
code auto-formatter YAPF to direct it to auto-format the container
of items to one item per line when the , after the final element is present.

1
2
3
4
5
6
7
Yes:   golomb3 = [0, 1, 3]
Yes: golomb4 = [
0,
1,
4,
6,
]
1
2
3
4
5
6
No:    golomb4 = [
0,
1,
4,
6
]


3.5 Blank Lines

Two blank lines between top-level definitions, be they function or class
definitions. One blank line between method definitions and between the class
line and the first method. No blank line following a def line. Use single
blank lines as you judge appropriate within functions or methods.


3.6 Whitespace

Follow standard typographic rules for the use of spaces around punctuation.

No whitespace inside parentheses, brackets or braces.

1
Yes: spam(ham[1], {'eggs': 2}, [])
1
No:  spam( ham[ 1 ], { 'eggs': 2 }, [ ] )

No whitespace before a comma, semicolon, or colon. Do use whitespace after a
comma, semicolon, or colon, except at the end of the line.

1
2
3
Yes: if x == 4:
print(x, y)
x, y = y, x
1
2
3
No:  if x == 4 :
print(x , y)
x , y = y , x

No whitespace before the open paren/bracket that starts an argument list,
indexing or slicing.

1
Yes: spam(1)
1
No:  spam (1)
1
Yes: dict['key'] = list[index]
1
No:  dict ['key'] = list [index]

No trailing whitespace.

Surround binary operators with a single space on either side for assignment
(=), comparisons (==, <, >, !=, <>, <=, >=, in, not in, is, is not), and
Booleans (and, or, not). Use your better judgment for the insertion of spaces
around arithmetic operators (+, -, *, /, //, %, **, @).

1
Yes: x == 1
1
No:  x<1

Never use spaces around = when passing keyword arguments or defining a default
parameter value, with one exception:
when a type annotation is present, _do_ use spaces
around the = for the default parameter value.

1
2
Yes: def complex(real, imag=0.0): return Magic(r=real, i=imag)
Yes: def complex(real, imag: float = 0.0): return Magic(r=real, i=imag)
1
2
No:  def complex(real, imag = 0.0): return Magic(r = real, i = imag)
No: def complex(real, imag: float=0.0): return Magic(r = real, i = imag)

Don’t use spaces to vertically align tokens on consecutive lines, since it
becomes a maintenance burden (applies to :, #, =, etc.):

1
2
3
4
5
6
7
8
Yes:
foo = 1000 # comment
long_name = 2 # comment that should not be aligned

dictionary = {
'foo': 1,
'long_name': 2,
}
1
2
3
4
5
6
7
8
No:
foo = 1000 # comment
long_name = 2 # comment that should not be aligned

dictionary = {
'foo' : 1,
'long_name': 2,
}



3.7 Shebang Line

Most .py files do not need to start with a #! line. Start the main file of a
program with
#!/usr/bin/env python3 (to support virtualenvs) or #!/usr/bin/python3 per
PEP-394.

This line is used by the kernel to find the Python interpreter, but is ignored by Python when importing modules. It is only necessary on a file intended to be executed directly.



3.8 Comments and Docstrings

Be sure to use the right style for module, function, method docstrings and
inline comments.



3.8.1 Docstrings

Python uses docstrings to document code. A docstring is a string that is the
first statement in a package, module, class or function. These strings can be
extracted automatically through the __doc__ member of the object and are used
by pydoc.
(Try running pydoc on your module to see how it looks.) Always use the three
double-quote """ format for docstrings (per
PEP 257).
A docstring should be organized as a summary line (one physical line not
exceeding 80 characters) terminated by a period, question mark, or exclamation
point. When writing more (encouraged), this must be followed by a blank line,
followed by the rest of the docstring starting at the same cursor position as
the first quote of the first line. There are more formatting guidelines for
docstrings below.



3.8.2 Modules

Every file should contain license boilerplate. Choose the appropriate boilerplate for the license used by the project (for example, Apache 2.0, BSD, LGPL, GPL)

Files should start with a docstring describing the contents and usage of the
module.

1
2
3
4
5
6
7
8
9
10
11
12
"""A one line summary of the module or program, terminated by a period.

Leave one blank line. The rest of this docstring should contain an
overall description of the module or program. Optionally, it may also
contain a brief description of exported classes and functions and/or usage
examples.

Typical usage example:

foo = ClassFoo()
bar = foo.FunctionBar()
"""



3.8.3 Functions and Methods

In this section, “function” means a method, function, or generator.

A function must have a docstring, unless it meets all of the following criteria:

  • not externally visible
  • very short
  • obvious

A docstring should give enough information to write a call to the function
without reading the function’s code. The docstring should describe the
function’s calling syntax and its semantics, but generally not its
implementation details, unless those details are relevant to how the function is
to be used. For example, a function that mutates one of its arguments as a side
effect should note that in its docstring. Otherwise, subtle but important
details of a function’s implementation that are not relevant to the caller are
better expressed as comments alongside the code than within the function’s
docstring.

The docstring should be descriptive-style ("""Fetches rows from a Bigtable.""") rather than imperative-style ("""Fetch rows from a Bigtable."""). The docstring for a @property data descriptor should use the
same style as the docstring for an attribute or a
function argument ("""The Bigtable path.""",
rather than """Returns the Bigtable path.""").

A method that overrides a method from a base class may have a simple docstring
sending the reader to its overridden method’s docstring, such as """See base class.""". The rationale is that there is no need to repeat in many places
documentation that is already present in the base method’s docstring. However,
if the overriding method’s behavior is substantially different from the
overridden method, or details need to be provided (e.g., documenting additional
side effects), a docstring with at least those differences is required on the
overriding method.

Certain aspects of a function should be documented in special sections, listed
below. Each section begins with a heading line, which ends with a colon. All
sections other than the heading should maintain a hanging indent of two or four
spaces (be consistent within a file). These sections can be omitted in cases
where the function’s name and signature are informative enough that it can be
aptly described using a one-line docstring.


Args:
: List each parameter by name. A description should follow the name, and be
separated by a colon followed by either a space or newline. If the
description is too long to fit on a single 80-character line, use a hanging
indent of 2 or 4 spaces more than the parameter name (be consistent with the
rest of the docstrings in the file). The description should include required
type(s) if the code does not contain a corresponding type annotation. If a
function accepts *foo (variable length argument lists) and/or **bar
(arbitrary keyword arguments), they should be listed as *foo and **bar.


Returns: (or Yields: for generators)
: Describe the type and semantics of the return value. If the function only
returns None, this section is not required. It may also be omitted if the
docstring starts with Returns or Yields (e.g. """Returns row from Bigtable as a tuple of strings.""") and the opening sentence is sufficient to
describe return value.


Raises:
: List all exceptions that are relevant to the interface followed by a
description. Use a similar exception name + colon + space or newline and
hanging indent style as described in Args:. You should not document
exceptions that get raised if the API specified in the docstring is violated
(because this would paradoxically make behavior under violation of the API
part of the API).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def fetch_smalltable_rows(table_handle: smalltable.Table,
keys: Sequence[Union[bytes, str]],
require_all_keys: bool = False,
) -> Mapping[bytes, Tuple[str]]:
"""Fetches rows from a Smalltable.

Retrieves rows pertaining to the given keys from the Table instance
represented by table_handle. String keys will be UTF-8 encoded.

Args:
table_handle: An open smalltable.Table instance.
keys: A sequence of strings representing the key of each table
row to fetch. String keys will be UTF-8 encoded.
require_all_keys: Optional; If require_all_keys is True only
rows with values set for all keys will be returned.

Returns:
A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example:

{b'Serak': ('Rigel VII', 'Preparer'),
b'Zim': ('Irk', 'Invader'),
b'Lrrr': ('Omicron Persei 8', 'Emperor')}

Returned keys are always bytes. If a key from the keys argument is
missing from the dictionary, then that row was not found in the
table (and require_all_keys must have been False).

Raises:
IOError: An error occurred accessing the smalltable.
"""

Similarly, this variation on Args: with a line break is also allowed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def fetch_smalltable_rows(table_handle: smalltable.Table,
keys: Sequence[Union[bytes, str]],
require_all_keys: bool = False,
) -> Mapping[bytes, Tuple[str]]:
"""Fetches rows from a Smalltable.

Retrieves rows pertaining to the given keys from the Table instance
represented by table_handle. String keys will be UTF-8 encoded.

Args:
table_handle:
An open smalltable.Table instance.
keys:
A sequence of strings representing the key of each table row to
fetch. String keys will be UTF-8 encoded.
require_all_keys:
Optional; If require_all_keys is True only rows with values set
for all keys will be returned.

Returns:
A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example:

{b'Serak': ('Rigel VII', 'Preparer'),
b'Zim': ('Irk', 'Invader'),
b'Lrrr': ('Omicron Persei 8', 'Emperor')}

Returned keys are always bytes. If a key from the keys argument is
missing from the dictionary, then that row was not found in the
table (and require_all_keys must have been False).

Raises:
IOError: An error occurred accessing the smalltable.
"""



3.8.4 Classes

Classes should have a docstring below the class definition describing the class.
If your class has public attributes, they should be documented here in an
Attributes section and follow the same formatting as a
function’s Args section.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class SampleClass:
"""Summary of class here.

Longer class information....
Longer class information....

Attributes:
likes_spam: A boolean indicating if we like SPAM or not.
eggs: An integer count of the eggs we have laid.
"""

def __init__(self, likes_spam: bool = False):
"""Inits SampleClass with blah."""
self.likes_spam = likes_spam
self.eggs = 0

def public_method(self):
"""Performs operation blah."""




3.8.5 Block and Inline Comments

The final place to have comments is in tricky parts of the code. If you’re going
to have to explain it at the next code review,
you should comment it now. Complicated operations get a few lines of comments
before the operations commence. Non-obvious ones get comments at the end of the
line.

1
2
3
4
5
6
# We use a weighted dictionary search to find out where i is in
# the array. We extrapolate position based on the largest num
# in the array and the array size and then do binary search to
# get the exact number.

if i & (i-1) == 0: # True if i is 0 or a power of 2.

To improve legibility, these comments should start at least 2 spaces away from
the code with the comment character #, followed by at least one space before
the text of the comment itself.

On the other hand, never describe the code. Assume the person reading the code
knows Python (though not what you’re trying to do) better than you do.

1
2
# BAD COMMENT: Now go through the b array and make sure whenever i occurs
# the next element is i+1





3.8.6 Punctuation, Spelling, and Grammar

Pay attention to punctuation, spelling, and grammar; it is easier to read
well-written comments than badly written ones.

Comments should be as readable as narrative text, with proper capitalization and
punctuation. In many cases, complete sentences are more readable than sentence
fragments. Shorter comments, such as comments at the end of a line of code, can
sometimes be less formal, but you should be consistent with your style.

Although it can be frustrating to have a code reviewer point out that you are
using a comma when you should be using a semicolon, it is very important that
source code maintain a high level of clarity and readability. Proper
punctuation, spelling, and grammar help with that goal.


3.10 Strings

Use an
f-string,
the % operator, or the format method for formatting strings, even when the
parameters are all strings. Use your best judgment to decide between + and %
(or format) though. Do not use % or the format method for pure
concatenation.

1
2
3
4
5
6
Yes: x = a + b
x = '%s, %s!' % (imperative, expletive)
x = '{}, {}'.format(first, second)
x = 'name: %s; score: %d' % (name, n)
x = 'name: {}; score: {}'.format(name, n)
x = f'name: {name}; score: {n}'
1
2
3
4
No: x = '%s%s' % (a, b)  # use + in this case
x = '{}{}'.format(a, b) # use + in this case
x = first + ', ' + second
x = 'name: ' + name + '; score: ' + str(n)

Avoid using the + and += operators to accumulate a string within a loop. In
some conditions, accumulating a string with addition can lead to quadratic
rather than linear running time. Although common accumulations of this sort may
be optimized on CPython, that is an implementation detail. The conditions under
which an optimization applies are not easy to predict and may change. Instead,
add each substring to a list and ''.join the list after the loop terminates,
or write each substring to an io.StringIO buffer. These techniques
consistently have amortized-linear run time complexity.

1
2
3
4
5
Yes: items = ['<table>']
for last_name, first_name in employee_list:
items.append('<tr><td>%s, %s</td></tr>' % (last_name, first_name))
items.append('</table>')
employee_table = ''.join(items)
1
2
3
4
No: employee_table = '<table>'
for last_name, first_name in employee_list:
employee_table += '<tr><td>%s, %s</td></tr>' % (last_name, first_name)
employee_table += '</table>'

Be consistent with your choice of string quote character within a file. Pick '
or " and stick with it. It is okay to use the other quote character on a
string to avoid the need to \\ escape within the string.

1
2
3
4
Yes:
Python('Why are you hiding your eyes?')
Gollum("I'm scared of lint errors.")
Narrator('"Good!" thought a happy Python reviewer.')
1
2
3
4
No:
Python("Why are you hiding your eyes?")
Gollum('The lint. It burns. It burns us.')
Gollum("Always the great lint. Watching. Watching.")

Prefer """ for multi-line strings rather than '''. Projects may choose to
use ''' for all non-docstring multi-line strings if and only if they also use
' for regular strings. Docstrings must use """ regardless.

Multi-line strings do not flow with the indentation of the rest of the program.
If you need to avoid embedding extra space in the string, use either
concatenated single-line strings or a multi-line string with
textwrap.dedent()
to remove the initial space on each line:

1
2
3
4
  No:
long_string = """This is pretty ugly.
Don't do this.
"""
1
2
3
Yes:
long_string = """This is fine if your use case can accept
extraneous leading spaces."""
1
2
3
Yes:
long_string = ("And this is fine if you cannot accept\n" +
"extraneous leading spaces.")
1
2
3
Yes:
long_string = ("And this too is fine if you cannot accept\n"
"extraneous leading spaces.")
1
2
3
4
5
6
Yes:
import textwrap

long_string = textwrap.dedent("""\
This is also fine, because textwrap.dedent()
will collapse common leading spaces in each line.""")



3.10.1 Logging

For logging functions that expect a pattern-string (with %-placeholders) as
their first argument: Always call them with a string literal (not an f-string!)
as their first argument with pattern-parameters as subsequent arguments. Some
logging implementations collect the unexpanded pattern-string as a queryable
field. It also prevents spending time rendering a message that no logger is
configured to output.

1
2
3
4
Yes:
import tensorflow as tf
logger = tf.get_logger()
logger.info('TensorFlow Version is: %s', tf.__version__)
1
2
3
4
5
6
7
8
9
Yes:
import os
from absl import logging

logging.info('Current $PAGER is: %s', os.getenv('PAGER', default=''))

homedir = os.getenv('HOME')
if homedir is None or not os.access(homedir, os.W_OK):
logging.error('Cannot write to home directory, $HOME=%r', homedir)
1
2
3
4
5
6
7
8
9
10
No:
import os
from absl import logging

logging.info('Current $PAGER is:')
logging.info(os.getenv('PAGER', default=''))

homedir = os.getenv('HOME')
if homedir is None or not os.access(homedir, os.W_OK):
logging.error(f'Cannot write to home directory, $HOME={homedir!r}')



3.10.2 Error Messages

Error messages (such as: message strings on exceptions like ValueError, or
messages shown to the user) should follow three guidelines:

  1. The message needs to precisely match the actual error condition.

  2. Interpolated pieces need to always be clearly identifiable as such.

  3. They should allow simple automated processing (e.g. grepping).

1
2
3
4
5
6
7
8
9
Yes:
if not 0 <= p <= 1:
raise ValueError(f'Not a probability: {p!r}')

try:
os.rmdir(workdir)
except OSError as error:
logging.warning('Could not remove directory (reason: %r): %r',
error, workdir)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
No:
if p < 0 or p > 1: # PROBLEM: also false for float('nan')!
raise ValueError(f'Not a probability: {p!r}')

try:
os.rmdir(workdir)
except OSError:
# PROBLEM: Message makes an assumption that might not be true:
# Deletion might have failed for some other reason, misleading
# whoever has to debug this.
logging.warning('Directory already was deleted: %s', workdir)

try:
os.rmdir(workdir)
except OSError:
# PROBLEM: The message is harder to grep for than necessary, and
# not universally non-confusing for all possible values of `workdir`.
# Imagine someone calling a library function with such code
# using a name such as workdir = 'deleted'. The warning would read:
# "The deleted directory could not be deleted."
logging.warning('The %s directory could not be deleted.', workdir)




3.11 Files, Sockets, and similar Stateful Resources

Explicitly close files and sockets when done with them. This rule naturally
extends to closeable resources that internally use sockets, such as database
connections, and also other resources that need to be closed down in a similar
fashion. To name only a few examples, this also includes
mmap mappings,
h5py File objects, and
matplotlib.pyplot figure windows.

Leaving files, sockets or other such stateful objects open unnecessarily has
many downsides:

  • They may consume limited system resources, such as file descriptors. Code
    that deals with many such objects may exhaust those resources unnecessarily
    if they’re not returned to the system promptly after use.
  • Holding files open may prevent other actions such as moving or deleting
    them, or unmounting a filesystem.
  • Files and sockets that are shared throughout a program may inadvertently be
    read from or written to after logically being closed. If they are actually
    closed, attempts to read or write from them will raise exceptions, making
    the problem known sooner.

Furthermore, while files and sockets (and some similarly behaving resources) are
automatically closed when the object is destructed, coupling the lifetime of the
object to the state of the resource is poor practice:

  • There are no guarantees as to when the runtime will actually invoke the
    __del__ method. Different Python implementations use different memory
    management techniques, such as delayed garbage collection, which may
    increase the object’s lifetime arbitrarily and indefinitely.
  • Unexpected references to the file, e.g. in globals or exception tracebacks,
    may keep it around longer than intended.

Relying on finalizers to do automatic cleanup that has observable side effects
has been rediscovered over and over again to lead to major problems, across many
decades and multiple languages (see e.g.
this article
for Java).

The preferred way to manage files and similar resources is using the
with statement:

1
2
3
with open("hello.txt") as hello_file:
for line in hello_file:
print(line)

For file-like objects that do not support the with statement, use
contextlib.closing():

1
2
3
4
5
import contextlib

with contextlib.closing(urllib.urlopen("http://www.python.org/")) as front_page:
for line in front_page:
print(line)

In rare cases where context-based resource management is infeasible, code
documentation must explain clearly how resource lifetime is managed.


3.12 TODO Comments

Use TODO comments for code that is temporary, a short-term solution, or
good-enough but not perfect.

A TODO comment begins with the string TODO in all caps and a parenthesized
name, e-mail address, or other identifier
of the person or issue with the best context about the problem. This is followed
by an explanation of what there is to do.

The purpose is to have a consistent TODO format that can be searched to find
out how to get more details. A TODO is not a commitment that the person
referenced will fix the problem. Thus when you create a
TODO, it is almost always your name
that is given.

1
2
# TODO(kl@gmail.com): Use a "*" here for string repetition.
# TODO(Zeke) Change this to use relations.

If your TODO is of the form “At a future date do something” make sure that you
either include a very specific date (“Fix by November 2009”) or a very specific
event (“Remove this code when all clients can handle XML responses.”).


3.13 Imports formatting

Imports should be on separate lines; there are
exceptions for typing imports.

E.g.:

1
2
3
Yes: import os
import sys
from typing import Mapping, Sequence
1
No:  import os, sys

Imports are always put at the top of the file, just after any module comments
and docstrings and before module globals and constants. Imports should be
grouped from most generic to least generic:

  1. Python future import statements. For example:

    1
    2
    3
    from __future__ import absolute_import
    from __future__ import division
    from __future__ import print_function

    See above for more information about those.

  2. Python standard library imports. For example:

    1
    import sys
  3. third-party module
    or package imports. For example:

1
import tensorflow as tf
  1. Code repository
    sub-package imports. For example:
1
from otherproject.ai import mind
  1. Deprecated: application-specific imports that are part of the same
    top level
    sub-package as this file. For example:
1
from myproject.backend.hgwells import time_machine
You may find older Google Python Style code doing this, but it is no longer required. **New code is encouraged not to bother with this.** Simply treat application-specific sub-package imports the same as other sub-package imports.

Within each grouping, imports should be sorted lexicographically, ignoring case,
according to each module’s full package path (the path in from path import ...). Code may optionally place a blank line between import sections.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import collections
import queue
import sys

from absl import app
from absl import flags
import bs4
import cryptography
import tensorflow as tf

from book.genres import scifi
from myproject.backend import huxley
from myproject.backend.hgwells import time_machine
from myproject.backend.state_machine import main_loop
from otherproject.ai import body
from otherproject.ai import mind
from otherproject.ai import soul

# Older style code may have these imports down here instead:
#from myproject.backend.hgwells import time_machine
#from myproject.backend.state_machine import main_loop


3.14 Statements

Generally only one statement per line.

However, you may put the result of a test on the same line as the test only if
the entire statement fits on one line. In particular, you can never do so with
try/except since the try and except can’t both fit on the same line, and
you can only do so with an if if there is no else.

1
2
3
Yes:

if foo: bar(foo)
1
2
3
4
5
6
7
8
9
10
11
No:

if foo: bar(foo)
else: baz(foo)

try: bar(foo)
except ValueError: baz(foo)

try:
bar(foo)
except ValueError: baz(foo)




3.15 Accessors

If an accessor function would be trivial, you should use public variables
instead of accessor functions to avoid the extra cost of function calls in
Python. When more functionality is added you can use property to keep the
syntax consistent.

On the other hand, if access is more complex, or the cost of accessing the
variable is significant, you should use function calls (following the
Naming guidelines) such as get_foo() and set_foo(). If the
past behavior allowed access through a property, do not bind the new accessor
functions to the property. Any code still attempting to access the variable by
the old method should break visibly so they are made aware of the change in
complexity.


3.16 Naming

module_name, package_name, ClassName, method_name, ExceptionName,
function_name, GLOBAL_CONSTANT_NAME, global_var_name, instance_var_name,
function_parameter_name, local_var_name.

Function names, variable names, and filenames should be descriptive; eschew
abbreviation. In particular, do not use abbreviations that are ambiguous or
unfamiliar to readers outside your project, and do not abbreviate by deleting
letters within a word.

Always use a .py filename extension. Never use dashes.


3.16.1 Names to Avoid

  • single character names, except for specifically allowed cases:

    • counters or iterators (e.g. i, j, k, v, et al.)
    • e as an exception identifier in try/except statements.
    • f as a file handle in with statements

    Please be mindful not to abuse single-character naming. Generally speaking,
    descriptiveness should be proportional to the name’s scope of visibility.
    For example, i might be a fine name for 5-line code block but within
    multiple nested scopes, it is likely too vague.

  • dashes (-) in any package/module name

  • __double_leading_and_trailing_underscore__ names (reserved by Python)

  • offensive terms

  • names that needlessly include the type of the variable (for example:
    id_to_name_dict)


3.16.2 Naming Conventions

  • “Internal” means internal to a module, or protected or private within a
    class.

  • Prepending a single underscore (_) has some support for protecting module
    variables and functions (linters will flag protected member access).

  • Prepending a double underscore (__ aka “dunder”) to an instance variable
    or method effectively makes the variable or method private to its class
    (using name mangling); we discourage its use as it impacts readability and
    testability, and isn’t really private. Prefer a single underscore.

  • Place related classes and top-level functions together in a
    module.
    Unlike Java, there is no need to limit yourself to one class per module.

  • Use CapWords for class names, but lower_with_under.py for module names.
    Although there are some old modules named CapWords.py, this is now
    discouraged because it’s confusing when the module happens to be named after
    a class. (“wait – did I write import StringIO or from StringIO import StringIO?”)

  • Underscores may appear in unittest method names starting with test to
    separate logical components of the name, even if those components use
    CapWords. One possible pattern is test<MethodUnderTest>_<state>; for
    example testPop_EmptyStack is okay. There is no One Correct Way to name
    test methods.


3.16.3 File Naming

Python filenames must have a .py extension and must not contain dashes (-).
This allows them to be imported and unittested. If you want an executable to be
accessible without the extension, use a symbolic link or a simple bash wrapper
containing exec "$0.py" "$@".


3.16.4 Guidelines derived from Guido‘s Recommendations











































































Type Public Internal
Packages lower_with_under
Modules lower_with_under _lower_with_under
Classes CapWords _CapWords
Exceptions CapWords
Functions lower_with_under() _lower_with_under()
Global/Class Constants CAPS_WITH_UNDER _CAPS_WITH_UNDER
Global/Class Variables lower_with_under _lower_with_under
Instance Variables lower_with_under _lower_with_under (protected)
Method Names lower_with_under() _lower_with_under() (protected)
Function/Method Parameters lower_with_under
Local Variables lower_with_under


3.16.5 Mathematical Notation

For mathematically heavy code, short variable names that would otherwise violate
the style guide are preferred when they match established notation in a
reference paper or algorithm. When doing so, reference the source of all naming
conventions in a comment or docstring or, if the source is not accessible,
clearly document the naming conventions. Prefer PEP8-compliant
descriptive_names for public APIs, which are much more likely to be
encountered out of context.

3.17 Main

In Python, pydoc as well as unit tests require modules to be importable. If a
file is meant to be used as an executable, its main functionality should be in a
main() function, and your code should always check if __name__ == '__main__'
before executing your main program, so that it is not executed when the module
is imported.

When using absl, use app.run:

1
2
3
4
5
6
7
8
9
from absl import app
...

def main(argv: Sequence[str]):
# process non-flag arguments
...

if __name__ == '__main__':
app.run(main)

Otherwise, use:

1
2
3
4
5
def main():
...

if __name__ == '__main__':
main()

All code at the top level will be executed when the module is imported. Be
careful not to call functions, create objects, or perform other operations that
should not be executed when the file is being pydoced.


3.18 Function length

Prefer small and focused functions.

We recognize that long functions are sometimes appropriate, so no hard limit is
placed on function length. If a function exceeds about 40 lines, think about
whether it can be broken up without harming the structure of the program.

Even if your long function works perfectly now, someone modifying it in a few
months may add new behavior. This could result in bugs that are hard to find.
Keeping your functions short and simple makes it easier for other people to read
and modify your code.

You could find long and complicated functions when working with
some
code. Do not be intimidated by modifying existing code: if working with such a
function proves to be difficult, you find that errors are hard to debug, or you
want to use a piece of it in several different contexts, consider breaking up
the function into smaller and more manageable pieces.


3.19 Type Annotations



3.19.1 General Rules

  • Familiarize yourself with
    PEP-484.
  • In methods, only annotate self, or cls if it is necessary for proper
    type information. e.g., @classmethod def create(cls: Type[T]) -> T: return cls()
  • If any other variable or a returned type should not be expressed, use Any.
  • You are not required to annotate all the functions in a module.
    • At least annotate your public APIs.
    • Use judgment to get to a good balance between safety and clarity on the
      one hand, and flexibility on the other.
    • Annotate code that is prone to type-related errors (previous bugs or
      complexity).
    • Annotate code that is hard to understand.
    • Annotate code as it becomes stable from a types perspective. In many
      cases, you can annotate all the functions in mature code without losing
      too much flexibility.


3.19.2 Line Breaking

Try to follow the existing indentation rules.

After annotating, many function signatures will become “one parameter per line”.

1
2
3
4
5
def my_method(self,
first_var: int,
second_var: Foo,
third_var: Optional[Bar]) -> int:
...

Always prefer breaking between variables, and not, for example, between variable
names and type annotations. However, if everything fits on the same line, go for
it.

1
2
def my_method(self, first_var: int) -> int:
...

If the combination of the function name, the last parameter, and the return type
is too long, indent by 4 in a new line.

1
2
3
def my_method(
self, first_var: int) -> Tuple[MyLongType1, MyLongType1]:
...

When the return type does not fit on the same line as the last parameter, the
preferred way is to indent the parameters by 4 on a new line and align the
closing parenthesis with the def.

1
2
3
4
5
Yes:
def my_method(
self, other_arg: Optional[MyLongType]
) -> Dict[OtherLongType, MyLongType]:
...

pylint
allows you to move the closing parenthesis to a new line and align with the
opening one, but this is less readable.

1
2
3
4
5
No:
def my_method(self,
other_arg: Optional[MyLongType]
) -> Dict[OtherLongType, MyLongType]:
...

As in the examples above, prefer not to break types. However, sometimes they are
too long to be on a single line (try to keep sub-types unbroken).

1
2
3
4
5
6
7
def my_method(
self,
first_var: Tuple[List[MyLongType1],
List[MyLongType2]],
second_var: List[Dict[
MyLongType3, MyLongType4]]) -> None:
...

If a single name and type is too long, consider using an
alias for the type. The last resort is to break after the
colon and indent by 4.

1
2
3
4
5
6
Yes:
def my_function(
long_variable_name:
long_module_name.LongTypeName,
) -> None:
...
1
2
3
4
5
6
No:
def my_function(
long_variable_name: long_module_name.
LongTypeName,
) -> None:
...


3.19.3 Forward Declarations

If you need to use a class name from the same module that is not yet defined –
for example, if you need the class inside the class declaration, or if you use a
class that is defined below – use a string for the class name.

1
2
3
4
class MyClass:

def __init__(self,
stack: List["MyClass"]) -> None:


3.19.4 Default Values

As per
PEP-008, use
spaces around the = only for arguments that have both a type annotation and
a default value.

1
2
3
Yes:
def func(a: int = 0) -> int:
...
1
2
3
No:
def func(a:int=0) -> int:
...



3.19.5 NoneType

In the Python type system, NoneType is a “first class” type, and for typing
purposes, None is an alias for NoneType. If an argument can be None, it
has to be declared! You can use Union, but if there is only one other type,
use Optional.

Use explicit Optional instead of implicit Optional. Earlier versions of PEP
484 allowed a: str = None to be interpreted as a: Optional[str] = None, but
that is no longer the preferred behavior.

1
2
3
4
5
Yes:
def func(a: Optional[str], b: Optional[str] = None) -> str:
...
def multiple_nullable_union(a: Union[None, str, int]) -> str:
...
1
2
3
4
5
No:
def nullable_union(a: Union[None, str]) -> str:
...
def implicit_optional(a: str = None) -> str:
...




3.19.6 Type Aliases

You can declare aliases of complex types. The name of an alias should be
CapWorded. If the alias is used only in this module, it should be _Private.

For example, if the name of the module together with the name of the type is too
long:

1
2
_ShortName = module_with_long_name.TypeWithLongName
ComplexMap = Mapping[str, List[Tuple[int, int]]]

Other examples are complex nested types and multiple return variables from a
function (as a tuple).



3.19.7 Ignoring Types

You can disable type checking on a line with the special comment # type: ignore.

pytype has a disable option for specific errors (similar to lint):

1
# pytype: disable=attribute-error



3.19.8 Typing Variables

If an internal variable has a type that is hard or impossible to infer, you can
specify its type in a couple ways.


Type Comments:
: Use a # type: comment on the end of the line

1
a = SomeUndecoratedFunction()  # type: Foo


Annotated Assignments
: Use a colon and type between the variable name and value, as with function
arguments.

1
a: Foo = SomeUndecoratedFunction()



3.19.9 Tuples vs Lists

Typed lists can only contain objects of a single type. Typed tuples can either
have a single repeated type or a set number of elements with different types.
The latter is commonly used as the return type from a function.

1
2
3
a = [1, 2, 3]  # type: List[int]
b = (1, 2, 3) # type: Tuple[int, ...]
c = (1, "2", 3.5) # type: Tuple[int, str, float]




3.19.10 TypeVars

The Python type system has
generics. The factory
function TypeVar is a common way to use them.

Example:

1
2
3
4
5
from typing import List, TypeVar
T = TypeVar("T")
...
def next(l: List[T]) -> T:
return l.pop()

A TypeVar can be constrained:

1
2
3
AddableType = TypeVar("AddableType", int, float, str)
def add(a: AddableType, b: AddableType) -> AddableType:
return a + b

A common predefined type variable in the typing module is AnyStr. Use it for
multiple annotations that can be bytes or unicode and must all be the same
type.

1
2
3
4
5
from typing import AnyStr
def check_length(x: AnyStr) -> AnyStr:
if len(x) <= 42:
return x
raise ValueError()



3.19.11 String types

The proper type for annotating strings depends on what versions of Python the
code is intended for.

Prefer to use str, though Text is also acceptable. Be consistent in using
one or the other. For code that deals with binary data, use bytes. For Python
2 compatible code that processes text data (str or unicode in Python 2,
str in Python 3), use Text.

1
2
3
4
5
6
def deals_with_text_data_in_py3(x: str) -> str:
...
def deals_with_binary_data(x: bytes) -> bytes:
...
def py2_compatible_text_data_processor(x: Text) -> Text:
...

In some uncommon Python 2 compatibility cases, str may make sense instead of
Text, typically to aid compatibility when the return types aren’t the same
between Python 2 and Python 3. Never use unicode as it doesn’t exist in Python

  1. The reason this discrepancy exists is because str means something different
    in Python 2 than in Python 3.

No:

1
2
def py2_code(x: str) -> unicode:
...

If the type can be either bytes or text, use Union, with the appropriate text
type.

1
2
3
4
5
6
from typing import Text, Union
...
def py3_only(x: Union[bytes, str]) -> Union[bytes, str]:
...
def py2_compatible(x: Union[bytes, Text]) -> Union[bytes, Text]:
...

If all the string types of a function are always the same, for example if the
return type is the same as the argument type in the code above, use
AnyStr.



3.19.12 Imports For Typing

For classes from the typing module, always import the class itself. You are
explicitly allowed to import multiple specific classes on one line from the
typing module. Ex:

1
from typing import Any, Dict, Optional

Given that this way of importing from typing adds items to the local
namespace, any names in typing should be treated similarly to keywords, and
not be defined in your Python code, typed or not. If there is a collision
between a type and an existing name in a module, import it using import x as y.

1
from typing import Any as AnyType


3.19.13 Conditional Imports

Use conditional imports only in exceptional cases where the additional imports
needed for type checking must be avoided at runtime. This pattern is
discouraged; alternatives such as refactoring the code to allow top level
imports should be preferred.

Imports that are needed only for type annotations can be placed within an if TYPE_CHECKING: block.

  • Conditionally imported types need to be referenced as strings, to be forward
    compatible with Python 3.6 where the annotation expressions are actually
    evaluated.
  • Only entities that are used solely for typing should be defined here; this
    includes aliases. Otherwise it will be a runtime error, as the module will
    not be imported at runtime.
  • The block should be right after all the normal imports.
  • There should be no empty lines in the typing imports list.
  • Sort this list as if it were a regular imports list.
    1
    2
    3
    4
    import typing
    if typing.TYPE_CHECKING:
    import sketch
    def f(x: "sketch.Sketch"): ...



3.19.14 Circular Dependencies

Circular dependencies that are caused by typing are code smells. Such code is a
good candidate for refactoring. Although technically it is possible to keep
circular dependencies, various build systems will not let you do so
because each module has to depend on the other.

Replace modules that create circular dependency imports with Any. Set an
alias with a meaningful name, and use the real type name from
this module (any attribute of Any is Any). Alias definitions should be separated
from the last import by one line.

1
2
3
4
5
6
7
from typing import Any

some_mod = Any # some_mod.py imports this module.
...

def my_method(self, var: "some_mod.SomeType") -> None:
...



3.19.15 Generics

When annotating, prefer to specify type parameters for generic types; otherwise,
the generics’ parameters will be assumed to be Any.

1
2
def get_names(employee_ids: List[int]) -> Dict[int, Any]:
...
1
2
3
4
5
6
# These are both interpreted as get_names(employee_ids: List[Any]) -> Dict[Any, Any]
def get_names(employee_ids: list) -> Dict:
...

def get_names(employee_ids: List) -> Dict:
...

If the best type parameter for a generic is Any, make it explicit, but
remember that in many cases TypeVar might be more
appropriate:

1
2
def get_names(employee_ids: List[Any]) -> Dict[Any, str]:
"""Returns a mapping from employee ID to employee name for given IDs."""
1
2
3
T = TypeVar('T')
def get_names(employee_ids: List[T]) -> Dict[T, str]:
"""Returns a mapping from employee ID to employee name for given IDs."""

4 Parting Words

BE CONSISTENT.

If you’re editing code, take a few minutes to look at the code around you and
determine its style. If they use spaces around all their arithmetic operators,
you should too. If their comments have little boxes of hash marks around them,
make your comments have little boxes of hash marks around them too.

The point of having style guidelines is to have a common vocabulary of coding so
people can concentrate on what you’re saying rather than on how you’re saying
it. We present global style rules here so people know the vocabulary, but local
style is also important. If code you add to a file looks drastically different
from the existing code around it, it throws readers out of their rhythm when
they go to read it. Avoid this.