Code Sample of Book "Effective Python: 59 Specific Ways to Write Better Python" by Brett Slatkin.
-
- There are two major version of Python still in active use: Python 2 and Python 3.
-
- There are multiple popular runtimes for Python: CPython, Jython, IronPython, PyPy, etc.
-
- Be sure that the command-line for running Python on your system is the version you expect it to be.
-
- Prefer Python 3 for your next project because that is the primary focus of the Python community.
-
- Always follow the PEP 8 style guide when writing Python code.
-
- Sharing a common style with the larger Python community facilitates collaboration with others.
-
- Using a consistent style makes it easier to modify your own code later.
-
- In Python 3, bytes contains sequences of 8-bit values, str contains sequences of Unicode characters. bytes and str instances can't be used together with operators (like > or +).
-
- In Python 2, str contains sequences of 8-bit values, unicode contains sequences of Unicode characters. str and unicode can be used together with operators if the str only contains 7-bit ASCII characters.
-
- Use helper functions to ensure that the inputs you operate on are the type of character sequence you expect (8-bit values, UTF-8 encoded characters, Unicode characters, etc.)
-
- If you want to read or write binary data to/from a file, always open the file using a binary mode (like 'rb' or 'wb').
-
- Python's syntax makes it all too easy to write single-line expressions that are overly complicated and difficult to read.
-
- Move complex expressions into helper functions, especially if you need to use the same logic repeatedly.
-
- The if/else expression provides a more readable alternative to using Boolean operators like or and adn in expressions.
-
- Avoid being verbose: Don't supply 0 for the start index or the length of the sequence for the end index.
-
- Slicing is forgiving of start or end indexes that are out of bounds, making it easy to express slices on the front or back boundaries of a sequence (like a[:20] or a[-20:]).
-
- Assigning to a list slice will replace that range in the original sequence with what's referenced even if their lengths are different.
-
- Specifying start, end, and stride in a slice can be extremely confusing.
-
- Prefer using positive stride values in slices without start or end indexes. Avoid negative stride values if possible.
-
- Avoid using start, end and stride together in a single slice. If you need all three parameters, consider doing two assignments (one to slice, another to stride) or using islice form itertools built-in module.
-
- List comprehensions are clearer than the map and filter built-in functions because they don't require extra lambda expressions.
-
- List comprehensions allow you easily skip items from the input list, a behavior map doesn't support without help from filter.
-
- Dictionaries and sets also support comprehension expressions.
-
- List comprehensions support multiple levels of loops and multiple conditions per loop level.
-
- List comprehensions with more than two expressions are very difficult to read and should be avoided.
-
- List comprehensions can cause problems for large inputs by using too much memory.
-
- Generator expressions avoid memory issues by producing outputs one at a time as an iterator.
-
- Generator expressions can be composed by passing the iterator from one generator expression into the for subexpression of another.
-
- Generator expressions execute very quickly when chained together.
-
- enumerate provides concise syntax for looping over an iterator and getting the index of each item from the iterator as you go.
-
- Prefer enumerate instead of looping over a range and indexing into a sequence.
-
- You can supply a second parameter to enumerate to specify the number from which to begin counting (zero is default).
-
- The zip built-in function can be used to iterate over multiple iterators in parallel.
-
- In Python 3, zip is a lazy generator that produces tuples. In Python 2, zip returns the full result as a list of tuples.
-
- zip truncates its outputs silently if you supply it with iterators of different lengths.
-
- The zip_longest function from the itertools built-in module lets you iterate over multiple iterators in parallel regardless of their lengths (see Item 46: Use built-in algorithms and data structures).
-
- Python has special syntax that allows else blocks to immediately follow for and while loop interior blocks.
-
- The else block after a loop only runs if the loop body did not encounter a break statement.
-
- Avoid using else blocks after loops because their behavior isn't intuitive and can be confusing.
-
- The try/finally compound statement lets you run cleanup code regardless of whether exceptions were raised in the try block.
-
- The else block helps you minimize the amount of code in try blocks and visually distinguish the success case from the try/except blocks.
-
- An else block can be used to perform additional actions after a successful try block but before common cleanup in a finally block.
-
- Functions that return None to indicate special meaning are error prone because None and other values (e.g., zero, the empty string) all evaluate to False in conditional expressions.
-
- Raise exceptions to indicate special situations instead of returning None. Expect the calling code to handle exceptions properly when they are documented.
-
- Closure functions can refer to variables from any of the scopes in which they were defined.
-
- By default, closure can't affect enclosing scopes by assigning variables.
-
- In Python 3, use the nonlocal statement to indicate when a closure can modify a variable in its enclosing scopes.
-
- In Python 2, use a mutable value (like a single-item list) to work around the lack of the nonlocal statement.
-
- Avoid using nonlocal statements for anything beyond simple functions.
-
- Using generators can be clearer than the alternative of returning lists of accumulated results.
-
- The iterator returned by a generator produces the set of values passed to yield expressions within the generator function's body.
-
- Generators can produce a sequence of outputs for arbitrarily large inputs because their working memory doesn't include all inputs and outputs.
-
- Beware of functions that iterate over input arguments multiple times. If these arguments are iterators, you may see strange behavior and missing values.
-
- Python's iterator protocol defines how containers and iterators interact with the iter and next built-in functions, for loops, and related expression.
-
- You can easily define your own iterable container type by implementing the iter method as a generator.
-
- You can detect that a value is an iterator (instead of a container) if calling iter on it twice produces the same result, which can then be progressed with the next built-in function.
-
- Functions can accept a variable number of positional arguments by using *args in the def statement.
-
- You can use the items from a sequence as the positional arguments for a function with the * operator.
-
- Using the * operator with a generator may cause your program to run out of memory and crash.
-
- Adding new positional parameters to functions that accept *args can introduce hard-to-find bugs.
-
- Function arguments can be specified by position or by keyword.
-
- Keywords make it clear what the purpose of each arguments is when it would be confusing with only positional arguments.
-
- Keywords arguments with default values make it easy to add new behaviors to a function, especially when the function has existing callers.
-
- Optional keyword arguments should always be passed by keyword instead of by position.
-
- Closure functions can refer to variables from any of the scopes in which they were defined.
-
- By default, closure can't affect enclosing scopes by assigning variables.
-
- In Python 3, use the nonlocal statement to indicate when a closure can modify a variable in its enclosing scopes.
-
- In Python 2, use a mutable value (like a single-item list) to work around the lack of the nonlocal statement.
-
- Avoid using nonlocal statements for anything beyond simple functions.
-
- Keyword arguments make the intention of a function call more clear.
-
- Use keyword-only arguments to force callers to supply keyword arguments for potentially confusing functions, especially those that accept multiple Boolean flags.
-
- Python 3 supports explicit syntax for keyword-only arguments in functions.
-
- Python 2 can emulate keyword-only arguments for functions by using **kwargs and manually raising TypeError exceptions.
-
- Avoid making dictionaries with values that are other dictionaries or long tuples.
-
- Use namedtuple for lightweight, immutable data containers before you need the flexibility of a full class.
-
- Move your bookkeeping code to use multiple helper classes when your internal state dictionaries get complicated.
-
- Instead of defining and instantiating classes, functions are often all you need for simple interfaces between components in Python.
-
- References to functions and methods in Python are first class, meaning they can be used in expressions like any other type.
-
- The call special method enables instances of a class to be called like plain Python functions.
-
- When you need a function to maintain state, consider defining a class that provides the call method instead of defining a stateful closure (see Item 15: "Know how closures interact with variable scope").
-
- Python only supports a single constructor per class, the init method.
-
- Use @classmethod to define alternative constructors for your classes.
-
- Use class method polymorphism to provide generic ways to build and connect concrete subclasses.
-
- Python's standard method resolution order (MRO) solves the problems to superclass initialization order and diamond inheritance.
-
- Always use the super built-in function to initialize parent classes.
-
- Avoid using multiple inheritance if mix-in classes can achieve the same outcome.
-
- Use pluggable behaviors at the instance level to provide per-class customization when mix-in classes may require it.
-
- Compose mix-ins to create complex functionality from simple behaviors.
-
- Private attributes aren't rigorously enforced by the Python compiler.
-
- Plan from the beginning to allow subclass to do more with your internal APIs and attributes instead of locking them out by default.
-
- Use documentation of protected fields to guide subclass instead of trying to force access control with private attributes.
-
- Only consider using private attributes to avoid naming conflicts with subclasses that are out of your control.
-
- Inherit directly from Python's container types (like list or dict) for simple use cases.
-
- Beware of the large number of methods required to implement custom container types correctly.
-
- Have your custom container types inherit from the interface defined in collections.abc to ensure that your classes match required interfaces and behaviors.
-
- Define new class interfaces using simple public attributes, and avoid set and get methods.
-
- Use @property to define special behavior when attributes are accessed on your objects, if necessary.
-
- Follow the rule of least surprise and void weird side effects in your @property methods.
-
- Ensure that @property methods are fast; do slow or complex work using normal methods.
-
- Use @property to give existing instance attributes new functionality.
-
- Make incremental progress toward better data models by using @property.
-
- Consider refactoring a class and all call sites when you find yourself using @property too heavily.
-
- Reuse the behavior and validation of @property methods by defining your own descriptor classes.
-
- Use WeakKeyDictionary to ensure that your descriptor classes don't cause memory leaks.
-
- Don't get bogged down trying to understand exactly how getattribute uses the descriptor protocol for getting and setting attributes.
-
- Use getattr and setattr to lazily load and save attributes for an object.
-
- Understand that getattr only gets called once when accessing a missing attribute, whereas getattribute gets called every time an attribute is accessed.
-
- Avoid infinite recursion in getattribute and setattr by using methods from super() (i.e., the object class) to access instance attributes directly.
-
- Use metaclasses to ensure that subclass are well formed at the time they are defined, before objects of their type are constructed.
-
- Metaclass have slightly different syntax in Python 2 vs. Python 3.
-
- The new method of metaclasses is run after the class statement's entire body has been processed.
-
- Class registration is a helpful pattern for building modular Python programs.
-
- Metaclass let you run registration code automatically each time your base class is subclassed in a program.
-
- Using metaclass for class registration avoids errors by ensuring that you never miss a registration call.
-
- Metaclass enable you to modify a class's attributes before the class is fully defined.
-
- Descriptors and metaclasses make a powerful combination for declarative behavior and runtime introspection.
-
- You can avoid both memory leaks and the weakref module by using metaclasses along with descriptors.
-
- Use the subprocess to run child processes and manage their input and output streams.
-
- Child processes run in parallel with the Python interpreter, enabling you to maximize your CPU usage.
-
- Use the timeout parameter with communicate to avoid deadlocks and hanging child processes.
-
- Python threads can't bytecode in parallel on multiple CPU cores because of the global interpreter lock (GIL).
-
- Python threads are still useful despite the GIL because they provide an easy way to do multiple things at seemingly the same time.
-
- Use Python threads to make multiple system calls in parallel. This allows you to do blocking I/O at the same time as computation.
-
- Even though Python has a global interpreter lock, you're still responsible for protecting against objects without locks.
-
- Your programs will corrupt their data structures if you allow multiple threads to modify the same objects without locks.
-
- The lock class in the threading built-in module is Python's standard mutual exclusion lock implementation.
-
- Pipelines are a great way to organize sequences of work that run concurrently using multiple Python threads.
-
- Be aware of the many problems in building concurrent pipelines: busy waiting, stopping workers, and memory explosion.
-
- The Queue class has all of the facilities you need to build robust pipelines: blocking operations, buffer sizes, and joining.
-
- Coroutines provide an efficient way to run tens of thousands of functions seemingly at the same time.
-
- Within a generator, the value of the yield expression will be whatever value was passed to the generator's send method from the exterior code.
-
- Coroutines give you a powerful tool for separating the core logic of your program from its interaction with the surrounding environment.
-
- Python 2 doesn't support yield from or returning values from generators.
-
- Moving CPU bottlenecks to C-extension modules can be an effective way to improve performance while maximizing your investment in Python code. However, the cost of doing so is high and may introduce bugs.
-
- The multiprocessing module provides powerful tools that can parallelize certain types of Python computation with minimal effort.
-
- The power of multiprocessing is best accessed through the concurrent.futures built-in module and its simple ProcessPoolExecutor class.
-
- The advanced parts of the multiprocessing module should be avoided because they are so complex.
-
- Decorators are Python syntax for allowing one function to modify another function at runtime.
-
- Using decorators can cause strange behaviors in tools that do introspection, such as debuggers.
-
- Use the wraps decorator from the functools built-in module when you define your own decorators to avoid any issues.
-
- The with statement allows you to reuse logic from try/finally blocks and reduce visual noise.
-
- The contextlib built-in module provides a contextmanager decorator that makes it easy to use your own functions in with statements.
-
- The value yielded by context managers is supplied to the as part of the with statement. It's useful for letting your code directly access the cause of the special context.
-
- The pickle built-in module is only useful for serializing and de-serializing objects between trusted programs.
-
- The pickle module may break down when used for more than trivial use cases.
-
- Use the copyreg built-in module with pickle to add missing attributes values, allow versioning of classes, and provide stable import paths.
-
- Avoid using the time module for translating between different time zones.
-
- Use the datetime built-in module along with the pytz module to reliably convert between times in different time zones.
-
- Always represent time in UTC and do conversations to local time as the final step before presentation.
-
- Use Python's built-in modules for algorithms and data structures.
-
- Don't re-implement this functionality yourself. It's hard to get right.
-
- Python has built-in types and classes in modules that can represent practically every type of numerical value.
-
- The Decimal class is ideal for situations that require high precision and exact rounding behavior, such as computations of monetary values.
-
- The Python Package Index (PyPI) contains a wealth of common packages that are built and maintained by the Python community.
-
- pip is the command-line to use for installing packages from PyPI.
-
- pip is installed by default in Python 3.4 and above; you must install it yourself for older versions.
-
- The majority of PyPI modules are free and open source software.
-
- Write documentation for every module, class and function using docstrings. Keep them up to date as your code changes.
-
- For modules: introduce the contents of the module and any important classes or functions all users should know about.
-
- For classes: document behavior, important attributes, and subclass behavior in the docstring following the class statement.
-
- For functions and methods: document every argument, returned value, raised exception, and other behaviors in the docstring following the def statement.
-
- Packages in Python are modules that contain other modules. Packages allow you to organize your code into separate, non-conflicting namespaces with unique absolute module names.
-
- Simple package are defined by adding an init.py file to a directory that contains other source files. These files become that child modules of the directory's package. Package directories may also contain other packages.
-
- You can provide an explict API for a module by listing its publicly visible name in its all special attribute.
-
- You can hide a package's internal implementation by only importing public names in the package's init.py file or by naming internal-only members with a leading underscore.
-
- When collaborating within a single team or on a single codebase, using all for explicit APIs is probably unnecessary.
-
- Defining root exceptions for your modules allows API consumers to insulate themselves from your API.
-
- Catching root exceptions can help you find bugs in code that consumes an API.
-
- Catching the Python Exception base class can help you find bugs in API implementations.
-
- Intermediate root exceptions let you add more specific types of exceptions in the future without breaking your API consumers.
-
- Circular dependencies happen when two modules must call into each other at import time. They can cause your program to crash at startup.
-
- The best way to break a circular dependency is refactoring mutual dependencies into a separate module at the bottom of the dependency tree.
-
- Dynamic imports are the simplest solution for breaking a circular dependency between modules while minimizing refactoring and complexity.
-
- Virtual environment allow you to use pip to install many different versions of the same package on the same machine without conflicts.
-
- Virtual environments are created with pyvenv, enabled with source bin/activate, and disabled with deactivate.
-
- You can dump all of the requirements of an environment with pip freeze. You can reproduce the environment by supplying the requirements.txt file to pip install -r.
-
- In versions of Python before 3.4, the pyvenv tool must be downloaded and installed separately. The command-line tool is called virtualenv instead of pyvenv.
-
- Programs often need to run in multiple deployment environments that each have unique assumptions and configurations.
-
- You can tailor a module's contents to different deployment environments by using normal Python statements in module scope.
-
- Module contents can be the product of any external condition, including host introspection through the sys and os modules.
-
- Calling print on built-in Python types will produce the human-readable string version of a value, which hides type information.
-
- Calling repr on built-in Python types will produce the printable string version of a value. These repr strings could be passed to the eval built-in function to get back the original value.
-
- %s in format strings will produce human-readable strings like str.%r will produce printable strings like repr.
-
- You can define the repr method to customize the printable representation of a class and provide more detailed debugging information.
-
- You can reach into any object's dict attribute to view its internals.
-
- The only way to have confidence in a Python program is to write tests.
-
- The unittest built-in module provides most of the facilities you'll need to write good tests.
-
- You can define tests by subclassing TestCase and defining one method per behavior you'd like to test. Test methods on TestCase classes must start with the word test.
-
- It's important to write both unit tests (for isolated functionality) and integration tests (for modules that interact).
- You can initiate the Python interactive debugger at a point of interest directly in your program with the import pdb; pdb.set_trace() statements.
- The Python debugger prompt is a full Python shell that lets you inspect and modify the state of a running program.
- pdb shell commands let you precisely control program execution, allowing you to alternate between inspecting program state and progressing program execution.
-
- It's import to profile Python programs before optimizing because the source of slowdowns is often obscure.
-
- Use the cProfile module instead of the profile module because it provides more accurate profiling information.
-
- The Profile object's runcall method provides everything you need to profile a tree of function calls in isolation.
-
- The Stats object lets you select and print the subset of profiling information you need to see to understand your program's performance.
-
- It can be difficult to understand how Python programs use and leak memory.
-
- The gc module can help you understand which objects exist, but it has no information about how they were allocated.
-
- The tracemalloc built-in module provides powerful tools for understanding the source of memory usage.
-
- tracemalloc is only available in Python 3.4 and above.