About ten years ago, Guido Van Rossum, the Python author and Benevolent Dictator for Life (BDFL), along with the Python community, decided to make several concurrent backward incompatible changes to Python 2.5 and release a new version, Python 3.0.
The main changes were:
- Using unicode strings as default, with the old string type becoming a full featured binary type
- Changing several builtins, for example
- The print statement became a function, allowing more consistent syntax and the use of the word print as a name
- The confusing
input
removed, andraw_input
now renamed toinput
- Simpler
exec
- Division is now split between
/
float division and//
truncating division
- Improved exception tracing, with chaining
- Improved function call syntax with annotations and keyword only arguments, replacing little used tuple parameter unpacking
- More class constructor features, such as nicer metaclass syntax, keyword
arguments,
__prepare__
- Renamed standard libraries, to be more consistent
- Removal of a lot of depreciated features, including old-style classes
- Removal of a lot of depreciated syntax that had become learner stumbling blocks
- Adding
nonlocal
variables - Extended tuple unpacking, like
first, *rest = makes_a_tuple()
- Removing the proliferation of
.pyc
files, instead using__pycache__
directories - Automatic selection of C-based standard library modules over pure Python ones if available
- Unified the
int
andlong
types into one unlimited length integer type
Unfortunately, this list was comprehensive enough to break virtually every python script ever written. So, to ease the transition, 3.0 and 2.6 were released simultaneously, with the other, backward compatible new features of 3.0 being also included in 2.6. This happened again with the releases of 3.1 and 2.7. Not wanting to maintain two Pythons, the BDFL declared that 2.7 was the last Python 2 series release.
These changes (mostly the unicode one) also made Python much slower in version 3.0. Since then, however, there have been many speed and memory improvements. Combined with new C extensions for some modules, Python 3 is now usually as fast or faster than Python 2.
The original, officially sanctioned upgrade path was one of the biggest issues
with moving to Python 3. A script, 2to3
, was supposed to convert code to
Python 3, and then the old version could be eventually dropped. This script
required a lot of manual intervention (things like the unicode strings require
knowledge of the programmer’s intent), and required library authors to maintain
two separate versions of the code. This hindered initial adoption with many
major libraries unwilling to support two versions for Python 3 support.
Unofficial authors tried making a new script, 3to2
, which worked significantly
better, but still was hindered by the dual copies of code issue.
Another decision also may have slowed adoption. Part way through the development of Python 3.2 up to 3.4, the decision was made to avoid adding any new features, to give authors time to adopt code to a stable Python 3. This statement could be taken in reverse; why update to Python 3 when it does not have any new features to improve your program? The original changes (as listed above) were not enough to cause mass adoption.
This dreary time in Python development is now drawing to a close, thanks to a change in the way authors started approaching Python compatibility. There is such a good overlap between Python 2.6 or Python 2.7 and Python 3.3+ that a single code base can support them both. The reason for this is the following three things:
- Good Python 2 is almost the same as Python 3. The things that were dropped were mostly things you shouldn’t do in Python 2 anyway.
- Several changes in syntax are available in Python 2 using
__future__
- The remaining changes can mostly be wrapped in libraries
These were capitalized by the unofficial library authors, and now almost every library is available as a single code base for Python 2 and 3. Most of the new standard libraries, and even a few language features, are regularly backported to Python 2, as well.
Libraries to ease in the transition
Six
The original compatibility library, six (so
named because 2 times 3 is 6), provides tools to make writing 2 and 3 compatible
code easy. You just import six, and then access the renamed standard libraries
from six.moves
. There are wrappers for the changed features, such as
six.with_metaclass
.
These features are not hard to wrap yourself, so many libraries implement their
own six
wrapper to reduce dependencies and overhead.
See also: Future library (click to expand)
Future
This is a newer library with a unique approach. Instead of forcing a usage of a
special wrapper, the idea of future is to simply
allow code to be written in Python 3, but work in Python 3. For example,
from builtins import input
will do nothing on Python 3 (builtins
is where
input
lives), but on Python 2 with future
installed, builtins
is part of
future
and will import the future
version. You can even patch in the Python
3 standard library names with a standard_library.install_aliases()
function.
Future also comes with it’s own version of the conversion scripts, called
futurize
and pasteurize
, which use the future library to make code that runs
on one version run on both versions. An alpha feature, the autotranslate
function, can turn a library that supports only Python 2 into a Python 3 version
on import.
Backports
Several of the new libraries and features have been backported to Python 2. I’m
not including ones that were backported in an official Python release, like
argparse
.
pathlib2
: A simple, object oriented path library from Python 3.4enum34
: A python package for enumerations from Python 3.4mock
: A version ofunittest.mock
from Python 3.3futures
: This is theconcurrent.futures
package in Python 3.2statistics
: From Python 3.4selectors34
: Theselectors
package from Python 3.4typing
: Type hints from Python 3.5trollius
: Theasyncio
package, with a new syntax foryield from
, from Python 3.4- Smaller changes:
configparser
,subprocess32
,functools32
, and the variousbackports
-dot-something packages.
New features in modern Python
These are features that have been released in a version of Python after 3.0 that are not in the older Python 2 series:
- Matrix multiplication operator,
@
(3.5) - Special
async
andawait
syntax for asynchronous operations (3.5, 3.7) - Unpacking improvements, so that the
*
and**
operators work in more places like you’d expect (3.5) - Function signatures now in easy to use object (3.3)
- Improvements to Windows support (Windows launcher, recent versions of Visual C++) (3.2, 3.4, 3.5, 3.6, 3.7)
- Delegation to a subgenerator,
yield from
, finally allows safe factorization of generators (3.3) - Context variables and AsyncIO improvements, include a simple run function (3.7)
importlib.resources
, which allows files that don’t end in.py
to be accessed (FINALLY!) (3.7, backports available)breakpoint()
built-in function for debugging (3.7)- Modules can have custom
__dir__
and__getattr__
(3.7) - Lots of new debugging options in CPython for developers, like timing module import and better stacktraces (3.7)
- Positional only arguments, to avoid name clashes and for eventual speed optimizations (3.8)
Formatted string literals (3.6)
Finally! You can write code such as the following now:
x = 2
print(f"The value of x is {x}")
This is indicated by the f
prefix, and can take almost any valid python
expression. It does not have the scope issues that the old workaround,
.format(**locals())
encounters.
In Python 3.8, you can use an equals:
x = 2
print(f"{x = }")
to print a variable or expression and its name:
x = 2
Syntax for variable annotations (3.6)
This will be great for type hints, IDE’s, and Cython, but the syntax is a little odd for Python. It’s based on function annotations. A quick example:
an_empty_list_of_ints: List[int] = []
will_be_a_str_later: str
This stores the variable name and the annotation in an __annotations__
dictionary for the module or the class that they are in.
Simi-ordered dictionaries (3.6 and 3.7)
Python dictionaries are now partially ordered; due to huge speedups in the C
definition of ordered dicts, the dict
class is now guaranteed to iterate in
order as long as nothing has been changed since the dict
creation. This may
sound restrictive, but it enables many features; you can now discover the order
keyword arguments were passed, the order class members were added, and the order
of {}
dicts. If you want to continue to keep or control the order, you should
move the dict
to an OrderedDict
, as before. This makes ordered dictionaries
much easier to create, too.
Only class member order and keyword argument order are ensured by the language; the ordering of
{}
is an implementation detail. This detail works in both CPython 3.6 and all versions PyPy, however. This became language mandated in Python 3.7.
DataClasses (3.7)
Most programmers coming from other languages want some form of class designed to
store data. Creation of these data-centric classes is verbose and ugly in
python, since you have to put all the setup in the __init__
method rather than
directly in the class like other languages, and you have to manage
initialization, print, comparison, etc. yourself. Now, with DataClasses, you can
do it with a nice syntax:
from dataclasses import dataclass
@dataclass
class Vector:
x: float
y: float
z: float
This will create (by default) __init__
, __repr__
, and __eq__
. You can also
ask for order
, unsafe_hash
, and frozen
.
This is similar to, and less powerful than, the popular attrs library (available for all versions of Python). This library module, like many others, was also backported to older versions of Python. However, the variable type annotations are not available in older versions.
Walrus operator (3.8)
You can now use a special assignment operator, :=
(called the walrus operator
due to the eyes + tusks appearance) almost anywhere that a normal =
was not
allowed. So, for example, you can now do this:
if x := long_check():
print(x)
# x is no longer in scope!
This might be very handy for setting up machine learning tools, where you set a number of layers then refer to it further down in the same dict or function call.
Other smaller features:
- Underscores in numeric literals. You can add arbitrary spacers to numbers now,
such as
1_000_000
. (3.6) - Windows encoding improvements. (3.6)
- Simpler customization of class creation, using
__init_subclass__
class method. (3.6) - Descriptor access to the name of the class and the descriptor, using
__set_name__
. (3.6) - A file system path protocol,
__fspath__()
, allows any object to indicate that it represents a path. Finally pathlib works without wrapping it in astr()
! (3.6) - Better support for
async
list comprehensions, andasync
generators. (3.6) - A
secrets
module for password related randomization functions. (3.6) __slots__
can contain docstrings as a dict (3.8)
Status of Python
The current status of the python releases is as follows:
- Python 2.5: Dead.
- Python 2.6: Most libraries are dropping support, officially discontinued, but still on some legacy systems, like the default environment in SL6.
- Python 2.7: The officially supported Python 2 release, critical security flaws fixed till January 1, 2020. PyPy supports 2.7.13. Windows version is stuck requiring Visual Studio 2008 for builds (Careful memory design can allow use of new VS). NumPy, Pandas, IPython, and more have dropped support.
- Python 3.0-3.2: Never used significantly, no library support.
- Python 3.3: Better backwards compatibility makes this the first generally used
Python 3, with Windows downloads
outpacing
Python 2.7 for the first time.
u""
was added back in as a no-op. - Python 3.4: Addition of asyncio features and pathlib provided even more interest. No longer in use.
- Python 3.5: New features, such as matrix multiplication, are accelerating the transition from Python 2. Note that PyPy3 is currently based on Python 3.5.3.
- Python 3.6: The addition of format strings make simple scripts much easier and cleaner.
- Python 3.7: Big performance improvements make this the fastest CPython ever; dataclasses, typing, and threading improvements.
- Python 3.8 beta: New walrus operator, positional only arguments, and fast call.