I haven’t seen a great deal of practical documentation about using classmethods as factories in Python (which is arguably the most important use of a classmethod, IMO). This post hopes to fill in that gap.
Simple Example of classmethod
Factory
This is not to hard to find
a good example of,
but here is a simple example of a class method being used for a generator. Let’s
say you have the following vector class, and you want to be able to make a new
vector using Vector(x,y,z)
or Vector.from_cyl(ρ,θ,z)
(as usual, I’m
exploiting Python 3’s unicode variable names, change these to plain text if you
are using Python 2). Here is how we would do that, with a repr
added to make
the display easier to see:
from math import sin, cos, pi as π
class Vector(object):
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
@classmethod
def from_cyl(cls, ρ, θ, z):
return cls(ρ * cos(θ), ρ * sin(θ), z)
def __repr__(self):
return "<{0.__class__.__name__}({0.x}, {0.y}, {0.z})>".format(self)
Vector(1, 2, 3)
Vector.from_cyl(1, π / 4, 1)
A few take away points:
- Class methods take the class as the first argument, traditionally
cls
. - Class methods that are factories call
cls()
, and return the new instance they created. - Both
__new__
and__init__
ran whencls()
was called.
Avoiding __init__
This is fine, but there are two more features that we often need. First, we
might not want to call the __init__
function at all; this is fine in the above
example, but we might not happen to be able to make the transformation to the
original parameters. If we can’t do that, it is actually possible in Python to
create a new instance of a class without calling __init__
using:
cls.__new__(cls, ...)
where the ellipsis are the parameters for the new function (often this is simply
cls.__new__(cls)
).
Take the following example, where we’ve reversed the class, so that cylindrical is now default:
from math import sin, cos, pi as π
class Vector(object):
def __init__(self, ρ, θ, z):
self.x = ρ * cos(θ)
self.y = ρ * sin(θ)
self.z = z
@classmethod
def from_xyz(cls, x, y, z):
ob = cls.__new__(cls)
ob.x = x
ob.y = y
ob.z = z
return ob
def __repr__(self):
return "<{0.__class__.__name__}({0.x}, {0.y}, {0.z})>".format(self)
Vector(1, π / 4, 1)
Vector.from_xyz(1, 2, 3)
Though it should be obvious now, I’ll point out that each class method using
this technique is responsible for doing anything that the __init__
function
normally does. In this case, self.x
, .y
, and .z
all must be manually set.
If we forgot to set self.z
, for example, bad things might happen when we call
other methods.
Auto generator
__init__
Method
This is something I do a lot that is still is difficult with above trick. Let’s
say you want to have an auto-selecting factory function. For example, if you
have .from_csv(filename)
and .from_excel(filename)
functions, you might want
to make a .from_any
function that bases it’s choice of loading function on the
extension of filename
. This is easy to do with normal factory functions unless
you decide that you’d like your __init__
function to act like .from_any
.
Then, you’ll need to write two functions for each factory function (this only
uses one from_
method pair for clarity):
class FromFile(object):
def __init__(self, filename):
if filename[-3:] == "csv":
self._from_csv(filename)
else:
self.file = "Not a valid format"
@classmethod
def from_csv(cls, filename):
self = cls.__new__(cls)
self._from_csv(filename)
return self
def _from_csv(self, filename):
self.file = "I got {0} from csv!".format(filename)
def __repr__(self):
return "<{0.__class__.__name__}({0.file})>".format(self)
FromFile("this.file")
FromFile("this.csv")
FromFile.from_csv("this.is")
__new__
Method
One other way we could do this would be to change __new__
itself. This is
technically what we want anyway; __new__
is where new instances of a class get
created.
The downside to this method is that we are limited in our use of __init__
with
arguments (if you have an __init__
, you need to accept everything that
__new__
accepts). __init__
will run only when we call the class normally, so
it may or may not run when a factory function is called, forcing you to put any
common init code in a separate function, and then manually calling it from each
factory function, and __new__
too, if necessary.
class FromFile(object):
def __new__(cls, filename):
if filename is None:
return super(FromFile, cls).__new__(cls)
elif filename[-3:] == "csv":
return cls.from_csv(filename)
else:
self = super(FromFile, cls).__new__(cls)
self.file = "Not a valid format"
self._common_init_code()
return self
def __init__(self, *args, **kargs):
print("This was called directly with", *args)
def _common_init_code(self):
print("All correct creations of this class will print this line!")
@classmethod
def from_csv(cls, filename):
self = super(FromFile, cls).__new__(cls)
self.file = "I got {0} from csv!".format(filename)
self._common_init_code()
return self
def __repr__(self):
return "<{0.__class__.__name__}({0.file})>".format(self)
FromFile("this.file")
FromFile("this.csv")
FromFile.from_csv("this.is")
There are other ways to generate objects of certain classes;
subclassing is a
valid
method,
or
using a factory function,
or even metaclasses.
(For metaclasses,
this article
is hard to beat.) Several of these methods cause the type of the object not
match the object used to create it (like numpy.array()
is a numpy.ndarray
),
but still is commonly used.