I haven’t seen a great deal of practical documentation about using classmethods as factories in Python (which is arguably the most important use of a classmethod, IMO). This post hopes to fill in that gap.
Simple Example of classmethod Factory
This is not to hard to find
a good example of,
but here is a simple example of a class method being used for a generator. Let’s
say you have the following vector class, and you want to be able to make a new
vector using Vector(x,y,z) or Vector.from_cyl(ρ,θ,z) (as usual, I’m
exploiting Python 3’s unicode variable names, change these to plain text if you
are using Python 2). Here is how we would do that, with a repr added to make
the display easier to see:
from math import sin, cos, pi as π
class Vector(object):
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
@classmethod
def from_cyl(cls, ρ, θ, z):
return cls(ρ * cos(θ), ρ * sin(θ), z)
def __repr__(self):
return "<{0.__class__.__name__}({0.x}, {0.y}, {0.z})>".format(self)
Vector(1, 2, 3)
Vector.from_cyl(1, π / 4, 1)
A few take away points:
- Class methods take the class as the first argument, traditionally
cls. - Class methods that are factories call
cls(), and return the new instance they created. - Both
__new__and__init__ran whencls()was called.
Avoiding __init__
This is fine, but there are two more features that we often need. First, we
might not want to call the __init__ function at all; this is fine in the above
example, but we might not happen to be able to make the transformation to the
original parameters. If we can’t do that, it is actually possible in Python to
create a new instance of a class without calling __init__ using:
cls.__new__(cls, ...)
where the ellipsis are the parameters for the new function (often this is simply
cls.__new__(cls)).
Take the following example, where we’ve reversed the class, so that cylindrical is now default:
from math import sin, cos, pi as π
class Vector(object):
def __init__(self, ρ, θ, z):
self.x = ρ * cos(θ)
self.y = ρ * sin(θ)
self.z = z
@classmethod
def from_xyz(cls, x, y, z):
ob = cls.__new__(cls)
ob.x = x
ob.y = y
ob.z = z
return ob
def __repr__(self):
return "<{0.__class__.__name__}({0.x}, {0.y}, {0.z})>".format(self)
Vector(1, π / 4, 1)
Vector.from_xyz(1, 2, 3)
Though it should be obvious now, I’ll point out that each class method using
this technique is responsible for doing anything that the __init__ function
normally does. In this case, self.x, .y, and .z all must be manually set.
If we forgot to set self.z, for example, bad things might happen when we call
other methods.
Auto generator
__init__ Method
This is something I do a lot that is still is difficult with above trick. Let’s
say you want to have an auto-selecting factory function. For example, if you
have .from_csv(filename) and .from_excel(filename) functions, you might want
to make a .from_any function that bases it’s choice of loading function on the
extension of filename. This is easy to do with normal factory functions unless
you decide that you’d like your __init__ function to act like .from_any.
Then, you’ll need to write two functions for each factory function (this only
uses one from_ method pair for clarity):
class FromFile(object):
def __init__(self, filename):
if filename[-3:] == "csv":
self._from_csv(filename)
else:
self.file = "Not a valid format"
@classmethod
def from_csv(cls, filename):
self = cls.__new__(cls)
self._from_csv(filename)
return self
def _from_csv(self, filename):
self.file = "I got {0} from csv!".format(filename)
def __repr__(self):
return "<{0.__class__.__name__}({0.file})>".format(self)
FromFile("this.file")
FromFile("this.csv")
FromFile.from_csv("this.is")
__new__ Method
One other way we could do this would be to change __new__ itself. This is
technically what we want anyway; __new__ is where new instances of a class get
created.
The downside to this method is that we are limited in our use of __init__ with
arguments (if you have an __init__, you need to accept everything that
__new__ accepts). __init__ will run only when we call the class normally, so
it may or may not run when a factory function is called, forcing you to put any
common init code in a separate function, and then manually calling it from each
factory function, and __new__ too, if necessary.
class FromFile(object):
def __new__(cls, filename):
if filename is None:
return super(FromFile, cls).__new__(cls)
elif filename[-3:] == "csv":
return cls.from_csv(filename)
else:
self = super(FromFile, cls).__new__(cls)
self.file = "Not a valid format"
self._common_init_code()
return self
def __init__(self, *args, **kargs):
print("This was called directly with", *args)
def _common_init_code(self):
print("All correct creations of this class will print this line!")
@classmethod
def from_csv(cls, filename):
self = super(FromFile, cls).__new__(cls)
self.file = "I got {0} from csv!".format(filename)
self._common_init_code()
return self
def __repr__(self):
return "<{0.__class__.__name__}({0.file})>".format(self)
FromFile("this.file")
FromFile("this.csv")
FromFile.from_csv("this.is")
There are other ways to generate objects of certain classes;
subclassing is a
valid
method,
or
using a factory function,
or even metaclasses.
(For metaclasses,
this article
is hard to beat.) Several of these methods cause the type of the object not
match the object used to create it (like numpy.array() is a numpy.ndarray),
but still is commonly used.