3
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

EP 24 Use `@classmethod` Polymorphism to Construct Objects Generically

Last updated at Posted at 2016-04-13
  • Python only supports a single constructor per class, the __init__ method
  • Use @classmethod to define alternative constructors for your classes.
  • Use class method polymorphism to provide generic ways to build and connect concrete subclasses.

Effective Python

Write MapReduce

Bad example.

Though make base class for InputData and Woker, code are sticked to each other in create_worker and generate_inputs. If you want to use other class for mapreduce, you have to rewrite your code, which is not ideal.

import os
from threading import Thread

class InputData:
    def read(self):
        raise NotImplementedError

class PathInputData(InputData):
    def __init__(self, path):
        super().__init__()
        self.path = path

    def read(self):
        return open(self.path).read()

class Worker:
    def __init__(self, input_data):
        self.input_data = input_data
        self.result = None

    def map(self):
        raise NotImplementedError

    def reduce(self):
        raise NotImplementedError

class LineCountWoker(Worker):
    def map(self):
        data = self.input_data.read()
        self.result = data.count('\n')

    def reduce(self, other):
        self.result += other.result

def generate_inputs(data_dir):
    for name in os.listdir(data_dir):
        yield PathInputData(os.path.join(data_dir, name))

def create_worker(input_list):
    workers = []
    for input_data in input_list:
        workers.append(LineCountWoker(input_data))
    return workers


def execute(workers):
    threads = [Thread(target=w.map) for w in workers]
    for thread in threads: thread.start()
    for thread in threads: thread.join()

    first, rest = workers[0], workers[1]
    for workers in rest:
        first.reduce(worker)
    return first.result

def mapreduce(data_dir):
    inputs = generate_inputs(data_dir)
    workers = create_worker(inputs)
    return execute(workers)


with TemporaryDirectory() as tmpdir:
    write_test_files(tmpdir)
    result = mapreduce(tmpdir)


Good practice

The main difference here is using @classmethod for common constructor interface. It looks like factory pattern. A mapreduce function accepts classes and instantiate them by calling @classmethod, which enable modulable and untidy code.

import os
from threading import Thread

class GenericInputData:
    def read(self):
        raise NotImplementedError

    @classmethod
    def generate_inputs(cls, config):
        raise NotImplementedError

class PathInputData(GenericInputData):
    def __init__(self, path):
        super().__init__()
        self.path = path

    def read(self):
        return open(self.path).read()

    @classmethod
    def generate_inputs(cls, config):
        data_dir = config['data_dir']
        for name in os.listdir(data_dir):
            yield cls(os.path.join(data_dir, name))

class GenericWorker:
    def __init__(self, input_data):
        self.input_data = input_data
        self.result = None

    def map(self):
        raise NotImplementedError

    def reduce(self):
        raise NotImplementedError

    @classmethod
    def create_workers(cls, input_class, config):
        workers = []
        for input_data in input_class.generate_inputs(config):
            workers.append(cls(input_data))
        return workers

class LineCountWoker(GenericWorker):
    def map(self):
        data = self.input_data.read()
        self.result = data.count('\n')

    def reduce(self, other):
        self.result += other.result

def mapreduce(worker_class, input_class, config):
    workers = worker_class.create_workers(input_class, config)
    return execute(workers)

def execute(workers):
    threads = [Thread(target=w.map) for w in workers]
    for thread in threads: thread.start()
    for thread in threads: thread.join()

    first, rest = workers[0], workers[1]
    for workers in rest:
        first.reduce(worker)
    return first.result


with TemporaryDirectory() as tmpdir:
    write_test_files(tmpdir)
    config = {'data_dir': tmpdir}
    result = mapreduce(LineCountWoker, PathInputData, config)


Polymorphism

If you think about the Greek roots of the term, it should become obvious.

Poly = many: polygon = many-sided, polystyrene = many styrenes (a), polyglot = many languages, and so on.
Morph = change or form: morphology = study of biological form, Morpheus = the Greek god of dreams able to take any form.
So polymorphism is the ability (in programming) to present the same interface for differing underlying forms (data types).

For example, integers and floats are implicitly polymorphic since you can add, subtract, multiply and so on, irrespective of the fact that the types are different. They're rarely considered as objects in the usual term.

But, in that same way, a class like BigDecimal or Rational or Imaginary can also provide those operations, even though they operate on different data types.

The classic example is the Shape class and all the classes that can inherit from it (square, circle, dodecahedron, irregular polygon, splat and so on).

With polymorphism, each of these classes will have different underlying data. A point shape needs only two co-ordinates (assuming it's in a two-dimensional space of course). A circle needs a center and radius. A square or rectangle needs two co-ordinates for the top left and bottom right corners (and possibly) a rotation. An irregular polygon needs a series of lines.

And, by making the class responsible for its code as well as its data, you can achieve polymorphism. In this example, every class would have its own Draw() function and the client code could simply do:

shape.Draw()

to get the correct behavior for any shape.

This is in contrast to the old way of doing things in which the code was separate from the data, and you would have had functions such as drawSquare() and drawCircle().

Object orientation, polymorphism and inheritance are all closely-related concepts and they're vital to know. There have been many "silver bullets" during my long career which basically just fizzled out but the OO paradigm has turned out to be a good one. Learn it, understand it, love it - you'll be glad you did :-)

(a) I originally wrote that as a joke but it turned out to be correct and, therefore, not that funny. The momomer styrene happens to be made from carbon and hydrogen, C8H8, and polystyrene is made from groups of that, (C8H8)n.

Perhaps I should have stated that a polyp was many occurrences of the letter p although, now that I've had to explain the joke, even that doesn't seem funny either.

3
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?