Shape Creation#
This tutorial walks you through the process of creating a new shape for use as a target in the morphing process.
Create a class for the shape#
All Data Morph shapes are defined as classes inside the shapes
subpackage.
In order to register a new target shape for the CLI, you will need to fork and clone
the Data Morph repository, and then add
a class defining your shape.
Select the appropriate base class#
Data Morph uses a hierarchy of shapes that all descend from an abstract
base class (Shape
), which defines the basics of how a shape
needs to behave (i.e., it must have a distance()
method and a
plot()
method).
Any new shape must inherit from Shape
or one of its
child classes:
If your shape is composed of lines, inherit from
LineCollection
(e.g.,Star
).If your shape is composed of points, inherit from
PointCollection
(e.g.,Heart
).If your shape isn’t composed of lines or points you can inherit directly from
Shape
(e.g.,Circle
). Note that in this case you must define both thedistance()
andplot()
methods (this is done for your if you inherit fromLineCollection
orPointCollection
).
Define the scale and placement of the shape based on the dataset#
Each shape will be initialized with a Dataset
instance. Use the dataset
to determine where in the xy-plane the shape should be placed and also to scale it
to the data. If you take a look at the existing shapes, you will see that they use
various bits of information from the dataset, such as the automatically-calculated
bounds (e.g., Dataset.data_bounds
, which form the bounding box of the
starting data, and Dataset.morph_bounds
, which define the limits of where
the algorithm can move the points) or percentiles using the data itself (see
Dataset.df
). For example, the XLines
shape inherits from
LineCollection
and uses the morph bounds (Dataset.morph_bounds
)
to calculate its position and scale:
class XLines(LineCollection):
name = 'x'
def __init__(self, dataset: Dataset) -> None:
xmin, xmax = dataset.morph_bounds.x_bounds
ymin, ymax = dataset.morph_bounds.y_bounds
super().__init__([[xmin, ymin], [xmax, ymax]], [[xmin, ymax], [xmax, ymin]])
Since we inherit from LineCollection
here, we don’t need to define
the distance()
and plot()
methods (unless we want to override them).
We do set the name
attribute here since the default will result in
a value of xlines
and x
makes more sense for use in the documentation
(see ShapeFactory
).
Register the shape#
For the data-morph
CLI to find your shape, you need to register it with the
ShapeFactory
:
Add your shape class to the appropriate module inside the
src/data_morph/shapes/
directory. Note that these correspond to the type of shape (e.g., usesrc/data_morph/shapes/points/<your_shape>.py
for a new shape inheriting fromPointCollection
).Add your shape to
__all__
in that module’s__init__.py
(e.g., usesrc/data_morph/shapes/points/__init__.py
for a new shape inheriting fromPointCollection
).Add an entry to the
ShapeFactory._SHAPE_CLASSES
tuple insrc/data_morph/shapes/factory.py
, preserving alphabetical order.
Test out the shape#
Defining how your shape should be generated from the input dataset will require a few iterations. Be sure to test out your shape on different datasets:
$ data-morph --start-shape panda music soccer --target-shape <your shape>
Some shapes will work better on certain datasets, and that’s fine. However,
if your shape only works well on one of the built-in datasets (see the
DataLoader
), then you need to keep tweaking your implementation.
(Optional) Contribute the shape#
If you think that your shape would be a good addition to Data Morph, create an issue in the Data Morph repository proposing its inclusion. Be sure to consult the contributing guidelines before doing so.
If and only if you are given the go ahead:
Prepare a docstring for your shape following what the other shapes have. Be sure to change the plotting code in the docstring to use your shape.
Add test cases for your shape to the
tests/shapes/
directory.Submit your pull request.