MikuMikuDance (MMD) is a free 3D animation software that is very popular in Japan. Ever since the software was first revealed 11 years ago, it has gained a large and strong user base. It simple user interface allowed amateurs to quickly pick up 3D animation. It has inspired modelers to create thousands of models of anime-style characters, animators to generate hundreds of elaborate dance motions, and video creators to upload hundreds of thousands of 3D animations made using the software to the Internet.
While MMD has made great impact to the Japanese 3D animation community, its closed-source nature seems to limit the usefulness of the software itself and its associated data. This is not to say that there's a lack of tools to manipulate the data. There are MMD-compatible close-sourced danimation authoring tools such as MikuMikuMoving and nanoem. There are also open source viewers and libraries such as MikuMikuFlex, Mikudayo, saba, and mmd-viewer-js. Data exporters to other platforms and formats such as Unity, FBX, and Blender exist, and a variety of other tools are listed at the VPVP Wiki. What I feel is lacking, however, is conceptual understanding the software. As a programmer, it is hard to develop novel tools that work with MMD data if we don't understand how the software works. This article aims to address this problem.
I have been struggling for a long time to develop tools to use MMD data. Recently, I felt that I have succeeded in producing animatated mesh that are close to the original software output. As a result, I would like to share the knowledge that I have gained so that other people might be able to do the same. In this series of articles, I describe how to compute the shape of an animated MMD model. This is the process that couples character models, stored in PMD/PMX files, to motions stored, in VMD files, to produce a dancing character. Clearly, this process is the core of MMD's functionality. Unfortunately, though, I have never seen it clearly documented anywhere.
This article is the first in a series of 3 articles. Here, we focus on morphing and skinning, which is the process that transforms the model mesh according to a given pose. The next article will be about forward and inverse kinematics, and the last article will be about physical simulation.
Disclaimer. The algorithms and formulas described in this article are by no means the same as or compatible with the ones in the original software. They are also not my original work as I based my implementation on MikuMikuFlex, an MMD library for the .Net Framework, and nagadomi's code.
Terminology
Before we can start talking about how to MikuMikuDance animation works, we need to establish some common terms that will be used throughout the article.
A model is a collection of data that describes a shape that can be animated. In MMD, a model is stored in a file with .pmd or .pmx extension. (We refer to such a file as the PMD file or the PMX file, based on the extension.) A model is typically a humanoid character, but it can also be a scene prop or the stage itself. This article, however, focuses on animating humanoid characters.
A pose is a configuration of how different parts of a model are positioned relative to one another. A pose of an MMD model is specified by a collection of numbers which we will describe in details in later articles. A pose can be saved in a file with extension .vpd (VPD file).
The pose of a model when you first load it from a file is called the binding pose. This is the pose that the model is in when an MMD user does not specify anything. For a humanoid character, this is the pose of the character standing straight with the arms extended to the side but angled downward.
A motion is a continuous sequence of poses through time. Typically, a motion is stored in a file with .vmd extension (VMD file). However, the data stored this way is not complete. A typical VMD motion only specifies where the ankles are but not how the leg and knee joints bend. Moreover, movement of parts such as hair and cloth are controlled by physical simulation, and MMD do not save such movement in VMD files. One of the aims of this article is to describe how to complete the missing data in VMD files so that we can have watch a fully animated model.
Posing a Model
Let us now dig deeper on how an MMD model and its pose are specified. As typical of models for real-time computer graphics, an MMD model is a triangle mesh. It consists of a set of 3D points, called vertices, and triangles formed by connecting combinations of 3 of these points. A vertex has many attributes, including its 3D position and normal vector (aka "normal"). The data in a PMD/PMX file specify the positions and normals for when the model is in the binding pose. When a model is animated, positions and normals change while other attributes remain the same. The main aim of this article is to describe how to compute the changing positions and normals.
To ease later algorithmic discussion, we now define some mathmatical notations. Suppose that we are dealing with an arbitrary vertex in the mesh. Let:
- $\mathrm{p}^{bd}$ denote the 3D position of the vertex in the binding pose, and
- $\mathrm{n}^{bd}$ denote the normal vector of the vertex in the binding pose.
For brevity, we will refer to the vertex position and normal in the binding pose as "binding position" and "binding normal," respectively.
In MMD, vertices can change through two mechanisms: morphing and skinning.
Morphing
An MMD model comes with a set of morphs. A morph is a predefined change some parts of a model. Each morph has an associated number that varies from $0$ to $1$ called its weight. When the weight is $0$, the morph is completely inactive, and its change is not imposed on the model. When the weight is $1$, the morph is fully active, and its change is in full effect. By default, the weight is $0$.
The PMD file format supports one morph type, but the PMX file format supports five. This article only discusses the morph type supported by both: the vertex morphs. As the name indicates, the vertex morph modifies vertex positions. Vertex morphs are mainly used to implement facial expressions. A typical MMD character model would contain vertex morphs that make the character wink and open its mouth as if to pronounce the five Japanese vowels.
Mathematically, let us say that a model has $n_{vm}$ vertex morphs. The $j$th morph specifies, for each vertex, the displacement vector $\mathrm{d}_j$ that is added to the binding position if the morph is in full effect. A pose must specify a weight $\omega_j$ for each morph. With these data in place, the position of the vertex after morphing, denoted by $\mathrm{p}^{vm}$, is given by: $$ \mathrm{p}^{vm} = \mathrm{p}^{bd} + \sum_{0 \leq j < n_{vm}} \omega_j \mathrm{d}_j.$$ Note that normal vectors are not changed by morphing.
In practice, the displacement vectors are zero for most vertices. So, MMD only stores non-zero vectors in the PMD/PMX files. Moreover, if a morph's weight is not specified in a VPD or VMD file, its weight is zero by default.
Skinning
MMD is a skeletal animation system. In addition to the aforementioned triangle mesh, a MMD model has a skeleton: a stick figure that abstracts the positions of mesh parts relative to one another. A skeleton contains a number of points, called bones, each of which can influence a particular subset of vertices. Skinning is the process of incorporating the influence of bones into vertex positions and normals. It is carried out after morphing.
To skin a mesh, MMD performs a sequence of complex calculations to compute a linear transformation (i.e., a $4 \times 4$ matrix) associated with each bone. (Don't worry. We will discuss the complex calculations in details later.) Let us call these transformations the skinning transformations and denote the skinning tranformation of the $k$th bone with $M_k$.
In MMD, a skinning transformation is a rotation followed by a translation: $$ M_k = T_k R_k.$$ As a result, the transformation can be represented by a vector $\mathrm{t}_k$ for the translation part and a quaternion $\mathrm{q}_k$ for the rotation part. The relationship between the matrices and their succinct representations are as follows. If $\mathrm{t}_k = (t_x, t_y, t_z)$, then
\begin{align*}
T_{k}
= \begin{bmatrix}
1 & & & t_x \\
& 1 & & t_y \\
& & 1 & t_z \\
& & & 1
\end{bmatrix},
\end{align*}
and, if $\mathrm{q}_k = q_w + q_x \mathrm{i} + q_y \mathrm{j} + q_z \mathrm{k}$, then
\begin{align}
R_k = \begin{bmatrix}
1 - 2q_y^2 - 2q_z^2 &
2q_x q_y - 2q_z q_w &
2 q_x q_z + q_y q_w &
0 \\
2 q_x q_y + 2 q_z q_w &
1 - 2q_x^2 - 2q_z^2 &
2 q_y q_z - 2 q_x q_w &
0 \\
2 q_x q_z - 2 q_y q_w &
2 q_y q_z + 2 q_x q_w &
1 - 2 q_x^2 - 2 q_y^2 &
0 \\
0 & 0 & 0 & 1
\end{bmatrix}.
\qquad \qquad (1)
\end{align}
The second equation is just the standard quaternion-to-matrix conversion. You can find a code snippet that does this here.
Each vertex in the model in under influence of at most 4 bones. Say, a vertex in influenced by Bone $k_0$, $k_1$, $k_2$, and $k_3$. The degree to which each bone influences the vertex is given by the skinning weights $w_0$, $w_1$, $w_2$, and $w_3$. Here, $w_0$ tells how much Bone $k_0$ influences the vertex, and the same goes for $w_1$, $w_2$, and $w_3$. The bone indices $k_0$, $\dotsc$, $k_3$ and the weights $w_0$, $\dotsc$, $w_3$ are specified in the model file for each vertex.
Vertices in an MMD model can be skinned by three different algorithms.
- Linear blend skinning: the standard skinning algorithm in computer graphics.
- SDEF: an proprietary skinning algorithm.
- Dual quaternion skinning: a more advanced skinning algorithm invented by Kavan el al. in 2007 [*].
The PMD file format only supports linear blend skinning. SDEF is available in PMX file format version 2.0 (and possibly earlier versions), and dual quaternion skinning was introduced in PMX version 2.1. The PMX file specifies which skinning algorithm is used for every vertex. (There is no such specification in the PMD file because every vertex uses linear blend skinning.) MMD itself only support the first two algorithms while other programs such as MikuMikuMoving and PmxEditor support all three.
Note also that the MMD community refers to the above algorithm with its own acronyms. Linear blend skinning's acronym is "BDEF," and dual quaternion skinning acronym is "QDEF."
We will now discuss the algorithms, one by one.
Linear Blend Skinning (BDEF)
Linear blend skinning is generally the first skining algorithm taught in any computer graphics course. Because of this, we will only discuss it briefly here. These slides from Stanford University are a good refresher.
The algorithm combines the skinning transformations of the four influencing bones into one final skinning matrix as follows:
\begin{align*}
M
= \sum_{j=0}^3 w_j M_{k_j}
= w_0 M_{k_0}
+ w_1 M_{k_1}
+ w_2 M_{k_2}
+ w_3 M_{k_3}.
\end{align*}
The final position and normal vector are then given by:
\begin{align*}
\mathrm{p}^{fin} &= M \mathrm{p}^{vm} \\
\mathrm{n}^{fin} &=
\mathrm{normalize}( M^{-T} \mathrm{n}^{bd} ).
\end{align*}
Here, $M^{-T}$ denotes the inverse transpose of $M$, and the $\mathrm{normalize}(\cdot)$ function does what it name states to a vector. Note that we need to renomalize because $M$ may not preserve vector lengths.
In PMX file format specification, linear blend skinning is referred to as "BDEF." MMD supports three variants: BDEF1, BDEF2, and BDEF4. The number in the name of each variant is simply the number of bones that influence the vertex. There is no change to the underlying calculation. In other words, we can think of BDEF1 as the linear blend skinning algorithm applied to the case where $w_1 = w_2 = w_3 = 0$. Similarly, $w_2 = w_3 = 0$ for BDEF2, and all weights may be non-zero for BDEF4.
SDEF
SDEF is a proprietary skinning algorithm attributed to Sasaki Yuhri (佐々木優理) [*]. The name is an acronym for short for "spherical deform" (スフィリカルデフォーム) [*]. It was designed to solve a common problem with linear blend skinning: the skinned mesh loses volume around folding joints such as knees and elbows.
As far as I know, the complete specification of the algorithm is not publicly available. Sasaki implemented it in a 3D animation software called Mikoto [*]. Currently, mqdl seems to be the most knowledgeable person, having extended the algorithm and implemented it in their 3D modeling program, XISMO. Other programmers have tried to reproduce the algorithm and made their code public. MikuMikuFlex's implementation is based on sam42's code. My description and implementation is based on nagadomi's code.
Spherical Blend Skinning
SDEF seems to be based on the spherical blend skinning algorithm by Kavan and Zara [*]. The main differences between the former and the latter are:
- SDEF requires the vertex to be influenced by two bones while spherical blend skinning can handle an arbitrary number of influencing bones in principle. (However, more than two bones is generally impractical.)
- SDEF requires more extra vertex attributes than spherical blend skinning does.
We can think of SDEF (nagadomi's implementation of SDEF, to be exact) as an extension of spherical blend skinning with two influencing bones. So it is expedient to describe spherical blend skinning first. The main idea of the algorithm is to exploit the fact each of the skinning transformation can be decompose into a translation and a rotation. The rotational parts of the two transformations are then blended together, and the result is then used to transform a part of the vertex.
More precisely, we regard the vertex position (after morphing) $\mathrm{p}^{vm}$ as rotating around a center of rotation $\mathrm{C}$, specific to that vertex. For example, a vertex around an elbow would rotate around the elbow bone's position. We then decompose the vertex position into the center and the displacement from the center:
\begin{align*}
\mathrm{p}^{vm} = \mathrm{C} + (\mathrm{p}^{vm} - \mathrm{C}).
\end{align*}
The spherical blending algorithm transforms the two parts differently. For the center $\mathrm{C}$, it computes
\begin{align*}
M \mathrm{C}
\end{align*}
where $M$ is the matrix for linear blend skinning, restricted to two influencing bones:
\begin{align*}
M
= w_0 M_{k_0}
+ w_1 M_{k_1}.
\end{align*}
For the displacement $\mathrm{p}^{vm} - \mathrm{C}$, it blends together the rotational component of two skinning matrices and uses the resulting rotation to transform the displacement vector. More precisesly, recall that each skinning transformation can be represented by a translation vector and a quaternion. So, say:
- $M_{k_0}$ is represented by $\mathrm{t}_{k_0}$ and $\mathrm{q}_{k_0}$.
- $M_{k_1}$ is represented by $\mathrm{t}_{k_1}$ and $\mathrm{q}_{k_1}$.
We use the bone weights to blend the quaternions together. This can be done either by spherical linear interpolation (slerp) or by computing the linear combination of the queternions and then normalizing. nagadomi's code choose the latter and compute the blended quaternion $\mathrm{q}$ with the following pseudocode snippet:
if $\mathrm{q}_{k_0} \cdot \mathrm{q}_{k_1} < 0$ then $\qquad \mathrm{q}_{k_0} := -\mathrm{q}_{k_1}$ endif $\mathrm{q} := (w_0 \mathrm{q}_{k_0} + w_1 \mathrm{q}_{k_1}) / \| w_1 \mathrm{q}_{k_0} + w_1 \mathrm{q}_{k_1} \|$ |
Then, we convert the blended quaternion $\mathrm{q}$ into a rotation matrix $R$ with Equation (1). This allows us to rotate the displacement vector as follows:
\begin{align*}
R (\mathrm{p}^{vm} - \mathrm{C}).
\end{align*}
The result is then added to the transformed center to get the final vertex position:
\begin{align}
\mathrm{p}^{fin} &= M \mathrm{C} + R (\mathrm{p}^{vm} - \mathrm{C}). \qquad \qquad (2)
\end{align}
This completes the description of the spherical blend skinning algorithm.
Let us remind ourselves that the algorithm requires an extra vertex attribute $\mathrm{C}$ to be specified for each vertex in the model. In the Kavan-Zara paper, $\mathrm{C}$ is a solution to a least square optimization problem which becomes very complex if there are three influencing bones or more. However, it is very simple when there are only two bones that have a parent-child relationship. In this case, $\mathrm{C}$ is the position of the child bone because it is the point that does not move then the two bones rotate relative to each other.
I'd also like to point out that both $\mathrm{p}^{vm}$ and $\mathrm{C}$ are points in 3D. As a result, they should be converted into homogeneous coordinates with the 4th components being $1$ when doing the calculation of Equation (2). In other words, we may rewrite the equation as:
\begin{align*}
\mathrm{p}^{fin} &=
M \begin{bmatrix} C_x \\ C_y \\ C_z \\ 1 \end{bmatrix}
+ R \left(
\begin{bmatrix} p^{vm}_x \\ p^{vm}_y \\ p^{vm}_z \\ 1 \end{bmatrix}
- \begin{bmatrix} C_x \\ C_y \\ C_z \\ 1 \end{bmatrix}
\right)
= M \begin{bmatrix} C_x \\ C_y \\ C_z \\ 1 \end{bmatrix}
+ R
\begin{bmatrix} p^{vm}_x - C_x \\ p^{vm}_y - C_y \\ p^{vm}_z - C_z \\ 0 \end{bmatrix}.
\end{align*}
From the above equation, we can readily verify that $\mathrm{p}^{vm} - \mathrm{C}$ is a vector because its 4th component is $0$. It is also apparent that the result value, $\mathrm{p}^{fin}$, is a point.
The SDEF Algorithm
Disclaimer: The material in this section is my interpretation of the SDEF algorithm based on nagadomi's code. As I have said before, the algorithm is proprietary, and its official description has not been made public.
The SDEF algorithm introduce two more vertex parameters:0 $\mathrm{R}_0$ and $\mathrm{R}_1$, both of which are 3D points. According to mqdl's tweet, the two parameters are used to "dynamically adjust $\mathrm{C}$.'' However, I have yet to find a more precise definition, and, frankly, I don't understand what the parameters mean.
In nagadomi's code, the following sequence of calculation is performed:
- $\mathrm{R}_0$ and $\mathrm{R}_1$ are blended together using the bone weights.
\begin{align*}
\tilde{\mathrm{R}} = w_0 \mathrm{R}_0 + w_1 \mathrm{R}_1.
\end{align*}
- Two new centers $\mathrm{C}_0$ and $\mathrm{C}_1$ are computed:
\begin{align*}
\mathrm{C}_0 &= \mathrm{C} + \frac{\mathrm{R}_0 - \tilde{\mathrm{R}}}{2} \\
\mathrm{C}_1 &= \mathrm{C} + \frac{\mathrm{R}_1 - \tilde{\mathrm{R}}}{2} \\
\end{align*}
- Lastly, the final vertex position is computed as follows:
\begin{align}
\mathrm{p}^{fin}
= w_0 M_{k_0} \mathrm{C}_0 + w_1 M_{k_1} \mathrm{C}_1 + R (\mathrm{p}^{vm} - \mathrm{C}). \qquad \qquad (3)
\end{align}
The final normal vector is given by $\mathrm{n}^{fin} = R \mathrm{n}^{bd}$.
On the surface, Equation (3) looks quite different from Equation (2). However, we can show that they are very similar through further manipulation.
\begin{align*}
\mathrm{p}^{fin}
&= w_0 M_{k_0} \mathrm{C}_0 + w_1 M_{k_1} \mathrm{C}_1 + R (\mathrm{p}^{vm} - \mathrm{C}) \\
&= w_0 M_{k_0} \bigg( \mathrm{C} + \frac{\mathrm{R}_0 - \tilde{\mathrm{R}}}{2} \bigg)
+ w_1 M_{k_1} \bigg( \mathrm{C} + \frac{\mathrm{R}_1 - \tilde{\mathrm{R}}}{2} \bigg)
+ R (\mathrm{p}^{vm} - \mathrm{C}) \\
&= w_0 M_{k_0} \mathrm{C} + w_1 M_{k_1} \mathrm{C}
+ R (\mathrm{p}^{vm} - \mathrm{C})
+ w_0 M_{k_0} \frac{\mathrm{R}_0 - \tilde{\mathrm{R}}}{2}
+ w_1 M_{k_1} \frac{\mathrm{R}_1 - \tilde{\mathrm{R}}}{2} \\
&= (w_0 M_{k_0} + w_1 M_{k_1} ) \mathrm{C}
+ R (\mathrm{p}^{vm} - \mathrm{C})
+ w_0 M_{k_0} \frac{\mathrm{R}_0 - \tilde{\mathrm{R}}}{2}
+ w_1 M_{k_1} \frac{\mathrm{R}_1 - \tilde{\mathrm{R}}}{2} \\
&= M\mathrm{C}
+ R (\mathrm{p}^{vm} - \mathrm{C})
+ \bigg( w_0 M_{k_0} \frac{\mathrm{R}_0 - \tilde{\mathrm{R}}}{2}
+ w_1 M_{k_1} \frac{\mathrm{R}_1 - \tilde{\mathrm{R}}}{2} \bigg).
\end{align*}
That is, it just Equation (2) with an extra term.
Actually, the extra term can also be simplified further. We note that, when there are only two influencing bones, it is typical that their weights add up to $1$: $w_0 + w_1 = 1$. As a result, we have
\begin{align*}
& w_0 M_{k_0} \frac{\mathrm{R}_0 - \tilde{\mathrm{R}}}{2}
+ w_1 M_{k_1} \frac{\mathrm{R}_1 - \tilde{\mathrm{R}}}{2} \\
&= w_0 M_{k_0} \frac{\mathrm{R}_0 - w_0 \mathrm{R}_0 - w_1 \mathrm{R}_1}{2}
+ w_1 M_{k_1} \frac{\mathrm{R}_1 - w_0 \mathrm{R}_0 - w_1 \mathrm{R}_1}{2} \\
&= w_0 M_{k_0} \frac{(1-w_0)\mathrm{R}_0 - w_1 \mathrm{R}_1}{2}
+ w_1 M_{k_1} \frac{(1 - w_1) \mathrm{R}_1 - w_0 \mathrm{R}_0}{2} \\
&= w_0 M_{k_0} \frac{w_1\mathrm{R}_0 - w_1 \mathrm{R}_1}{2}
+ w_1 M_{k_1} \frac{w_0 \mathrm{R}_1 - w_0 \mathrm{R}_0}{2} \\
&= \frac{w_0 w_1}{2} M_{k_0} (\mathrm{R}_0 - \mathrm{R}_1)
+ \frac{w_0 w_1}{2} M_{k_1} (\mathrm{R}_1 - \mathrm{R}_0) \\
&= \frac{w_0 w_1}{2} (M_{k_0} - M_{k_1}) (\mathrm{R}_0 - \mathrm{R}_1).
\end{align*}
So, Equation (3) becomes:
\begin{align}
\mathrm{p}^{fin}
= M\mathrm{C} + R (\mathrm{p}^{vm} - \mathrm{C})
+ \frac{w_0 w_1}{2} (M_{k_0} - M_{k_1}) (\mathrm{R}_0 - \mathrm{R}_1).
\end{align} \qquad \qquad (4)
A property that all skinning algorithms should have is that, when the model is in resting pose, the algorithm should not change the vertex position. More precisely, if $M_{k_0} = M_{k_1} = I$, then $\mathrm{p}^{fin} = \mathrm{p}^{vm}$. Equation (4) makes it apparent that this property holds for the nagadomi's version of the SDEF algorithm. This is because, when $M_{k_0} = M_{k_1} = I$, the extra term vanishes, and $M = R = I$. The RHS thus reduces to $\mathrm{C} + (\mathrm{p}^{vm} - \mathrm{C}) = \mathrm{p}^{vm}.$ I note that the property does not hold for the sam42's version, so I'm less confident that it is a correct interpretation.
Equation (4) also makes me more puzzled about the meaning of $\mathrm{R}_0$ and $\mathrm{R}_1$ because, after all, it seems that we don't need them both. The whole algorithm can be specified with just $\mathrm{R}_0 - \mathrm{R}_1$.
Lastly, I'd like to mention that teh SDEF algorithm does not compute $\mathrm{C}$ in the same way that spherical blend skinning does. If it were so, then $\mathrm{C}$ would be unchanged for vertices influenced by the same pair of bones. However, I inspected some knee vertices of Tda-style Append Miku and found that the $\mathrm{C}$ value changes ever so slightly from vertex to vertex.
Dual Quaternion Skinning (QDEF)
Dual quaternion skinning was introduced by Kavan et al. in 2007. Being quite recent, it is not typically taught in computer graphics courses. A good material to learn about dual quaternion skinning is the 2007 paper itself.
The main idea of the paper is to make use of an algebraic structure called the dual quaternions to represent skinning transforms. The dual quaternions combine the quanterions with the dual numbers. While the former is well-known to a CG practioner, the latter is typically not, so we will introduce it here.
Dual Numbers
A dual number $\hat{a}$ is a number of the form $\hat{a} = a_0 + \varepsilon a_{\varepsilon}$, where $a_0$ and $a_\varepsilon$ are real numbers. The symbol $\varepsilon$ denotes a special number (in the same vein as $i$ of the complex numbers), called the dual unit, satisfying the identity $\varepsilon^2 = 0$. $a_0$ is called the real part of $\hat{a}$, and $a_\varepsilon$ is called the dual part. Addition and multiplication of dual numbers are defined in the usual way. The multiplicative inverse exists for any dual number whose real part is not zero:
\begin{align*}
\frac{1}{a_0 + \varepsilon a_\varepsilon} = \frac{1}{a_0} - \varepsilon \frac{a_\varepsilon}{a_0^2}.
\end{align*}
Moreover, the square root of a "positive" dual number is given by:
\begin{align*}
\sqrt{a_0 + \varepsilon a_\varepsilon}
= \sqrt{a_0} + \varepsilon \frac{a_\varepsilon}{2 \sqrt{a_0}}.
\end{align*}
Dual Quaternions
A dual quaternion is just a quaternion whose four components are dual number instead of real numbers:
\begin{align*}
\hat{\mathrm{q}}
&= \hat{q}_w + \hat{q}_x \mathrm{i} + \hat{q}_y \mathrm{j} + \hat{q}_z \mathrm{k} \\
&= (q_{0,w} + q_{\varepsilon,w}) + (q_{0,x} + q_{\varepsilon,x}) \mathrm{i} + (q_{0,y} + q_{\varepsilon,y}) \mathrm{j} + (q_{0,z} + q_{\varepsilon,z}) \mathrm{k}.
\end{align*}
In the system of the dual queternions, $\varepsilon$ commutes with $\mathrm{i}$, $\mathrm{j}$, and $\mathrm{k}$. As a result, we may also write:
\begin{align*}
\hat{\mathrm{q}}
&= (q_{0,w} + q_{\varepsilon,w}) + (q_{0,x} + q_{\varepsilon,x}) \mathrm{i} + (q_{0,y} + q_{\varepsilon,y}) \mathrm{j} + (q_{0,z} + q_{\varepsilon,z}) \mathrm{k} \\
&= (q_{0,w} + q_{0,x} \mathrm{i} + q_{0,y} \mathrm{j} + q_{0,z} \mathrm{k}) + \varepsilon (q_{\varepsilon,w} + q_{\varepsilon,x} \mathrm{i} + q_{\varepsilon,y} \mathrm{j} + q_{\varepsilon,z} \mathrm{k}) \\
&= \mathrm{q}_0 + \varepsilon \mathrm{q}_\varepsilon.
\end{align*}
That is, we can also view a dual quaternion as a dual number whose two components are quaternions. Again, addition and multiplication of dual quaternions are defined in the usual way. The quaternion conjugate of a dual quaternion is the same dual quaternion but with its component conjugated:
\begin{align*}
\hat{\mathrm{q}}^* = \mathrm{q}_0^* + \varepsilon \mathrm{q}_\epsilon^*.
\end{align*}
When we multiply a dual quaternion with its quaternion conjugate, we get a dual number. The norm of a dual quaternion is given by the square root of this product:
\begin{align*}
\| \hat{\mathrm{q}} \|
= \sqrt{ \hat{\mathrm{q}} \hat{\mathrm{q}}^* }
= \sqrt{ \hat{\mathrm{q}}^* \hat{\mathrm{q}} }
= \| \mathrm{q}_0 \| + \varepsilon \frac{\mathrm{q}_0 \cdot \mathrm{q}_\varepsilon}{\| \mathrm{q}_0 \|}.
\end{align*}
A unit dual quaternion is a dual quaternion with norm $1$. From the above quation, a dual quaternion $\hat{\mathrm{q}}$ is unit if and only if (1) its real part $\mathrm{q}_0$ is a unit quaternion, and (2) $\mathrm{q}_0 \cdot \mathrm{q}_\varepsilon = 0$.
Dual Quaternions and Skinning Transformations
It turns out that unit dual quaternions can represent skinning transformations.
First, if $\mathrm{q}_0$ is a unit quarternion, then $\mathrm{q} = \mathrm{q} + \varepsilon \times 0$ is also a unit dual quaternion. Hence, unit dual quaternions can represent all rotations because unit quaternions can.
Second, unit dual quaternions can also represent translations. A translation by vector $\mathrm{t} = (t_x, t_y, t_z)^T$ is equivalent of the unit dual quaternion
\begin{align*}
1 + \frac{\varepsilon}{2}(t_x \mathrm{i} + t_y \mathrm{j} + t_z \mathrm{k}).
\end{align*}
As such, any skinning transformation $M = TR$, represented by a translation vector $\mathrm{t}$ and a unit quaternion $\mathrm{q}$, is equivalent to the following unit dual quaternion
\begin{align}
\bigg [ 1 + \frac{\varepsilon}{2}(t_x \mathrm{i} + t_y \mathrm{j} + t_z \mathrm{k}) \bigg] \mathrm{q}
= \mathrm{q} + \frac{\varepsilon}{2}(t_x \mathrm{i} + t_y \mathrm{j} + t_z \mathrm{k}) \mathrm{q}.
\end{align} \qquad \qquad (5)
(While it is not obvious, one may show that the above is a unit dual quaternion.)
It also turns out that all unit dual quaternion can be written in the form of Equation (5). As a result, given a unit dual quaternion $\hat{\mathrm{q}} = \mathrm{q}_0 + \varepsilon \mathrm{q}_\varepsilon$, one can recover the translation vector and the quaternion that represent the associated skinning transformation as follows.
- The quaternion that represents the rotation is just $\mathrm{q}_0$.
- The translation vector $\mathrm{t}$ is given by the vector part of the quaternion $2\mathrm{q}_\varepsilon \mathrm{q}_0^*$.
The above procedure gives us the ability to convert a unit dual quaternion to a transformation matrix.
The Skinning Algorithm
When skinning a vertex, we are given skinning transformations $M_{k_0}$, $\dotsc$, $M_{k_3}$ and their associated weights $w_0$, $\dotsc$, $w_3$. The goal is to combine multiple skinning transformations to a single transformation $M$, so that we can apply $M$ to $\mathrm{p}^{vm}$ and its inverse transpose $M^{-T}$ to $\mathrm{n}^{bd}$.
Dual quaternion skinning first convert the skinning transformations into unit quaternion $\hat{\mathrm{q}}{k_0}$, $\dotsc$, $\hat{\mathrm{q}}{k_3}$, using Equation (5). It then blends the unit quaternions into a single one by computing the weighted sum and normalizing:
\begin{align*}
\hat{\mathrm{q}} = \frac{w_0 \hat{\mathrm{q}}_{k_0} + \dotsb + w_3 \hat{\mathrm{q}}_{k_3} }{\| w_0 \hat{\mathrm{q}}_{k_0} + \dotsb + w_3 \hat{\mathrm{q}}_{k_3} \|}.
\end{align*}
Note that the denominator in the above equation is a dual number. So the division is equivalent to multiplying the weight sum with the inverse of its norm. After the blended quaternion $\hat{\mathrm{q}}$ is computed, we can convert it to a matrix $M$ by the two step process of the last section.
One subtlety when implementing dual quaternion skinning is dealing with the fact that $\hat{\mathrm{q}}$ and $-\hat{\mathrm{q}}$ represent exactly the same transformation. Hence, when computing the sum $w_0 \hat{\mathrm{q}}_{k_0} + \dotsb + w_3 \hat{\mathrm{q}}_{k_3}$, one needs to choose the signs of the dual quaternions to prevent catastrophic cancellation. In case, there are only two influencing bones, I use the following code to blend the quaternions together:
$\hat{\mathrm{q}} := w_0 \hat{\mathrm{q}}_{k_0}$ for $i := 1$ to $3$ $\qquad$if $\hat{\mathrm{q}} \cdot \hat{\mathrm{q}}_{k_i} < 0$ then $\qquad\qquad \hat{\mathrm{q}}_{k_i} := -\hat{\mathrm{q}}_{k_i}$ $\qquad$endif $\qquad \hat{\mathrm{q}} := \hat{\mathrm{q}} + w_i \hat{\mathrm{q}}_{k_i}$ endfor $\hat{\mathrm{q}} := \hat{\mathrm{q}} / \| \hat{\mathrm{q}} \|$ |
GLSL Vertex Shader Code
Lastly, I provide a vertex shader, written in the OpenGL Shading Language (GLSL), that performs all the morphing and skinning describe in this article.
# version 150
# define MORPH_DISPLACEMENT_TEXTURE_WIDTH 4096
# define BONE_TRANSFORM_TEXTURE_WIDTH 256
# define MORPH_WEIGHT_TEXTURE_WIDTH 256
# define SDEF_PARAMS_TEXTURE_WIDTH 256
# define LINEAR_BLEND 0
# define SDEF 1
# define DUAL_QUATERNION 2
# define quatNorm length
in vec3 vert_position;
in vec3 vert_normal;
in vec2 vert_texCoord;
in vec4 vert_tangent;
in vec2 vert_morphStartAndCount;
in vec4 vert_boneIndex;
in vec4 vert_boneWeight;
in vec2 vert_skinningInfo;
uniform mat4 sys_modelMatrix;
uniform mat4 sys_normalMatrix;
uniform mat4 sys_viewMatrix;
uniform mat4 sys_projectionMatrix;
uniform sampler2DRect mesh_boneTransform;
uniform sampler2DRect mesh_morphDisplacement;
uniform sampler2DRect mesh_morphWeight;
uniform sampler2DRect mesh_sdefParams;
out vec3 geom_normal;
out vec2 geom_texCoord;
out vec4 geom_tangent;
struct Morph {
vec3 displacement;
int index;
};
vec4 quatMul(vec4 a, vec4 b) {
return vec4(cross(a.xyz, b.xyz) + a.w * b.xyz + b.w * a.xyz, a.w * b.w - dot(a.xyz, b.xyz));
}
vec4 quatConj(vec4 q) {
return vec4(-q.xyz, q.w);
}
vec2 dualNumMul(vec2 a, vec2 b) {
return vec2(a.x * b.x, a.x * b.y + a.y * b.x);
}
vec2 dualNumConj(vec2 a) {
return vec2(a.x, -a.y);
}
vec2 dualNumSqrt(vec2 a) {
return vec2(sqrt(a.x), a.y / (2*sqrt(a.x)));
}
vec2 dualNumInv(vec2 a) {
return vec2(1.0 / a.x, -a.y / (a.x * a.x));
}
struct DualQuat {
vec4 q0;
vec4 qe;
};
DualQuat dualQuatAdd(DualQuat a, DualQuat b) {
return DualQuat(a.q0 + b.q0, a.qe + b.qe);
}
DualQuat dualQuatScale(DualQuat a, float c) {
return DualQuat(a.q0 * c, a.qe * c);
}
DualQuat dualQuatMul(DualQuat a, DualQuat b) {
return DualQuat(quatMul(a.q0, b.q0), quatMul(a.q0, b.qe) + quatMul(a.qe, b.q0));
}
vec2 dualQuatNorm(DualQuat a) {
float q0Norm = quatNorm(a.q0);
return vec2(q0Norm, dot(a.q0, a.qe) / q0Norm);
}
DualQuat dualQuatFromDualNum(vec2 a) {
return DualQuat(vec4(0,0,0,a.x), vec4(0,0,0,a.y));
}
DualQuat dualQuatNormalize(DualQuat a) {
vec2 norm = dualQuatNorm(a);
vec2 normInv = dualNumInv(norm);
return dualQuatMul(a, dualQuatFromDualNum(normInv));
}
Morph getMorph(int morphStart, int morphOrder) {
int index = morphStart + morphOrder;
float u = index - MORPH_DISPLACEMENT_TEXTURE_WIDTH * (index / MORPH_DISPLACEMENT_TEXTURE_WIDTH) + 0.5;
float v = index / MORPH_DISPLACEMENT_TEXTURE_WIDTH + 0.5;
vec4 disp = texture(mesh_morphDisplacement, vec2(u, v));
Morph morph;
morph.displacement = disp.xyz;
morph.index = int(round(disp.w));
return morph;
}
float getMorphWeight(int morphIndex) {
float u = morphIndex - MORPH_WEIGHT_TEXTURE_WIDTH * (morphIndex / MORPH_WEIGHT_TEXTURE_WIDTH) + 0.5;
float v = morphIndex / MORPH_WEIGHT_TEXTURE_WIDTH + 0.5;
float weight = texture(mesh_morphWeight, vec2(u, v)).x;
return weight;
}
vec4 readBoneTexture(int pixelIndex) {
float u = pixelIndex - BONE_TRANSFORM_TEXTURE_WIDTH * (pixelIndex / BONE_TRANSFORM_TEXTURE_WIDTH) + 0.5;
float v = pixelIndex / BONE_TRANSFORM_TEXTURE_WIDTH + 0.5;
return texture(mesh_boneTransform, vec2(u, v));
}
struct RigidBodyXform {
vec3 disp;
vec4 rot;
};
DualQuat dualQuatFromTranslation(vec3 t) {
return DualQuat(vec4(0,0,0,1), vec4(t*0.5, 0));
}
DualQuat dualQuatFromQuat(vec4 q) {
return DualQuat(q, vec4(0,0,0,0));
}
DualQuat dualQuatFromRigidBodyXform(RigidBodyXform xform) {
return dualQuatMul(
dualQuatFromTranslation(xform.disp),
dualQuatFromQuat(xform.rot));
}
RigidBodyXform getBoneRigidBodyXform(int boneIndex) {
RigidBodyXform xform;
xform.disp = readBoneTexture(2*boneIndex+1).xyz;
xform.rot = readBoneTexture(2*boneIndex);
return xform;
}
RigidBodyXform rigidBodyXformFromDualQuat(DualQuat a) {
return RigidBodyXform(
2*quatMul(a.qe, quatConj(a.q0)).xyz,
a.q0);
}
// This code is taken from
// http://www.euclideanspace.com/maths/geometry/rotations/conversions/quaternionToMatrix/index.htm
mat4 getRotationMatrix(vec4 q) {
mat4 m;
m[3] = vec4(0, 0, 0, 1);
float sqw = q.w*q.w;
float sqx = q.x*q.x;
float sqy = q.y*q.y;
float sqz = q.z*q.z;
// invs (inverse square length) is only required if quaternion is not already normalised
float invs = 1 / (sqx + sqy + sqz + sqw);
m[0][0] = (sqx - sqy - sqz + sqw)*invs;// since sqw + sqx + sqy + sqz =1/invs*invs
m[1][1] = (-sqx + sqy - sqz + sqw)*invs;
m[2][2] = (-sqx - sqy + sqz + sqw)*invs;
float tmp1 = q.x*q.y;
float tmp2 = q.z*q.w;
m[0][1] = 2.0 * (tmp1 + tmp2)*invs;
m[1][0] = 2.0 * (tmp1 - tmp2)*invs;
tmp1 = q.x*q.z;
tmp2 = q.y*q.w;
m[0][2] = 2.0 * (tmp1 - tmp2)*invs;
m[2][0] = 2.0 * (tmp1 + tmp2)*invs;
tmp1 = q.y*q.z;
tmp2 = q.x*q.w;
m[1][2] = 2.0 * (tmp1 + tmp2)*invs;
m[2][1] = 2.0 * (tmp1 - tmp2)*invs;
m[0][3] = 0;
m[1][3] = 0;
m[2][3] = 0;
m[3][3] = 1;
return m;
}
mat4 getMatrix(RigidBodyXform xform) {
mat4 m = getRotationMatrix(xform.rot);
m[3] = vec4(xform.disp, 1);
return m;
}
mat4 getInterverseMatrix(RigidBodyXform xform) {
mat4 m = getRotationMatrix(xform.rot);
m = transpose(m);
vec4 t = vec4(-xform.disp, 1);
t = m * t;
m[3] = t;
return m;
}
mat4 getBoneMatrix(int boneIndex) {
return getMatrix(getBoneRigidBodyXform(boneIndex));
}
struct VertexConfig {
vec3 p;
vec3 t;
vec3 n;
};
vec3 getBitangent(VertexConfig vc) {
return normalize(cross(vc.n,vc.t));
}
VertexConfig transformVertexConfig(mat4 M, VertexConfig vc) {
vec3 p = (M * vec4(vc.p, 1)).xyz;
vec3 t = normalize(M * vec4(vc.t, 0)).xyz;
vec3 n = normalize(transpose(inverse(M)) * vec4(vc.n, 0)).xyz;
return VertexConfig(p, t, n);
}
VertexConfig skinWithLinearBlending(VertexConfig vc) {
mat4 M = mat4(0);
for (int i=0;i<4;i++) {
int boneIndex = int(vert_boneIndex[i]);
if (boneIndex < 0) {
continue;
}
mat4 m = getBoneMatrix(boneIndex);
M += vert_boneWeight[i] * m;
}
M /= M[3][3];
return transformVertexConfig(M, vc);
}
VertexConfig skinWithDualQuat(VertexConfig vc) {
DualQuat sum = DualQuat(vec4(0,0,0,0), vec4(0,0,0,0));
for (int i=0;i<4;i++) {
int boneIndex = int(vert_boneIndex[i]);
if (boneIndex < 0) {
continue;
}
RigidBodyXform xform = getBoneRigidBodyXform(boneIndex);
DualQuat Q = dualQuatFromRigidBodyXform(xform);
if (dot(Q.q0, sum.q0) < 0) {
Q = dualQuatScale(Q, -1);
}
sum = dualQuatAdd(sum, dualQuatScale(Q, vert_boneWeight[i]));
}
DualQuat Q = dualQuatNormalize(sum);
mat4 M = getMatrix(rigidBodyXformFromDualQuat(Q));
return transformVertexConfig(M, vc);
}
struct SdefParams {
vec3 C;
vec3 R0;
vec3 R1;
};
vec3 readSdefParamsTexture(int pixelIndex) {
float u = pixelIndex - SDEF_PARAMS_TEXTURE_WIDTH * (pixelIndex / SDEF_PARAMS_TEXTURE_WIDTH) + 0.5;
float v = pixelIndex / SDEF_PARAMS_TEXTURE_WIDTH + 0.5;
return texture(mesh_sdefParams, vec2(u, v)).xyz;
}
SdefParams getSdefParams(int sdefVertexIndex) {
SdefParams params;
params.C = readSdefParamsTexture(3 * sdefVertexIndex);
params.R0 = readSdefParamsTexture(3 * sdefVertexIndex + 1);
params.R1 = readSdefParamsTexture(3 * sdefVertexIndex + 2);
return params;
}
vec4 quatBlend(vec4 q1, float w1, vec4 q2, float w2) {
float dotProd = dot(q1, q2);
if (dotProd < 0) {
q2 = -q2;
}
vec4 q = w1*q1 + w2*q2;
return normalize(q);
}
// SDEF code from https://gist.github.com/nagadomi/aa39745ae6716b50c2a60288b093d14b
VertexConfig skinWithSdef(VertexConfig vc) {
int boneIndex0 = int(round(vert_boneIndex[0]));
int boneIndex1 = int(round(vert_boneIndex[1]));
RigidBodyXform rbXform0 = getBoneRigidBodyXform(boneIndex0);
RigidBodyXform rbXform1 = getBoneRigidBodyXform(boneIndex1);
float bw0 = vert_boneWeight.x;
float bw1 = vert_boneWeight.y;
mat4 M0 = getMatrix(rbXform0);
mat4 M1 = getMatrix(rbXform1);
mat4 blendedM = bw0*M0 + bw1*M1;
vec4 qBlended = quatBlend(rbXform0.rot, bw0, rbXform1.rot, bw1);
mat4 qBlendedMat = getRotationMatrix(qBlended);
SdefParams params = getSdefParams(int(round(vert_skinningInfo.y)));
vec3 R0 = params.R0;
vec3 R1 = params.R1;
vec3 C = params.C;
vec3 Cpos = vc.p - C;
vec3 blendedR = bw0 * R0 + bw1 * R1;
vec3 p =
(
blendedM * vec4(C,1)
+ qBlendedMat * vec4(Cpos,0)
+ bw0 * bw1 * (M0 - M1) * vec4(R0 - R1, 0) / 2
).xyz;
vec3 t = (qBlendedMat * vec4(vc.t, 0)).xyz;
vec3 n = (qBlendedMat * vec4(vc.n, 0)).xyz;
return VertexConfig(p, t, n);
}
void main() {
vec3 morphedPosition = vert_position;
int morphCount = int(vert_morphStartAndCount.y);
int morphStart = int(vert_morphStartAndCount.x);
if (morphCount > 0) {
for (int morphOrder=0; morphOrder < morphCount; morphOrder++) {
Morph morph = getMorph(morphStart, morphOrder);
float weight = getMorphWeight(morph.index);
morphedPosition += weight * morph.displacement;
}
}
VertexConfig inVc = VertexConfig(morphedPosition, vert_tangent.xyz, vert_normal);
VertexConfig posedVc;
if (vert_skinningInfo.x == LINEAR_BLEND) {
posedVc = skinWithLinearBlending(inVc);
} else if (vert_skinningInfo.x == SDEF) {
posedVc = skinWithSdef(inVc);
} else {
posedVc = skinWithDualQuat(inVc);
}
vec4 worldPosition = sys_modelMatrix * vec4(posedVc.p, 1);
vec4 cameraPosition = sys_viewMatrix * vec4(worldPosition);
vec4 clipPosition = sys_projectionMatrix * vec4(cameraPosition);
vec3 worldNormal = normalize((sys_normalMatrix * vec4(posedVc.n, 0.0)).xyz);
vec3 worldTangent = normalize((sys_modelMatrix * vec4(posedVc.t, 0.0)).xyz);
float handedNess = determinant(sys_modelMatrix) < 0 ? -1 : 1;
geom_normal = worldNormal;
geom_texCoord = vert_texCoord;
geom_tangent = vec4(worldTangent, handedNess);
gl_Position = clipPosition;
}