[Robotics] Orientation Parameterizations
In the previous post we built rigid-body kinematics on the foundation of the rotation matrix $R \in SO(3)$: a $3 \times 3$ orthonormal array whose columns are the unit axes of one frame written in another. The rotation matrix is unrivalled as an operator—to rotate a point, multiply by $R$—but as a parameterization it is unwieldy. It stores nine numbers subject to six orthonormality constraints, leaving only three independent degrees of freedom. A human operator could hardly type all nine into a teach pendant; an optimizer cannot follow gradients on the constraint surface without projection; a neural network cannot output unconstrained tensors and call them rotations.
This post surveys the classical and modern parameterizations of $SO(3)$—compact coordinate systems for orientation. We work through four representations in detail: Cayley’s three-parameter formula (mostly of historical and algebraic interest), Euler angles and their fixed-axis cousins (the work-horse of teach pendants and human-readable orientation specifications), the axis-angle representation guaranteed by Euler’s rotation theorem, and the unit quaternion (the de facto internal representation in modern robotics, simulation, and graphics codebases). We close with a recent development from the deep-learning era—the 6D continuous representation of Zhou et al. (2019)—which resolves a subtle topological obstruction every classical parameterization suffers from. The next post will revisit all of these from the unifying perspective of Lie groups and Lie algebras, where the matrix exponential plays the central role.
Cayley’s Three-Parameter Form
A purely algebraic result, Cayley’s formula for orthonormal matrices, states that every proper orthonormal $R$ can be written as
\[R = (I_3 - S)^{-1}(I_3 + S),\]for some skew-symmetric $S = -S^T$. A $3 \times 3$ skew-symmetric matrix is specified by three scalars,
\[S = \begin{bmatrix} 0 & -s_z & s_y \\\\ s_z & 0 & -s_x \\\\ -s_y & s_x & 0 \end{bmatrix},\]confirming that three parameters always suffice in principle. In practice, more geometrically meaningful parameterizations are preferred.
Euler Angles, Fixed Angles, and Roll-Pitch-Yaw
A common family of parameterizations performs three successive elementary rotations about specified axes. Two flavors arise:
- Fixed-angle conventions (e.g., X-Y-Z). Start with $\{B\}$ coincident with $\{A\}$; rotate about the fixed $\hat{X}_A$ by $\gamma$, then about the fixed $\hat{Y}_A$ by $\beta$, then about the fixed $\hat{Z}_A$ by $\alpha$. The composed rotation is
This convention is often called roll-pitch-yaw, though usage varies.
- Euler-angle conventions (e.g., Z-Y-X). Start coincident; rotate about $\hat{Z}_B$ by $\alpha$, then about the new $\hat{Y}_B$ by $\beta$, then about the still-newer $\hat{X}_B$ by $\gamma$. Each rotation is about an axis of the moving (current) frame. The resulting matrix is
which is, perhaps surprisingly, identical to the fixed-angle X-Y-Z product taken in reverse order. The general fact is that three rotations about fixed axes give the same final orientation as the same three rotations about moving axes performed in opposite order.
There are 24 such conventions in total (12 fixed-angle, 12 Euler-angle), giving rise to a notational thicket; Craig devotes Appendix B to listing them all. The choice is largely a matter of taste, but inconsistencies between conventions are a notorious source of bugs.
All three-parameter representations of $SO(3)$ are known to suffer from singularities (often called gimbal lock), at which one degree of rotational freedom is instantaneously lost in the parameterization. For instance, in the X-Y-Z fixed angles the recovery formulas degenerate when $\beta = \pm 90^\circ$.
A Worked Convention: ZYZ Euler Angles
The XYZ fixed-angle convention developed above is just one of the twelve possible Euler-style triples. To make the singularity phenomenon and the inverse-problem algebra fully concrete, we follow Siciliano et al. and work through the ZYZ Euler angles explicitly. This is the convention most commonly used in classical mechanics and in the analysis of spherical wrists.
Starting from a frame initially coincident with the reference frame and writing $\boldsymbol{\phi} = [\varphi \; \vartheta \; \psi]^T$, the three successive rotations are:
- rotate about $\hat{Z}$ by $\varphi$, giving $R_z(\varphi)$;
- rotate about the new (current) $\hat{Y}’$ axis by $\vartheta$, giving $R_{y’}(\vartheta)$;
- rotate about the still newer (current) $\hat{Z}’’$ axis by $\psi$, giving $R_{z’’}(\psi)$.
Because each rotation is made about an axis of the moving frame, composition is by post-multiplication, and the resulting matrix is
\[R_{ZYZ}(\varphi, \vartheta, \psi) = R_z(\varphi) \, R_y(\vartheta) \, R_z(\psi).\]Expanding the product and adopting Siciliano’s shorthand $c_\bullet = \cos(\bullet)$, $s_\bullet = \sin(\bullet)$,
\[R_{ZYZ}(\varphi, \vartheta, \psi) = \begin{bmatrix} c_\varphi c_\vartheta c_\psi - s_\varphi s_\psi & -c_\varphi c_\vartheta s_\psi - s_\varphi c_\psi & c_\varphi s_\vartheta \\\\ s_\varphi c_\vartheta c_\psi + c_\varphi s_\psi & -s_\varphi c_\vartheta s_\psi + c_\varphi c_\psi & s_\varphi s_\vartheta \\\\ -s_\vartheta c_\psi & s_\vartheta s_\psi & c_\vartheta \end{bmatrix}.\]The inverse problem is to recover $(\varphi, \vartheta, \psi)$ from a numerically given matrix $R = [r_{ij}]$. Comparing the entries of $R$ with the symbolic matrix above and assuming for now $s_\vartheta \neq 0$, one obtains from $r_{13}$, $r_{23}$, and $r_{33}$ the half of the solution
\[\varphi = \mathrm{Atan2}(r_{23}, r_{13}), \qquad \vartheta = \mathrm{Atan2}\!\left( \sqrt{r_{13}^2 + r_{23}^2}, \; r_{33} \right),\]while $r_{31}$ and $r_{32}$ yield
\[\psi = \mathrm{Atan2}(r_{32}, -r_{31}).\]The choice of the positive root for $\sqrt{r_{13}^2 + r_{23}^2}$ restricts $\vartheta$ to the open interval $(0, \pi)$; the opposite sign choice gives a second, equally valid solution with $\vartheta \in (-\pi, 0)$ and $\varphi, \psi$ shifted by $\pi$. The use of $\mathrm{Atan2}$ rather than $\arctan$ is essential to disambiguate the quadrant.
The ZYZ Euler angle map $(\varphi, \vartheta, \psi) \mapsto R_{ZYZ}(\varphi, \vartheta, \psi)$ fails to be a local diffeomorphism whenever $s_\vartheta = 0$, i.e. at $\vartheta = 0$ or $\vartheta = \pi$. At these configurations the first and third rotation axes coincide (up to sign), and only the sum $\varphi + \psi$ (at $\vartheta = 0$) or the difference $\varphi - \psi$ (at $\vartheta = \pi$) is determined by $R$; the individual values $\varphi$ and $\psi$ are not.
This loss of one parameter at the singularity is the abstract source of “gimbal lock.” It is a topological obstruction—no smooth chart of three real numbers can cover all of $SO(3)$, since $SO(3)$ is not diffeomorphic to any open subset of $\mathbb{R}^3$. Every minimal parameterization, ZYZ or otherwise, carries such a defect somewhere.
It is worth comparing the ZYZ convention to the roll-pitch-yaw (RPY) convention introduced above. The RPY (XYZ-fixed-axis) sequence corresponds to the ZYX moving-axis (Tait-Bryan) sequence, and Siciliano’s $R_{RPY}(\varphi, \vartheta, \psi) = R_z(\varphi) R_y(\vartheta) R_x(\psi)$ has its own singularity at $\vartheta = \pm \pi/2$. The duality between moving-axis Euler angles and fixed-axis “RPY” angles—already announced in the previous subsection—is therefore a general statement: any moving-axis sequence is equivalent to the fixed-axis sequence taken in reverse order.
Equivalent Axis-Angle and Euler’s Theorem
A different parameterization records a unit axis $\hat{K} = (k_x, k_y, k_z)^T$ and a single angle $\theta$ of rotation about that axis. Euler’s theorem on rotation guarantees that every $R \in SO(3)$ admits such an axis-angle representation. The forward formula (Rodrigues’) is
\[R_K(\theta) = \begin{bmatrix} k_x k_x v\theta + c\theta & k_x k_y v\theta - k_z s\theta & k_x k_z v\theta + k_y s\theta \\\\ k_x k_y v\theta + k_z s\theta & k_y k_y v\theta + c\theta & k_y k_z v\theta - k_x s\theta \\\\ k_x k_z v\theta - k_y s\theta & k_y k_z v\theta + k_x s\theta & k_z k_z v\theta + c\theta \end{bmatrix},\]where $c\theta = \cos\theta$, $s\theta = \sin\theta$, and $v\theta = 1 - \cos\theta$. The inverse problem—recovering $(\hat{K}, \theta)$ from $R$—is solved by
\[\theta = \cos^{-1}\!\left( \frac{r_{11} + r_{22} + r_{33} - 1}{2} \right), \qquad \hat{K} = \frac{1}{2 \sin\theta} \begin{bmatrix} r_{32} - r_{23} \\\\ r_{13} - r_{31} \\\\ r_{21} - r_{12} \end{bmatrix},\]which is well-defined except at $\theta = 0$ (axis undefined) and $\theta = \pi$ (sign ambiguity).
Unit Quaternion Representation
The axis-angle representation overcomes the singularities of the three-parameter Euler conventions but introduces a small annoyance of its own: the rotation $(-\hat{K}, -\theta)$ produces exactly the same $R$ as $(\hat{K}, \theta)$, and the axis is undefined whenever $\sin\theta = 0$. Siciliano et al. observe that both inconveniences disappear if one passes from axis-angle to the closely related unit quaternion, also known as the Euler parameters of the rotation.
A unit quaternion is an ordered pair $Q = \{\eta, \boldsymbol{\epsilon}\}$ with $\eta \in \mathbb{R}$ and $\boldsymbol{\epsilon} = [\epsilon_x \; \epsilon_y \; \epsilon_z]^T \in \mathbb{R}^3$ satisfying $$ \eta^2 + \boldsymbol{\epsilon}^T \boldsymbol{\epsilon} = \eta^2 + \epsilon_x^2 + \epsilon_y^2 + \epsilon_z^2 = 1. $$ The scalar $\eta$ is called the scalar part and $\boldsymbol{\epsilon}$ the vector part of the quaternion. The set of unit quaternions is the three-sphere $S^3 \subset \mathbb{R}^4$.
For a rotation by angle $\vartheta$ about a unit axis $\hat{K}$, the associated unit quaternion is
\[\eta = \cos\frac{\vartheta}{2}, \qquad \boldsymbol{\epsilon} = \sin\frac{\vartheta}{2}\, \hat{K}.\]Two key observations follow at once. First, replacing $(\hat{K}, \vartheta)$ by $(-\hat{K}, -\vartheta)$ leaves both $\eta$ and $\boldsymbol{\epsilon}$ unchanged, so the axis-angle sign ambiguity is automatically resolved. Second, the quaternions $Q$ and $-Q$ correspond to the same rotation (they describe the same axis with angles differing by $2\pi$). This two-to-one correspondence
\[S^3 \twoheadrightarrow SO(3), \qquad \{Q, -Q\} \mapsto R,\]is the celebrated double cover of $SO(3)$ by the unit quaternions, a fact ultimately responsible for the existence of spinors in physics.
The forward map $Q \to R$ is given (Siciliano §2.6) by the closed form
\[R(\eta, \boldsymbol{\epsilon}) = \begin{bmatrix} 2(\eta^2 + \epsilon_x^2) - 1 & 2(\epsilon_x \epsilon_y - \eta \epsilon_z) & 2(\epsilon_x \epsilon_z + \eta \epsilon_y) \\\\ 2(\epsilon_x \epsilon_y + \eta \epsilon_z) & 2(\eta^2 + \epsilon_y^2) - 1 & 2(\epsilon_y \epsilon_z - \eta \epsilon_x) \\\\ 2(\epsilon_x \epsilon_z - \eta \epsilon_y) & 2(\epsilon_y \epsilon_z + \eta \epsilon_x) & 2(\eta^2 + \epsilon_z^2) - 1 \end{bmatrix}.\]The inverse map $R \to Q$, choosing the branch $\eta \ge 0$, is
\[\eta = \tfrac{1}{2}\sqrt{r_{11} + r_{22} + r_{33} + 1},\] \[\boldsymbol{\epsilon} = \tfrac{1}{2} \begin{bmatrix} \mathrm{sgn}(r_{32} - r_{23}) \sqrt{r_{11} - r_{22} - r_{33} + 1} \\\\ \mathrm{sgn}(r_{13} - r_{31}) \sqrt{r_{22} - r_{33} - r_{11} + 1} \\\\ \mathrm{sgn}(r_{21} - r_{12}) \sqrt{r_{33} - r_{11} - r_{22} + 1} \end{bmatrix}.\]Crucially, no singularity occurs in the inverse formula: every rotation in the angular range $\vartheta \in [-\pi, \pi]$ can be reconstructed, in contrast to the axis-angle inverse which fails at $\sin\vartheta = 0$.
Composition. The product of rotations corresponds to a special non-commutative product of quaternions. For $Q_1 = {\eta_1, \boldsymbol{\epsilon}_1}$ and $Q_2 = {\eta_2, \boldsymbol{\epsilon}_2}$ the quaternion product is
\[Q_1 \star Q_2 = \{\, \eta_1 \eta_2 - \boldsymbol{\epsilon}_1^T \boldsymbol{\epsilon}_2, \; \eta_1 \boldsymbol{\epsilon}_2 + \eta_2 \boldsymbol{\epsilon}_1 + \boldsymbol{\epsilon}_1 \times \boldsymbol{\epsilon}_2 \,\}.\]If $Q_1, Q_2$ are the unit quaternions of rotations $R_1, R_2$, then $Q_1 \star Q_2$ is the unit quaternion of $R_1 R_2$. The identity element of the quaternion group is ${1, \mathbf{0}}$, and the inverse $Q^{-1}$ corresponds to $R^T$ and is simply
\[Q^{-1} = \{\eta, -\boldsymbol{\epsilon}\}.\]Compared to all other classical parameterizations, unit quaternions offer:
- No singularities. The map $Q \to R$ is globally smooth; conversion $R \to Q$ has no division by $\sin\vartheta$.
- Minimal redundancy. Only 4 numbers with 1 constraint, versus 9 numbers and 6 constraints for a rotation matrix.
- Cheap composition and inversion. Quaternion multiplication uses ~16 multiplications; matrix multiplication uses 27. Inversion is trivial (negate the vector part).
- Stable interpolation. Spherical linear interpolation (SLERP) along the unit three-sphere produces constant-angular-velocity paths between two orientations—the standard tool for keyframe animation and trajectory generation.
- Easy normalisation. Drift away from $\|Q\| = 1$ is corrected by a single division; drift away from $R^T R = I$ requires an expensive orthogonalisation.
These advantages make the unit quaternion the de facto internal representation for orientation in most modern robotics, simulation, and graphics codebases (ROS tf2, Eigen Quaternion, MuJoCo, Drake).
Comparison of Orientation Representations
Each of the parameterizations discussed above trades off a different combination of compactness, smoothness, interpretability, and computational cost. The following table—synthesising the discussions in Craig, Spong, Siciliano, and Lynch & Park—summarises their relative strengths.
| Representation | Numbers | Constraints | Singularities | Composition | Best for |
|---|---|---|---|---|---|
| Rotation matrix $R \in SO(3)$ | 9 | 6 (orthonormality) | none | matrix multiply $R_1 R_2$ | acting on points; analytic derivations |
| Euler / RPY angles $(\alpha, \beta, \gamma)$ | 3 | 0 | gimbal lock at one of $\beta = 0, \pm\pi/2, \pi$ | three trig matrix multiplies | human input; teach pendants |
| Axis-angle $(\hat{K}, \theta)$ | 4 | 1 ($|\hat{K}| = 1$) | axis undefined at $\theta = 0$; sign ambiguity at $\theta = \pi$ | not closed under simple algebra | geometric reasoning; one-shot rotations |
| Unit quaternion $Q = {\eta, \boldsymbol{\epsilon}}$ | 4 | 1 ($|Q| = 1$) | none (double cover ${Q, -Q}$) | quaternion product $Q_1 \star Q_2$ | storage, composition, interpolation (SLERP) |
In practice one rarely sticks to a single representation. A common pattern—explicit in the design of Drake’s RigidTransform and ROS’s tf2—is to store orientations as unit quaternions, compose and integrate angular velocity in quaternion form, present orientations to the human user as roll-pitch-yaw, and convert to $R \in SO(3)$ on the fly whenever a matrix-vector product on a point is needed. Each conversion is a closed-form, constant-cost operation; no representation is fundamentally privileged.
These three “minimal” parameterizations—Euler/fixed angles, axis-angle, and (its close cousin) the unit quaternion—all sit on a deeper Lie-group / Lie-algebra structure. The matrix logarithm sends $SO(3)$ to its tangent space at the identity, the Lie algebra $\mathfrak{so}(3)$ of $3 \times 3$ skew-symmetric matrices; the exponential map sends it back. We will develop this geometric viewpoint systematically in the next post on the Lie groups $SO(3)$ and $SE(3)$ and their associated screw / twist coordinates.
Continuity for Learning: The 6D Representation
A more recent perspective on parameterizations of $SO(3)$—motivated by deep learning but with consequences throughout modern robot perception—concerns the topological continuity of the map from parameter space to the rotation group. The point, made precisely by Zhou et al. (CVPR 2019), is that every classical three- and four-parameter representation surveyed above fails to be a continuous embedding of $SO(3)$ into a Euclidean parameter space. For analytic work this is harmless, but for a regression network whose final layer outputs a rotation—a learned pose estimator, an end-to-end visuomotor policy, a diffusion model that denoises into the manipulator’s task space—a discontinuous parameterization forces the network to approximate a discontinuous function and degrades both convergence and accuracy.
The Topological Obstruction
As a manifold, $SO(3)$ is homeomorphic to the real projective space $\mathbb{RP}^3$: a closed, non-orientable $3$-manifold with fundamental group $\mathbb{Z}/2\mathbb{Z}$. Topology alone forbids any continuous embedding of $\mathbb{RP}^3$ into $\mathbb{R}^d$ for $d \le 4$. Any continuous map $f : SO(3) \to \mathbb{R}^d$ admitting a continuous left-inverse must therefore fail somewhere—either by identifying distinct rotations (many-to-one collapse) or by mapping nearby rotations to far-apart parameter points (a “jump”). The classical parameterizations each exhibit one symptom or the other:
- Euler / RPY angles ($\mathbb{R}^3$). Gimbal-lock singularities collapse a degree of freedom at $\beta = 0, \pm \pi/2, \pi$; in addition, angle values are defined only modulo $2\pi$, so the map $(\alpha, \beta, \gamma) \mapsto R$ is many-to-one with discontinuous inverse.
- Axis-angle $\theta \hat{K} \in \mathbb{R}^3$. The forward map is continuous, but at $\theta = \pi$ the rotation about $\hat{K}$ coincides with the rotation about $-\hat{K}$, so the boundary sphere $|\theta \hat{K}| = \pi$ is antipodally identified and the inverse is discontinuous across antipodes.
- Unit quaternion $Q \in S^3 \subset \mathbb{R}^4$. The double cover $Q \sim -Q$ gives $SO(3) \cong S^3 / {\pm 1}$. A continuous section $SO(3) \to S^3$ (i.e. a consistent choice of sign) does not exist globally; selecting one hemisphere produces a discontinuity at the equator.
There is no representation $f : SO(3) \to \mathbb{R}^d$ for $d \le 4$ that is continuous and admits a continuous left-inverse $g : f(SO(3)) \to SO(3)$. Equivalently, $SO(3)$ does not embed continuously into $\mathbb{R}^d$ for $d \le 4$. The minimum embedding dimension is $d = 5$, and analogous obstructions hold for $SO(n)$ with appropriate dimensional thresholds.
A Continuous 6D Representation via Gram–Schmidt
Zhou et al. resolve the obstruction by enlarging the parameter space to $\mathbb{R}^6$ and constructing a continuous decoding map. The recipe is striking in its simplicity. Identify $\mathbb{R}^6$ with the space of ordered pairs $(\mathbf{a}_1, \mathbf{a}_2)$ of $\mathbb{R}^3$ vectors, restricted to the open dense subset on which $\mathbf{a}_1 \ne \mathbf{0}$ and $\mathbf{a}_2$ is not parallel to $\mathbf{a}_1$. Apply Gram–Schmidt orthonormalisation to extract two orthonormal vectors,
\[\mathbf{b}_1 = \frac{\mathbf{a}_1}{\|\mathbf{a}_1\|}, \qquad \mathbf{b}_2 = \frac{\mathbf{a}_2 - (\mathbf{b}_1 \cdot \mathbf{a}_2)\, \mathbf{b}_1}{\|\mathbf{a}_2 - (\mathbf{b}_1 \cdot \mathbf{a}_2)\, \mathbf{b}_1\|},\]and complete the right-handed orthonormal frame by a cross product, $\mathbf{b}_3 = \mathbf{b}_1 \times \mathbf{b}_2$. Stacking yields a rotation matrix
\[R = \big[\, \mathbf{b}_1 \;\; \mathbf{b}_2 \;\; \mathbf{b}_3 \,\big] \in SO(3).\]This map $g_{6D} : \mathbb{R}^6 \to SO(3)$ is continuous, surjective, and admits a continuous right-inverse—namely, return the first two columns of $R$ as a $\mathbb{R}^6$ vector. Conceptually the construction is “forget the third column”—since the third column is determined by the first two via the right-hand rule, $6$ scalars suffice to encode a rotation, and the encoding never crosses an antipodal identification. A leaner 5D variant trims the redundant scale of $\mathbf{a}_1$ to achieve the theoretical minimum embedding dimension, but the 6D form is the one almost universally adopted because Gram–Schmidt is trivially differentiable and well-behaved on GPUs.
A rotation $R \in SO(3)$ is encoded by its first two columns $(\mathbf{b}_1, \mathbf{b}_2) \in \mathbb{R}^6$. The decoding map $\mathbb{R}^6 \to SO(3)$, defined by Gram–Schmidt orthonormalisation of the pair followed by a cross product to complete the right-handed frame, is continuous, surjective, and admits a continuous one-sided inverse. It is free of the topological obstructions inherent in $\le 4$-dimensional parameterisations and is the *de facto* output parameterisation for rotation in modern neural-network architectures.
Implications for Robotics
Modern robotic systems increasingly contain a learned component that produces an orientation—6-DoF pose regression from images for grasping, learned inverse-kinematics modules, diffusion or transformer-based visuomotor policies that emit end-effector targets in $SE(3)$. Whenever the network’s output layer parameterises a rotation, a discontinuous parameterisation compels the network to approximate a discontinuous regression target—impossible in the limit of a smooth model class. Empirically, replacing quaternion or axis-angle output heads with the 6D representation typically reduces mean angular error on standard pose-estimation benchmarks by a meaningful margin, accelerates convergence, and improves generalisation, with no change to the rest of the architecture. The result has become a default in libraries such as pytorch3d, in policy learning frameworks (Diffusion Policy, RT-2, ACT), and in the output heads of dense 6-DoF grasp networks. For the analytic robot-kinematics treated in the remainder of this series the choice is moot: the classical parameterisations are perfectly fine for forward kinematics, Newton-iteration IK, and feedback control. But for any pipeline whose model class is “a smooth function from $\mathbb{R}^k$ to rotations,” the 6D representation is the topologically correct first choice.
Summary
We have surveyed the parameterizations of $SO(3)$ used throughout robotics and learning:
- The rotation matrix $R \in SO(3)$ is excellent as an operator but wasteful as a parameterization (9 numbers, 6 constraints).
- Cayley’s formula $R = (I - S)^{-1}(I + S)$ recovers any $R$ from a skew-symmetric $S$ with three independent entries; mostly of historical and algebraic interest.
- Euler and fixed-axis angles (Z-Y-X moving, X-Y-Z fixed, ZYZ, RPY, … twenty-four conventions in all) use the minimum of three numbers but pay with gimbal-lock singularities—ZYZ failing at $\vartheta = 0, \pi$, RPY at $\vartheta = \pm \pi/2$. The fixed-axis/moving-axis duality holds in general: any moving-axis Euler sequence equals the corresponding fixed-axis sequence in reverse.
- The axis-angle / Rodrigues representation, justified by Euler’s rotation theorem, captures any rotation with $(\hat{K}, \theta)$—four numbers, one constraint—but suffers a sign ambiguity at $\theta = \pi$ and an undefined axis at $\theta = 0$.
- Unit quaternions $Q = {\eta, \boldsymbol{\epsilon}}$ live on $S^3 \subset \mathbb{R}^4$ and double-cover $SO(3)$. They have no singularities, compose via the quaternion product $Q_1 \star Q_2$, invert by negation of the vector part, interpolate stably via SLERP, and are the de facto internal representation for orientation in modern robotics middleware.
- The choice among classical representations is a tradeoff among compactness, singularity-freeness, ease of composition, and human readability; in practice, codebases routinely interconvert as task convenience dictates.
- For learning-based pipelines, every $\le 4$-parameter representation is topologically discontinuous: $SO(3) \cong \mathbb{RP}^3$ does not embed continuously into $\mathbb{R}^d$ for $d \le 4$. The 6D representation of Zhou et al. (2019), obtained by Gram–Schmidt orthonormalisation of two $\mathbb{R}^3$ vectors followed by a cross product, removes the obstruction and has become the default output parameterisation for neural rotation regressors.
The next post views all of these representations through the unified lens of Lie groups and Lie algebras: $SO(3)$ as a manifold, $\mathfrak{so}(3)$ as its tangent space at the identity, and the exponential map exp : $\mathfrak{so}(3) \to SO(3)$ that turns axis-angle into a special case of a far more general construction.
Reference
[1] John J. Craig, Introduction to Robotics: Mechanics and Control, 3rd Edition, Pearson Prentice Hall, 2005. Chapter 2 and Appendix B (rotation parameterizations and the twenty-four Euler/fixed-angle conventions).
[2] Mark W. Spong, Seth Hutchinson, and M. Vidyasagar, Robot Dynamics and Control, 2nd Edition, January 2004. Chapter 2: “Rigid Motions and Homogeneous Transformations.”
[3] Bruno Siciliano, Lorenzo Sciavicco, Luigi Villani, and Giuseppe Oriolo, Robotics: Modelling, Planning and Control, Springer, 2009. Chapter 2, sections 2.4–2.6 (Euler ZYZ and RPY angles, angle/axis, unit quaternion).
[4] Kevin M. Lynch and Frank C. Park, Modern Robotics: Mechanics, Planning, and Control, Cambridge University Press, 2017. Chapter 3 (rotations and rigid-body motions).
[5] Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li, “On the Continuity of Rotation Representations in Neural Networks,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. arXiv:1812.07035.
Leave a comment