Quaternions and Orthogonal 4x4 Real Matrices Henry G. Baker June and October, 1996 Copyright (c) 1996 by Henry G. Baker (hbaker@netcom.com). All rights reserved. This file can be found as URL: ftp://ftp.netcom.com/pub/hb/hbaker/quaternion/orthogonal-4x4.txt This ASCII note is best read with a _fixed-width font_. Abstract -------- We show constructively how to represent the action vO of an _arbitrary_ 4x4 real special (det(O)=1) orthogonal matrix O on a 4-vector v as a triple quaternion product AvB. This representation is unique only up to the sign of A and B, since AvB = (-A)v(-B). We thus expose the intimate relation between 4x4 real orthogonal matrices and quaternions. Notation -------- Since this is an ascii text file, we are forced to severely curtail the use of mathematical notations. About the only symbols that will be needed are the usual (TeX) symbols "^" for superscript and "_" for subscript. However, we will _not_ be making use of the TeX grouping brackets "{}", but will instead use parentheses. We use "/=" for _not equal_. Unless otherwise noted, all matrices in this memo will be _square_. If M is a matrix, then M_ij is the ij'th element of M. The notation (M_ij) means the matrix constructed from the elements M_ij, where i,j range over the number of rows and columns of M, respectively. Thus, M=(M_ij). The _transpose_ of a matrix M is the matrix M' = (M_ji). [W X] [W Y] The transpose of [Y Z] is [X Z]. If z is a complex number, then z* is the complex _conjugate_ of z -- i.e., if z=x+iy, then z* = x-iy. We denote the _determinant_ of the matrix M by det(M), or, when writing out the matrix explicitly, by using "|" brackets instead of "[]" brackets. The determinant of [W X] [W X] |W X| [Y Z] is det([Y Z]), or |Y Z|, whose value is WZ-XY. The _inverse_ of a matrix M (assuming that it has one, i.e., det(M)/=0) is M^-1 -- i.e, M M^-1 = M^-1 M = I, where I is the _identity_ matrix of the same size as M. The inverse of [W X] [ Z -X] M = [Y Z] is (1/det(M)) [-Y W]. A _diagonal_ matrix is a matrix whose only non-zero entries are on the main diagonal -- i.e., if M is diagonal, then M_ij=0 for i/=j. We will use the notation diag(x,y) for a 2x2 diagonal matrix whose non-zero elements along the main diagonal are x and y. The 2x2 identity matrix is [a 0] diag(a,b) = [0 b]. Orthogonal Matrices ------------------- An _orthogonal_ matrix O is a nonsingular (det(O)/=0) _real_ matrix such that O^-1 = O' -- i.e., its inverse is equal to its transpose. Trivial properties of orthogonal matrices: * the transpose and inverse of an orthogonal matrix are themselves orthogonal. * the product of two or more orthogonal matrices (of the same size) is orthogonal. * the determinant of an orthogonal matrix is +/- 1. The proof of this last property is as follows. If O is orthogonal, then OO'=I, hence det(OO')=det(O)det(O')=det(O)det(O)=det(O)^2=det(I)=1. Since O is real, so is det(O), so det(O) = +/- 1. Quaternions ----------- For our purposes, we will define _quaternions_ as 4x4 real matrices of the form [ a b c d] [-b a -d c] [-c d a -b] Q(a,b,c,d) = [-d -c b a]. Thus, a quaternion Q(a,b,c,d) is determined by 4 real parameters -- a,b,c,d. The set of quaternions is _closed_ under matrix multiplication. The set of quaternions is also closed under _arbitrary_ special orthogonal transformations: O Q(a,b,c,d) O' is also a quaternion, for _any_ special orthogonal 4x4 matrix O (det(O)=1). The _conjugate_ Q'(a,b,c,d) of a quaternion Q(a,b,c,d) is simply the _transpose_ of the corresponding matrix -- i.e., Q'(a,b,c,d) = Q(a,-b,-c,-d) [a -b -c -d] [b a d -c] [c -d a b] = [d c -b a]. The _norm_ of a quaternion Q(a,b,c,d) is the square root of the _determinant_ of the corresponding matrix: det(Q(a,b,c,d)) = (a^2 + b^2 + c^2 + d^2)^2 norm(Q(a,b,c,d)) = a^2 + b^2 + c^2 + d^2. Note that norm(Q1 Q2) = norm(Q1) norm(Q2), and that norm(Q') = norm(Q). Note also that if norm(Q(a,b,c,d)) = 1, then the 4x4 matrix corresponding to Q(a,b,c,d) is orthogonal. Note that if c=cos(alpha) and s=sin(alpha), then [ c s 0 0] [-s c 0 0] [ 0 0 c -s] Q(c,s,0,0) = [ 0 0 s c] Define the matrix function flip(M) = C M' C', where C=diag(1,-1,-1,-1), i.e., [1 0 0 0] [0 -1 0 0] [0 0 -1 0] C = [0 0 0 -1] If Q(a,b,c,d) is a quaternion, then Qf(a,b,c,d)=flip(Q(a,b,c,d)) is a 4x4 'flipped' matrix Qf(a,b,c,d) = C Q(a,b,c,d)' C', where C = diag(1,-1,-1,-1) [ a b c d] [-b a d -c] [-c -d a b] = [-d c -b a]. (If you compare the matrices of Qf(a,b,c,d) and Q(a,b,c,d), you will see that Qf = Q', except that the first row and first column have be negated, leaving the first row and column unchanged.) The reason for defining the 'flipped' matrix Qf(a,b,c,d) is as follows. Consider the operation of the matrix Q(a,b,c,d) on the 4-element row vector v = [w x y z]: [ w x y z] [ a b c d] [-b a -d c] [-c d a -b] [-d -c b a] We note that we can derive the same information by considering the first row of Q(w,x,y,z) Q(a,b,c,d): [ w x y z] [ a b c d] [-x w -z y] [-b a -d c] [-y z w -x] [-c d a -b] [-z -y x w] [-d -c b a]. Now consider the same quaternion Q(a,b,c,d), but now operating _from the left_ of Q(w,x,y,z): [ a b c d] [ w x y z] [-b a -d c] [-x w -z y] [-c d a -b] [-y z w -x] [-d -c b a] [-z -y x w] In order to compute the effect of this operation on the row vector [w x y z], we must multiply on the _right_ by Qf(a,b,c,d): [ w x y z] [ a b c d] [-b a d -c] [-c -d a b] [-d c -b a]. In other words, to get the effect of Q(a,b,c,d)Q(w,x,y,z), we must compute [w x y z] Qf(a,b,c,d). Using the concept of 'flipped' quaternion matrices, we can now represent the effect of the quaternion triple product Q(a,b,c,d)Q(w,x,y,z)Q(e,f,g,h) as [w x y z] Q(e,f,g,h) Qf(a,b,c,d) = [w x y z] Qf(a,b,c,d) Q(e,f,g,h) We note that the associativity of quaternion multiplication guarantees that Q(e,f,g,h) and Qf(a,b,c,d) _commute_, i.e., they have the same eigenvectors. In short, the concept of 'flipped' matrices Qf allows us to convert bilateral quaternion triple products into unilateral matrix products. Representing 4x4 Special Orthogonal Matrices by Quaternions ----------------------------------------------------------- If O is a 4x4 _special_ orthogonal matrix, then OO'=I and det(O)=1. O has _four_ (not necessarily distinct) eigenvalues: l1, l2, l3, l4, and we can arrange them in 2 pairs of complex conjugates: l1=l2* and l3=l4*. So, by Schur's Decomposition Theorem, we can factor the orthogonal matrix O as O = U P U', where U is a 4x4 special orthogonal matrix, and where P can be made to have the form: [ c1 s1 0 0] [-s1 c1 0 0] [ 0 0 c2 s2] P = [ 0 0 -s2 c2], where c1=realpart(l1)=realpart(l2), s1=imagpart(l1)=-imagpart(l2), c2=realpart(l3)=realpart(l4), s1=imagpart(l3)=-imagpart(l4). Clearly, c1=cos(alpha), s1=sin(alpha), c2=cos(beta), s2=sin(beta), for some real angles alpha, beta. Unfortunately, unless c1=c2 and s1=-s2, P is _not_ a quaternion itself, nor is it a product of quaternions, since quaternions are closed under multiplication. However, P _can_ be factored as the product of a quaternion and a 'flipped' quaternion P = Qf Q = P1f P2. Let c3=cos((alpha+beta)/2), s3=sin((alpha+beta)/2), c4=cos((alpha-beta)/2), s4=sin((alpha-beta)/2). Then P = Qf(c3,s3,0,0) Q(c4,s4,0,0) = Qf(cos((alpha+beta)/2),sin((alpha+beta)/2),0,0) Q(cos((alpha-beta)/2),sin((alpha-beta/2)),0,0) = P1f P2. In short, the action vP of the matrix P on the vector v=[w,x,y,z] can be simulated by the quaternion triple product Q(c3,s3,0,0) Q(w,x,y,z) Q(c4,s4,0,0). We must now take U into account. For Schur, O = U P U' = U P1f P2 U' = U P1f U' U P2 U' = (U P1f U') (U P2 U') = Af B, where A = flip(Af) = flip(U P1f U') and B = U P2 U'. Now, B is a quaternion, since the special orthogonal transformation U converts the quaternion P2=Q(c4,s4,0,0) into another quaternion. Furthermore, A is also a quaternion, as A = flip(U P1f U') = C (U P1f U')' C' = C U'' P1f' U' C' = C U (C P1' C')' U' C' = C U C'' P1'' C' U' C' = C U C' P1 C U' C' (remember, C=C', since C is diagonal) = flip(U)' P1 flip(U) = flip(U)' Q(c3,s3,0,0) flip(U) Therefore, since flip(U) preserves orthogonality of U, this orthogonal transform preserves the quaternionness of P1=Q(c3,s3,0,0), and hence A is also a quaternion. Rotations in 4 Dimensions and in 3 Dimensions --------------------------------------------- The Schur factorization given above of the special orthogonal matrix O shows a major difference between rotations in 3 dimensions and rotations in 4 dimensions: A general rotation in 4 dimensions consists of two, independent, commutative rotations about two orthogonal fixed _planes_. I.e., instead of a basic 4D rotation having a fixed _axis_, as in 3D, it has a fixed _plane_. Consider the operation of the matrix U above as a _change of basis_ to a new basis in which the first rotation rotates 1,i about the j,k 'axis' (i.e., holding the j,k plane constant), while the second rotation rotates j,k about the 1,i 'axis' (i.e., holding the 1,i plane constant). Furthermore, every vector in the subspace 1,i is perpendicular (orthogonal) to every vector in the subspace j,k, so the two subspaces are orthogonal. In other words, the matrix P above factors as follows: [ c1 s1 0 0] [ c1 s1 0 0] [1 0 0 0] [-s1 c1 0 0] [-s1 c1 0 0] [0 1 0 0] [ 0 0 c2 s2] [ 0 0 1 0] [0 0 c2 s2] P = [ 0 0 -s2 c2] = [ 0 0 0 1] [0 0 -s2 c2] [1 0 0 0] [ c1 s1 0 0] [0 1 0 0] [-s1 c1 0 0] [0 0 c2 s2] [ 0 0 1 0] = [0 0 -s2 c2] [ 0 0 0 1] = P2 P1 = P1 P2 We can now better understand the usage of quaternions for rotations in 3 dimensions. When manipulating 3D vectors, we typically identify 3D space with the subspace Q(0,x,y,z), where x,y,z are the coordinates of a 3D space point (or vector). In order to perform a rotation on this 3D subspace, we do a 4D rotation that preserves the 'plane' 1,n, where n is a vector indicating the 3D axis of the 3D rotation. Since our 3D rotation will preserve a plane, the angle 'alpha' in the decomposition above will be _zero_, and hence the quaternion decomposition will look like Q(c3,s3,0,0) Q(0,x,y,z) Q(c4,s4,0,0) = Q(cos(beta/2),sin(beta/2),0,0) Q(0,x,y,z) Q(cos(-beta/2),sin(-beta/2),0,0) = Q(cos(beta/2),sin(beta/2),0,0) Q(0,x,y,z) Q(cos(beta/2),-sin(beta/2),0,0) = Q(cos(beta/2),sin(beta/2),0,0) Q(0,x,y,z) Q(cos(beta/2),sin(beta/2),0,0)' which becomes a classical quaternion expression R Q R' for 3D rotation after changing the basis back to the original basis by means of the special orthogonal matrix U. Decomposition of 4x4 Special Orthogonal O into Quaternions ---------------------------------------------------------- Although the Schur decomposition given above works, and Schur decompositions are readily available in matrix libraries -- e.g., MATLAB, there is an easier way (suggested by Shoemake) to decompose a given special orthogonal matrix O into quaternions. Let [o11 o12 o13 o14] [o21 o22 o23 o24] [o31 o32 o33 o34] O = [o41 o42 o43 o44] Then let Q1 = Q(o11,o12,o13,o14) (i.e., construct a quaternion Q1 from the first row of the given matrix O), and let P = O (Q1)^-1 = O/Q1. Then [1 0 0 0] [0 p22 p23 p24] [0 p32 p33 p34] P = O (Q1)^-1 = O/Q1 = [0 p42 p43 p44], i.e., P holds the first coordinate fixed and has a special orthogonal 3x3 submatrix -- i.e., a 3x3 rotation matrix -- and P Q1 = O. We can now use a standard technique (e.g., [Salamin79]) to extract the angle alpha and axis n from this 3x3 rotation matrix, and use them to construct a quaternion Q2 which implements this same rotation. But then the mapping of 4D vector v by the matrix O is computed by vO = v(P Q1) = (vP) Q1 = (Q2 v Q2') Q1 = Q2 v (Q2' Q1) = A v B, where A = Q2 and B = (Q2' Q1) are the quaternions we have been seeking. Computing the Planar 'Axes' of the 4D Rotation AvB -------------------------------------------------- Given two arbitrary unit quaternions A,B, we would like to determine the two orthogonal rotations that implement the general 4D rotation AvB about the two orthogonal 'axis' planes. Any unit quaternion A can be written in the form exp(alpha n), where alpha is a real number and n is a 'pure' unit quaternion n=Q(0,x,y,z) where x^2+y^2+z^2=1. Furthermore, we define exp(alpha n) = cos(alpha) + n sin(alpha) = Q(cos(alpha),x sin(alpha),y sin(alpha),z sin(alpha)) Let A = exp(gamma a) and B = exp(delta b). Then A = exp(gamma a) = exp((gamma+delta)/2 a) exp((gamma-delta)/2 a) = exp(alpha a) exp(beta a) and B = exp(-(gamma-delta)/2 b) exp((gamma+delta)/2 b) = exp(-beta b) exp(alpha b) by setting alpha = gamma+delta/2 and beta = gamma-delta/2. Then AvB = exp(gamma a) v exp(delta b) = exp(alpha a) exp(beta a) v exp(-beta b) exp(alpha b) = A1 A2 v B2 B1 = A2 A1 v B1 B2 (A1 commutes with A2 because both are in the complex subfield Q(a), and B1 commutes with B2 because both are in the complex subfield Q(b).) Now consider the map v |-> A1 v B1 = exp(alpha a) v exp(alpha b). The point 0 maps to A1 0 B1 = 0, so this map fixes the origin. The point a-b maps to A1 (a-b) B1 = a-b, so this map fixes a-b. The point 1+ab maps to A1 (1+ab) B1 = 1+ab, so this map fixes 1+ab. Thus, the map v |-> (A1 v B1) fixes the plane defined by the 3 points: 0, a-b, 1+ab. Also consider the map v |-> A2 v B2 = exp(beta a) v exp(-beta b). The point 0 maps to A2 0 B2 = 0, so this map also fixes the origin. The point a+b maps to A2 (a+b) B2 = a+b, so this map fixes a+b. The point 1-ab maps to A2 (1-ab) B2 = 1-ab, so this map fixes 1-ab. Thus, the map v |-> (A2 v B2) fixes the plane defined by the 3 points: 0, a+b, 1-ab. Now the 4D vector a-b is orthogonal to a+b, becaue a-b and a+b are the diagonals of a rhombus; similar reasoning proves that 1+ab is orthogonal to 1-ab. If P,Q are two quaternions, then their inner product is the scalar part of P Q', which is zero if the quaternions are orthogonal, i.e., P Q' + Q P' = 0 if P,Q are orthogonal. We now prove that 1-ab is orthogonal to a+b: (a+b)(1-ab)'+(1-ab)(a+b)' = (a+b)(1-b'a')+(1-ab)(-a-b) = (a+b)(1-ba)+(ab-1)(a+b) = a+b-aba-bba+aba+abb-a-b = -bba+abb = a-a = 0 Similarly, 1+ab is orthogonal to a+b, 1+ab is orthogonal to a-b, and 1-ab is orthogonal to a-b. (a+b)(1+ab)'+(1+ab)(a+b)' = (a+b)(1+b'a')+(1+ab)(-a-b) = (a+b)(1+ba)+(1+ab)(-a-b) = a+b+aba+bba-a-b-aba-abb = bba-abb = -a+a = 0 (a-b)(1-ab)'+(1-ab)(a-b)' = (a-b)(1-ba)+(1-ab)(-a+b) = a-b-aba+bba-a+b+aba-abb = bba-abb = -a+a = 0 (a-b)(1+ab)'+(1+ab)(a-b)' = (a-b)(1+ba)+(1+ab)(-a+b) = a-b+aba-bba-a+b-aba+abb = -bba+abb = a-a = 0 Therefore, we have shown that the map v |-> AvB can be factored into the composition of the map v |-> A1 v B1 and the map v |-> A2 v B2. Furthermore, the first map rotates about the 'axis' plane defined by the points 0, a-b, 1+ab, while the second map rotates about the 'axis' plane defined by the points 0, a+b, 1-ab, and the two 'axis' planes are orthogonal to one another. The pure vectors a,b may be computed from A,B as follows: A = cos(gamma) + a sin(gamma), so Re(A) = (A+A')/2, and a = (A-Re(A))/sqrt(1-Re(A)^2); similarly, Re(B) = (B+B')/2, and b = (B-Re(B))/sqrt(1-Re(B)^2). Conclusions ----------- Given an arbitrary 4x4 special orthogonal matrix O (det(O)=1), we have shown two different ways to construct unit quaternions A,B from O, such that the action vO of the matrix O on the 4-vector v=[w,x,y,z] is the same as the quaternion triple product A Q(w,x,y,z) B. Furthermore, we have shown how to factor this triple product into the product A1 A2 Q(w,x,y,z) B2 B1, so that the inner mapping fixes one 'axis plane' and the outer mapping fixes an orthogonal 'axis plane'. Finally, we have shown how to compute 3 points on each of these axis planes. References ---------- Coxeter, H.S.M. Regular Complex Polytopes, 2nd Ed. Cambridge University Press, Cambridge, 1991. ISBN 0-521-39490-2. (Especially Chapter 6: "The geometry of quaternions".) Eves, Howard. Elementary Matrix Theory. Dover Publications, Inc., New York, 1966. ISBN 0-486-63946-0. Neumann, Peter M., Stoy, Gabrielle A., and Thompson, Edward C. Groups and Geometry. Oxford University Press, 1994. ISBN 0-19-853451-5. Especially Chapter 16: "Complex numbers and quaternions". Salamin, Eugene. "Application of Quaternions to Computation with Rotations". Stanford AI Lab. Internal Working Paper, 1979. On the World-Wide Web at ftp://ftp.netcom.com/pub/hb/hbaker/quaternion/stanfordaiwp79-salamin.ps.gz (also .dvi.gz).