denote the actual nearest point to p on surface j. Define the weights
η
j
(p) =
1
1 + p − n
j
(p)
2
k
(3)
where k = 10
4
is a smoothness parameter. A virtual nearest point
is then obtained by normalizing the weights η
j
to sum to 1, and
computing the weighted average of the n
j
’s. Intuitively, when p
i
is far from any surface (and c
i
is large), the cost L
CI
will push it
towards some average of the surface ”mass”. When p
i
gets close
to a surface, it will be pushed towards the nearest point on that sur-
face. This construction also makes L
CI
smooth, which facilitates
numerical optimization.
4.2
Inverse dynamics and physics-violation cost
Next we describe the physics-violation cost L
Physics
. This cost has
two components. One is general and depends on our novel formu-
lation of contact dynamics which is essential to the CIO method.
The other is specific to the simplified model of multi-body dynam-
ics described below – which can be replaced with a full physics
model without modifying the rest of the method. In this subsection
we focus on a single time step and omit the time index t.
Let f ∈ R
6N
denote the vector of contact forces acting on all N
end-effectors. We have a 6D vector per contact because we allow
torsion around the surface normal, and furthermore the origin of
the contact force is allowed to move inside the end-effector surface
patch. Thus, unlike previous work which models contact as a sum
a multiple contact points, we model an entire contact surface patch,
allowing us to reason about contact at a coarser scale and speed up
the optimization. Note that all potential contacts are considered at
all times, and so the dimensionality of the contact force vector re-
mains constant, resulting in smoothness of the cost. Contact forces
here are associated with individual end-effectors and not with pairs
of contacting bodies. We only model contacts at pre-defined end-
effectors, although our definition is sufficiently general to include
surface patches on any body part.
Let J (q) ∈ R
6N ×D
denote the Jacobian matrix mapping general-
ized velocities ˙
q to contact-space velocities for the contact model
described above; D = dim ( ˙
q) is the number of degrees of freedom
in the character. Let τ (q, ˙
q, ¨
q) denote the inverse of the smooth
dynamics, which equals the sum of contact and applied forces:
τ (q, ˙
q, ¨
q) = J (q)
T
f + Bu
(4)
Here u is the vector of applied forces/controls in the actuated space.
They are mapped to the full space by the matrix B, which has zeros
in the rows corresponding to ”root” forces and torques acting on the
torso or on passive objects. The smooth inverse dynamics are
τ (q, ˙
q, ¨
q) = M (q) ¨
q + C (q, ˙
q) ˙
q + g (q)
(5)
where M is the inertia matrix, C the matrix of Coriolis and cen-
trifugal terms, and g is gravity.
Our goal now is to complete the inverse dynamics computation, and
in particular recover f , u given τ, J, B. We do this by minimizing
the squared residual in (4) subject to friction-cone constraints on f
and quadratic regularization for f and u. Thus we have the follow-
ing quadratic programming (QP) problem:
f , u =
arg min
f ,u
J
T
f + Bu − τ
2
+ f
T
W f + u
T
Ru
subject to
Af ≤ b
(6)
The pose-dependent matrix A (q) and vector b (q) encode the stan-
dard pyramid approximation to the friction cone which makes the
inequality constraints linear. They also include linear inequality
constraints that restrict the origin point of each contact force to
the surface patch of its corresponding end-effector. Currently these
patches are rectangles. For end-effectors that are allowed to grab
(i.e. hands) we omit the friction-cone constraints, allowing both
positive and negative normal forces.
The control regularization matrix R is constant and symmetric posi-
tive definite. The contact force regularization matrix W depends on
the auxiliary variables: W is diagonal and its 6 diagonal elements
corresponding to the i-th contact force vector are
W
j,j
=
k
0
c
2
i
+ k
1
(7)
for all j between 6i − 5 and 6i. We used k
0
= 10
−2
, k
1
= 10
−3
.
The components of W corresponding to hand end-effectors were
four times larger, because hands are generally weaker than feet.
The quadratic program (6) is convex and therefore has a unique so-
lution – which can be found using a number of algorithms. Here
we used a specialized sequential solver due to [Lee and Goswami
2010] because it is easy to implement efficiently, although the re-
sults obtained with an off-the-shelf QP solver were very similar.
In summary, inverse dynamics are computed by first computing the
quantities τ (s) , J (s) , A (s) , b (s) , W (s) as described above,
and then solving (6) which yields f (s) and u (s). Re-introducing
the time index t, the physics-violation cost is
L
Physics
(s) =
t
J
t
(s)
T
f
t
(s) + Bu
t
(s) − τ
t
(s)
2
(8)
Note that even if this cost can be reduced to zero by a certain choice
of f and u, the actual f and u computed in (6) will generally be
different because of the extra regularization terms.
4.2.1
Trade-off between
L
CI
and
L
Physics
Let us now examine how the two cost terms L
CI
and L
Physics
de-
fined in (2) and (8) are affected by the auxiliary variables c. The
term L
CI
achieves its global minimum of zero when all c’s are set
to zero. However this makes the W matrix in (7) large, contact
forces become expensive, and the QP solver for (6) prefers to vi-
olate physics rather than use large contact forces – which in turn
leads to a substantial L
Physics
term. Conversely if all c ’s are large,
the QP solver is able to reconcile any trajectory with the inverse dy-
namics model by generating contact forces at all end-effectors, in-
cluding those that are not actually in contact. This makes L
Physics
small, but now L
CI
is large because end-effectors that are not in
contact have large c’s. Thus the optimal trade-off is achieved when
c
i
is large only for end-effectors that are in contact, and furthermore
the trajectory can be generated using contact forces only at those
end-effectors. In other words, at the optimal solution the auxiliary
variables c correspond to the contacts that end up being active.
Since the controls u are constrained to act in the actuated space,
root helper forces (often used in prior work for continuation) are
not allowed here. If we were to allow such forces the above trade-
off in terms of c would no longer hold. On the other hand, the ability
of our method to generate contact forces from a distance provides
an even more effective continuation method: non-physical behavior
in the early phases of optimization is still possible, but it is always
associated with potential contacts, making it easier to transition to
physically-realistic contact dynamics later in optimization.