4.2.2
Simplified physics model
While the CIO method as described above can be used with stan-
dard physics models as implemented in existing physics engines,
our implementation relies on a simplified model yielding a favor-
able trade-off between physical realism and optimization efficiency.
Instead of representing the pose q directly and then computing the
end-effector positions p
i
(q) using forward kinematics, we repre-
sent the end-effector positions as well as their orientations directly
(i.e. as functions of spline parameters contained in s) and then de-
fine the pose q using inverse kinematics; see Appendix. All mass
is assumed to be concentrated at the root bodies: the torso of each
character, as well as any passive objects. Non-smooth movements
of the (now-massles) limbs are avoided by including an accelera-
tion cost described later. The inverse dynamics are still in the form
(4) and the quadratic program defining the contact force and control
is still in the form (6), but all computations are now simplified.
Representing the pose in terms of end-effector positions and orien-
tations makes it difficult to enforce kinematic constraints exactly.
However we turn this to our advantage, by introducing an addi-
tional continuation method that allows limbs to stretch and joint
limits to be violated early in optimization. This is done by adding
quadratic costs to L
Physics
, that penalize any deviations of the limb
lengths from their reference values as well as any joint limit vio-
lations. We also penalize penetration of the character’s body parts
(approximated for collision with capsules shown in figure 2) against
the environment, or other body parts.
4.3
High-level goals and task cost
The cost L
Task
(s) encodes the high-level goals of the movement.
It includes task-specific terms specifying the desired outcome, and
generic terms (integrated over time) specifying that the movement
should be energy-efficient and smooth:
L
Task
(s) =
b
b
(q
T
(s))+
t
f
t
(s)
2
+ u
t
(s)
2
+ ¨
q
t
(s)
2
(9)
Here
b
are task-specific terms which only depend on the final pose
q
T
, and b is an index over different tasks. Several tasks can be com-
posed together, such as combining a standing task with the moving
to target task. We use the above general form of L
Task
for all tasks
except for kicking/punching. In that case we specify an
b
at reg-
ular intervals when each target should be hit, and also include de-
pendence on ˙
q because we want the targets to be hit with a certain
end-effector velocity.
The general procedure for constructing the task-specific costs
b
is
to identify a vector of positional (and optionally velocity) features
h
b
(q) that are key to task b, define the desired feature values h
∗
b
at
the end of the movement (or at other important points in time such
as target hits), and then construct
b
as
b
(q
T
(s)) = h
b
(q
T
(s)) − h
∗
b
2
(10)
In this way, a final position task
pos
can be specified by using h
pos
that selects torso position, and setting h
∗
pos
to the desired position.
Final orientation task
dir
can be defined similarly for torso facing
direction. Standing task
stand
can be expressed by using a com-
bination of h
stand
and h
∗
stand
which specifies that the center of torso
should be between two feet, the feet be fully extended, and the torso
direction be aligned with the vertical direction vector.
The relative importance of the different features can be adjusted by
scaling the corresponding elements of h.
4.4
Heuristic sub-goals and hint cost
In the absence of good initialization – which in the present context
would correspond to motion capture data or other detailed user in-
puts we aim to avoid – numerical optimization can be sped up by
providing heuristic sub-goals early on, and then disabling them near
convergence. Such heuristics (also known as shaping) are not meant
to be part of the true cost, but rather guide the solution to a region
from where the true cost can be optimized efficiently. We found that
even though most of the behaviors we studied could be synthesized
without such heuristics, in some cases (particularly those involving
two characters) a certain type of heuristic helps. This heuristic is
based on the ZMP stability criterion used in locomotion, where the
objective is to keep the ”zero moment point” z (q, ¨
q) in the convex
hull of the support region [Vukobratovic and Borovac 2004]. Let
n (z) denote the nearest distance (in a soft-min sense) to z point in
the convex hull. We compute n by expressing it as a convex com-
bination of the end-effector positions: n =
i
λ
i
p
i
where λ
i
≥ 0
and
i
λ
i
= 1, and solving for the coefficients λ using quadratic
programming regularized by the same weights W as in (7). Then
the hint cost is
L
Hint
(s) =
t
max ( z
t
(s) − n (z
t
(s)) − , 0)
2
(11)
This is a half-quadratic starting
away from the convex hull. The
parameter
is used to adjust how strictly we want to enforce the
ZMP stability criterion.
4.5
Numerical optimization and continuation
We optimize the composite cost L (s) defined in (1) using an off-
the-shelf implementation of the LBFGS algorithm. The dimension-
ality of the vector s is (12(N + 1) + N )K, where again N is the
number of end-effectors and K is the number of movement phases.
The specific representation s used here is defined in (12) and (13) of
Appendix A. We use K between 10 and 20 depending on the com-
plexity of the task. Each phase lasts 0.5 sec. The inverse dynamics
and cost are evaluated at 0.1 sec intervals (note that the analytical
spline representation allows us to evaluate the dynamics and cost at
any point in time). The gradient
L (s) which is needed for nu-
merical optimization is approximated using finite differences (with
= 10
−3
). Our implementation of finite differences takes advan-
tage of the fact that many of the cost terms depend only on the pose
at a single point in time, and do not need to be recomputed when
the rest of the trajectory is perturbed.
Continuation is implemented by weighting the four terms in (1) dif-
ferently in different phases of the optimization process (not to be
confused with movement phases). The optimization process has
three phases as follows. In Phase 1 only L
Task
is enabled. This
causes the optimizer to rapidly discover a movement that achieves
the task goals without being physically realistic. In Phase 2 we en-
able all four terms, except L
Physics
is down-weighted by 0.1 so that
physical consistency is enforced gradually. In Phase 3 we fully en-
able all terms except for L
Hint
– which is no longer needed and is
undesirable at this point, because we do not want it to affect the fi-
nal solution. Qualitatively, Phase 1 corresponds to rapid discovery
combined with wishful thinking; Phase 2 corresponds to cautious
enforcement of physical realism while being guided by optional
hints; Phase 3 corresponds to refinement of the final solution. The
solution obtained at the end of each phase is perturbed with small
zero-mean Gaussian noise (to break any symmetries) and used to
initialize the next phase. The initialization for Phase 1 is completely
uninformative – a static initial pose. We found that using such con-
tinuation is often important. Exactly the same continuation scheme
was successful in all of the diverse behaviors we studied, and so our
method does not need behavior-specific adjustments.