Discovery of Complex Behaviors through Contact-Invariant Optimization

Yüklə 0,85 Mb.

Pdf görüntüsü

səhifə	2/6
tarix	17.11.2018
ölçüsü	0,85 Mb.
	#80046

1 2 3 4 5 6

3

Rationale and Overview

Our work was motivated by the observation that contact interac-

tions are essential for most animal and human movements. Pre-

vious work on motion synthesis through numerical optimization

has either pre-deﬁned the contact interactions, or expressed them as

functions of the movement trajectory and thereby optimized them

indirectly, almost as a side effect of trajectory optimization. Our

reasoning was that, if contacts are essential, they should play a more

central role. This suggested optimizing over auxiliary decision vari-

ables which directly specify when and where contacts are made.

Our ﬁrst attempts to develop such a method failed in interesting

ways. We deﬁned discrete variables specifying contacts between

pairs of objects, and a continuous feedback control method that

pushed the character along trajectories consistent with this high-

level contact speciﬁcation. We found that, even though setting such

discrete variables was relatively easy for humans (resulting in a

novel way of motion scripting), optimizing them automatically was

basically intractable. This was partly because discrete optimiza-

tion is generally hard, and partly because it is difﬁcult to tell what

constitutes a good set of contacts without simultaneously consider-

ing the detailed trajectory that instantiates them. These difﬁculties

suggested that decisions regarding contact interactions should be

encoded as continuous rather than discrete variables, and should be

optimized simultaneously with the movement trajectory.

Continuous speciﬁcation of desired contacts naturally leads to the

idea of weights in cost functions. The contact-related auxiliary vari-

ables we use here (denoted c

≥ 0 for contact i) have the follow-

ing semantics: if c

is large contact i must be active (i.e. the cor-

responding bodies must be touching), while if c

is small we do

not care what happens at contact i. Another important observation

is that complex behaviors are naturally decomposed into phases,

and the set of contacts remains invariant in each phase. The con-

tact forces are not invariant (on the contrary, they change a lot) but

the presence or absence of a contact is. This suggests making c

i,t

piece-wise constant over time, i.e. deﬁning c

i,φ(t)

where φ (t) is

the movement phase at time t. The most direct approach at this

point would be to deﬁne auxiliary cost terms weighted by c

i,φ

. If

we did that, however, the optimizer will immediately set all c’s to

zero and effectively eliminate our auxiliary costs. One way to pre-

vent this would be to constrain the sum of the c’s, but this amounts

to telling the optimizer how much overall contact it should use in a

given behavior, and we do not know the answer in advance.

Another way to prevent the optimizer from eliminating the auxiliary

costs – which is what is use here – is to make the auxiliary variables

c also affect the dynamics, in such a way that setting them to zero

would be suboptimal. Since they are associated with contacts, the

natural way to enter the dynamics is to allow contact forces to be

generated at contact i only when the corresponding c

is large. This

has another unexpected beneﬁt: instead of ﬁnding the active con-

tacts and performing various calculations that depend on the output

of the collision detector (and are therefore non-smooth and difﬁcult

to optimize over), we can assume that the active contacts are those

whose c’s are large, resulting in simpler computations with smooth

output. The approach outlined above does require all potential con-

tacts to be enumerated in advance, and an auxiliary variable c

be deﬁned for each of them.

Finally, we introduce a simpliﬁed physics model consistent with

our contact-centric approach. Instead of parametrizing the joint-

space conﬁguration of the character and using forward kinematics

to compute end-effector positions and orientations, we parameter-

ize the end-effectors and use inverse kinematics to deﬁne the joint-

space conﬁguration. In this way the optimizer can work directly

with the end-effectors to which the auxiliary variables are associ-

ated. Kinematic constraints (i.e. ﬁxed limb sizes and joint limits)

are enforced as costs, and the dynamics are simpliﬁed by assuming

that all mass is concentrated at the torso. We do not yet know if

this physics simpliﬁcation was necessary, and will ﬁnd out in future

work.

4

Contact-Invariant Optimization

We now describe the CIO method in detail. We begin with a general

formulation that can be adapted to different types of physics models

and tasks. We then describe our speciﬁc simpliﬁcation of physics,

followed by details of the behavioral tasks and the numerical opti-

mization procedure.

4.1

General formulation and contact-invariant cost

Let s denote the real-valued solution vector that encodes the move-

ment trajectory and auxiliary variables. The trajectory can be rep-

resented directly by listing the sequence of poses, or by function

approximators such as splines (which is what we use here). All we

require is that the character pose q

(s) is a well-deﬁned function

of s at each (discrete) point in time 1 ≤ t ≤ T . The auxiliary vari-

ables c

i,φ(t)

(s) ≥ 0 are also included in s. The overall movement

time T is partitioned into K intervals or phases, and 1 ≤ φ (t) ≤ K

is the index of the phase to which time step t belongs. In our cur-

rent implementation the number of phases is predeﬁned and their

durations are equal, although in principle these parameters can also

be optimized in an outer loop. 1 ≤ i ≤ N is an index over ”end-

effectors”. Here end-effector does not refer to an entire rigid body

(e.g. a hand or a foot), but to a speciﬁc surface patch on one of the

rigid bodies. These patches are the only places where contact forces

can be exerted, as explained below. The function p

(q) ∈ R

re-

turns the center of patch i.

The CIO method computes the optimal solution s

∗

by minimizing

a composite objective function L (s) in the form

L (s) = L

(s) + L

Physics

(s) + L

Task

(s) + L

Hint

(s)

(1)

is a novel contact-invariant cost introduced here. L

Physics

pe-

nalizes physics violations; we enforce physical consistency using a

soft cost rather than a hard constraint because this enables power-

ful continuation methods. L

Task

speciﬁes the task objectives, and

is the only term that needs to be modiﬁed in order to synthesize a

novel behavior. L

Hint

is optional and can be used to provide hints

(e.g. ZMP-like costs are used here) in the early phases of optimiza-

tion. Continuation methods are implemented by weighting these

costs differently in different phases of optimization; see below.

The contact-invariant cost L

is deﬁned as

(s) =

i,φ(t)

(s)

i,t

(s)

+ ˙e

i,t

(s)

(2)

i,t

is a 4D contact-violation vector for end-effector i at time t. Re-

call that a large value of c

i,φ(t)

means that end-effector i should be

in contact with the environment during the entire movement phase

φ (t) to which time t belongs. Thus when c is large we want the

corresponding e to be small. This vector encodes misalignment in

both position and orientation. The ﬁrst 3 components of e are the

difference vector between the end-effector position p

) and the

”nearest point” on any surface in the environment (including other

body segments). The last component of e is the angle between

the surface normal at the nearest point and the surface normal at

the end-effector. The cost L

penalizes both e and its velocity ˙e

which corresponds to slip.

Our deﬁnition of ”nearest point” is unusual in an important way:

we effectively use a soft-min instead of a min operator. Let n

(p)

Yüklə 0,85 Mb.

Dostları ilə paylaş:

1 2 3 4 5 6