Discovery of Complex Behaviors through Contact-Invariant Optimization

Yüklə 0,85 Mb.

Pdf görüntüsü

səhifə	3/6
tarix	17.11.2018
ölçüsü	0,85 Mb.
	#80046

1 2 3 4 5 6

4.2.1 Trade-off between

denote the actual nearest point to p on surface j. Deﬁne the weights

(p) =

1 + p − n

(p)

(3)

where k = 10

is a smoothness parameter. A virtual nearest point

is then obtained by normalizing the weights η

to sum to 1, and

computing the weighted average of the n

’s. Intuitively, when p

is far from any surface (and c

is large), the cost L

will push it

towards some average of the surface ”mass”. When p

gets close

to a surface, it will be pushed towards the nearest point on that sur-

face. This construction also makes L

smooth, which facilitates

numerical optimization.

4.2

Inverse dynamics and physics-violation cost

Next we describe the physics-violation cost L

Physics

. This cost has

two components. One is general and depends on our novel formu-

lation of contact dynamics which is essential to the CIO method.

The other is speciﬁc to the simpliﬁed model of multi-body dynam-

ics described below – which can be replaced with a full physics

model without modifying the rest of the method. In this subsection

we focus on a single time step and omit the time index t.

Let f ∈ R

denote the vector of contact forces acting on all N

end-effectors. We have a 6D vector per contact because we allow

torsion around the surface normal, and furthermore the origin of

the contact force is allowed to move inside the end-effector surface

patch. Thus, unlike previous work which models contact as a sum

a multiple contact points, we model an entire contact surface patch,

allowing us to reason about contact at a coarser scale and speed up

the optimization. Note that all potential contacts are considered at

all times, and so the dimensionality of the contact force vector re-

mains constant, resulting in smoothness of the cost. Contact forces

here are associated with individual end-effectors and not with pairs

of contacting bodies. We only model contacts at pre-deﬁned end-

effectors, although our deﬁnition is sufﬁciently general to include

surface patches on any body part.

Let J (q) ∈ R

6N ×D

denote the Jacobian matrix mapping general-

ized velocities ˙

q to contact-space velocities for the contact model

described above; D = dim ( ˙

q) is the number of degrees of freedom

in the character. Let τ (q, ˙

q, ¨

q) denote the inverse of the smooth

dynamics, which equals the sum of contact and applied forces:

τ (q, ˙

q, ¨

q) = J (q)

f + Bu

(4)

Here u is the vector of applied forces/controls in the actuated space.

They are mapped to the full space by the matrix B, which has zeros

in the rows corresponding to ”root” forces and torques acting on the

torso or on passive objects. The smooth inverse dynamics are

τ (q, ˙

q, ¨

q) = M (q) ¨

q + C (q, ˙

q) ˙

q + g (q)

(5)

where M is the inertia matrix, C the matrix of Coriolis and cen-

trifugal terms, and g is gravity.

Our goal now is to complete the inverse dynamics computation, and

in particular recover f , u given τ, J, B. We do this by minimizing

the squared residual in (4) subject to friction-cone constraints on f

and quadratic regularization for f and u. Thus we have the follow-

ing quadratic programming (QP) problem:

f , u =

arg min

f ,u

f + Bu − τ

+ f

W f + u

subject to

Af ≤ b

(6)

The pose-dependent matrix A (q) and vector b (q) encode the stan-

dard pyramid approximation to the friction cone which makes the

inequality constraints linear. They also include linear inequality

constraints that restrict the origin point of each contact force to

the surface patch of its corresponding end-effector. Currently these

patches are rectangles. For end-effectors that are allowed to grab

(i.e. hands) we omit the friction-cone constraints, allowing both

positive and negative normal forces.

The control regularization matrix R is constant and symmetric posi-

tive deﬁnite. The contact force regularization matrix W depends on

the auxiliary variables: W is diagonal and its 6 diagonal elements

corresponding to the i-th contact force vector are

j,j

+ k

(7)

for all j between 6i − 5 and 6i. We used k

= 10

−2

, k

= 10

−3

The components of W corresponding to hand end-effectors were

four times larger, because hands are generally weaker than feet.

The quadratic program (6) is convex and therefore has a unique so-

lution – which can be found using a number of algorithms. Here

we used a specialized sequential solver due to [Lee and Goswami

2010] because it is easy to implement efﬁciently, although the re-

sults obtained with an off-the-shelf QP solver were very similar.

In summary, inverse dynamics are computed by ﬁrst computing the

quantities τ (s) , J (s) , A (s) , b (s) , W (s) as described above,

and then solving (6) which yields f (s) and u (s). Re-introducing

the time index t, the physics-violation cost is

Physics

(s) =

(s)

(s) + Bu

(s) − τ

(s)

(8)

Note that even if this cost can be reduced to zero by a certain choice

of f and u, the actual f and u computed in (6) will generally be

different because of the extra regularization terms.

4.2.1

Trade-off between

and

Physics

Let us now examine how the two cost terms L

and L

Physics

de-

ﬁned in (2) and (8) are affected by the auxiliary variables c. The

term L

achieves its global minimum of zero when all c’s are set

to zero. However this makes the W matrix in (7) large, contact

forces become expensive, and the QP solver for (6) prefers to vi-

olate physics rather than use large contact forces – which in turn

leads to a substantial L

Physics

term. Conversely if all c ’s are large,

the QP solver is able to reconcile any trajectory with the inverse dy-

namics model by generating contact forces at all end-effectors, in-

cluding those that are not actually in contact. This makes L

Physics

small, but now L

is large because end-effectors that are not in

contact have large c’s. Thus the optimal trade-off is achieved when

is large only for end-effectors that are in contact, and furthermore

the trajectory can be generated using contact forces only at those

end-effectors. In other words, at the optimal solution the auxiliary

variables c correspond to the contacts that end up being active.

Since the controls u are constrained to act in the actuated space,

root helper forces (often used in prior work for continuation) are

not allowed here. If we were to allow such forces the above trade-

off in terms of c would no longer hold. On the other hand, the ability

of our method to generate contact forces from a distance provides

an even more effective continuation method: non-physical behavior

in the early phases of optimization is still possible, but it is always

associated with potential contacts, making it easier to transition to

physically-realistic contact dynamics later in optimization.

Yüklə 0,85 Mb.

Dostları ilə paylaş:

1 2 3 4 5 6