Metal Shading Language Specification

Yüklə 4,82 Kb.

Pdf görüntüsü

səhifə	42/51
tarix	25.05.2018
ölçüsü	4,82 Kb.
	#45967

1 ... 38 39 40 41 42 43 44 45 ... 51

5.12 Atomic Functions
The Metal programming language implements a subset of the C++14 atomics and
synchronization operations. Metal atomic functions must operate on Metal atomic data, as
described in section 2.5.
Atomic operations play a special role in making assignments in one thread visible to another
thread. A synchronization operation on one or more memory locations is either an acquire
operation, a release operation, or both an acquire and release operation. A synchronization
operation without an associated memory location is a fence and can be either an acquire fence,
a release fence, or both an acquire and release fence. In addition, there are relaxed atomic
operations that are not synchronization operations.
There are only a few kinds of operations on atomic types, although there are many instances of
those kinds. This section specifies each general kind.
Atomic functions are defined in the header

.
Built-in pack functions
Description
uint pack_float_to_unorm4x8(float4 x)
uint pack_float_to_snorm4x8(float4 x)
uint pack_half_to_unorm4x8(half4 x)
uint pack_half_to_snorm4x8(half4 x)
Convert a 4-component vector
normalized single- or half-precision
floating-point value to four 8-bit
integer values and pack these 8-bit
integer values into a 32-bit unsigned
integer.
uint pack_float_to_srgb_unorm4x8(float4 x)
uint pack_half_to_srgb_unorm4x8(half4 x)
Convert a 4-component vector
normalized single- or half-precision
floating-point value to four 8-bit
integer values and pack these 8-bit
integer values into a 32-bit unsigned
integer. The color values are
converted from linear RGB to sRGB.
uint pack_float_to_unorm2x16(float2 x)
uint pack_float_to_snorm2x16(float2 x)
uint pack_half_to_unorm2x16(half2 x)
uint pack_half_to_snorm2x16(half2 x)
Convert a 2-component vector of
normalized single- or half-precision
floating-point values to two 16-bit
integer values and pack these 16-bit
integer values into a 32-bit unsigned
integer.
uint pack_float_to_unorm10a2(float4)
ushort pack_float_to_unorm565(float3)
uint pack_half_to_unorm10a2(half4)
ushort pack_half_to_unorm565(half3)
Convert a 4- or 3-component vector
of normalized single- or half-
precision floating-point values to a
packed, 1010102 or 565 color
integer value.

2017-9-12   |  Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
136
174

5.12.1
Memory Order
The enumerated type
memory_order
specifies the detailed regular (non-atomic) memory
synchronization operations as defined in section 29.3 of the C++14 specification and may
provide for operation ordering. For details on different memory orders, see sections 5.12.1.1,
5.12.1.2, and 5.12.1.3.
For
ios-metal2.0
, all the enumerated values listed in Table 32 are supported with atomic
operations.
enum memory_order {memory_order_relaxed, memory_order_acquire,
memory_order_release, memory_order_acq_rel, memory_order_seq_cst};
For pre-2.0 versions of Metal on iOS and all versions of Metal on macOS,
memory_order_relaxed
is the only
memory_order
supported with atomic operations.
enum memory_order {memory_order_relaxed };
Table 32 Memory Ordering Enum Values
5.12.1.1
Relaxed Ordering
Atomic operations tagged
memory_order_relaxed
are not synchronization operations. These
operations do not order memory, but they guarantee atomicity and modification order
consistency.
Typical use for relaxed memory ordering is updating counters, such as reference counters since
this only requires atomicity, but neither ordering nor synchronization.
Memory Order
Description
memory_order_relaxed
  There are no synchronization or ordering constraints, only
atomicity is required of this operation.
memory_order_acquire
A load operation with this memory order performs the acquire
operation on the affected memory location: prior writes made to
other memory locations by the thread that did the release
become visible in this thread.
memory_order_release
A store operation with this memory order performs the release
operation: prior writes to other memory locations become visible
to the threads that do an acquire on the same location.
memory_order_acq_rel
A load operation with this memory order performs the acquire
operation on the affected memory location and a store operation
with this memory order performs the release operation.
memory_order_seq_cst
Same as
memory_order_acq_rel
, plus a single total order exists
in which all threads observe all modifications in the same order.

2017-9-12   |  Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
137
174

5.12.1.2
Release-Acquire Ordering
If an atomic store in thread A is tagged
memory_order_release
and an atomic load in thread B
from the same variable is tagged
memory_order_acquire
, all memory writes (non-atomic and
relaxed atomic) that happened-before the atomic store from the point of view of thread A,
become visible side-effects in thread B. That is, once the atomic load is completed, thread B is
guaranteed to see everything thread A wrote to memory.
The synchronization is established only between the threads releasing and acquiring the same
atomic variable. Other threads can see a different order of memory accesses than either or both
of the synchronized threads.
5.12.1.3
Sequentially Consistent Ordering
Atomic operations tagged
memory_order_seq_cst
order memory the same way as release/
acquire ordering (everything that happened-before a store operation in one thread becomes a
visible side effect in the thread that performed the load) and also establish a single total
modification order of all atomic operations that are so tagged. Sequential ordering may be
necessary for multiple producer-multiple consumer situations, where all consumers must
observe the actions of all producers occurring in the same order.
Note: as soon as an atomic operation that does not use a memory order of
memory_order_seq_cst
is encountered, the sequential consistency is lost.
5.12.2
Memory Scope
The enumerated type
memory_scope
specifies whether the memory ordering constraints given
by
memory_order
apply to threads within a SIMD-group, a threadgroup, or threads across
threadgroups of a kernel(s) executing on the device. Its enumerated values are as follows:
enum memory_scope {memory_scope_simdgroup, memory_scope_threadgroup,
memory_scope_device};
The memory scope can be specified when performing atomic operations to
device
memory.
Atomic operations to
threadgroup
memory only guarantee memory ordering in the
threadgroup, not across threadgroups.
5.12.3
Fence Functions
For iOS, the following fence functions are supported.
void atomic_thread_fence(mem_flags flags, memory_order order)
void atomic_thread_fence(mem_flags flags, memory_order order, memory_scope
scope)
atomic_thread_fence
establishes memory synchronization ordering of non-atomic and
relaxed atomic accesses, as instructed by
order
, without an associated atomic function. For

2017-9-12   |  Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
138
174

Yüklə 4,82 Kb.

Dostları ilə paylaş:

1 ... 38 39 40 41 42 43 44 45 ... 51