Metal Shading Language Specification

Yüklə 4,82 Kb.

Pdf görüntüsü

səhifə	18/51
tarix	25.05.2018
ölçüsü	4,82 Kb.
	#45967

1 ... 14 15 16 17 18 19 20 21 ... 51

// an array of Foo elements
device Foo *my_info;

Since texture objects are always allocated from the device address space, the
device
address
attribute is not needed for texture types. The elements of a texture object cannot be directly
accessed. Functions to read from and write to a texture object are provided.
4.2.2
threadgroup Address Space
Threads are organized into threadgroups. Threads in a threadgroup cooperate by sharing data
through
threadgroup
memory and by synchronizing their execution to coordinate memory
accesses to both
device
and
threadgroup
memory. The threads in a given threadgroup
execute concurrently on a single compute unit on the GPU. A GPU may have multiple compute
units. Multiple threadgroups can execute concurrently across multiple compute units.
The
threadgroup
address space name is used to allocate variables used by a kernel function.
Variables declared in the
threadgroup
address space cannot be used in graphics functions.
Variables allocated in the
threadgroup
address space in a kernel function are allocated for
each threadgroup executing the kernel, are shared by all threads in a threadgroup and exist only
for the lifetime of the threadgroup that is executing the kernel.
Variables allocated in the
threadgroup
address space for a mid-render kernel function are
allocated for each threadgroup executing the kernel and are persistent across mid-render and
fragment kernel functions over a tile.
The example below shows how variables allocated in the
threadgroup
address space can be
passed either as arguments or be declared inside a kernel function. (The
[[threadgroup(0)]]

attribute in the code below is explained in section 4.3.1.)
kernel void
my_kernel(threadgroup float *a [[threadgroup(0)]],
…)
{
// A float allocated in the threadgroup address space
threadgroup float x;
// An array of 10 floats allocated in the
// threadgroup address space
threadgroup float b[10];
…
}
4.2.2.1
SIMD-groups and Quad-groups

2017-9-12   |  Copyright © 2017 Apple Inc. All Rights Reserved.
Page    of
56
174

Within a threadgroup, threads can be divided into SIMD-groups in an implementation- defined
fashion. Each SIMD-group is a collection of threads that executes concurrently. The mapping to
SIMD-groups is invariant for the duration of a kernel’s execution, across dispatches of a given
kernel with the same launch parameters, and from one threadgroup to another within the
dispatch (excluding the trailing edge threadgroups in the presence of non-uniform threadgroup
sizes). In addition, all SIMD-groups within a threadgroup must be the same size, apart from the
SIMD-group with the maximum index, which may be smaller, if the size of the threadgroup is not
evenly divisible by the size of the SIMD-groups.
A quad-group is a SIMD-group with the thread execution width of 4.
SIMD-groups are only supported for
macos-metal2.0
. Quad-groups are only supported on
ios—metal2.0
.
For kernel function attributes SIMD-groups and quad-groups, see section 4.3.4.6. SIMD- group
functions are described in section 5.13. Quad-group functions are described in section 5.14.
4.2.3
threadgroup_imageblock Address Space
The
threadgroup_imageblock
address space refers to objects allocated in threadgroup
memory that are only accessible using an
imageblock
object (see section 2.10). A
pointer to a user-defined type allocated in the
threadgroup_address
address space can be an
argument to a tile shading function (see section 4.1.2). There is exactly one threadgroup per tile,
and each threadgroup can access the threadgroup memory and the imageblock associated with
its tile.
Variables allocated in the
threadgroup_imageblock
address space in a kernel function are
allocated for each threadgroup executing the kernel, are shared by all threads in a threadgroup,
and exist only for the lifetime of the threadgroup that is executing the kernel. Each thread in the
threadgroup uses explicit 2D coordinates to access imageblocks. Do not assume any particular
spatial relationship between the threads and the imageblock. The threadgroup dimensions may
be smaller than the tile size.
4.2.4
constant Address Space
The
constant
address space name refers to buffer memory objects allocated from the device
memory pool but are read-only. Variables in program scope must be declared in the
constant

address space and initialized during the declaration statement. The initializer(s) expression
must be a core constant expression. (Refer to section 5.19 of the C++14 specification.)
Variables in program scope have the same lifetime as the program, and their values persist
between calls to any of the compute or graphics functions in the program.
constant float samples[] = { 1.0f, 2.0f, 3.0f, 4.0f };
Pointers or references to the
constant
address space are allowed as arguments to functions.
Writing to variables declared in the
constant
address space is a compile-time error. Declaring
such a variable without initialization is also a compile-time error.

2017-9-12   |  Copyright © 2017 Apple Inc. All Rights Reserved.
Page    of
57
174

Yüklə 4,82 Kb.

Dostları ilə paylaş:

1 ... 14 15 16 17 18 19 20 21 ... 51