3. If a struct member or array element
E
is itself a struct or array, its struct members or array
elements are assigned indices according to rules 1 and 2 recursively, starting from the ID
assigned to
E
. In the example below, index 4 is explicitly provided for the nested struct
called
normal
, so its elements (previously defined as
tex
and
uvScaleOffset
) are
assigned IDs 4 and 5, respectively. The elements of the nested struct called
specular
are assigned IDs 6 and 7 by adding one to the maximum ID (5) used by the previous
member.
struct Material {
MaterialTexture diffuse;
// Assigned indices 0, 1
MaterialTexture normal [[id(4)]];
// Assigned indices 4, 5
MaterialTexture specular;
// Assigned indices 6, 7
}
4. Top-level argument buffer arguments are assigned IDs starting from 0, according to rules
1-3.
2.12.1
Tier 2 Hardware Support for Argument Buffers
With Tier 2 hardware, argument buffers have the following additional capabilities that are not
available with Tier 1 hardware.
Argument buffers can be accessed through pointer indexing. This syntax shown below refers to
an array of consecutive, independently encoded argument buffers:
kernel void
kern(constant Resources *resArray [[buffer(0)]])
{
constant Resources & resources = resArray[3];
}
kernel void
kern(constant texture2d *textures [[buffer(0)]]);
To support GPU driven pipelines and indirect draw calls and dispatches, resources can be
copied between structs and arrays within a function, as shown below:
kernel void
copy(constant Foo & src [[buffer(0)]],
device Foo & dst [[buffer(1)]])
{
dst.a = src.d;
…
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page of
40
174
}
Samplers cannot be copied from the thread address space to the device address space. As a
result, samplers can only be copied into an argument buffer directly from another argument
buffer. The example below shows both legal and illegal copying:
struct Resources {
sampler sam;
};
kernel void
copy(device Resources *src,
device Resources *dst,
sampler sam1)
{
constexpr sampler sam2;
dst->sam = src->sam;
// Legal: device -> device
dst->sam = sam1;
// Illegal: thread -> device
dst->sam = sam2;
// Illegal: thread -> device
}
Argument buffers can contain pointers to other argument buffers:
struct Textures {
texture2d diffuse;
texture2d specular;
};
struct Material {
device Textures *textures;
};
fragment float4
fragFunc(device Material & material);
2.13 Uniform Type
2.13.1
The Need for a Uniform Type
In the following function example, the variable
i
is used to index into an array of textures given
by
texInput
. The variable
i
is non-uniform; i.e., it can have a different value for threads
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page of
41
174
executing the graphics or kernel function for a draw or dispatch call, as shown in the example
below. Therefore, the texture sampling hardware has to handle a sample request that can refer
to different textures for threads executing the graphics or kernel function for a draw or dispatch
call.
kernel void
my_kernel(array, 10> texInput,
array, 10> texOutput,
sampler s,
…,
uint2 gid [[thread_position_in_grid]])
{
int i = …;
float4 color = texList[i].sample(s, float2(coord.x, coord.y));
…;
texOutput[i].write(color, coord);
}
If the variable
i
has the same value for all threads (i.e., uniform) executing the graphics or kernel
function of a draw or dispatch call and if this information was communicated to the hardware,
then the texture sampling hardware can apply appropriate optimizations. A similar argument can
be made for texture writes, where a variable computed at runtime is used as an index into an
array of textures or to index into one or more buffers.
To indicate that this variable is uniform for all threads executing the graphics or kernel function
of a draw or dispatch call, the Metal shading language adds a new template class called uniform
(available in the header
metal_uniform
) that can be used to declare variables inside a graphics
or kernel function. This template class can only be instantiated with arithmetic types (i.e.,
boolean, integer, and floating point types) and vector types.
The code below is a modified version of the previous example, where the variable
i
is declared
as a
uniform
type:
kernel void
my_kernel(array, 10> texInput,
array, 10> texOutput,
sampler s,
…,
uint2 gid [[thread_position_in_grid]])
{
uniform i = …;
float4 color = texList[i].sample(s, float2(coord.x, coord.y));
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page of
42
174
…;
texOutput[i].write(color, coord);
}
2.13.2
Behavior of the Uniform Type
If a variable is of the
uniform
type, and the variable does not have the same value for all
threads executing the kernel or graphics function, then the behavior is undefined.
Uniform variables implicitly type convert to non-uniform types. Assigning the result of an
expression computed using uniform variables to a uniform variable is legal, but assigning a non-
uniform variable to a uniform variable results in a compile-time error. In the following example,
the multiplication legally converts the uniform variable
x
into non- uniform product
z
. However,
assigning the non-uniform variable
z
to the uniform variable
b
results in a compile-time error.
uniform x = …;
int y = …;
intz=x*y;
// x is converted to a non-uniform for a multiply
uniform b = z; // illegal; compile-time error
To declare an array of uniform elements:
uniform bar[10]; // elements stored in bar array are uniform
The
uniform
type is legal for both parameters and the return type of a function. For example:
uniform foo(…); // foo returns a uniform integer value
int bar(uniform a, …);
It is legal to declare a pointer to a uniform type, but not legal to declare a uniform pointer. For
example:
device uniform *ptr;
// values pointed to by ptr are uniform
uniform ptr; // illegal; compile-time error
The results of expressions that combine uniform with non-uniform variables are non- uniform. If
the non-uniform result is assigned to a uniform variable, as in the example below, the behaviors
is undefined. (The front-end might generate a compile-time error, but it is not guaranteed to do
so.)
uniform i = …;
int j = …;
if (i < j) {
// non-uniform result for expression (i < j)
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page of
43
174
Dostları ilə paylaş: |