diff --git a/api/cl_khr_command_buffer.asciidoc b/api/cl_khr_command_buffer.asciidoc index 7ba41289e..0c80a4d4f 100644 --- a/api/cl_khr_command_buffer.asciidoc +++ b/api/cl_khr_command_buffer.asciidoc @@ -3,8 +3,6 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer.txt[] -// *Revision*:: -// 0.9.6 // *Extension and Version Dependencies*:: // This extension requires OpenCL 1.2 or later. // Buffering of SVM commands requires OpenCL 2.0 or later. @@ -12,7 +10,7 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer.txt[] === Other Extension Metadata *Last Modified Date*:: - 2024-12-13 + 2025-07-10 *IP Status*:: No known IP claims. *Contributors*:: @@ -124,14 +122,6 @@ There are no gurantees made around the values of sync-points returned from adding commands to a command-buffer. Any semantics that a could be inferred from the sync-point values returned is implementation defined. -==== Simultaneous Use - -The optional simultaneous use capability was added to the extension so that -vendors can support pipelined workflows, where command-buffers are repeatedly -enqueued without blocking in user code. However, simultaneous use may result in -command-buffers being more expensive to enqueue than in a sequential model, so -the capability is optional to enable optimizations on command-buffer recording. - === Interactions With Other Extensions The introduction of the command-buffer abstraction enables functionality @@ -242,11 +232,8 @@ features: * {cl_device_command_buffer_capabilities_khr_TYPE} ** {CL_COMMAND_BUFFER_CAPABILITY_KERNEL_PRINTF_KHR} ** {CL_COMMAND_BUFFER_CAPABILITY_DEVICE_SIDE_ENQUEUE_KHR} - ** {CL_COMMAND_BUFFER_CAPABILITY_SIMULTANEOUS_USE_KHR} * {cl_command_buffer_properties_khr_TYPE} ** {CL_COMMAND_BUFFER_FLAGS_KHR} - * {cl_command_buffer_flags_khr_TYPE} - ** {CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR} * {cl_command_buffer_info_khr_TYPE} ** {CL_COMMAND_BUFFER_QUEUES_KHR} ** {CL_COMMAND_BUFFER_NUM_QUEUES_KHR} @@ -257,7 +244,6 @@ features: * {cl_command_buffer_state_khr_TYPE} ** {CL_COMMAND_BUFFER_STATE_RECORDING_KHR} ** {CL_COMMAND_BUFFER_STATE_EXECUTABLE_KHR} - ** {CL_COMMAND_BUFFER_STATE_PENDING_KHR} * {cl_command_type_TYPE} ** {CL_COMMAND_COMMAND_BUFFER_KHR} * New Error Codes @@ -470,3 +456,6 @@ features: * 0.9.7, 2024-12-13 ** Refactor queue compatability between command-buffer creation and enqueue (experimental). + * 0.9.8, 2025-07-10 + ** Rework simultaneous use definition and remove pending state + (experimental). diff --git a/api/cl_khr_command_buffer_mutable_dispatch.asciidoc b/api/cl_khr_command_buffer_mutable_dispatch.asciidoc index 5287ddcf7..1ef08b76e 100644 --- a/api/cl_khr_command_buffer_mutable_dispatch.asciidoc +++ b/api/cl_khr_command_buffer_mutable_dispatch.asciidoc @@ -6,7 +6,7 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer_mutable_dispatch.txt[ === Other Extension Metadata *Last Modified Date*:: - 2024-09-05 + 2025-08-08 *IP Status*:: No known IP claims. *Contributors*:: @@ -41,6 +41,15 @@ This allows inputs and outputs to the kernel, as well as work-item sizes and offsets, to change without having to re-record the entire command sequence in a new command-buffer. +==== Simultaneous Use + +The optional <> capability was added to the +extension so that vendors could support concurrent execution of the same +command-buffer object which has been updated between submissions. However, +simultaneous use may result in command-buffers having a larger overhead to +implement, so the capability is optional to enable optimizations when this +usage isn't required by a user. + === Interactions With Other Extensions The {clUpdateMutableCommandsKHR} entry-point has been designed for the purpose @@ -72,6 +81,8 @@ void pointer using {cl_command_buffer_update_type_khr_TYPE}. * {cl_device_info_TYPE} ** {CL_DEVICE_MUTABLE_DISPATCH_CAPABILITIES_KHR} + * {cl_device_command_buffer_capabilities_khr_TYPE} + ** {CL_COMMAND_BUFFER_CAPABILITY_SIMULTANEOUS_USE_KHR} * {cl_command_properties_khr_TYPE} ** {CL_MUTABLE_DISPATCH_ASSERTS_KHR} ** {CL_MUTABLE_DISPATCH_UPDATABLE_FIELDS_KHR} @@ -95,6 +106,7 @@ void pointer using {cl_command_buffer_update_type_khr_TYPE}. ** {CL_MUTABLE_COMMAND_COMMAND_TYPE_KHR} * {cl_command_buffer_flags_khr_TYPE} ** {CL_COMMAND_BUFFER_MUTABLE_KHR} + ** {CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR} * {cl_command_buffer_properties_khr_TYPE} ** {CL_COMMAND_BUFFER_MUTABLE_DISPATCH_ASSERTS_KHR} * {cl_command_buffer_update_type_khr_TYPE} @@ -363,3 +375,7 @@ may be a introduced as a stand alone extension. * Revision 0.9.3, 2024-09-05 ** Rename `CL_MUTABLE_DISPATCH_PROPERTIES_ARRAY_KHR` to `CL_MUTABLE_COMMAND_PROPERTIES_ARRAY_KHR` (experimental). + * Revision 0.9.4, 2025-08-08 + ** Move `CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR` and + `CL_COMMAND_BUFFER_CAPABILITY_SIMULTANEOUS_USE_KHR` in this + extension from the base extension (experimental). diff --git a/api/opencl_platform_layer.asciidoc b/api/opencl_platform_layer.asciidoc index 0108c87b9..4072045eb 100644 --- a/api/opencl_platform_layer.asciidoc +++ b/api/opencl_platform_layer.asciidoc @@ -1790,11 +1790,13 @@ include::{generated}/api/version-notes/CL_COMMAND_BUFFER_CAPABILITY_KERNEL_PRINT include::{generated}/api/version-notes/CL_COMMAND_BUFFER_CAPABILITY_DEVICE_SIDE_ENQUEUE_KHR.asciidoc[] +ifdef::cl_khr_command_buffer_mutable_dispatch[] {CL_COMMAND_BUFFER_CAPABILITY_SIMULTANEOUS_USE_KHR_anchor} Device - supports the command-buffers having a <> that exceeds 1. + supports enqueueing command-buffers with a <> usage pattern. include::{generated}/api/version-notes/CL_COMMAND_BUFFER_CAPABILITY_SIMULTANEOUS_USE_KHR.asciidoc[] +endif::cl_khr_command_buffer_mutable_dispatch[] ifdef::cl_khr_command_buffer_multi_device[] {CL_COMMAND_BUFFER_CAPABILITY_MULTIPLE_QUEUE_KHR_anchor} Device diff --git a/api/opencl_runtime_layer.asciidoc b/api/opencl_runtime_layer.asciidoc index e2b0406d3..a28c4b7f8 100644 --- a/api/opencl_runtime_layer.asciidoc +++ b/api/opencl_runtime_layer.asciidoc @@ -14618,13 +14618,14 @@ on one or more command-queues without any application code interaction. Grouping the operations together allows efficient enqueuing of repetitive operations, as well as enabling driver optimizations. -Command-buffers are _sequential use_ by default, but may also be set to -_simultaneous use_ on creation if the device optionally supports this -capability. -A sequential use command-buffer must have a <> -of 0 or 1. -The simultaneous use capability removes this restriction and allows -command-buffers to have a <> greater than 1. +Upon creation a command-buffer is in the <> state. In +order for the command-buffer to be enqueued it must first be finalized using +{clFinalizeCommandBufferKHR}, after which no further commands can be recorded. +A command-buffer is enqueued for execution on command-queues with a call to +{clEnqueueCommandBufferKHR}. It is always valid to call +{clEnqueueCommandBufferKHR} with a command-buffer that has previously been +enqueued, provided the call doesn't violate the definition of +<>. Command-buffers are created using an ordered list of command-queues that commands are recorded to and execute on by default. All these queue objects @@ -14690,6 +14691,32 @@ If using layered extension {cl_khr_command_buffer_mutable_dispatch_EXT}, usage>>. ==== +Simultaneous use is defined using the _prerequisite_ terminology from the +<<_execution_model, execution model>>. +ifndef::cl_khr_command_buffer_multi_device[] +A command-buffer exhibits undefined behavior if a simultaneous use +pattern occurs. +endif::cl_khr_command_buffer_multi_device[] +ifdef::cl_khr_command_buffer_multi_device[] +Simultaneous use is an optional feature for devices to support concurrent +executions of a command-buffer which have been updated between submissions. +A command-buffer must be created with {CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR} +to avoid undefined behavior if a simultaneous use pattern occurs. +endif::cl_khr_command_buffer_multi_device[] + +[[simultaneous-use]] +Simultaneous Use:: When a command-buffer is submitted for +execution without a prerequisite on all the previous submissions of the same +command-buffer which are not in the {CL_COMPLETE} state. + +[NOTE] +==== +An example of simultaneous use would be two submissions of the same +command-buffer to a single out-of-order queue, without any events or barriers +used to express a dependency between the two enqueue calls. Using a single +in-order queue, events, or barriers to express dependencies between submissions +of the same command-buffer would each be ways to avoid simultaneous use. +==== ifdef::cl_khr_command_buffer_multi_device[] === Command-Buffers and Multiple Devices @@ -14733,7 +14760,9 @@ endif::cl_khr_command_buffer_multi_device[] === Command-Buffer Lifecycle -A command-buffer is always in one of the following states: +A command-buffer is created in the recording state and transitions to the +executable state when finalized, at which point it cannot move back to +the recording state. [[recording]] Recording:: Initial state of a command-buffer on creation, where commands can be @@ -14743,11 +14772,6 @@ recorded to the command-buffer. Executable:: State after command recording has finished with {clFinalizeCommandBufferKHR} and the command-buffer may be enqueued. -[[pending]] -Pending:: Once a command-buffer has been enqueued to a command-queue it enters -the Pending state until completion, at which point it moves back to the -<> state. - // Image generated from the following mermaid diagram description using https://mermaid.live // Ideally we'd use the asciidoctor-diagram extension to generate the rendered diagram, but // there are issues installing the gem with ruby 2.3.3 @@ -14757,21 +14781,10 @@ the Pending state until completion, at which point it moves back to the // stateDiagram-v2 // [*] --> Recording: Create // Recording -->Executable: Finalize -// Executable --> Pending: Enqueue -// Pending --> Executable: Completion // .... image::images/commandbuffer_lifecycle.png[align="center", title="Lifecycle of a command-buffer."] -[[pending_count]] -The Pending Count is the number of copies of the command -buffer in the <> state. -By default a command-buffer's Pending Count must be 0 or 1. -If the command-buffer was created with -{CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR} then the command-buffer may have a -Pending Count greater than 1. - - === Creating Command-Buffer Objects [open,refpage='clCreateCommandBufferKHR',desc='Create a command-buffer',type='protos'] @@ -14812,13 +14825,15 @@ include::{generated}/api/version-notes/CL_COMMAND_BUFFER_FLAGS_KHR.asciidoc[] | {cl_command_buffer_flags_khr_TYPE} | This is a bitfield and can be set to a combination of the following values: +ifdef::cl_khr_command_buffer_mutable_dispatch[] {CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR_anchor} - Allow multiple - instances of the command-buffer to be submitted to the device for - execution. - If set, devices must support + instances of the command-buffer to be scheduled for execution on the + device in a usage pattern that exhibits <>. If set, devices must support {CL_COMMAND_BUFFER_CAPABILITY_SIMULTANEOUS_USE_KHR}. include::{generated}/api/version-notes/CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR.asciidoc[] +endif::cl_khr_command_buffer_mutable_dispatch[] ifdef::cl_khr_command_buffer_multi_device[] {CL_COMMAND_BUFFER_DEVICE_SIDE_SYNC_KHR_anchor} - All commands in the @@ -14898,16 +14913,6 @@ ifdef::cl_khr_command_buffer_multi_device[] |==== endif::cl_khr_command_buffer_multi_device[] -[NOTE] -==== -Upon creation the command-buffer is defined as being in the -<> state, in order for the command-buffer to be enqueued -it must first be finalized using {clFinalizeCommandBufferKHR} after which no -further commands can be recorded. -A command-buffer is submitted for execution on command-queues with a call to -{clEnqueueCommandBufferKHR}. -==== - // refError {clCreateCommandBufferKHR} returns a valid non-zero command-buffer and @@ -15089,9 +15094,6 @@ execution was successfully queued, or one of the errors below: * {CL_INVALID_COMMAND_BUFFER_KHR} if _command_buffer_ is not a valid command-buffer. * {CL_INVALID_OPERATION} if _command_buffer_ has not been finalized. - * {CL_INVALID_OPERATION} if _command_buffer_ was not created with the - {CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR} flag and is in the <> state. * {CL_INVALID_VALUE} if _queues_ is `NULL` and _num_queues_ is > 0, or _queues_ is not `NULL` and _num_queues_ is 0. * {CL_INVALID_VALUE} if _num_queues_ is > 0 and not the same value as @@ -15125,6 +15127,17 @@ execution was successfully queued, or one of the errors below: required by the OpenCL implementation on the host. -- +ifndef::cl_khr_command_buffer_mutable_dispatch[] +Calling {clEnqueueCommandBufferKHR} in a usage pattern that exhbits +<> results in undefined behavior. +endif::cl_khr_command_buffer_mutable_dispatch[] + +ifdef::cl_khr_command_buffer_mutable_dispatch[] +Calling {clEnqueueCommandBufferKHR} in a usage pattern that exhbits +<> when _command_buffer_ was not created +with the {CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR} flag results in undefined +behavior. +endif::cl_khr_command_buffer_mutable_dispatch[] === Recording Commands to a Command-Buffer @@ -16553,9 +16566,7 @@ include::{generated}/api/version-notes/clRemapCommandBufferKHR.asciidoc[] * _errcode_ret_ returns an appropriate error code. If _errcode_ret_ is `NULL`, no error code is returned. -The returned command-buffer has the same state as the input command-buffer, -unless the input command-buffer is in the <> state, in -which case the returned command-buffer has state <>. +The returned command-buffer has the same state as the input command-buffer. // refError @@ -16682,10 +16693,6 @@ one of the errors below is returned: * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources required by the OpenCL implementation on the host. -Using this function when _command_buffer_ is in the <> -state and not created with the {CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR} flag -causes undefined behavior. - [NOTE] ==== Performant usage is to call {clUpdateMutableCommandsKHR} only when the @@ -16903,18 +16910,10 @@ include::{generated}/api/version-notes/CL_COMMAND_BUFFER_STATE_KHR.asciidoc[] include::{generated}/api/version-notes/CL_COMMAND_BUFFER_STATE_RECORDING_KHR.asciidoc[] {CL_COMMAND_BUFFER_STATE_EXECUTABLE_KHR_anchor} is returned when - _command_buffer_ has been finalized and there is not a <> instance of _command_buffer_ awaiting completion on a - command_queue. + _command_buffer_ has been finalized. include::{generated}/api/version-notes/CL_COMMAND_BUFFER_STATE_EXECUTABLE_KHR.asciidoc[] - {CL_COMMAND_BUFFER_STATE_PENDING_KHR_anchor} is returned when an - instance of _command_buffer_ has been enqueued for execution but not - yet completed. - -include::{generated}/api/version-notes/CL_COMMAND_BUFFER_STATE_PENDING_KHR.asciidoc[] - | {CL_COMMAND_BUFFER_PROPERTIES_ARRAY_KHR_anchor} include::{generated}/api/version-notes/CL_COMMAND_BUFFER_PROPERTIES_ARRAY_KHR.asciidoc[] diff --git a/images/commandbuffer_lifecycle.png b/images/commandbuffer_lifecycle.png index 12867c804..8d82abdf1 100644 Binary files a/images/commandbuffer_lifecycle.png and b/images/commandbuffer_lifecycle.png differ diff --git a/xml/cl.xml b/xml/cl.xml index a68baafc7..d8c286a14 100644 --- a/xml/cl.xml +++ b/xml/cl.xml @@ -1371,7 +1371,6 @@ server's OpenCL/api-docs repository. - @@ -7324,7 +7323,7 @@ server's OpenCL/api-docs repository. - + @@ -7347,14 +7346,10 @@ server's OpenCL/api-docs repository. - - - - @@ -7371,7 +7366,6 @@ server's OpenCL/api-docs repository. - @@ -7471,7 +7465,7 @@ server's OpenCL/api-docs repository. - + @@ -7485,8 +7479,14 @@ server's OpenCL/api-docs repository. + + + + + +