diff --git a/api/cl_ext_buffer_device_address.asciidoc b/api/cl_ext_buffer_device_address.asciidoc
new file mode 100644
index 000000000..b76998925
--- /dev/null
+++ b/api/cl_ext_buffer_device_address.asciidoc
@@ -0,0 +1,80 @@
+// Copyright 2024 The Khronos Group Inc.
+// SPDX-License-Identifier: CC-BY-4.0
+
+include::{generated}/meta/{refprefix}cl_ext_buffer_device_address.txt[]
+
+=== Other Extension Metadata
+
+*Last Modified Date*::
+ 2025-02-04
+*IP Status*::
+ No known IP claims.
+*Contributors*::
+ - Pekka Jääskeläinen, Intel +
+ - Karol Herbst, Red Hat +
+ - Ben Ashbaugh, Intel +
+ - Kevin Petit, Arm +
+ - Henry Linjamäki, Intel +
+
+=== Description
+
+This extension provides access to raw device pointers for cl_mem buffers
+without requiring a shared virtual address space between the host and
+the device.
+
+==== Background
+
+Shared Virtual Memory (SVM) introduced in OpenCL 2.0 is the first feature
+that enables raw pointers in the OpenCL standard. Its coarse-grain
+variant is relatively simple to implement on various platforms in terms of
+coherency requirements, but it requires mapping the buffer's address range
+to the host virtual address space.
+However, various higher-level heterogeneous APIs present a memory allocation
+routine which can allocate device-only memory and provide raw addresses to
+it without guarentees of system-wide uniqueness. For example, minimal
+implementations of OpenMP's omp_target_alloc() and CUDA/HIP's
+cudaMalloc()/hipMalloc() do not require a shared address space between the host and the device.
+
+Host-device unified addressing might not be a major implementation issue in
+systems which can provide virtual memory across the platform, but might
+bring challenges in cases where the device presents a global memory with
+a disjoint address space (that can also be a physical memory address space) or,
+for example, when a barebone embedded system lacks virtual memory support altogether.
+This extension is targeted to complement the OpenCL SVM extension by providing
+an additional lower-end step in the spectrum of type of pointers/buffers OpenCL
+can allocate.
+
+=== New Command
+
+ * {clSetKernelArgDevicePointerEXT}
+
+=== New Types
+
+ * {cl_mem_device_address_EXT}
+
+=== New Enums
+
+ * {cl_mem_properties_TYPE}
+ ** {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT}
+ * {cl_mem_info_TYPE}
+ ** {CL_MEM_DEVICE_ADDRESS_EXT}
+ * {cl_kernel_exec_info_TYPE}
+ ** {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT}
+
+=== Version History
+
+ * Revision 1.0.0, 2025-01-15
+ ** Initial version for detailed review.
+ * Revision 1.0.1, 2025-01-28
+ ** Made it explicit that passing illegal pointers is legal as long as they are
+ not referenced. Removed CL_INVALID_ARG_VALUE as a possible error in
+ clSetKernelArgDevicePointerEXT() as there are no illegal pointer
+ cases when calling this function. Return CL_INVALID_OPERATION for
+ clGetMemObjectInfo() if the pointer is not a buffer device pointer.
+ clSetKernelExecInfo() and clSetKernelArgDevicePointerEXT() now only
+ error out if no devices in the context associated with kernel support
+ device pointers.
+ * Revision 1.0.2, 2025-02-04
+ ** Converted the clSetKernelArgDevicePointerEXT() address parameter to
+ a value instead of a pointer to the value.
+
diff --git a/api/opencl_runtime_layer.asciidoc b/api/opencl_runtime_layer.asciidoc
index c1331ffc1..f9ccf894e 100644
--- a/api/opencl_runtime_layer.asciidoc
+++ b/api/opencl_runtime_layer.asciidoc
@@ -595,6 +595,35 @@ include::{generated}/api/version-notes/CL_MEM_DEVICE_HANDLE_LIST_KHR.asciidoc[]
{CL_MEM_DEVICE_HANDLE_LIST_END_KHR_anchor}) to associate with the
external memory handle.
endif::cl_khr_external_memory[]
+
+ifdef::cl_ext_buffer_device_address[]
+
+| {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT_anchor}
+
+include::{generated}/api/version-notes/CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT.asciidoc[]
+ | {cl_bool_TYPE}
+ | When set to {CL_TRUE}, specifies that the buffer must have a single fixed
+ device-side address for its lifetime, and the address can be queried via {clGetMemObjectInfo}.
+
+ Each device in the context can have their own (fixed) device-side address and
+ a copy of the created buffer which are synchronized
+ implicitly by the runtime.
+
+ The flag might imply that the buffer will be "pinned" permanently to
+ a device's memory, but might not be necessarily so, as long as the address
+ range of the buffer remains constant.
+
+ The device addresses of sub-buffers derived from {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT}
+ allocated buffers can be computed by adding the sub-buffer origin to the
+ device-specific start address.
+
+ If the device supports SVM and {clCreateBufferWithProperties} is called with a pointer
+ returned by {clSVMAlloc} as its _host_ptr_ argument, and {CL_MEM_USE_HOST_PTR} is
+ set in its _flags_ argument, the device-side address is guaranteed to match
+ the _host_ptr_.
+
+endif::cl_ext_buffer_device_address[]
+
|====
ifdef::cl_khr_external_memory[]
@@ -662,6 +691,12 @@ ifdef::cl_khr_external_memory[]
{CL_MEM_DEVICE_HANDLE_LIST_KHR} is specified as part of _properties_.
** if _properties_ includes more than one external memory handle.
endif::cl_khr_external_memory[]
+ifdef::cl_ext_buffer_device_address[]
+ * {CL_INVALID_OPERATION}
+ ** If _properties_ includes {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} and there
+ are no devices in the context that support the {cl_ext_buffer_device_address_EXT}
+ extension.
+endif::cl_ext_buffer_device_address[]
[[memory-flags-table]]
.List of supported memory flag values
@@ -6463,6 +6498,20 @@ include::{generated}/api/version-notes/CL_MEM_D3D11_RESOURCE_KHR.asciidoc[]
returns the _resource_ argument specified when _memobj_ was created.
endif::cl_khr_d3d11_sharing[]
+ifdef::cl_ext_buffer_device_address[]
+| {CL_MEM_DEVICE_ADDRESS_EXT_anchor}
+
+include::{generated}/api/version-notes/CL_MEM_DEVICE_ADDRESS_EXT.asciidoc[]
+ | {cl_mem_device_address_EXT_TYPE}[]
+ | If _memobj_ was created using {clCreateBufferWithProperties} with
+ the {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} property set to {CL_TRUE},
+ returns a list of device addresses for the buffer, one for each
+ device in the context in the same order as the list of devices
+ passed to {clCreateContext}.
+
+endif::cl_ext_buffer_device_address[]
+
+
|====
// refError
@@ -6472,6 +6521,11 @@ successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_MEM_OBJECT} if _memobj_ is a not a valid memory object.
+ifdef::cl_ext_buffer_device_address[]
+ * {CL_INVALID_OPERATION} is returned for the {CL_MEM_DEVICE_ADDRESS_EXT} query if
+ the {cl_ext_buffer_device_address_EXT} is not supported or if the
+ buffer was not allocated with {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT}.
+endif::cl_ext_buffer_device_address[]
* {CL_INVALID_VALUE} if _param_name_ is not one of the supported values, or
if the size in bytes specified by _param_value_size_ is less than size of
the return type specified in the
@@ -10778,6 +10832,48 @@ Otherwise, it returns one of the following errors:
required by the OpenCL implementation on the host.
--
+ifdef::cl_ext_buffer_device_address[]
+[open,refpage='clSetKernelArgDevicePointerEXT',desc='Set a device pointer as the argument value for a specific argument of a kernel.',type='protos']
+--
+To set a device pointer as the argument value for a specific argument of a
+kernel, call the function
+
+include::{generated}/api/protos/clSetKernelArgDevicePointerEXT.txt[]
+include::{generated}/api/version-notes/clSetKernelArgDevicePointerEXT.asciidoc[]
+
+ * _kernel_ is a valid kernel object.
+ * _arg_index_ is the argument index.
+ Arguments to the kernel are referred by indices that go from 0 for the
+ leftmost argument to _n_ - 1, where _n_ is the total number of arguments
+ declared by a kernel.
+ * _arg_value_ is the device pointer that should be used as the argument value for
+ argument specified by _arg_index_.
+ The device pointer specified is the value used by all API calls that enqueue
+ _kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument
+ value is changed by a call to {clSetKernelArgDevicePointerEXT} for _kernel_.
+ The device pointer can only be used for arguments that are declared to be a
+ pointer to `global` memory allocated with {clCreateBufferWithProperties} with
+ the {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} property. The pointer value specified as
+ the argument value can be the pointer to the beginning of the buffer or any offset into
+ the buffer region. The device pointer value must be naturally aligned according to
+ the argument's type. It should be noted that it's legal to pass invalid
+ pointers as the value (similarly to C/C++ function calls with pointer arguments) as
+ long as the kernel doesn't dereference the pointer.
+
+{clSetKernelArgDevicePointerEXT} returns {CL_SUCCESS} if the argument was set
+successfully. Otherwise, it returns one of the following errors:
+
+ * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object.
+ * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support
+ the {cl_ext_buffer_device_address_EXT} extension.
+ * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index.
+ * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required
+ by the OpenCL implementation on the device.
+ * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources
+ required by the OpenCL implementation on the host.
+--
+endif::cl_ext_buffer_device_address[]
+
[open,refpage='clSetKernelExecInfo',desc='Set additional execution information for a kernel.',type='protos']
--
To set additional execution information for a kernel, call the function
@@ -10844,6 +10940,19 @@ include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM
If {clSetKernelExecInfo} has not been called with a value for
{CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM}, the default value is
{CL_TRUE}.
+
+ifdef::cl_ext_buffer_device_address[]
+| {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT_anchor}
+
+include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT.asciidoc[]
+ | {cl_mem_device_address_EXT_TYPE}[]
+ | Device pointers must reference locations contained entirely within
+ buffers that are passed to kernel as arguments, or that are passed
+ through the execution information. Non-argument device pointers accessed
+ by the kernel must be specified by passing pointers to those buffers
+ via this {clSetKernelExecInfo} option.
+endif::cl_ext_buffer_device_address[]
+
|====
// refError
@@ -10853,7 +10962,16 @@ successfully.
Otherwise, it returns one of the following errors:
* {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object.
- * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM.
+ * {CL_INVALID_OPERATION} for {CL_KERNEL_EXEC_INFO_SVM_PTRS} and
+ {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} if no devices in
+ the context associated with _kernel_ support SVM.
+ifdef::cl_ext_buffer_device_address[]
+ * {CL_INVALID_OPERATION} for {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT} if no
+ device in the context associated with _kernel_ support the {cl_ext_buffer_device_address_EXT}
+ extension.
+endif::cl_ext_buffer_device_address[]
+ * {CL_INVALID_VALUE} if _param_name_ is not valid, if _param_value_ is
+ `NULL` or if the size specified by _param_value_size_ is not valid.
* {CL_INVALID_OPERATION} if _param_name_ is
{CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} and _param_value_ is {CL_TRUE}
and no devices in the context associated with _kernel_ support fine-grain
diff --git a/xml/cl.xml b/xml/cl.xml
index 39c3a4c4c..a2b0556b6 100644
--- a/xml/cl.xml
+++ b/xml/cl.xml
@@ -255,6 +255,7 @@ server's OpenCL/api-docs repository.
typedef cl_bitfield cl_platform_command_buffer_capabilities_khr;
typedef cl_bitfield cl_mutable_dispatch_asserts_khr
typedef cl_bitfield cl_device_kernel_clock_capabilities_khr;
+ typedef cl_ulong cl_mem_device_address_ext;
Structure types
@@ -2315,6 +2316,12 @@ server's OpenCL/api-docs repository.
+
+
+
+
+
+