Skip to content

Commit

Permalink
[SYCL][Docs] Add and implement in-order queue event extension (#12331)
Browse files Browse the repository at this point in the history
This commit adds the extension for and implementation of the ability to
get the last event of an in-order queue as well as setting an external
event to be used as a dependence in an in-order queue.

---------

Signed-off-by: Larsen, Steffen <[email protected]>
  • Loading branch information
steffenlarsen authored Jan 15, 2024
1 parent 1e36e61 commit 1907275
Show file tree
Hide file tree
Showing 10 changed files with 373 additions and 11 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
= sycl_ext_oneapi_in_order_queue_events

:source-highlighter: coderay
:coderay-linenums-mode: table

// This section needs to be after the document title.
:doctype: book
:toc2:
:toc: left
:encoding: utf-8
:lang: en
:dpcpp: pass:[DPC++]

// Set the default source code type in this document to C++,
// for syntax highlighting purposes. This is needed because
// docbook uses c++ and html5 uses cpp.
:language: {basebackend@docbook:c++:cpp}


== Notice

[%hardbreaks]
Copyright (C) 2024-2024 Intel Corporation. All rights reserved.

Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
permission by Khronos.


== Contact

To report problems with this extension, please open a new issue at:

https://github.com/intel/llvm/issues


== Dependencies

This extension is written against the SYCL 2020 revision 8 specification. All
references below to the "core SYCL specification" or to section numbers in the
SYCL specification refer to that revision.


== Status

This is an experimental extension specification, intended to provide early
access to features and gather community feedback. Interfaces defined in this
specification are implemented in {dpcpp}, but they are not finalized and may
change incompatibly in future versions of {dpcpp} without prior notice.
*Shipping software products should not rely on APIs defined in this
specification.*


== Overview

SYCL 2020 in-order queues allow for simple control of submission ordering, i.e.
commands are executed in the order they are submitted. This extension adds two
additional APIs for controlling in-order queues: Getting the event from the last
command submission into the queue and setting an external event as an implicit
dependence on the next command submitted to the queue.

This extension exists to solve a specific problem, and a general solution is
still being evaluated. It is not recommended for general usage.


== Specification

=== Feature test macro

This extension provides a feature-test macro as described in the core SYCL
specification. An implementation supporting this extension must predefine the
macro `SYCL_EXT_ONEAPI_IN_ORDER_QUEUE_EVENTS` to one of the values defined in
the table below. Applications can test for the existence of this macro to
determine if the implementation supports this feature, or applications can test
the macro's value to determine which of the extension's features the
implementation supports.

[%header,cols="1,5"]
|===
|Value
|Description

|1
|The APIs of this experimental extension are not versioned, so the
feature-test macro always has this value.
|===

=== New SYCL queue APIs

This extension adds the following new APIs to the existing `sycl::queue` class:

[source, c++]
----
namespace sycl {
class queue {
...
event ext_oneapi_get_last_event() const { /*...*/ }
void ext_oneapi_set_external_event(const event &external_event) { /*...*/ }
}
} // namespace sycl
----

These new APIs have the following behaviour:

--
[options="header"]
|====
| Function Definition | Description
a|
[source, c++]
----
event ext_oneapi_get_last_event() const;
----
| Returns an event representing the execution of the last command submitted to
the queue.

Calls to this member function throw a `sycl::exception` with `errc::invalid` if
the queue does not have the `property::queue::in_order` property.

Calls to this member function throw a `sycl::exception` with `errc::invalid` if
the queue has the `ext::oneapi::property::queue::discard_events` property from
the
link:../supported/sycl_ext_oneapi_discard_queue_events.asciidoc[sycl_ext_oneapi_discard_queue_events extension].

a|
[source, c++]
----
void ext_oneapi_set_external_event(const event &externalEvent);
----
| Sets an event to be used as an additional dependency of the next command
submission to the queue. Subsequent calls to this function will overwrite the
event of the previous call, resulting in only the `externalEvent` from the last
call to this function being a dependency of the next command submission.

This is equivalent to calling `handler::depends_on()` in a command submission
with the `externalEvent` from the most recent call to this member function since
the previous command submission to the same queue.

Calls to this member function throw a `sycl::exception` with `errc::invalid` if
the queue does not have the `property::queue::in_order` property.

Calls to this member function throw a `sycl::exception` with `errc::invalid` if
the queue has the `ext::oneapi::property::queue::discard_events` property from
the
link:../supported/sycl_ext_oneapi_discard_queue_events.asciidoc[sycl_ext_oneapi_discard_queue_events extension].
|====
--
4 changes: 4 additions & 0 deletions sycl/include/sycl/queue.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2783,6 +2783,10 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {

pi_native_handle getNative(int32_t &NativeHandleDesc) const;

event ext_oneapi_get_last_event() const;

void ext_oneapi_set_external_event(const event &external_event);

private:
std::shared_ptr<detail::queue_impl> impl;
queue(std::shared_ptr<detail::queue_impl> impl) : impl(impl) {}
Expand Down
52 changes: 43 additions & 9 deletions sycl/source/detail/queue_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,20 @@ static event createDiscardedEvent() {
return createSyclObjFromImpl<event>(EventImpl);
}

const std::vector<event> &
queue_impl::getExtendDependencyList(const std::vector<event> &DepEvents,
std::vector<event> &MutableVec) {
if (isInOrder()) {
std::optional<event> ExternalEvent = popExternalEvent();
if (ExternalEvent) {
MutableVec = DepEvents;
MutableVec.push_back(*ExternalEvent);
return MutableVec;
}
}
return DepEvents;
}

event queue_impl::memset(const std::shared_ptr<detail::queue_impl> &Self,
void *Ptr, int Value, size_t Count,
const std::vector<event> &DepEvents) {
Expand Down Expand Up @@ -108,9 +122,13 @@ event queue_impl::memset(const std::shared_ptr<detail::queue_impl> &Self,
if (isInOrder() && MLastCGType == CG::CGTYPE::CodeplayHostTask)
MLastEvent.wait();

std::vector<event> MutableDepEvents;
const std::vector<event> &ExpandedDepEvents =
getExtendDependencyList(DepEvents, MutableDepEvents);

auto EventImpl = detail::getSyclObjImpl(ResEvent);
MemoryManager::fill_usm(Ptr, Self, Count, Value,
getOrWaitEvents(DepEvents, MContext),
getOrWaitEvents(ExpandedDepEvents, MContext),
&EventImpl->getHandleRef(), EventImpl);

if (MContext->is_host())
Expand Down Expand Up @@ -201,9 +219,13 @@ event queue_impl::memcpy(const std::shared_ptr<detail::queue_impl> &Self,
if (isInOrder() && MLastCGType == CG::CGTYPE::CodeplayHostTask)
MLastEvent.wait();

std::vector<event> MutableDepEvents;
const std::vector<event> &ExpandedDepEvents =
getExtendDependencyList(DepEvents, MutableDepEvents);

auto EventImpl = detail::getSyclObjImpl(ResEvent);
MemoryManager::copy_usm(Src, Self, Count, Dest,
getOrWaitEvents(DepEvents, MContext),
getOrWaitEvents(ExpandedDepEvents, MContext),
&EventImpl->getHandleRef(), EventImpl);

if (MContext->is_host())
Expand Down Expand Up @@ -244,9 +266,13 @@ event queue_impl::mem_advise(const std::shared_ptr<detail::queue_impl> &Self,
if (isInOrder() && MLastCGType == CG::CGTYPE::CodeplayHostTask)
MLastEvent.wait();

std::vector<event> MutableDepEvents;
const std::vector<event> &ExpandedDepEvents =
getExtendDependencyList(DepEvents, MutableDepEvents);

auto EventImpl = detail::getSyclObjImpl(ResEvent);
MemoryManager::advise_usm(Ptr, Self, Length, Advice,
getOrWaitEvents(DepEvents, MContext),
getOrWaitEvents(ExpandedDepEvents, MContext),
&EventImpl->getHandleRef(), EventImpl);

if (MContext->is_host())
Expand Down Expand Up @@ -288,11 +314,15 @@ event queue_impl::memcpyToDeviceGlobal(
if (isInOrder() && MLastCGType == CG::CGTYPE::CodeplayHostTask)
MLastEvent.wait();

std::vector<event> MutableDepEvents;
const std::vector<event> &ExpandedDepEvents =
getExtendDependencyList(DepEvents, MutableDepEvents);

auto EventImpl = detail::getSyclObjImpl(ResEvent);
MemoryManager::copy_to_device_global(DeviceGlobalPtr, IsDeviceImageScope,
Self, NumBytes, Offset, Src,
getOrWaitEvents(DepEvents, MContext),
&EventImpl->getHandleRef(), EventImpl);
MemoryManager::copy_to_device_global(
DeviceGlobalPtr, IsDeviceImageScope, Self, NumBytes, Offset, Src,
getOrWaitEvents(ExpandedDepEvents, MContext),
&EventImpl->getHandleRef(), EventImpl);

if (MContext->is_host())
return MDiscardEvents ? createDiscardedEvent() : event();
Expand Down Expand Up @@ -333,11 +363,15 @@ event queue_impl::memcpyFromDeviceGlobal(
if (isInOrder() && MLastCGType == CG::CGTYPE::CodeplayHostTask)
MLastEvent.wait();

std::vector<event> MutableDepEvents;
const std::vector<event> &ExpandedDepEvents =
getExtendDependencyList(DepEvents, MutableDepEvents);

auto EventImpl = detail::getSyclObjImpl(ResEvent);
MemoryManager::copy_from_device_global(
DeviceGlobalPtr, IsDeviceImageScope, Self, NumBytes, Offset, Dest,
getOrWaitEvents(DepEvents, MContext), &EventImpl->getHandleRef(),
EventImpl);
getOrWaitEvents(ExpandedDepEvents, MContext),
&EventImpl->getHandleRef(), EventImpl);

if (MContext->is_host())
return MDiscardEvents ? createDiscardedEvent() : event();
Expand Down
34 changes: 32 additions & 2 deletions sycl/source/detail/queue_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -716,6 +716,22 @@ class queue_impl {

unsigned long long getQueueID() { return MQueueID; }

void setExternalEvent(const event &Event) {
std::lock_guard<std::mutex> Lock(MInOrderExternalEventMtx);
MInOrderExternalEvent = Event;
}

std::optional<event> popExternalEvent() {
std::lock_guard<std::mutex> Lock(MInOrderExternalEventMtx);
std::optional<event> Result = std::nullopt;
std::swap(Result, MInOrderExternalEvent);
return Result;
}

const std::vector<event> &
getExtendDependencyList(const std::vector<event> &DepEvents,
std::vector<event> &MutableVec);

protected:
// Hook to the scheduler to clean up any fusion command held on destruction.
void cleanup_fusion_cmd();
Expand All @@ -731,8 +747,8 @@ class queue_impl {
};

// Accessing and changing of an event isn't atomic operation.
// Hence, here is the lock for thread-safety.
std::lock_guard<std::mutex> Lock{MLastEventMtx};
// Hence, here is are locks for thread-safety.
std::lock_guard<std::mutex> LastEventLock{MLastEventMtx};

if (MLastCGType == CG::CGTYPE::None)
MLastCGType = Type;
Expand All @@ -744,6 +760,13 @@ class queue_impl {
if (NeedSeparateDependencyMgmt)
Handler.depends_on(MLastEvent);

// If there is an external event set, add it as a dependency and clear it.
// We do not need to hold the lock as MLastEventMtx will ensure the last
// event reflects the corresponding external event dependence as well.
std::optional<event> ExternalEvent = popExternalEvent();
if (ExternalEvent)
Handler.depends_on(*ExternalEvent);

EventRet = Handler.finalize();

MLastEvent = EventRet;
Expand Down Expand Up @@ -894,6 +917,13 @@ class queue_impl {
// the fallback implementation of profiling info
bool MFallbackProfiling = false;

// This event can be optionally provided by users for in-order queues to add
// an additional dependency for the subsequent submission in to the queue.
// Access to the event should be guarded with MInOrderExternalEventMtx.
// NOTE: std::optional must not be exposed in the ABI.
std::optional<event> MInOrderExternalEvent;
mutable std::mutex MInOrderExternalEventMtx;

public:
// Queue constructed with the discard_events property
const bool MDiscardEvents;
Expand Down
1 change: 1 addition & 0 deletions sycl/source/feature_test.hpp.in
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ inline namespace _V1 {
#define SYCL_EXT_INTEL_CACHE_CONTROLS 1
#define SYCL_EXT_INTEL_FP_CONTROL 1
#define SYCL_EXT_ONEAPI_NON_UNIFORM_GROUPS 1
#define SYCL_EXT_ONEAPI_IN_ORDER_QUEUE_EVENTS 1

#ifndef __has_include
#define __has_include(x) 0
Expand Down
26 changes: 26 additions & 0 deletions sycl/source/queue.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -300,5 +300,31 @@ bool queue::ext_codeplay_supports_fusion() const {
ext::codeplay::experimental::property::queue::enable_fusion>();
}

event queue::ext_oneapi_get_last_event() const {
if (!is_in_order())
throw sycl::exception(
make_error_code(errc::invalid),
"ext_oneapi_get_last_event() can only be called on in-order queues.");
if (impl->MDiscardEvents)
throw sycl::exception(
make_error_code(errc::invalid),
"ext_oneapi_get_last_event() cannot be called on queues with the "
"ext::oneapi::property::queue::discard_events property.");
return impl->getLastEvent();
}

void queue::ext_oneapi_set_external_event(const event &external_event) {
if (!is_in_order())
throw sycl::exception(make_error_code(errc::invalid),
"ext_oneapi_set_external_event() can only be called "
"on in-order queues.");
if (impl->MDiscardEvents)
throw sycl::exception(
make_error_code(errc::invalid),
"ext_oneapi_set_external_event() cannot be called on queues with the "
"ext::oneapi::property::queue::discard_events property.");
return impl->setExternalEvent(external_event);
}

} // namespace _V1
} // namespace sycl
Loading

0 comments on commit 1907275

Please sign in to comment.