-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor improvements to the documentation #292
base: master
Are you sure you want to change the base?
Conversation
TTG C++ implementation is currently supported by 2 backends providing task scheduling, data transfer, and resource management. | ||
While it is possible to use specific TTG backend explicitly, by using the appropriate namespaces, it is recommended to write backend-neutral programs that can be specialized to a particular backend as follows. | ||
The TTG C++ implementation is currently supported by 2 backends providing task scheduling, data transfer, and resource management. | ||
While it is possible to use a specific TTG backend explicitly, by using the appropriate namespaces, it is recommended to write backend-neutral programs that can be specialized to a particular backend in of two ways. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... in *one of *the two ways
README.md
Outdated
@@ -154,8 +160,7 @@ To execute a TTG we must make it executable (this will declare the TTG program | |||
tt->invoke(); | |||
``` | |||
|
|||
`ttg::execute()` must occur before, not after, sending any messages. Note also that we must ensure that only one such message must be generated. Since TTG execution uses the Single Program Multiple Data (SPMD) model, | |||
when launching the TTG program as multiple processes only the first process (rank) gets to send the message. | |||
`ttg::execute()` must occur before, not after, sending any messages. Note also that we must ensure that only one such message is generated. Since TTG execution uses the Single Program Multiple Data (SPMD) model, when launching the TTG program as multiple processes only the first process (rank) gets to send the message. Otherwise, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise ?
README.md
Outdated
@@ -422,22 +447,24 @@ Although the structure of the device-capable program is nearly identical to the | |||
|
|||
##### `TTValue` | |||
|
|||
For optimal performance low-level runtime that manages the data motion across the memory hierarchy (host-to-host (i.e., between MPI ranks), host-to-device, and device-to-device) must be able to _track_ each datum as it orchestrates the computation. For example, when a TTG task `send`'s a datum to an output terminal connected to multiple consumers the runtime may avoid unnecessary copies, e.g. by recognizing that all consumers will only need read-only access to the data, hence reference to the same datum can be passed to all consumers. This requires being able to map pointer to a C++ object to the control block that describes that object to the runtime. Deriving C++ type `T` from `TTValue<T>` makes it possible to track objects `T` by embedding the control block into each object. This is particularly important for the data that has to travel to the device. | |||
For optimal performance, the low-level runtime that manages the data motion across the memory hierarchy (host-to-host (i.e., between MPI ranks), host-to-device, and device-to-device) and so it must be able to _track_ each datum as it orchestrates the computation. For example, when a TTG task `send`'s a datum to an output terminal connected to multiple consumers the runtime may avoid unnecessary copies, e.g., by recognizing that all consumers will only need read-only access to the data, hence reference to the same datum can be passed to all consumers. This requires the mapping of a pointer to a C++ object to the control block that describes that object to the runtime. Deriving C++ type `T` from `TTValue<T>` includes the control block in `T` and avoids creating a separate control block. This is particularly important for the data that has to travel to the device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... a TTG task sends a datum ...send
's
README.md
Outdated
ttg::execute(); | ||
// add a single task into the taskpool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we never introduce implementation details like taskpools, I prefer something like // create task to kickstart computation
README.md
Outdated
@@ -28,15 +28,21 @@ The development of TTG was motivated by _irregular_ scientific applications like | |||
#include <ttg.h> | |||
|
|||
int main(int argc, char *argv[]) { | |||
// initialization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just skip comments that look like code
README.md
Outdated
ttg::fence(); | ||
|
||
// finalization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just skip comments that look like code
README.md
Outdated
@@ -195,28 +200,40 @@ $F_N = F_{N-1} + F_{N-2}, F_0=0, F_1=1$. | |||
int main(int argc, char *argv[]) { | |||
ttg::initialize(argc, argv); | |||
|
|||
const int64_t N = 20; | |||
const int64_t N = 20; // want to compute Fib(20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should match the TT name, so Fib
-> fib
README.md
Outdated
ttg::Edge<int64_t, Fn> f2f; | ||
ttg::Edge<void, Fn> f2p; | ||
auto make_ttg_fib_lt(const int64_t F_n_max) { | ||
ttg::Edge<int64_t, Fn> f2f; // Fib to Fib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fib -> fib
README.md
Outdated
ttg::Edge<void, Fn> f2p; | ||
auto make_ttg_fib_lt(const int64_t F_n_max) { | ||
ttg::Edge<int64_t, Fn> f2f; // Fib to Fib | ||
ttg::Edge<void, Fn> f2p; // Fib to print |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fib -> print
README.md
Outdated
`Buffer<T>` is a view of a contiguous sequence of objects of type `T` in the host memory that can be automatically moved by the runtime to/from the device memory. Here `Fn::b` is a view of the 2-element sequence pointed to by `Fn::F`; once it's constructed the content of `Fn::F` will be moved to/from the device by the runtime. The subsequent actions of `Fn::b` cause the automatic transfers of data to (`device::select(f_n.b)`) and from (`ttg::device::wait(f_n.b)`) the device. | ||
`Buffer<T>` is a view of a contiguous sequence of objects of type `T` in the host memory that can be automatically moved by the runtime to/from the device memory. Here `Fn::b` is a view of the 2-element sequence pointed to by `Fn::F`; once it's constructed the content of `Fn::F` will be moved to/from the device by the runtime. The subsequent use of `Fn::b` cause the automatic transfers of data to (`device::select(f_n.b)`) and from (`ttg::device::wait(f_n.b)`) the device. | ||
A `Buffer<T>` can be either owning or non-owning. In the example above, the memory is owned by the `unique_ptr`. | ||
If no pointer is passed to the constructor of `Buffer<T>` the buffer becomes owning, i.e., it allocates the necessary host-side memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to get in the weeds of why F
must be on heap to give relocatability?
README.md
Outdated
@@ -493,8 +518,7 @@ To simplify debugging of multirank TTG programs it is possible to automate the p | |||
# TTG Performance | |||
|
|||
Competitive performance of TTG for several paradigmatic scientific applications on shared- and distributed-memory machines (CPU only) | |||
will be discussed in [manuscript ``Generalized Flow-Graph Programming Using Template Task-Graphs: Initial Implementation and Assessment''](https://www.ipdps.org/ipdps2022/2022-accepted-papers.html) to be presented at [IPDPS'22](https://www.ipdps.org/ipdps2022/). | |||
Stay tuned! | |||
will be discussed in [manuscript ``Generalized Flow-Graph Programming Using Template Task-Graphs: Initial Implementation and Assessment''](https://www.ipdps.org/ipdps2022/2022-accepted-papers.html) and has been presented at [IPDPS'22](https://www.ipdps.org/ipdps2022/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"will be" -> "is"
@devreal ping |
Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
No description provided.