-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Efficient realloc: only copy data in touched spaces #24
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that this is correct.
In this flavor of CHAI - what state do arrays end up in if they are read-only (valid on both GPU and Host)? In my flavor, neither space was marked as touched.
Also, what happens right after allocation?
Trickiest use case I can remember dealing with:
1: realloc immediately after alloc
2: realloc something after it's used as a const in another part of a program.
Thanks @robinson96, I'll write up some more test cases and check that this all works. |
@@ -137,7 +137,9 @@ void* ArrayManager::reallocate(void* pointer, size_t elems) | |||
pointer_record->m_user_callback(ACTION_ALLOC, ExecutionSpace(space), sizeof(T) * elems); | |||
void* new_ptr = m_allocators[space]->allocate(sizeof(T)*elems); | |||
|
|||
rm.copy(old_ptr, new_ptr); | |||
if (pointer_record->m_touched[space]) { | |||
rm.copy(old_ptr, new_ptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the rm knows the lengths and this does a reasonable thing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sir.
One other concern - by choosing not to do a realloc on the CPU side of things, you're forcing a copy to always happen, while most realloc implementations will avoid a copy if the realloc ends up with the same address (which happens pretty often, in my experience). Does Umpire have a realloc API? Should it? |
It does, but that means we can't avoid the unnecessary copies on the CPU side - you call realloc and the semantics of realloc are what you will get. If you only want to avoid a device-side copy then that would be fine. |
No description provided.