-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify interoperability with fork and subprocess creation #474
Open
nspark
wants to merge
4
commits into
openshmem-org:main
Choose a base branch
from
nspark:interop/fork
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
265a972
Initial interop text re: fork and subprocesses
nspark 29e7417
Update 'fork' wording for consistency
nspark 507d765
Clarifying note on forking, exit handlers, and implicit finalization
nspark 8cd6cad
Changelog for fork/subprocess interoperability
nspark File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two cases above are still problematic. The issue is that fork marks pages shared by the parent and child processes as copy on write. When either process touches the page, it causes a copy of that page to be mapped into that process' address space and this breaks memory registration because it can change the parent's mapping for memory registered with the NIC. The second bullet above is solved by
ibv_fork_init
, which marks any memory registered to the NIC withMADV_DONTFORK
. This prevents the page being remapped to a copy in the parent's address space, but it also changes the behavior of fork since some pages don't propagate to the child process. In new kernels, Linux copies the pages immediately, which solves the memory registration problem without breaking fork, but at the expense of copy space/time overhead (especially if the application does fork+exec). I'm not exactly sure what will happen in the case wherefork
is called beforeibv_fork_init
on older Linux kernels. Hopefully the NIC driver will see that the page is marked copy on write during registration and duplicate it in the parent's address space.To summarize, any support for fork is problematic. I know the bullet below says that we aren't required to support it, but the two bullets above describe two usage models as things that ought to work. Fork bugs are very hard to diagnose; here's an interesting read of one user's journey discovering that fork was broken: https://blog.nelhage.com/post/a-cursed-bug/. I would suggest that if we want to keep the above bullets, we also need to give users some way to ask the SHMEM library whether or not the above usage models are supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That post is a great write-up, and it touches on a behavioral aspect of CoW that I didn't know:
I had always thought the child always gets the copy. It sounds like the behavior in Linux 5.12 is (or can be) much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to, for now, limit subprocess creation to
posix_spawn
. We could then revisit documentingfork
compatibility with a supplemental API (e.g.,shmem_is_forksafe
) later.I think the documentation rationale for
posix_spawn
is sound. I didn't knowposix_spawn
existed until I found that I couldn'tfork-exec
in various SHMEM implementations. I imagine many Unix-y programmers are more used tofork-exec
thanposix_spawn
.Still, what's the concern with use of
fork
beforeshmem_init[_thread]
? I'd generally think that, in general, we're still in "vanilla Linux memory land" during that time.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concern I had about fork before init is what NIC drivers will do when you try to register a page that's marked CoW. I suppose this is a common case, e.g. if a new allocation is registered before it's initialized (all pages mapped to zero page), so perhaps nothing to worry about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been digging around this trying to better understand the problem and a few things have changed since this thread was initiated. It is not clear to me that
posix_spawn
should better thanfork()
orvfork()
. I believe both of those functions can be called depending on the setting of a few POSIX system environment variables. I would like to see this interface defined, but I know this is not a trivial task. Have any of your (@jdinan) opinions on the matter changed with updates to the kernel and networking middle layers options.The main ambition is having a defined interface within a SHMEM program that is able to spawn off a subprocess safely. The process should have limitations on what it can do (e.g. no shmem routines).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only change that comes to mind is Linux 5.12 copying the parent process pages into the child process at the time of the fork rather than marking them as CoW. Is having a copy of the parent process' data a usage model that's important for your apps? If not, we leave more freedom to implementations by not needing to support such a model.