Skip to content
This repository has been archived by the owner on Mar 24, 2022. It is now read-only.

Faulted guests can run! #478

Merged
merged 3 commits into from
Apr 1, 2020
Merged

Faulted guests can run! #478

merged 3 commits into from
Apr 1, 2020

Conversation

iximeow
Copy link
Contributor

@iximeow iximeow commented Apr 1, 2020

Over in #429 we adjusted KillSwitch to be useful in that a timeout before a guest begins is upheld, rather than ignored. As it turns out, our tests didn't cover one of the expectations we have around instance lifetimes: after a fault, instances can be run without reset. Consequently, users of lucet-runtime that ran instances post-fault without reset (such as lucet-spectest) caught a bug: rerunning after a fault could cause us to panic in a way that aborts the process.

In normal guest exits we disable termination on the instance's KillState, then reset it to a new one so new KillSwitch can cancel execution for some eventual function call. To avoid the bad KillState state, we just needed to do the same when exiting through a signal handler, too.

The first commit here introduces a test that fails like

error: test failed, to rerun pass '-p lucet-runtime --test guest_fault'
                                                                     
Caused by:                                                                                
  process didn't exit successfully: `/home/iximeow/lucet/target/debug/deps/guest_fault-88464978bc082788` (signal: 6, SIGABRT: process abort signal)

(SIGABRT because we panic where there isn't full unwind information, and libunwind falls over)

where the second commit fixes this issue, and gets us back to passing all but five spectests: data, elem, globals, imports, linking!

@@ -215,7 +216,7 @@ macro_rules! timeout_tests {
}

// An instance that has faulted is not terminable.
assert_eq!(kill_switch.terminate(), Err(KillError::NotTerminable));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here says not terminable, and in a way the Err does mean not terminable, but Invalid means the underlying KillState is gone and the KillSwitch is no longer valid, which is slightly different from the NotTerminable result. I think this is a case of the test being subtly incorrect, but I know we've talked a bit about this granularity of KillError maybe being a misdesigned interface anyway.

Copy link
Member

@cratelyn cratelyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the changes here look fantastic, this is a great find @iximeow! 💪 🤠

One typo, and two small documentation fixes requested:

  1. The module-level documentation in lucet-runtime/lucet-runtime-internals/src/instance/execution.rs should probably be updated to mention that faulted guests are terminable.

  2. Same for the doc comments for KillError.

@iximeow iximeow merged commit 7c816c6 into master Apr 1, 2020
@iximeow iximeow deleted the faulted-guests-can-run branch April 1, 2020 23:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants