-
Notifications
You must be signed in to change notification settings - Fork 43
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge remote-tracking branch 'supervisor/thebutlah/prepare-foss'
- Loading branch information
Showing
19 changed files
with
1,536 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
# Changelog | ||
|
||
## 0.4.1 | ||
|
||
### Added | ||
|
||
+ Private proxy for getting notified of service registration `org.worldcoin.OrbSupervisor1` | ||
+ Version pinning for GitHub actions | ||
|
||
### Changed | ||
|
||
+ Upon booting `orb-supervisor` permits `update-agent` to begin downloading | ||
immediately without throttling, until a signup starts | ||
|
||
## 0.4.0 | ||
|
||
### Added | ||
|
||
+ Proxy for logind method `org.freedesktop.login1.Manager.ScheduleShutdown` | ||
+ enables `orb-core` and `update-agent` to shutdown or restart the device without | ||
needing to grant elevated priveleges/suid | ||
|
||
## 0.3.0 | ||
|
||
`orb-supervisor` no longer shuts down `orb-core` immediately when an update happens | ||
but waits until no new signups have been started for a while. | ||
|
||
### Added | ||
|
||
+ Upon receiving a `RequestUpdatePermission` request, `orb-supervisor` only shuts | ||
down `orb-core` after 20 minutes of inactivity (meaning that no signups have been | ||
performed for 20 minutes). This timer is reset every time a new signup starts. | ||
Once the timer is up, `orb-supervisor` schedules `update-agent` to immediately run again. | ||
|
||
### Changed | ||
|
||
+ `orb-supervisor` now returns custom `MethodError`s to report why an update was denied, | ||
bringing it more in line with DBus conventions. | ||
|
||
## 0.2.0 (October 20, 2022) | ||
|
||
`orb-supervisor`'s integration with systemd and journald is improved by using | ||
journald conventions and writing directly to the journald socket. | ||
|
||
### Added | ||
|
||
+ `orb-supervisor` detects if its attached to an interactive TTY using `STDIN`: | ||
+ if not attached to a TTY, it will write to the journald socket | ||
+ if attached to a TTY, it will write to stdout/stderr | ||
+ `orb-supervisor` identifies itself as `worldcoin-supervisor` using SYSLOG IDENT; | ||
+ use `journalctl -t worldcoin-supervisor` to filter journald entries | ||
(`-u worldcoin-supervisor` however is still the preferred way); | ||
|
||
## 0.1.0 (August 31, 2022) | ||
|
||
This is the first release of `orb-supervisor`. | ||
|
||
### Added | ||
|
||
+ Expose dbus property `org.worldcoin.OrbSupervisor1.Manager.BackgroundDownloadsAllowed`; | ||
+ Tracks how much time has passed since the last | ||
`org.worldcoin.OrbCore1.Signup.SignupStarted` events; | ||
+ Expose dbus method `org.worldcoin.OrbSupervisor1.Manager.RequestUpdatePermission`; | ||
+ attempts to shutdown `worldcoin-core.service` through | ||
`org.freedesktop.systemd1.Manager.StopUnit`; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
[package] | ||
name = "orb-supervisor" | ||
version = "0.4.1" | ||
edition = "2021" | ||
|
||
[dependencies] | ||
color-eyre = "0.6.3" | ||
libc = "0.2.135" | ||
listenfd = "1.0.0" | ||
tokio = { version = "1.21.2", features = ["macros", "net", "rt-multi-thread"] } | ||
tokio-stream = "0.1.11" | ||
tracing = { version = "0.1.37", features = ["attributes"] } | ||
tracing-subscriber = { version = "0.3.16", features = ["env-filter"] } | ||
zbus = { version = "3.9.0", default-features = false, features = ["tokio"] } | ||
zbus_systemd = { version = "0.0.8", features = [ "systemd1", "login1" ] } | ||
thiserror = "1.0.37" | ||
futures = "0.3.24" | ||
once_cell = "1.15.0" | ||
tap = "1.0.1" | ||
tracing-journald = "0.3.0" | ||
|
||
[dev-dependencies] | ||
dbus-launch = "0.2.0" | ||
tokio = { version = "1.25.0", features = ["sync", "test-util"] } |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Guideline for Supervised Process Development | ||
|
||
Examples of SuPr (**Su**pervised **Pr**ocess) development: | ||
- update-agent | ||
- orb-core | ||
- fan-controller | ||
- ... | ||
|
||
## Expectations | ||
|
||
Through signal_hook or otherwise, we expect components to adhere to UNIX signal best practices, specifically around shutdown signals. | ||
|
||
### Shutdown Flow | ||
|
||
The supervisor _decides_ it must shutdown. The supervisor iterates over the list of supervised processes, reads their corresponding PID file, and issues a [SIGTERM](https://man7.org/linux/man-pages/man7/signal.7.html) to give the application **SOME DEFINED SECONDS** to shutdown. After that time has elapsed, the supervisor re-reads the SuPr PID files and sends a [SIGKILL](https://man7.org/linux/man-pages/man7/signal.7.html). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Orb Supervisor | ||
|
||
Orb supervisor is a central IPC server that coordinates device state and external UX across independent agents (binaries/processes). | ||
|
||
## Table of Contents | ||
|
||
- Minimal viable product (MVP) | ||
- Why (this is necessary) | ||
- Managing device health | ||
- Consistent UX | ||
- Seperation of concerns | ||
- Relevant components | ||
|
||
## MVP | ||
|
||
### Initial release | ||
|
||
- supervisor running [tonic gRPC](https://github.com/hyperium/tonic) over UDS (Unix Domain Sockets) | ||
- supervisor can broadcast shutdown message | ||
- component apps (orb-core, update-agent) listen for broadcast and shutdown | ||
- supervisor can update SMD **through sub-process** | ||
- supervisor can display front LED patterns | ||
- IPC (InterProcess-Communication) client library supporting defaults for process shutdown handlers | ||
- Setup the bidirectional communication + the listener for broadcast messages | ||
|
||
### Immediate follow-up release | ||
- supervisor can play sounds | ||
- supervisor can engage in bi-directional communication for signup permission with orb-core; orb-core must not run a signup if... | ||
- an update is scheduled; | ||
- the device is shutting down; | ||
- the SSD is full (coordinate with @AI on signup extensions); | ||
- fan-controller PoC | ||
- spin fans up/down depending on temperature/temperature-analogs | ||
- watch iops/sec on NVMe as an indicator of SSD temperature (can be replaced by reading out SMART data after kernel 5.10 is deployed) | ||
- supervisor can update SMD **through nvidia-smd crate** | ||
- Implement an Nvidia SMD parser as a crate (other people may want this) | ||
|
||
## Why this is necessary | ||
|
||
There are two reasons that make the orb supervisor necessary: | ||
|
||
1. Managing device health (heat, updates) | ||
1. Consistent UX (updates w/ voice, LEDs) | ||
1. Separation of concerns | ||
|
||
### Managing device health | ||
|
||
Device health must be ensured at all times, whether the device is updating or in the middle of a signup. Furthermore, you want this to be maximally isolated to avoid a scenario where, through a vulnerability in a monolithic application, an attacker acquires fan control and overheats the device. | ||
|
||
> **Scenario**: _A non-security critical update is running in the background and writing large blobs of data to the NVMe SSD_ while _orb-core is running and signups are being performed. An attacker uses a vulnerability in the QR code processing to deadlock a thread. They then proceed to garble the incoming network traffic causing the download to be repeatedly retried and data to be constantly written to the SSD while thermal management is stuck in the blocked runtime. This can feasibly fry an Orb._ | ||
### Consistent UX | ||
|
||
By necessity, the update agent service must have heightened privileges. Under no circumstances can we extend these to the entire orb-core process. At the same time, the operator must receive feedback on the status of an update. For certain updates, orb-core will not run during the update. In this scenario there is currently no mechanism to give feedback to the operator. | ||
|
||
Thus, an independent service that owns UX is a necessary condition for operator feedback. | ||
|
||
### Seperation of concerns | ||
|
||
Breaking components down allows us to: | ||
|
||
+ Reduce attack surfaces by restricting the responsibilities of privileged services; | ||
+ Employ best patterns for the job (a fan monitoring service looks different from an update agent looks different from orb core); | ||
+ Reduce engineering load (understanding a 500 LoC binary and finding bugs _is_ easier than in a 10k LoC monolith); | ||
+ Running integration tests is significantly easier outside of complex runtimes. | ||
|
||
It is best industry practice to write dedicated services *where possible*, where coupling is low and where solutions already exist. This applies especially on a full Linux host and will reduce engineering load. | ||
|
||
## Relevant components | ||
|
||
+ update agent | ||
+ fan monitor & control | ||
+ wifi management | ||
+ UX controller, split into: | ||
+ Sound | ||
+ LED | ||
+ library for basic and repeatable "component" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
use tokio::time::Duration; | ||
|
||
pub const WORLDCOIN_CORE_UNIT_NAME: &str = "worldcoin-core.service"; | ||
pub const DURATION_TO_STOP_CORE_AFTER_LAST_SIGNUP: Duration = | ||
Duration::from_secs(20 * 60); |
Oops, something went wrong.