From 6a203371c127ba5bc6f934edf3b97f13c8a96afa Mon Sep 17 00:00:00 2001 From: Simon Noetzlin Date: Mon, 9 Dec 2024 10:51:30 +0100 Subject: [PATCH] doc: add fault resolution ADR (#2051) * save draft * save last version * last changes * add ADR in intro * updat title * nits * Update docs/docs/adrs/adr-018-fault-resolutions.md Co-authored-by: Marius Poke * Update docs/docs/adrs/adr-018-fault-resolutions.md Co-authored-by: Marius Poke * Update docs/docs/adrs/adr-018-fault-resolutions.md Co-authored-by: Marius Poke * add ref * Update docs/docs/adrs/adr-018-fault-resolutions.md Co-authored-by: Marius Poke * address comments * add detail about fraud proofs * refine metions about fraud proofs and add refs * add new evidence fields + details * fix typo * change title to incorrect executions fix typos * revert renaming * specify scope of first iteration * add fault type field and update some parts * save * save * add states * add details to states * nits * add details for pruning * improve message definition * docs: update ADRs intro and fault resolution adr position and title * docs: update ADRs intro and fault resolution adr position and title * docs: mv adrs to accepted * docs: mv adrs to accepted --------- Co-authored-by: Marius Poke Co-authored-by: MSalopek --- docs/docs/adrs/adr-022-fault-resolutions.md | 131 ++++++++++++++++++++ docs/docs/adrs/intro.md | 1 + 2 files changed, 132 insertions(+) create mode 100644 docs/docs/adrs/adr-022-fault-resolutions.md diff --git a/docs/docs/adrs/adr-022-fault-resolutions.md b/docs/docs/adrs/adr-022-fault-resolutions.md new file mode 100644 index 0000000000..3501ecbd07 --- /dev/null +++ b/docs/docs/adrs/adr-022-fault-resolutions.md @@ -0,0 +1,131 @@ +--- +sidebar_position: 23 +title: Fault Resolutions +--- +# ADR 022: Fault Resolutions + +## Changelog +* 17th July 2024: Initial draft + +## Status + +Proposed + +## Context + +Partial Set Security ([PSS](./adr-015-partial-set-security.md)) allows a subset of a provider chain's validator set to secure a consumer chain. + While this shared security scheme has many advantages, it comes with a risk known as the + [subset problem](https://informal.systems/blog/replicated-vs-mesh-security#risks-of-opt-in-security-also-known-as-ics-v-2). + This problem arises when a malicious majority of validators from the provider chain collude and misbehave on a consumer chain. + This threat is particularly relevant for Opt-in chains, since they might be secured by a relatively small subset of the provider's validator set. + +In cases of collusion, various types of misbehaviour can be performed by the validators, such as: + +* Incorrect executions to break protocol rules in order to steal funds. +* Liveness attacks to halt the chain or censor transactions. +* Oracle attacks to falsify information used by the chain logic. + +Currently, these types of attacks aren't handled in PSS, leaving the malicious validators unpunished. + +A potential solution for the handling of incorrect executions is to use fraud proofs. + This technology allows proving incorrect state transitions of a chain without a full node. + However, this is a complex technology and there is no framework that works for Cosmos chains to this day. + + +To address this risk in PSS, a governance-gated slashing solution can be used to handle all types of misbehavior resulting from validator collusion. As fraud proof technology matures, part of the solution could potentially be automated. + + +This ADR proposes a fault resolution mechanism, which is a type of governance proposal that can be used to vote on the slashing of validators that misbehave on Opt-in consumer chains (see [fault resolutions](https://forum.cosmos.network/t/preventing-intersubjective-faults-in-ics/14103#fault-resolutions-3) in "Preventing Intersubjective faults in ICS"). + +In what follows, we describe the implementation of a fault resolution mechanism for any intersubjective fault. + Note that in the first iteration, it is only incorrect executions that are defined as a fault and are therefore dealt with by the mechanism (see [Incorrect Executions](https://forum.cosmos.network/t/preventing-intersubjective-faults-in-ics/14103#incorrect-execution-fault-definition-5) in "Preventing Intersubjective faults in ICS"). + + +## Decision + +The proposed solution introduces a new `consumer-fault-resolution` governance proposal type to the `provider` module, which allows validators to be penalised for committing faults on an Opt-in consumer chain. + +If such a proposal passes, the proposal handler tombstones all the validators listed in the proposal and slashes them by a per-consumer chain predefined + amount or the default value used for double-sign infractions. + +The proposal has the following fields: + +- **Consumer Chain**: The consumer chain ID that the fault was related to. +- **Validators**: The list of all the validators to be slashed. +- **Evidence**: A free text form. +- **Fault Type**: The fault definition type. +- **Description**: This field is automatically generated by aggregating the fault definition corresponding to the *Fault Type* and the *Evidence* fields. + + Each fault type is mapped to a fault definition that precisely describes an intersubjective fault, such as an incorrect execution, and explains why it qualifies as a slashable fault. Refer to the [fault definitions section](https://forum.cosmos.network/t/preventing-intersubjective-faults-in-ics/14103#fault-definitions-4) in "Preventing Intersubjective faults in ICS" for more details. Note that the text of each fault definition is stored as a string constant in the provider code. + + +In addition, to prevent spamming, users must pay a default fee of `100ATOM` to submit a fault resolution to the provider. + This amount is stored in a new `consumer-fault-resolution-fee` parameter of the `provider` module. + +### Validations + +The submission of a fault resolution succeeds only if all of the following conditions are met: + +- the consumer chain is an Opt-in chain +- all listed validators were opted-in to the consumer chain in the past unbonding-period +- the `100ATOM` fee is provided + +### States + +Additional states are added to the `provider` modules: + +* The timestamps that record when validators opts in or opts out of a Opt-in consumer chain. + Note that these timestamps can be pruned after an unbonding period elapses following a validator's opts-out. + +```golang + ConsumerValidatorSubscriptionTimestampPrefix | len(consumerID) | consumerID | valAddr | ProtocolBuffer(ConsumerValSubscriptionTimestamp) +``` + +```protobuf + messsage { + // timestamp recording the last time a validator opted in to the consumer chain + google.protobuf.Timestamp join_time = 1; + // timestamp recording the last time a validator opted out of the consumer chain + google.protobuf.Timestamp leave_time = 2; + } +``` + +* Pre-defined slashing factor per-consumer chain for each defined fault (optional). + +```golang + ConsumerFaultSlashFactorPrefix | len(consumerID) | consumerID | faultType -> SlashFactor +``` + +### Additional considerations + +Fault resolution proposals should be `expedited` to minimize the time given to the listed validators + to unbond to avoid punishment (see [Expedited Proposals](https://docs.cosmos.network/v0.50/build/modules/gov#expedited-proposals)) . + + +## Consequences + +### Positive + +- Provide the ability to slash and tombstone validators for committing incorrect executions on Opt-in consumer chains. + +### Negative + +- Assuming that malicious validators unbond immediately after misbehaving, a fault resolution has to be submitted within a maximum + of two weeks in order to slash the validators. + +### Neutral + +- Fault definitions need to have a clear framework in order to avoid debates about whether an attack has actually taken place. + +## References + + * [Preventing intersubjective faults in ICS](https://forum.cosmos.network/t/preventing-intersubjective-faults-in-ics/14103) + +* [Enabling Opt-in and Mesh Security with Fraud Votes](https://forum.cosmos.network/t/enabling-opt-in-and-mesh-security-with-fraud-votes/10901) + +* [CHIPs discussion phase: Partial Set Security](https://forum.cosmos.network/t/chips-discussion-phase-partial-set-security-updated/11775) + +* [Replicated vs. Mesh Security](https://informal.systems/blog/replicated-vs-mesh-security#risks-of-opt-in-security-also-known-as-ics-v-2) + + + diff --git a/docs/docs/adrs/intro.md b/docs/docs/adrs/intro.md index d0fc7f3240..c5a75ca99a 100644 --- a/docs/docs/adrs/intro.md +++ b/docs/docs/adrs/intro.md @@ -50,6 +50,7 @@ To suggest an ADR, please make use of the [ADR template](https://github.com/cosm - [ADR 011: Improving testing and increasing confidence](./adr-011-improving-test-confidence.md) - [ADR 016: Security aggregation](./adr-016-securityaggregation.md) - [ADR 021: Consumer Chain Clients](./adr-021-consumer-chain-clients.md) +- [ADR 022: Fault Resolutions](./adr-022-fault-resolutions.md) ### Rejected