Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(spec): Requirements on valid(v) #748

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
142 changes: 129 additions & 13 deletions specs/consensus/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -413,7 +413,7 @@ correct behaviour provided that `p` is able to prove the existence of a

The [pseudo-code][pseudo-code] of the consensus algorithm includes calls to
functions that are not defined in the pseudo-code itself, but are assumed to be
implemented by the processes running the consensus protocol.
implemented by the context/application that uses the consensus protocol.

### Proposer Selection

Expand All @@ -437,18 +437,134 @@ height of consensus `h_p`.

### Validation

The external function `valid(v)` is invoked by a process when it receives a
`⟨PROPOSAL, h, r, v, *⟩` from the proposer of round `r` of height `h`.
It should return whether the `v` is a valid value according to the semantics of
the "client" of the consensus protocol, i.e., the application that uses
consensus to agree on proposed values.

> TODO: relevant observation:
> - Validation typically depends on `h` as well, in particular on the
> application state at blockchain height `h`. It should not ordinarily
> depend on `r`, since the application state should not change over rounds.
> - Determinism: is `valid(v)` a function?
> - Needed because of validity property of consensus, defined in the paper
The external function `valid` is used to guard any value-specific action in
the algorithm. The purpose is to restrict the domain of values that consensus
may decide. There is the assumption that if `v` is a value returned
by the `getValue()` function of a correct process in the current height, then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
by the `getValue()` function of a correct process in the current height, then
by the [`getValue()` function](#proposal-value) of a correct process in the current height, then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a section on getValue().

`valid(v)` evaluates to `true`. Under this assumption, as long as the proposer
is correct, `valid(v)` does not serve any purpose. It only becomes relevant if
there is a faulty proposer; it limits "how bad" proposed values can be.

In the arXiv paper, the pseudo code uses `valid(v)`, that is, a function that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the arXiv paper, the pseudo code uses `valid(v)`, that is, a function that
The [pseudo-code][pseudo-code] uses `valid(v)`, that is, a function that

maps a value to a boolean. Observe that this should be understood in terms
of a mathematical (pure) function:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
of a mathematical (pure) function:
of a mathematical (pure) function, with the following properties:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can point to the "properties above" instead.

1. the function must only depend on `v` but not on other data
(e.g., an application state at a given point in time, the current local time
or temperature).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example is fun, eheh.

2. If invoked several times for the same input, the function always returns
the same boolean value (determinism).
3. If invoked on different processes for the same input, the function always
returns the same boolean value.

The correctness of the consensus algorithm depends heavily on these points.
Most obviously for termination, we require that all correct processes find
the value by a correct proposer valid under all circumstances, because we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the value by a correct proposer valid under all circumstances, because we
the value proposed by a correct process valid under all circumstances, because we

need them to all prevote for the value. A deviation in `valid` would
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
need them to all prevote for the value. A deviation in `valid` would
need them to accept the value. A deviation in `valid` would

result in some processes
prevoting nil, and consequently, there might never be enough votes for a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prevoting nil, and consequently, there might never be enough votes for a
rejecting values proposed by correct processes, and consequently, there might never be enough votes to decide a value.

value to decide.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
value to decide.


#### `valid(v)` in implementations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### `valid(v)` in implementations
#### Implementation


In implementations, the validity check typically is not pure:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In implementations, the validity check typically is not pure:
Implementations of the validity check are typically not pure:

as already mentioned in the arXiv paper "In the context of blockchain systems,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
as already mentioned in the arXiv paper "In the context of blockchain systems,
as already mentioned in the [Tendermint paper][tendermint-arxiv], "In the context of blockchain systems,

for example, a value is not valid if it does not
contain an appropriate hash of the last value (block) added to the blockchain." That is,
rather than `valid(v)` the implementation looks something like `valid(v, chain, height)`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding valid(v, chain, height), I agree with the discussion but I would use the nomenclature of the pseudo-code.

Instead of chain, we have the decision_p array for process p, and instead of height we use h_p.

The height here can actually be omitted, as by construction if h_p = H, then decision_p is filled from 0 to H-1. In other words, len(decision_p) = h_p is an invariant.

that is,
it depends on the state of the blockchain at a given height and a value. This implies that
a value that might be valid at blockchain height 5, might be invalid at height 6. Observe
that this, strictly speaking, violates Point 1. from above.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
that this, strictly speaking, violates Point 1. from above.
that this, strictly speaking, the above defined property 1.

However, as long as all processes agree on
the state of the chain up to height 5 when they
start consensus for height 6, this still satisfies Points 2 and 3 and it will not harm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
start consensus for height 6, this still satisfies Points 2 and 3 and it will not harm
start consensus for height 6, this still satisfies properties 2 and 3 and it will not harm

liveness. (There have been cases of
non-determinism in the application level that led to processes disagreeing on the
application state and thus consensus being blocked at a given height)

**Remark.** We have seen slight (ab)uses of valid, that use external data. Consider he toy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use a quote block (> ) for this paragraph.

example of the proposer proposing the current room temperature, and the processes
checking in `valid` whether the temperature is within one degree of their room
temperature. It is easy to see that this a priori violates the points 1-3 above.
One the one hand, this requires additional assumption on the environment to
ensure termination, on the other hand, it is impossible to check validity after
the fact (e.g., a late joiner cannot check validity of a value that processes have
decided some time ago). So this use is not encouraged, and we ignore it in the
remainder. However, for some applications this use case might still be beneficial: in
this case it is important to understand that one needs to make an argument (which
also needs to involve
the environment) why it
can be ensured that if `v` is a value returned
by the `getValue()` function of a correct process in the current height, then
`valid(v)` evaluates to `true`.

**Remark.** Point 1 above also forbids that the current consensus round influences
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use a quote block (> ) for this paragraph.

validity. For instance, one may say to get out of the liveness issue from the previous
remark, to just find all blocks valid that have been proposed after, say, round 10.
Similarly, one might consider more stricter validity requirements for the proposal in
round 0 (the lucky path), while in unlucky situations one might want to simplify
reaching a decision by weakening validity. In general this leads to complexity down
the road when someone reads the decisions much later, and needs to understand the
different semantics of different blocks based on the decision round. We strongly
suggest to not use round numbers in validation of values for this reason.

#### Backward compatibility of `valid(v)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Backward compatibility of `valid(v)`
#### Backwards compatibility


**Requirement 1 (Fixing bugs).**
There might be a bug in `valid`. Then it might be possible that due to a bug,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There might be a bug in `valid`. Then it might be possible that due to a bug,
There might be a bug in the implementation of `valid(v)`. Then it might be possible that due to a bug,

`valid` returns `false`
for a value `v` proposed by a correct process, and we are stuck at a given height.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for a value `v` proposed by a correct process, and we are stuck at a given height.
for values proposed by correct processes, and we are stuck at a given height.

A way to get out of the
situation is to produce a new implementation `valid'` that returns `true` for `v`. So
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
situation is to produce a new implementation `valid'` that returns `true` for `v`. So
situation is to produce a new implementation of `valid(v)` that returns `true` for the values proposed by correct processes.

to be prepared for such a scenario we need to allow a change in the function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to be prepared for such a scenario we need to allow a change in the function.
To be prepared for such a scenario we need to allow a change in the function.


**Requirement 2 (Future use).**
So if we allow changes to `valid`, we need to understand all uses of this function: Some
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
So if we allow changes to `valid`, we need to understand all uses of this function: Some
If we allow changes to `valid`, we need to understand all uses of this function. Some

synchronization protocols may use `valid` for consistency checks, for instance, if a node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
synchronization protocols may use `valid` for consistency checks, for instance, if a node
synchronization protocols may use `valid(v)` for consistency checks, for instance, if a node

fell behind, it might need to learn several past decisions. In doing so, it typically also
uses (the current version) of `valid` to check the decided values before accepting them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uses (the current version) of `valid` to check the decided values before accepting them.
uses (the current version) of `valid(v)` to check the decided values before accepting them.

In this scenario, a value decided in the past (potentially using a now old version of
`valid`) should be deemed valid with the current version of the function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`valid`) should be deemed valid with the current version of the function.
`valid(v)`) should be deemed valid with the current version of the function.


These two requirements lead us to the following requirement on the implementations:
we consider a sequence of `valid_i(v)` implementations, with increasing indexes `i`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
we consider a sequence of `valid_i(v)` implementations, with increasing indexes `i`,
we consider a sequence of `valid_i(v)` implementations, with increasing versions `i`,

so that to represent multiple _backwards compatible_ implementations of the validity checks.
Formally we require that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

`valid_i(v) == true` implies `valid_j(v) == true `, for `j > i` and every value `v`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`valid_i(v) == true` implies `valid_j(v) == true `, for `j > i` and every value `v`
`valid_i(v) == true` implies `valid_j(v) == true `, for `j > i` and every value `v`.


The logical implications allows to be more permissive in later versions (as required to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The logical implications allows to be more permissive in later versions (as required to
The logical implication allows newer versions of `valid(v)` to be more permissive (as required to

fix the bug), while maintaining the need that newer versions allow to check validity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fix the bug), while maintaining the need that newer versions allow to check validity
fix bugs), while ensuring that newer versions allow to check validity

of previously decided values.

#### Backward compatibility of `valid(v, chain, height)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this section as an example of the requirement of the previous section.

Although I agree that it is only a legit example if v somehow included a height or h_p is also known by the valid(v) implementation. Not sure if we need this. Maybe a pointer to issue 510 is enough here.


As mentioned above, a typical use case is that the validity depends on the
state of the blockchain at a given height. In this scenario, the two requirements
from above can be achieved without the implication:
The idea is that if `valid` should be changed in that from height
`h`on, a function `new_valid` should be used to check validity, while up to height
`h-1` the function `old_valid` has been used, then we can get something
along these lines:
```
valid(v, chain, height) =
if (height >= h)
new_valid(v, chain, height)
else
old_valid(v, chain, height)
```

In this way, we effectively split evaluation of `valid` into different height
intervals. Whenever
a new version is needed, and new interval is added on top. Moreover
if in
the future validity of a value is checked with a new version of `valid`, effectively
the logic that was in place for that height is used by using the proper "if" branch.
This addresses Requirement 1 and 2 from above.

TODO: I think in this [remark](https://github.com/informalsystems/malachite/issues/510#issuecomment-2505632410), different use cases (soft upgrades) are mixed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the example in the code above is almost the definition of software upgrades.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linked comment is a short-back of the code snippet and the software upgrades approach. This is the case where a bug identified during a given height forces us to the most general approach described in the "Backwards Compatibility" section.

I am not sure what should
be captured in this document.

## Primitives

Expand Down
Loading