Return groups #372

gabriel-barrett · 2024-11-06T18:24:36Z

This PR allows you to specify chips for group of returns in Lair code. For instance, in the Lurk evaluator, we could create a group for all returns that are errors, like

    let (env_tag, new_env) = call(eval, env_expr_tag, env_expr, env);
    match env_tag {
        Tag::Err => {
            #[group=err]
            return (env_tag, new_env)
        }
        Tag::Env => {
            let (res_tag, res) = call(eval, res_tag, res, new_env);
            return (res_tag, res)
        }
    };
    #[group=err]
    let err = EvalErr::NotEnv;
    return (err_tag, err)

so that it splits the Lurk eval chip into 2 different chips using the same lookup index, minimizing the cost of successful computations.

I've used this to optimize the compiled version of Lurk. Here are the benchmarks:

|-----------------------------+-------------+----------+-----------+----------+------------------------|
|                             | Interpreted | -------> | Compiled  | -------> | Compiled and optimized |
|-----------------------------+-------------+----------+-----------+----------+------------------------|
| fib-evaluation-100000       | 3.7002 s    | -31.072% | 2.5505 s  | -1.6857% | 2.5075 s               |
| sum-evaluation-100000       | 2.9119 s    | -22.548% | 2.2553 s  | -1.9032% | 2.2124 s               |
| lcs-evaluation              | 893.18 ms   | -28.116% | 642.97 ms | -7.1679% | 596.89 ms              |
|-----------------------------+-------------+----------+-----------+----------+------------------------|
| fib-trace-generation-100000 | 168.65 ms   | -11.642% | 148.09 ms | -17.142% | 122.05 ms              |
| sum-trace-generation-100000 | 166.67 ms   | -4.5229% | 157.96 ms | -4.0621% | 148.92 ms              |
| lcs-trace-generation        | 76.110 ms   | -5.5693% | 71.657 ms | -52.393% | 33.907 ms              |
|-----------------------------+-------------+----------+-----------+----------+------------------------|
| fib-e2e-100000              | 24.857 s    | -30.077% | 17.381 s  | -18.639% | 14.141 s               |
| sum-e2e-100000              | 23.481 s    | -18.080% | 19.235 s  | -25.113% | 14.405 s               |
| lcs-e2e                     | 10.003 s    | -28.218% | 7.1690 s  | -37.722% | 4.4647 s               |
|-----------------------------+-------------+----------+-----------+----------+------------------------|

arthurpaulino

For this first review round I'm mostly concerned with documentation. The provable attribute from QueryRecord really needs some explanation.

arthurpaulino · 2024-11-13T12:37:57Z

src/air/builder.rs

@@ -136,14 +136,17 @@ impl<AB: AirBuilder + MessageBuilder<AirInteraction<AB::Expr>>> LookupBuilder fo
 pub struct Record {
    pub nonce: u32,
    pub count: u32,
+    // Original query that did the lookup. `None` is for the root lookup


Nit: please turn this into an actual docstring

arthurpaulino · 2024-11-13T12:50:50Z

src/lair/air.rs

-                if let Some(block) = cases.default.as_ref() {
-                    process(block)
-                };


Why isn't the block for the default case dealt with anymore?

I believe (but would also like confirmation since I'm not sure and I had the same question) that the default case gets now handled by the split function when transforming the match blocks into Choose/ChooseMany blocks

arthurpaulino · 2024-11-13T12:51:09Z

src/lair/bytecode.rs

@@ -143,6 +143,8 @@ pub struct Func<F> {
    pub(crate) input_size: usize,
    pub(crate) output_size: usize,
    pub(crate) body: Block<F>,
+    // This last field is purely to catch potential bugs


Nit: please turn this into an actual docstring. Also, it's worth mentioning what kinds of bugs it aims to prevent and how/why.

arthurpaulino · 2024-11-13T12:53:09Z

src/lair/execute.rs

@@ -72,6 +82,7 @@ pub struct QueryRecord<F: PrimeField32> {
    pub(crate) bytes: BytesRecord,
    pub(crate) emitted: Vec<List<F>>,
    pub(crate) debug_data: DebugData,
+    pub(crate) provable: bool,


How come? Definitely worth a docstring.

Yeah, a docstring would be useful. As far as I can tell, the provable flag indicates whether the nonces should be "fixed" at the end or not, which is a requirement for proving, but not needed for only execution. Not sure if there is some other difference

arthurpaulino · 2024-11-13T12:53:29Z

src/lair/execute.rs

@@ -278,6 +288,7 @@ impl<F: PrimeField32> QueryRecord<F> {
            bytes: BytesRecord::default(),
            emitted: vec![],
            debug_data: DebugData::default(),
+            provable: true,


arthurpaulino · 2024-11-13T12:54:18Z

src/lair/execute.rs

@@ -370,6 +381,71 @@ impl<F: PrimeField32> QueryRecord<F> {
    pub fn expect_public_values(&self) -> &[F] {
        self.public_values.as_ref().expect("Public values not set")
    }
+
+    fn nonce_map<C1: Chipset<F>, C2: Chipset<F>>(


nonce_map and fix_nonces would really benefit from docstrings

arthurpaulino · 2024-11-13T12:55:22Z

src/lair/func_chip.rs

@@ -92,6 +120,7 @@ impl<F> Func<F> {
        &self,
        toplevel: &Toplevel<F, C1, C2>,
    ) -> LayoutSizes {
+        assert!(self.split);


What does this mean?

arthurpaulino · 2024-11-13T12:55:38Z

src/lair/func_chip.rs

-                if let Some(block) = cases.default.as_ref() {
-                    process(block)
-                };


What about the default case?

arthurpaulino · 2024-11-13T12:56:09Z

src/lair/toplevel.rs

+/// This struct holds the full Lair function and the split functions, which contains
+/// only the paths belonging to the same group


arthurpaulino · 2024-11-13T12:57:34Z

src/lair/toplevel.rs

@@ -879,3 +932,118 @@ impl<F: Field + Ord> OpE<F> {
        }
    }
 }
+
+impl<F: Field + Ord> Func<F> {
+    fn split(&self, groups: &FxIndexSet<ReturnGroup>) -> FxIndexMap<ReturnGroup, Func<F>> {


split would really benefit from a docstring with a high level explanation of how "splitting" works

wwared · 2024-11-13T13:02:19Z

src/core/eval_compiled.rs

+const ERR_GROUP: ReturnGroup = 1;
+const IMMEDIATE_GROUP: ReturnGroup = 2;
+const UNOP_GROUP: ReturnGroup = 3;
+const BINOP_MISC_GROUP: ReturnGroup = 4;
+const BINOP_U64_GROUP: ReturnGroup = 5;
+const BINOP_NUM_GROUP: ReturnGroup = 6;
+const BINOP_BIG_GROUP: ReturnGroup = 7;
+const MISC_GROUP: ReturnGroup = 8;
+const U64_DIV_GROUP: ReturnGroup = 9;


I understand why they're constants like this, but I feel like it would be better if we could instead have an enum with #[repr(u8)] (or #[repr(ReturnGroup)] if that's valid) or something similar and use the variants, something like Group::Err instead of ERR_GROUP. Having an enum ensures entries are sequential and that there is no overlap. I'm not sure how an enum would interact with the ReturnGroup newtype though.

More of a minor suggestion, this refactor can be done later on.

wwared · 2024-11-13T13:05:25Z

src/core/eval_compiled.rs

+        expect_eq(eval_misc.width(), expect!["79"]);
+        expect_eq(eval_unop.width(), expect!["115"]);
+        expect_eq(eval_binop_misc.width(), expect!["114"]);
+        expect_eq(eval_binop_u64.width(), expect!["163"]);
+        expect_eq(eval_binop_num.width(), expect!["81"]);
+        expect_eq(eval_binop_big.width(), expect!["154"]);
+        expect_eq(eval_u64_div.width(), expect!["242"]);
+        expect_eq(eval_err.width(), expect!["127"]);
+        expect_eq(eval_immediate.width(), expect!["34"]);


When you do eval_unop.width(), does this give the width of the main group (id 0)? Or does it give the width before any splitting happens?

I also think it would be valuable to track the width of each chip after splitting, so we can have an idea of the costs of each return group chip variant. Maybe add another method like all_group_widths() that returns an array of the widths and track that in the test as well?

wwared · 2024-11-13T13:19:09Z

src/lair/execute.rs

+            full_func,
+            split_funcs,
+        } in toplevel.func_map.values()
+        {


As a suggestion for slight optimization, we could check if the full_func/split_funcs actually has at least one split or not (since funcs that don't use return group never get split) and only add it to the NonceMap if there is at least >1 split funcs.

Then, in fix_nonces below, change update_nonce to be something like

let update_nonce = |lookup: &mut Record| { if let Some(index) = lookup.query_index { if let Some(nonce) = *map.get(&(index, lookup.nonce)) { lookup.nonce = nonce; } } };

instead.

This should make the nonce map smaller by skipping storing a trivial nonce map if there is only a single return group for a func.

(Though I recognize a benefit of making the nonce map "complete" and using the .unwrap() in update_nonce is that if a nonce/lookup is unaccounted for in the map it would lead to the unwrap causing a panic which is easier to debug than silently ignoring it, so I'm also ok with leaving this as-is for now, and doing other minor optimizations later when it would be more impactful)

Actually, now that I think about sharding, it's better if the nonce_map is actually complete and left as-is, otherwise when we shard chips down the road this optimization would be a lot of extra confusion for very minor benefit. You can ignore this suggestion.

wwared · 2024-11-13T13:28:58Z

src/lair/trace.rs

                let val = map[*var].0;
-                let branch = cases.match_case(&val).expect("No match");
+                let i = cases.match_case(&val).expect("No match");
+                let branch = &branches[*i];


Can this &branches[i] end up as an out of bounds access if match_case returns the index for the default case?

No, since the default case is also added to the list of unique branches

wwared · 2024-11-13T13:31:20Z

src/lair/trace.rs

+    pub fn push_provide(&mut self, index: &mut ColumnIndex, lookup: Record) {
+        let provide = lookup.into_provide();
+        self.push_aux(index, provide.last_nonce);
+        self.push_aux(index, provide.last_count);
+    }


This will be very useful for funcs that provide multiple values! 🙏

wwared · 2024-11-13T13:37:43Z

src/lair/toplevel.rs

+                }
+                let ctrl = Ctrl::Return(*return_ident, out.clone(), group);
+                let return_idents = vec![*return_ident];
+                *return_ident += 1;


This +1 here means that the return idents need to all be sequential, correct? So a lair func like

match env_tag { Tag::Err => { #[group=2] return (env_tag, new_env) } Tag::Env => { #[group=4] let (res_tag, res) = call(eval, res_tag, res, new_env); return (res_tag, res) } };

Would end up as a miscompilation or panic since there is a gap between return groups 0 and 1 (and 2 and 4), and the split function starts from return group 0 and increments the index by 1 each time, right?

This is perfectly fine as-is (this is all internal to lair after all), but this should be explicitly documented somewhere, maybe in the docstring for the ReturnGroup type, saying it must start at 1 and there must be no gaps in the indices.

EDIT: I'm not sure what I said above is actually true, but if it is, it definitely should be mentioned in a docstring somewhere

gabriel-barrett force-pushed the return-chips branch 7 times, most recently from ce11eaa to 24e79a7 Compare November 8, 2024 01:22

gabriel-barrett marked this pull request as ready for review November 8, 2024 01:44

gabriel-barrett force-pushed the return-chips branch 13 times, most recently from d5f2ea0 to 1a80a15 Compare November 12, 2024 13:49

Choose refactor

d76d213

gabriel-barrett force-pushed the return-chips branch from 5d72bab to a6a4f19 Compare November 13, 2024 12:33

arthurpaulino reviewed Nov 13, 2024

View reviewed changes

wwared reviewed Nov 13, 2024

View reviewed changes

gabriel-barrett added 5 commits November 14, 2024 10:34

refactor around generate trace

993d1f6

added query index field in Record

5fdcad9

added query index to chipset

ed41077

Return groups

35d5287

macros

3feab38

gabriel-barrett added 11 commits November 14, 2024 10:46

execute.rs return groups

c4227b5

split funcs

1f9dae6

split func chips and execution

6625825

removed empty paths

bece8f8

Func::from_name -> Func::from_name_main

3edbca2

generate trace return groups

56cce85

nonce translation

60c4b81

compiled lurk return groups

2ebe200

fix_nonces refactor

2765398

fixed emission

ff88b8c

optimizations

50e0e67

gabriel-barrett force-pushed the return-chips branch from a6a4f19 to 50e0e67 Compare November 14, 2024 13:47

gabriel-barrett mentioned this pull request Nov 14, 2024

Return chips refactors #391

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return groups #372

Return groups #372

gabriel-barrett commented Nov 6, 2024 •

edited

Loading

arthurpaulino left a comment

arthurpaulino Nov 13, 2024

arthurpaulino Nov 13, 2024

wwared Nov 13, 2024 •

edited

Loading

arthurpaulino Nov 13, 2024

arthurpaulino Nov 13, 2024

wwared Nov 13, 2024

arthurpaulino Nov 13, 2024

arthurpaulino Nov 13, 2024

arthurpaulino Nov 13, 2024

arthurpaulino Nov 13, 2024

arthurpaulino Nov 13, 2024

arthurpaulino Nov 13, 2024

wwared Nov 13, 2024 •

edited

Loading

wwared Nov 13, 2024

wwared Nov 13, 2024

wwared Nov 13, 2024

wwared Nov 13, 2024

gabriel-barrett Nov 14, 2024

wwared Nov 13, 2024

wwared Nov 13, 2024 •

edited

Loading

		/// This struct holds the full Lair function and the split functions, which contains
		/// only the paths belonging to the same group

Return groups #372

Are you sure you want to change the base?

Return groups #372

Conversation

gabriel-barrett commented Nov 6, 2024 • edited Loading

arthurpaulino left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wwared Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wwared Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wwared Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

gabriel-barrett commented Nov 6, 2024 •

edited

Loading

wwared Nov 13, 2024 •

edited

Loading

wwared Nov 13, 2024 •

edited

Loading

wwared Nov 13, 2024 •

edited

Loading