Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return groups #372

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Return groups #372

wants to merge 17 commits into from

Conversation

gabriel-barrett
Copy link
Contributor

@gabriel-barrett gabriel-barrett commented Nov 6, 2024

This PR allows you to specify chips for group of returns in Lair code. For instance, in the Lurk evaluator, we could create a group for all returns that are errors, like

    let (env_tag, new_env) = call(eval, env_expr_tag, env_expr, env);
    match env_tag {
        Tag::Err => {
            #[group=err]
            return (env_tag, new_env)
        }
        Tag::Env => {
            let (res_tag, res) = call(eval, res_tag, res, new_env);
            return (res_tag, res)
        }
    };
    #[group=err]
    let err = EvalErr::NotEnv;
    return (err_tag, err)

so that it splits the Lurk eval chip into 2 different chips using the same lookup index, minimizing the cost of successful computations.

I've used this to optimize the compiled version of Lurk. Here are the benchmarks:

|-----------------------------+-------------+----------+-----------+----------+------------------------|
|                             | Interpreted | -------> | Compiled  | -------> | Compiled and optimized |
|-----------------------------+-------------+----------+-----------+----------+------------------------|
| fib-evaluation-100000       | 3.7002 s    | -31.072% | 2.5505 s  | -1.6857% | 2.5075 s               |
| sum-evaluation-100000       | 2.9119 s    | -22.548% | 2.2553 s  | -1.9032% | 2.2124 s               |
| lcs-evaluation              | 893.18 ms   | -28.116% | 642.97 ms | -7.1679% | 596.89 ms              |
|-----------------------------+-------------+----------+-----------+----------+------------------------|
| fib-trace-generation-100000 | 168.65 ms   | -11.642% | 148.09 ms | -17.142% | 122.05 ms              |
| sum-trace-generation-100000 | 166.67 ms   | -4.5229% | 157.96 ms | -4.0621% | 148.92 ms              |
| lcs-trace-generation        | 76.110 ms   | -5.5693% | 71.657 ms | -52.393% | 33.907 ms              |
|-----------------------------+-------------+----------+-----------+----------+------------------------|
| fib-e2e-100000              | 24.857 s    | -30.077% | 17.381 s  | -18.639% | 14.141 s               |
| sum-e2e-100000              | 23.481 s    | -18.080% | 19.235 s  | -25.113% | 14.405 s               |
| lcs-e2e                     | 10.003 s    | -28.218% | 7.1690 s  | -37.722% | 4.4647 s               |
|-----------------------------+-------------+----------+-----------+----------+------------------------|

@gabriel-barrett gabriel-barrett force-pushed the return-chips branch 7 times, most recently from ce11eaa to 24e79a7 Compare November 8, 2024 01:22
@gabriel-barrett gabriel-barrett marked this pull request as ready for review November 8, 2024 01:44
@gabriel-barrett gabriel-barrett force-pushed the return-chips branch 13 times, most recently from d5f2ea0 to 1a80a15 Compare November 12, 2024 13:49
Copy link
Contributor

@arthurpaulino arthurpaulino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this first review round I'm mostly concerned with documentation. The provable attribute from QueryRecord really needs some explanation.

@@ -136,14 +136,17 @@ impl<AB: AirBuilder + MessageBuilder<AirInteraction<AB::Expr>>> LookupBuilder fo
pub struct Record {
pub nonce: u32,
pub count: u32,
// Original query that did the lookup. `None` is for the root lookup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please turn this into an actual docstring

Comment on lines -523 to -525
if let Some(block) = cases.default.as_ref() {
process(block)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't the block for the default case dealt with anymore?

Copy link
Contributor

@wwared wwared Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe (but would also like confirmation since I'm not sure and I had the same question) that the default case gets now handled by the split function when transforming the match blocks into Choose/ChooseMany blocks

@@ -143,6 +143,8 @@ pub struct Func<F> {
pub(crate) input_size: usize,
pub(crate) output_size: usize,
pub(crate) body: Block<F>,
// This last field is purely to catch potential bugs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please turn this into an actual docstring. Also, it's worth mentioning what kinds of bugs it aims to prevent and how/why.

@@ -72,6 +82,7 @@ pub struct QueryRecord<F: PrimeField32> {
pub(crate) bytes: BytesRecord,
pub(crate) emitted: Vec<List<F>>,
pub(crate) debug_data: DebugData,
pub(crate) provable: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come? Definitely worth a docstring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a docstring would be useful. As far as I can tell, the provable flag indicates whether the nonces should be "fixed" at the end or not, which is a requirement for proving, but not needed for only execution. Not sure if there is some other difference

@@ -278,6 +288,7 @@ impl<F: PrimeField32> QueryRecord<F> {
bytes: BytesRecord::default(),
emitted: vec![],
debug_data: DebugData::default(),
provable: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

@@ -370,6 +381,71 @@ impl<F: PrimeField32> QueryRecord<F> {
pub fn expect_public_values(&self) -> &[F] {
self.public_values.as_ref().expect("Public values not set")
}

fn nonce_map<C1: Chipset<F>, C2: Chipset<F>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nonce_map and fix_nonces would really benefit from docstrings

@@ -92,6 +120,7 @@ impl<F> Func<F> {
&self,
toplevel: &Toplevel<F, C1, C2>,
) -> LayoutSizes {
assert!(self.split);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Comment on lines -156 to -158
if let Some(block) = cases.default.as_ref() {
process(block)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the default case?

Comment on lines +10 to +11
/// This struct holds the full Lair function and the split functions, which contains
/// only the paths belonging to the same group
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏🏼

@@ -879,3 +932,118 @@ impl<F: Field + Ord> OpE<F> {
}
}
}

impl<F: Field + Ord> Func<F> {
fn split(&self, groups: &FxIndexSet<ReturnGroup>) -> FxIndexMap<ReturnGroup, Func<F>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split would really benefit from a docstring with a high level explanation of how "splitting" works

Comment on lines +144 to +152
const ERR_GROUP: ReturnGroup = 1;
const IMMEDIATE_GROUP: ReturnGroup = 2;
const UNOP_GROUP: ReturnGroup = 3;
const BINOP_MISC_GROUP: ReturnGroup = 4;
const BINOP_U64_GROUP: ReturnGroup = 5;
const BINOP_NUM_GROUP: ReturnGroup = 6;
const BINOP_BIG_GROUP: ReturnGroup = 7;
const MISC_GROUP: ReturnGroup = 8;
const U64_DIV_GROUP: ReturnGroup = 9;
Copy link
Contributor

@wwared wwared Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why they're constants like this, but I feel like it would be better if we could instead have an enum with #[repr(u8)] (or #[repr(ReturnGroup)] if that's valid) or something similar and use the variants, something like Group::Err instead of ERR_GROUP. Having an enum ensures entries are sequential and that there is no overlap. I'm not sure how an enum would interact with the ReturnGroup newtype though.

More of a minor suggestion, this refactor can be done later on.

Comment on lines +970 to +978
expect_eq(eval_misc.width(), expect!["79"]);
expect_eq(eval_unop.width(), expect!["115"]);
expect_eq(eval_binop_misc.width(), expect!["114"]);
expect_eq(eval_binop_u64.width(), expect!["163"]);
expect_eq(eval_binop_num.width(), expect!["81"]);
expect_eq(eval_binop_big.width(), expect!["154"]);
expect_eq(eval_u64_div.width(), expect!["242"]);
expect_eq(eval_err.width(), expect!["127"]);
expect_eq(eval_immediate.width(), expect!["34"]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you do eval_unop.width(), does this give the width of the main group (id 0)? Or does it give the width before any splitting happens?

I also think it would be valuable to track the width of each chip after splitting, so we can have an idea of the costs of each return group chip variant. Maybe add another method like all_group_widths() that returns an array of the widths and track that in the test as well?

full_func,
split_funcs,
} in toplevel.func_map.values()
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a suggestion for slight optimization, we could check if the full_func/split_funcs actually has at least one split or not (since funcs that don't use return group never get split) and only add it to the NonceMap if there is at least >1 split funcs.

Then, in fix_nonces below, change update_nonce to be something like

        let update_nonce = |lookup: &mut Record| {
            if let Some(index) = lookup.query_index {
                if let Some(nonce) = *map.get(&(index, lookup.nonce)) {
                    lookup.nonce = nonce;
                }
            }
        };

instead.

This should make the nonce map smaller by skipping storing a trivial nonce map if there is only a single return group for a func.

(Though I recognize a benefit of making the nonce map "complete" and using the .unwrap() in update_nonce is that if a nonce/lookup is unaccounted for in the map it would lead to the unwrap causing a panic which is easier to debug than silently ignoring it, so I'm also ok with leaving this as-is for now, and doing other minor optimizations later when it would be more impactful)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I think about sharding, it's better if the nonce_map is actually complete and left as-is, otherwise when we shard chips down the road this optimization would be a lot of extra confusion for very minor benefit. You can ignore this suggestion.

let val = map[*var].0;
let branch = cases.match_case(&val).expect("No match");
let i = cases.match_case(&val).expect("No match");
let branch = &branches[*i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this &branches[i] end up as an out of bounds access if match_case returns the index for the default case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since the default case is also added to the list of unique branches

Comment on lines +66 to +70
pub fn push_provide(&mut self, index: &mut ColumnIndex, lookup: Record) {
let provide = lookup.into_provide();
self.push_aux(index, provide.last_nonce);
self.push_aux(index, provide.last_count);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be very useful for funcs that provide multiple values! 🙏

}
let ctrl = Ctrl::Return(*return_ident, out.clone(), group);
let return_idents = vec![*return_ident];
*return_ident += 1;
Copy link
Contributor

@wwared wwared Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This +1 here means that the return idents need to all be sequential, correct? So a lair func like

    match env_tag {
        Tag::Err => {
            #[group=2]
            return (env_tag, new_env)
        }
        Tag::Env => {
            #[group=4]
            let (res_tag, res) = call(eval, res_tag, res, new_env);
            return (res_tag, res)
        }
    };

Would end up as a miscompilation or panic since there is a gap between return groups 0 and 1 (and 2 and 4), and the split function starts from return group 0 and increments the index by 1 each time, right?

This is perfectly fine as-is (this is all internal to lair after all), but this should be explicitly documented somewhere, maybe in the docstring for the ReturnGroup type, saying it must start at 1 and there must be no gaps in the indices.

EDIT: I'm not sure what I said above is actually true, but if it is, it definitely should be mentioned in a docstring somewhere

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants