-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return groups #372
base: main
Are you sure you want to change the base?
Return groups #372
Conversation
ce11eaa
to
24e79a7
Compare
d5f2ea0
to
1a80a15
Compare
5d72bab
to
a6a4f19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this first review round I'm mostly concerned with documentation. The provable
attribute from QueryRecord
really needs some explanation.
@@ -136,14 +136,17 @@ impl<AB: AirBuilder + MessageBuilder<AirInteraction<AB::Expr>>> LookupBuilder fo | |||
pub struct Record { | |||
pub nonce: u32, | |||
pub count: u32, | |||
// Original query that did the lookup. `None` is for the root lookup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: please turn this into an actual docstring
if let Some(block) = cases.default.as_ref() { | ||
process(block) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't the block for the default case dealt with anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe (but would also like confirmation since I'm not sure and I had the same question) that the default case gets now handled by the split
function when transforming the match
blocks into Choose
/ChooseMany
blocks
@@ -143,6 +143,8 @@ pub struct Func<F> { | |||
pub(crate) input_size: usize, | |||
pub(crate) output_size: usize, | |||
pub(crate) body: Block<F>, | |||
// This last field is purely to catch potential bugs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: please turn this into an actual docstring. Also, it's worth mentioning what kinds of bugs it aims to prevent and how/why.
@@ -72,6 +82,7 @@ pub struct QueryRecord<F: PrimeField32> { | |||
pub(crate) bytes: BytesRecord, | |||
pub(crate) emitted: Vec<List<F>>, | |||
pub(crate) debug_data: DebugData, | |||
pub(crate) provable: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come? Definitely worth a docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, a docstring would be useful. As far as I can tell, the provable
flag indicates whether the nonces should be "fixed" at the end or not, which is a requirement for proving, but not needed for only execution. Not sure if there is some other difference
@@ -278,6 +288,7 @@ impl<F: PrimeField32> QueryRecord<F> { | |||
bytes: BytesRecord::default(), | |||
emitted: vec![], | |||
debug_data: DebugData::default(), | |||
provable: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
@@ -370,6 +381,71 @@ impl<F: PrimeField32> QueryRecord<F> { | |||
pub fn expect_public_values(&self) -> &[F] { | |||
self.public_values.as_ref().expect("Public values not set") | |||
} | |||
|
|||
fn nonce_map<C1: Chipset<F>, C2: Chipset<F>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nonce_map
and fix_nonces
would really benefit from docstrings
@@ -92,6 +120,7 @@ impl<F> Func<F> { | |||
&self, | |||
toplevel: &Toplevel<F, C1, C2>, | |||
) -> LayoutSizes { | |||
assert!(self.split); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean?
if let Some(block) = cases.default.as_ref() { | ||
process(block) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the default case?
/// This struct holds the full Lair function and the split functions, which contains | ||
/// only the paths belonging to the same group |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏🏼
@@ -879,3 +932,118 @@ impl<F: Field + Ord> OpE<F> { | |||
} | |||
} | |||
} | |||
|
|||
impl<F: Field + Ord> Func<F> { | |||
fn split(&self, groups: &FxIndexSet<ReturnGroup>) -> FxIndexMap<ReturnGroup, Func<F>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split
would really benefit from a docstring with a high level explanation of how "splitting" works
const ERR_GROUP: ReturnGroup = 1; | ||
const IMMEDIATE_GROUP: ReturnGroup = 2; | ||
const UNOP_GROUP: ReturnGroup = 3; | ||
const BINOP_MISC_GROUP: ReturnGroup = 4; | ||
const BINOP_U64_GROUP: ReturnGroup = 5; | ||
const BINOP_NUM_GROUP: ReturnGroup = 6; | ||
const BINOP_BIG_GROUP: ReturnGroup = 7; | ||
const MISC_GROUP: ReturnGroup = 8; | ||
const U64_DIV_GROUP: ReturnGroup = 9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why they're constants like this, but I feel like it would be better if we could instead have an enum
with #[repr(u8)]
(or #[repr(ReturnGroup)]
if that's valid) or something similar and use the variants, something like Group::Err
instead of ERR_GROUP
. Having an enum ensures entries are sequential and that there is no overlap. I'm not sure how an enum would interact with the ReturnGroup
newtype though.
More of a minor suggestion, this refactor can be done later on.
expect_eq(eval_misc.width(), expect!["79"]); | ||
expect_eq(eval_unop.width(), expect!["115"]); | ||
expect_eq(eval_binop_misc.width(), expect!["114"]); | ||
expect_eq(eval_binop_u64.width(), expect!["163"]); | ||
expect_eq(eval_binop_num.width(), expect!["81"]); | ||
expect_eq(eval_binop_big.width(), expect!["154"]); | ||
expect_eq(eval_u64_div.width(), expect!["242"]); | ||
expect_eq(eval_err.width(), expect!["127"]); | ||
expect_eq(eval_immediate.width(), expect!["34"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you do eval_unop.width()
, does this give the width of the main group (id 0)? Or does it give the width before any splitting happens?
I also think it would be valuable to track the width of each chip after splitting, so we can have an idea of the costs of each return group chip variant. Maybe add another method like all_group_widths()
that returns an array of the widths and track that in the test as well?
full_func, | ||
split_funcs, | ||
} in toplevel.func_map.values() | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a suggestion for slight optimization, we could check if the full_func
/split_funcs
actually has at least one split or not (since funcs that don't use return group never get split) and only add it to the NonceMap
if there is at least >1 split funcs.
Then, in fix_nonces
below, change update_nonce
to be something like
let update_nonce = |lookup: &mut Record| {
if let Some(index) = lookup.query_index {
if let Some(nonce) = *map.get(&(index, lookup.nonce)) {
lookup.nonce = nonce;
}
}
};
instead.
This should make the nonce map smaller by skipping storing a trivial nonce map if there is only a single return group for a func.
(Though I recognize a benefit of making the nonce map "complete" and using the .unwrap()
in update_nonce
is that if a nonce/lookup is unaccounted for in the map it would lead to the unwrap causing a panic which is easier to debug than silently ignoring it, so I'm also ok with leaving this as-is for now, and doing other minor optimizations later when it would be more impactful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, now that I think about sharding, it's better if the nonce_map
is actually complete and left as-is, otherwise when we shard chips down the road this optimization would be a lot of extra confusion for very minor benefit. You can ignore this suggestion.
let val = map[*var].0; | ||
let branch = cases.match_case(&val).expect("No match"); | ||
let i = cases.match_case(&val).expect("No match"); | ||
let branch = &branches[*i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this &branches[i]
end up as an out of bounds access if match_case
returns the index for the default case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since the default case is also added to the list of unique branches
pub fn push_provide(&mut self, index: &mut ColumnIndex, lookup: Record) { | ||
let provide = lookup.into_provide(); | ||
self.push_aux(index, provide.last_nonce); | ||
self.push_aux(index, provide.last_count); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be very useful for funcs that provide multiple values! 🙏
} | ||
let ctrl = Ctrl::Return(*return_ident, out.clone(), group); | ||
let return_idents = vec![*return_ident]; | ||
*return_ident += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This +1
here means that the return idents need to all be sequential, correct? So a lair func like
match env_tag {
Tag::Err => {
#[group=2]
return (env_tag, new_env)
}
Tag::Env => {
#[group=4]
let (res_tag, res) = call(eval, res_tag, res, new_env);
return (res_tag, res)
}
};
Would end up as a miscompilation or panic since there is a gap between return groups 0 and 1 (and 2 and 4), and the split
function starts from return group 0 and increments the index by 1 each time, right?
This is perfectly fine as-is (this is all internal to lair after all), but this should be explicitly documented somewhere, maybe in the docstring for the ReturnGroup
type, saying it must start at 1 and there must be no gaps in the indices.
EDIT: I'm not sure what I said above is actually true, but if it is, it definitely should be mentioned in a docstring somewhere
a6a4f19
to
50e0e67
Compare
This PR allows you to specify chips for group of returns in Lair code. For instance, in the Lurk evaluator, we could create a group for all returns that are errors, like
so that it splits the Lurk eval chip into 2 different chips using the same lookup index, minimizing the cost of successful computations.
I've used this to optimize the compiled version of Lurk. Here are the benchmarks: