-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement get_many_mut #238
base: master
Are you sure you want to change the base?
Conversation
a9f088c
to
0ef0e90
Compare
0ef0e90
to
d3f1968
Compare
I think ideally we would want two things for this, especially the first:
I suspect part of this is because you should be using a raw pointer to offset/index each of the elements, otherwise you have the main |
Yeah, this makes a lot of sense. Is it okay to just leave the PR open until then?
Haven't seen this before but it would indeed fit pretty well. It seems to have been open for quite a while without resolution though. Also, if they opt for a result-based return value, that'd not match the hashmap api (as it is now) so we'd have to do conversions in the implementation.
Yeah, exactly, the lambda captures the whole array lifetime mutably and can't be invoked multiple times. I did not consider pointer arithmetic though, I'll experiment a bit and see how it looks. Probably similar to the |
src/map.rs
Outdated
// SAFETY: OK to discard mutable borrow lifetime as each index is unique | ||
#[allow(unsafe_code)] | ||
unsafe { | ||
&mut *(&mut entries[i].value as *mut V) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still creates aliasing &mut
between entries
and the previous iterations, which is not allowed. I suggest using self.as_entries_mut().as_mut_ptr()
to start with a pointer, then &mut (*entries_ptr.add(i)).value
for each index.
We probably need a manual bounds-check too, but that can be part of the loop scanning for duplicates. In theory, get_index_of
should always be in-bounds, but better safe than sorry...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how it can create aliasing &mut
for entries
. I only get a &mut
to a specific element of entries
(only the value member of that element) and while that references has a lifetime of entries
, it is not mutably borrowing all of entries
, no? Maybe I'm misunderstand reference/lifetime semantics and what is allowed. Using the pointer doesn't affect the guarantee of the function: each &mut
reference that comes out of this function is unique so what difference will it make?
We probably need a manual bounds-check too...
Yeah that makes sense if pointers are the way to go but it should still panic rather than return None, I assume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I did run miri on the tests like:
# rustup +nightly component add miri
cargo clean
cargo +nightly miri test
but I know miri doesn't catch all cases of undefined behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, maybe I understand. The variable entries
is &mut [Bucket<K, V>]
so it is a mutable reference the whole slice but the array out
also holds mutable references to individual elements, which alias with entries
. Is this what you are referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's it. I'm a little surprised that miri doesn't see any problem, but I think it's because slice indexing is an intrinsic operation in MIR, so miri doesn't really see any use of the entire entries
. If you force it to go through the IndexMut
trait, entries.index_mut(i)
, then it does complain:
test map::tests::many_mut_multi_success ... error: Undefined Behavior: trying to reborrow <390898> for SharedReadOnly permission at alloc159164[0xc], but that tag does not exist in the borrow stack for this location
--> /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cmp.rs:1531:27
|
1531 | PartialEq::eq(*self, *other)
| ^^^^^
| |
| trying to reborrow <390898> for SharedReadOnly permission at alloc159164[0xc], but that tag does not exist in the borrow stack for this location
| this error occurs as part of a reborrow at alloc159164[0xc..0x10]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <390898> was created by a retag at offsets [0xc..0x10]
--> src/map.rs:502:19
|
502 | let out = indices.map(|i| {
| ___________________^
503 | | // SAFETY: OK to discard mutable borrow lifetime as each index is unique
504 | | #[allow(unsafe_code)]
505 | | unsafe {
506 | | &mut *(&mut entries.index_mut(i).value as *mut V)
507 | | }
508 | | });
| |__________^
help: <390898> was later invalidated at offsets [0x0..0x40]
--> src/map.rs:506:29
|
506 | &mut *(&mut entries.index_mut(i).value as *mut V)
| ^^^^^^^^^^^^^^^^^^^^
= note: backtrace: [...]
--> src/map/tests.rs:449:5
|
449 | assert_eq!(map.get_many_mut([&1, &1123]), Some([&mut 10, &mut 100]));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We probably need a manual bounds-check too...
Yeah that makes sense if pointers are the way to go but it should still panic rather than return None, I assume?
Yeah, if get_index_of
returns a bad index, then we have internal errors, so a panic is fine.
Yeah, that's fine. I should probably make a label for this... |
Avoid mutable aliasing by using a pointer to entries
src/map.rs
Outdated
for idx in indices { | ||
let idx = idx?; | ||
if idx >= len { | ||
panic!("Index is out of range! Got '{idx}' but length is '{len}'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One drawback of sharing the code with get_many_index_mut
is that we have to duplicate the in-bounds check here to provide the panic behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, since this is internal "shouldn't happen" territory, I would also be fine with letting such bugs fall through to get_many_index_mut -> None
. Maybe just add a debug_assert!(idx < len)
for good measure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sure, that makes sense!
Can IndexMap add it under a nightly or unstable opt-in cargo feature? Then people will have to opt in to breaking changes, and you can document that changes to this API won't result in a new major version. (Or if you don't feel strongly, you can just make a new major release if the HashMap API changes for some reason.) |
I don't like using nightly/unstable features, in part because it's an extra maintenance burden, but also because the feature may be opted-in on your behalf, arbitrarily deep in your dependency tree. I'd rather not set us up for needing a semver bump either. If this is something that folks really need, let's apply that pressure on |
std::collections::HashMap
is addingget_many_mut
in rust-lang/rust#97601 and I'm working on replacing some uses of that hash map with this one where some of the code usesget_many_mut
(currently only a single location in rustc) so I would appreciate it ifindexmap::IndexMap
could provide the same interface.This code uses a decent amount of unsafe compared to the rest of the file and I saw the comment in lib.rs on having almost all unsafe code in raw.rs but I couldn't figure out how to move parts of my implementation there as it is mostly dealing with initializing the array. I tried to use
array::map
to avoid all unsafe but couldn't quite get it to work with the lifetimes and the lambda.If you want this change, I'll also write some docs.