-
Notifications
You must be signed in to change notification settings - Fork 60
Add pointer specs needed to verify slices #1534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
creusot-contracts/src/std/ptr.rs
Outdated
#[trusted] | ||
#[requires(self.offset_logic(offset@) == own_offset.ptr())] | ||
#[ensures(self.offset_logic(offset@) == result)] | ||
unsafe fn add_own(self, offset: usize, own_offset: Ghost<&PtrOwn<T>>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here, which may be wrong, is that the only way to prove self.offset_logic(offset) == own_offset.ptr()
is if self
actually comes from an allocation of an array of T
, so all the pointers between self
and self.add(offset)
are in fact valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we can imagine that we could compute offsets between fields of a struct, so arrays are not the only option here.
But I basically agree with you that this requirement should be enough. Basically, this correspond to assuming that the pointer type ghostly contains the identifier of the allocation: offset_logic
would keep this provenance information, so that self
and own_offsef
would have the same provenance. So own_offset
would guarantee that both pointers have the same block, and that this block has not been deallocated.
This matches the provenance semantics that Rust tries to adopt now.
But, in any case, this needs to be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing: we want to allow one-past-the-end pointers. <*const T>::add
says this:
vec.as_ptr().add(vec.len())
(forvec: Vec<T>
) is always safe.
I just ran into this while looking at split_at_unchecked
.
We need a weaker form of evidence than PtrOwn
to model this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[requires(self.offset_logic(offset@) == own_offset.ptr() || self.offset_logic(offset@-1) == own_offset.ptr())]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is still too restrictive because you might not have access to the PtrOwn
of the last element. In split_at_unchecked
the add
happens after the first slice is created, consuming all of its PtrOwn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. And more specifically, if we have a pointer to a zero-sized slice (this is a corner case, but this needs to be handled), then doing a 0-offset should be allowed, and my proposal forbids it.
So... it seems like the only remaining possibility is to add some informatin in the pointers.
For example, fn min_offset (p: *const T) -> Int
and fn max_offset (p: *const T) -> Int
, indicating the minimum and maximum value allowed for an offset. This is sound, because we can consider that these values are contained in the "origin" field of pointers. Some additional postconditions need to be added to offset_logic
to tell how min_offset
and max_offset
are affected.
Alternatively, we could add logical functions fn chunck_begin_addr(p: *const T) -> Int
and fn chunck_end_addr(p: *const T) -> Int
, and then compare addresses (it's important to compare addresses and not pointers, because then we can leverage the support for arithmetics in SMT solvers).
The idea is the same, it's not clear to me what's the best variant. The advantage of chunck_begin_addr
is that these stay constant when offseting, while the advantage of offset ranges is that we do not use the notion of addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm experimenting with a SliceOwn
token, a variant of PtrOwn
, that contains information about allowed offsets. In particular we can create zero-sized SliceOwn
that point one-past-the-end of an allocated slice.
I think that adding such information in the pointers themselves is not a solution because the range of possible offsets depends on if the allocated object has been freed:
it’s always UB to offset a pointer derived from something that is now deallocated, except if the offset is 0. --- https://doc.rust-lang.org/std/ptr/index.html#provenance
Does it makes sense to be able to convert &SliceOwn
into &PtrOwn
of each element, and conversely, to convert &PtrOwn
into a singleton &SliceOwn
? (And of course, without the &
this won't be allowed.) It's certainly convenient to reuse PtrOwn
's method to dereference pointers, but I wonder if there may be some extra information in a &PtrOwn
that can't come from &SliceOwn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How having a SliceOwn
for a type T
would be different to having a PtrOwn
for [T]
?
It really seems bad to me to have two different token types for owning a single slot and owning a slice. For example, how would we offset into an which is stored in a struct, of which we only have a PtrOwn
(I assume that in your proposal, it is impossible to split these ownership tokens, because that would essentially revert to the original proposal)?
To synthesize my current view of things:
On the one hand, we need to be able to split ownership, both because it is much easier to specify reading/writing from a raw pointer, and because we can very much imagine that a large block of memory is split into smaller thunks, themselves used in very different part of a program (different threads...). Being able to split tokens is crucial for proving things like Bumpalo
(I'm not saying that you should prove it, be not having the tools to do it seems like a red flag to me).
On the other hand, in order to guarantee that some offsets are allowed, and to guarantee that we never partially deallocate a block by giving back only some of its tokens, we need another token that says "this block is allocated". This other token could not be split and would not provide read/write access to anything, but a shared borrow of it would be needed to offset pointers and full ownership of it would be needed to deallocate memory.
swap
on slices
creusot-contracts/src/std/ptr.rs
Outdated
#[trusted] | ||
#[logic] | ||
#[open(self)] | ||
fn offset_logic(self, offset: Int) -> RawPtr<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we ensure something about the address of the pointer (i.e., the value when casted to usize
).
creusot-contracts/src/std/ptr.rs
Outdated
#[trusted] | ||
#[requires(self.offset_logic(offset@) == own_offset.ptr())] | ||
#[ensures(self.offset_logic(offset@) == result)] | ||
unsafe fn add_own(self, offset: usize, own_offset: Ghost<&PtrOwn<T>>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we can imagine that we could compute offsets between fields of a struct, so arrays are not the only option here.
But I basically agree with you that this requirement should be enough. Basically, this correspond to assuming that the pointer type ghostly contains the identifier of the allocation: offset_logic
would keep this provenance information, so that self
and own_offsef
would have the same provenance. So own_offset
would guarantee that both pointers have the same block, and that this block has not been deallocated.
This matches the provenance semantics that Rust tries to adopt now.
But, in any case, this needs to be documented.
creusot-contracts/src/std/ptr.rs
Outdated
// TODO: The offset in bytes, `count * size_of::<T>()`, must fit in an `isize`. | ||
#[trusted] | ||
#[requires(self.offset_logic(offset@) == own_offset.ptr())] | ||
// #[ensures(result as RawPtr<T> == self.offset_logic(offset))] // TODO: cast *mut to RawPtr ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this forbidden in Pearlite? We should allow these casts, and translate them by identity (using the Coerce construct).
TODO:
|
creusot-contracts/src/std/ptr.rs
Outdated
#[trusted] | ||
#[logic] | ||
#[open(self)] | ||
#[ensures(self.addr_logic()@ + offset < usize::MAX@ ==> result.addr_logic()@ == self.addr_logic()@ + offset)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This complicated post-condition let me think that, finally, addr_logic
should return an Int
, with a type invariant on pointer types stating that the logical address of pointers appearing in programs are between 0 and usize::MAX
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, we should have an associativity lemma stating p.offset_logic(x).offset_logic(y) = p.offset_logic(x+y)
.
swap
on slices
Stuff needed to verify
swap
on slices.