Tutorial on graphs in Rust

I've done quite a few graph experiments (not much to show the world though - at least not yet), so it's interesting to compare our experiences.

First; and this is not specific to graph nodes but to every larger object, is that I often see Rc<RefCell<Node>>. However, in larger objects, often some struct fields never change during the lifetime of that object, while others do. E g, the node might have an "id" field that remains constant, but the "edges" field might change. Hence, that could instead be modelled as Rc<Node> where Node is

struct Node {
    id: i32,
    edges: RefCell<Vec<Rc<Node>>>,
}

This way, we get a compiler guarantee that the id field never changes during the lifetime of the Node (almost; in theory you might be able to do get_mut() on the Rc). Also, if every mutable struct field has its own RefCell, that would also somewhat reduce the risk of RefCell panics as the locks are smaller.

I've never tried the TypedArena approach. The reason is mainly that if the graph lives for the duration of the program, and a lot of nodes are added and removed during that duration, you end up with ever growing memory usage as the dropped nodes are never recycled.

As for UnsafeCell, ever since dbaupp in quite strong terms told me not to use it, I haven't dared. But it's good to see that his opinion is not undebatable. :-) I'd like to point out that returning a node is still possible in the RefCell case, except you don't return a &Node but instead a Ref<Node>. Due to the Deref it would be just as ergonomic to the caller.

I agree that the <'a>s are annoying boilerplate. And you never get rid of them either:

struct Mold<'a> {
    id: &'a str,
}

struct Yeast<'a> {
    mold: Mold<'a>,
}

struct Funghi<'a> {
    yeast: Yeast<'a>,
}

...and so on. There is certainly room for improvement w r t lifetime elision here. For that reason I tend to prefer the Rc approach to memory management, at least for now.

I also have no good solution to the safe initialisation problem. E g, consider Mold owning some Yeasts, but there needs to be back references to the owner:

struct Mold {
    yeasts: Vec<Rc<Yeast>>,
}

struct Yeast {
    mold: Weak<Mold>,
}

There's, AFAIK, no way to properly initialise the mold and the Yeast to hold references to each other (without using extra containers such as RefCell / Option etc), even though the relationship remains constant during the entire lifetime of both the mold and the yeast. Maybe unsafe is the least bad option here?

/r/rust Thread Link - github.com