I am porting some of my more complex C++ code over to Rust as a way to learn the language. One thing I have is a map of values keyed by a std::string held inside the value type, to avoid copying the string into the maps key I use a string_view. Because of how it's used, the key will always point to the valid string held in its value. The Rust below represents the structures I am porting.
use std::collections::HashMap;
use std::sync::Arc;
struct BaseThing {
name: String,
data1: String,
data2: f32,
// and many more bits of data
}
struct Component {
pub base_thing: Arc<BaseThing>,
pub power: i32,
}
type ComponentsMap = HashMap<String, Component>;
struct Value {
pub components: ComponentsMap,
}
fn add_component(value: &mut Value, base_thing: Arc<BaseThing>, power: i32) {
value.components.insert(base_thing.name.clone(), Component { base_thing, power });
}
Ideally I want the map to be a keyed by a reference to the contained string. Something like this:
type ComponentsMap = HashMap<&str, Component>;
But I can't figure the borrow checker voodoo to make that work, if it can work at all.
This is fundamentally impossible. See Why can't I store a value and a reference to that value in the same struct?
However, there is a solution. You can do even better than the C++ code.
The key is to use hashbrown::HashTable. hashbrown is the crate that powers the hashmap in std, and its HashTable is a "raw" hash map, with more basic API. Instead of keys and values, you store just values. And the map doesn't know to hash keys or to compare them: you need to provide callback to do that on every operation.
You can build a wrapper around that, where "the key" is just the data you use for the hashing and comparison. You need to be careful, though, to not mutate the key - in other words, the hash and equality of an entry must not change once inserted, otherwise anything can occur (except undefined behavior - e.g. panics, infinite loops, etc.).
As noted in other answers, this is difficult in general as it would involve self-references.
But as Vi. suggests in their answer, you can use a custom key type to avoid the extra allocation. And in your specific use case, the BaseThing is already wrapped in an Arc, so you can implement the custom key type yourself as a wrapper around the Arc, which is cheap to clone, like so:
use std::borrow::Borrow;
use std::hash::{Hash, Hasher};
use std::sync::Arc;
struct BaseThingName(Arc<BaseThing>);
// The `Borrow` trait implies a contract that `BaseThingName` "behaves
// identically" to the returned value (`self.0.name`), i.e., the `Eq` and `Hash`
// implementations yield the same result as `name` would, so that `HashMap::get`
// can look up the thing by the `name`, relying on the contract.
impl Borrow<str> for BaseThingName {
fn borrow(&self) -> &str {
&self.0.name
}
}
impl Eq for BaseThingName {}
impl PartialEq for BaseThingName {
fn eq(&self, other: &Self) -> bool {
self.0.name == other.0.name
}
}
impl Hash for BaseThingName {
fn hash<H: Hasher>(&self, hasher: &mut H) {
self.0.name.hash(hasher);
}
}
(Playground)
(See the official documentation for details on the trait requirements for the key type of a HashMap.)
To use the custom key, replace the key type of ComponentsMap with the wrapper type and update the insertion code to use it instead of a String:
type ComponentsMap = HashMap<BaseThingName, Component>;
pub fn add_component(value: &mut Value, base_thing: Arc<BaseThing>, power: i32) {
value.components.insert(
BaseThingName(Arc::clone(&base_thing)),
Component { base_thing, power },
);
}
You can look up components by the name even though the key is now a wrapper for the BaseThing as a whole, thanks to the Borrow trait implementation for BaseThingName:
#[test]
fn it_works() {
let mut value = Value {
components: ComponentsMap::new(),
};
let base_thing = Arc::new(BaseThing {
name: "hello".to_owned(),
// …
});
add_component(&mut value, base_thing, 42);
assert_eq!(value.components.get("hello").unwrap().power, 42);
}
Even further, you could implement the traits for Component (or a wrapper around it) instead, and insert it into a HashSet instead of a HashMap:
pub struct SetComponent(Component);
type ComponentsSet = HashSet<SetComponent>;
pub struct Value {
pub components: ComponentsSet,
// …
}
impl Borrow<str> for SetComponent {
fn borrow(&self) -> &str {
&self.0.base_thing.name
}
}
impl Eq for SetComponent {}
impl PartialEq for SetComponent {
fn eq(&self, other: &Self) -> bool {
self.0.base_thing.name == other.0.base_thing.name
}
}
impl Hash for SetComponent {
fn hash<H: Hasher>(&self, hasher: &mut H) {
self.0.base_thing.name.hash(hasher);
}
}
pub fn add_component(value: &mut Value, base_thing: Arc<BaseThing>, power: i32) {
value
.components
.insert(SetComponent(Component { base_thing, power }));
}
(Playground)
Now it should be even cheaper than the original C++ code (in theory), because the set doesn't allocate for (the smart pointers to) the keys by itself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With