Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode codepoint to Rust string

I'm just learning Rust, so I apologize if there is an easy way to do this that I've missed. I have a program that gets unicode codepoints as strings at runtime, and I'd like to convert those codepoints to Rust strings containing the characters they represent. Basically, I'm trying to figure out how to define parse_unicode for the below code.

fn parse_unicode(input: &str) -> String {
    input.to_string() // not working implementation
}

#[test]
fn test_parse_unicode() {
    let parsed_content = parse_unicode("1f44d");
    assert_eq!(parsed_content, String::from("\u{1f44d}"));
}

I see there's a function to go from byte arrays to to strings, so if I wrote code myself to parse these codepoints out into byte arrays I could then convert them to strings, but I'm hoping there's a more idiomatic (or at least easier) approach.

like image 263
Mindful Avatar asked Oct 21 '25 19:10

Mindful


1 Answers

Stargateur has basically solved my problem with code linked in the comments, which looks like this:

use std::num::ParseIntError;

#[derive(Debug, PartialEq)]
enum Error {
    Int(ParseIntError),
    Unicode(u32),
}

fn parse_unicode(input: &str) -> Result<char, Error> {
    let unicode = u32::from_str_radix(input, 16).map_err(Error::Int)?;
    char::from_u32(unicode).ok_or_else(|| Error::Unicode(unicode))
}

#[test]
fn test_parse_unicode() {
    assert_eq!(parse_unicode("1f44d"), Ok('👍'));
}
like image 129
Mindful Avatar answered Oct 23 '25 10:10

Mindful