Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all problematic characters in an intelligent way in C#

Tags:

string

c#

.net

Is there any .Net library to remove all problematic characters of a string and only leave alphanumeric, hyphen and underscore (or similar subset) in an intelligent way? This is for using in URLs, file names, etc.

I'm looking for something similar to stringex which can do the following:

A simple prelude

"simple English".to_url => "simple-english"

"it's nothing at all".to_url => "its-nothing-at-all"

"rock & roll".to_url => "rock-and-roll"

Let's show off

"$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"

"10% off if you act now".to_url => "10-percent-off-if-you-act-now"

You don't even wanna trust Iconv for this next part

"kick it en Français".to_url => "kick-it-en-francais"

"rock it Español style".to_url => "rock-it-espanol-style"

"tell your readers 你好".to_url => "tell-your-readers-ni-hao"

like image 937
pupeno Avatar asked Dec 02 '25 03:12

pupeno


2 Answers

You can try this

string str = phrase.ToLower();  //optional
str = str.Trim();
str = Regex.Replace(str, @"[^a-z0-9\s_]", ""); // invalid chars        
str = Regex.Replace(str, @"\s+", " ").Trim(); // convert multiple spaces into one space
str = str.Substring(0, str.Length <= 400 ? str.Length : 400).Trim(); // cut and trim it
str = Regex.Replace(str, @"\s", "-");
like image 148
Luke101 Avatar answered Dec 04 '25 18:12

Luke101


Perhaps this question here can help you on your way. It gives you code on how Stackoverflow generates its url's (more specifically, how question names are turned into nice urls.

Link to Question here, where Jeff Atwood shows their code

like image 27
Terje Avatar answered Dec 04 '25 19:12

Terje



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!