Some weird characters are getting stored in one of the table. They seem to be coming from .csv feeds so I don't have much control over that.
Hello Kitty Essential Accessory Kit
How can I clean it and remove these characters. I am ok doing it at db level or in C#.
EDIT
As per the suggestions received in comments. I am also looking into what I can do to correct it at feed level. Here's more info on it.
You can use .net regular expression functions. For example, using Regex.Replace:
Regex.Replace(s, @"[^\u0000-\u007F]", string.Empty);
As there is no support for regular expressions in SQL Server you need to create a SQL CLR function. More information about the .net integration in SQL Server can be found here:
In your case:
Open Visual Studio and create Class Library Project:

Then rename the class to StackOverflow and paste the following code in its file:
using Microsoft.SqlServer.Server;
using System;
using System.Collections.Generic;
using System.Data.SqlTypes;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
public class StackOverflow
{
[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, Name = "RegexReplace")]
public static SqlString Replace(SqlString sqlInput, SqlString sqlPattern, SqlString sqlReplacement)
{
string input = (sqlInput.IsNull) ? string.Empty : sqlInput.Value;
string pattern = (sqlPattern.IsNull) ? string.Empty : sqlPattern.Value;
string replacement = (sqlReplacement.IsNull) ? string.Empty : sqlReplacement.Value;
return new SqlString(Regex.Replace(input, pattern, replacement));
}
}
Now, build the project. Open the SQL Server Management Studio. Select your database and replace the path value of the following FROM clause to match your StackOverflow.dll:
CREATE ASSEMBLY [StackOverflow] FROM 'C:\Users\gotqn\Desktop\StackOverflow\StackOverflow\bin\Debug\StackOverflow.dll';
Finally, create the SQL CLR function:
CREATE FUNCTION [dbo].[StackOverflowRegexReplace] (@input NVARCHAR(MAX),@pattern NVARCHAR(MAX), @replacement NVARCHAR(MAX))
RETURNS NVARCHAR(4000)
AS EXTERNAL NAME [StackOverflow].[StackOverflow].[Replace]
GO
You are ready to use RegexReplace .net function directly in your T-SQL statements:
SELECT [dbo].[StackOverflowRegexReplace] ('Hello Kitty Essential Accessory Kit', '[^\u0000-\u007F]', '')
//Hello Kitty Essential Accessory Kit
if you are looking for alphabets and numbers only in a string, than this can help you out.
In this, Regex is used to replace all characters other than alphabets and numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With