SQL Server uses high CPU when searching inside nvarchar strings

Question

Check out the following example. It shows that searching within a unicode string (nvarchar) is almost eight times as bad as searching within a varchar string. And on par with implicit conversions. Looking for an explanation for this. Or a way to search within nvarchar strings more efficiently.

use tempdb
create table test
(
    testid int identity primary key,
    v varchar(36),
    nv nvarchar(36),
    filler char(500)
)
go

set nocount on
set statistics time off
insert test (v, nv)
select CAST (newid() as varchar(36)),
    CAST (newid() as nvarchar(36))
go 1000000

set statistics time on
-- search utf8 string
select COUNT(1) from test where v like '%abcd%' option (maxdop 1)
-- CPU time = 906 ms,  elapsed time = 911 ms.

-- search utf8 string using unicode (uses convert_implicit)
select COUNT(1) from test where v like N'%abcd%' option (maxdop 1)
-- CPU time = 6969 ms,  elapsed time = 6970 ms.

-- search unicode string
select COUNT(1) from test where nv like N'%abcd%' option (maxdop 1)
-- CPU time = 6844 ms,  elapsed time = 6911 ms.

TomTom · Accepted Answer

Looking for an explanation for this.

NVarchar is 16 bit and Unicode comparison rules are a lot more complicated than ASCII - special chars for the various languages that are supported at the same time require quote some more processing.

Larry Coleman · Answer

My guess is that LIKE is implemented using an O(n^2) algorithm as opposed to an O(n) algorithm; it would probably have to be for the leading % to work. Since the Unicode string is twice as long, that seems consistent with your numbers.

SQL Server uses high CPU when searching inside nvarchar strings

Tags:

sql-server

t-sql

Michael J Swart

2 Answers

TomTom

Larry Coleman

Recent Activity

Donate For Us

SQL Server uses high CPU when searching inside nvarchar strings

Tags:

sql-server

t-sql

Michael J Swart

2 Answers

TomTom

Larry Coleman

Related questions

Recent Activity

Donate For Us