I am developing an ASP.NET website in Visual Studio 2010 (with Service Pack 1, thank you very much). I want to utilize .NET's built-in Membership and Role providers for SQL Server 2008.
Now, I've been developing Microsoft technologies for a very long time, and have rubbed elbows with some of the finest SQL Server DBA's in the business. Every one of them has told me to stay away from GUIDS as primary keys when constructing a database table that will:
Reason: Because the primary key is a clustered index!
This basically means that each record inserted into the table must obey the constraints of the index. So if the index is sorted ASC, the record, with the newly generated GUID, must be physically wedged, in proper sequence, into the data table in question.
This would be just fine for a table with only a few thousand records or so. SQL Server would only have to re-position a handful. However if the data table has several millions of records, and finds it must insert the new record at row 216. it could take a substantial amount of time (by web standards) to accomplish. It has to physically move all those rows down, to insert the new one.
So my question is simply this. Since Microsoft, and all the DBS's we know and love, have said NO to GUIDs as primary keys... why does the ASPNET_REGSQL tool create tables using a GUID as a primary key?
Or am I missing something? Is there a new feature to the SQL Profiler engine in 2008 that no longer sees GUIDS as a task?
Guids do have some strengths; For example, if you are generating guids in the application code, they can be generated in a web farm without worrying about ending up with the same ID. Another benefit is that pages in the database can be locked without likely causing any issues since it is unlikely that two randomly chosen rows will exist in the same page of data.
As far as what you said about several million rows of data - Guids are going to be ok as long as you are always asking the SQL server to return a single row of data. The biggest problem is going to be when you ask for some large subset of the data or when you are batch inserting a large number of rows; Then you are likely to be doing a lot of random I/O to get all of the rows that match your criteria or insert all of the rows into the random locations that the guids end up pointing to. Also, SQL doesn't have to "physically move all those rows down to insert the new one"; Data is stored on pages, and SQL usually will only have to change the data on one page in the data file to insert the row, possibly with a couple other pages updated, but it is not like inserting a row into a massive text file.
All of that said - yeah, I also generally prefer an integer for a primary key, but just wanted to point out that there are definitely situations in which guids make some sense.
There is nothing wrong with using GUIDs as a primary key. They can have some detriments if not used properly, sure, but consider a scenario where you have various databases at stores or other points of sale and, each night, you need to take all of the data from each location and combine it into a single master database at corporate. GUIDs are a great option here because you don't need to worry about identity conflicts.
Every one of them has told me to stay away from GUIDS as primary keys when constructing a database ... because the primary key is a clustered index!
A primary key doesn't have to use a clustered index, that's just the default index type used when creating a primary key.
In fact, if you look at the database schema used by the SqlMembershipProvider you'll see that there is a nonclustered index on the primary key column.
The following is the SQL script from the InstallCommon.sql script in %WINDIR%\Microsoft.NET\Framework\v4.0.30319:
  CREATE TABLE [dbo].aspnet_Users (
    ApplicationId    uniqueidentifier    NOT NULL FOREIGN KEY REFERENCES [dbo].aspnet_Applications(ApplicationId),
    UserId           uniqueidentifier    NOT NULL PRIMARY KEY NONCLUSTERED DEFAULT NEWID(),
    UserName         nvarchar(256)       NOT NULL,
    LoweredUserName  nvarchar(256)       NOT NULL,
    MobileAlias      nvarchar(16)        DEFAULT NULL,
    IsAnonymous      bit                 NOT NULL DEFAULT 0,
    LastActivityDate DATETIME            NOT NULL)
   CREATE UNIQUE CLUSTERED INDEX aspnet_Users_Index ON [dbo].aspnet_Users(ApplicationId, LoweredUserName)
   CREATE NONCLUSTERED INDEX aspnet_Users_Index2 ON [dbo].aspnet_Users(ApplicationId, LastActivityDate)
Note that the primary key column (UserId) is created using the statement PRIMARY KEY NONCLUSTERED and that the table's CLUSTERED index is created as a composite index on ApplicationId and LoweredUserName.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With