I am doing some performance tests using .Net 3.5 against SQL Server. I am doing an insert of 1 million records. When I wrap this inside a transaction (either serializable, RepeatabelRead or ReadUncommited) it runs in under 80 seconds on my system. When I remove the transaction it runs in roughly 300 seconds. I would expect that using no transaction would be the fastest way to insert lines into a database, because the DBMS does not need to take into account a potential rollback. What happens here? Is this typical for SQL Server, the SQL Server ADO.Net Provider, ADO.Net in general, DBMSes in general?
I have a background in iSeries/DB2 databases. In DB2 you have to enable journalling before you can get commitment control and transactions, and journalling is relatively expensive.
What I actually wanted to do was a compare of SqlCommand inserts vs Entity Framework inserts, but I was so surprised at these results that I wanted to find out what is going here first.
Below the code that I use to run the a test. When I run the below code, it takes about 74 seconds (measured between the AtStart log and the AtEnd log lines)
using (SqlConnection sqlConnection = new SqlConnection(connectionString))
{
sqlConnection.Open();
SqlCommand deleteCommand = new SqlCommand("DELETE FROM LockTest");
deleteCommand.Connection = sqlConnection;
deleteCommand.ExecuteNonQuery();
using (SqlTransaction transaction = sqlConnection.BeginTransaction(System.Data.IsolationLevel.Serializable))
{
try
{
if (DEBUG) LOG.Debug("AtStart");
SqlCommand insertCommand = new SqlCommand();
insertCommand.Connection = sqlConnection;
insertCommand.Transaction = transaction;
insertCommand.CommandText = "INSERT INTO LockTest (Id, Name, Description, Type) " +
"VALUES (@id, @name, @description, @type)";
SqlParameter idParameter = new SqlParameter("@id", System.Data.SqlDbType.UniqueIdentifier);
insertCommand.Parameters.Add(idParameter);
SqlParameter nameParameter = new SqlParameter("@name", System.Data.SqlDbType.NVarChar, 50);
insertCommand.Parameters.Add(nameParameter);
SqlParameter descriptionParameter = new SqlParameter("@description", System.Data.SqlDbType.NVarChar, Int32.MaxValue);
insertCommand.Parameters.Add(descriptionParameter);
SqlParameter typeParameter = new SqlParameter("@type", System.Data.SqlDbType.NChar, 20);
insertCommand.Parameters.Add(typeParameter);
insertCommand.Prepare();
for (int i= 0; i < 1000000; i++)
{
Guid g = Guid.NewGuid();
string s = g.ToString();
insertCommand.Parameters["@id"].Value = g;
insertCommand.Parameters["@name"].Value = s;
insertCommand.Parameters["@description"].Value = DateTime.UtcNow.Ticks.ToString();
insertCommand.Parameters["@type"].Value = "test";
insertCommand.ExecuteNonQuery();
}
transaction.Commit();
}
catch
{
transaction.Rollback();
throw;
}
}
sqlConnection.Close();
}
if (DEBUG) LOG.Debug("AtEnd");
Log flush.
With no explicit transactions the implicit transactions started by each statement (ie. INSERT) must commit. Commit cannot return until the data in the log is written to the disk, which means each INSERT statement must wait for the log disk write operation.
Explicit transactions must only wait when the COMMIT statement is issued, and by that time every full log page was already submitted, and the last log page contains probably several INSERTs so the cost of the write is amortized.
Update:
You can verify the Log Flush times in the performance counters: http://msdn.microsoft.com/en-us/library/ms189883.aspx:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With