I have an Azure Sql database about 9GB in size. It serves a web app that handles about 135K requests per hour. Most of the data is transient, it lives in the database from a few minutes to five days and is deleted. About 10GB moves through the database per day.
I tried to run a delete query on a table to delete about 250,000 records out of 350,000 records total. About 10 percent of the records have one or two nvarchar(max) values large enough to be stored in LOB storage.
Over the weekend, I tried to delete them all at once. It ran for four hours before I canceled the query, then it was rolling back for another 8 hours - bad move. I really wasn't expecting it to be that bad.
Then I tried another approach. This batch ran at night when the web app was handling about 100K requests per hour. tblJobs Id field is a uniqueidentifier that is the primary key.
insert @tableIds select Id from dbo.tblJobs with(nolock) 
where (datediff(day, SchedDate, getDate()) > 60)  
   or (datediff(day, ModifiedDate, getDate()) > 3 and ToBeRemoved = 1)
set @maintLogStr = 'uspMaintenance [tblJobs] Obsolete J records count @tableIds: ' + convert(nvarchar(12), (select count(1) from @tableIds))
insert dbo.admin_MaintenanceLog(LogEntry) values(@maintLogStr)
set @maintLogId = newid()
set @maintLogStr = 'uspMaintenance [tblJobs] Obsolete J records beginning loop...'
insert dbo.admin_MaintenanceLog(Id, LogEntry) values(@maintLogId, @maintLogStr)
while exists(select * from @tableIds)
begin
    delete @tableIdsTmp
    begin transaction
        insert @tableIdsTmp select top 1000 id from @tableIds
        delete p from @tableIdsTmp i join dbo.tblJobs p on i.id = p.Id
        delete x from @tableIdsTmp t join @tableIds x on t.id = x.id
        set @maintLogStr = 'uspMaintenance [tblJobs] Obsolete J records remaining count @tableIds: ' + convert(nvarchar(12), (select count(1) from @tableIds))
        update dbo.admin_MaintenanceLog set LogEntry = @maintLogStr, RecordCreated = getdate() where Id = @maintLogId
    commit transaction
    if @dowaits = 1 WAITFOR DELAY '00:00:01.000'
end
SchedDate, ModifiedDate and ToBeRemoved are not indexed so gathering the Ids in @tableIds took about 3 minutes - not bad.
Then from the log entries, it took 1 hour 55 minutes to delete 11,000 records from tblJobs at which time the job called from a remote machine timed out.
Why is it taking so long? What can I do to speed it up?
If you are deleting 95% of a table and keeping 5%, it can actually be quicker to move the rows you want to keep into a new table, drop the old table, and rename the new one. Or copy the keeper rows out, truncate the table, and then copy them back in.
Queries can become slow for various reasons ranging from improper index usage to bugs in the storage engine itself. However, in most cases, queries become slow because developers or MySQL database administrators neglect to monitor them and keep an eye on their performance.
Reduce nested views to reduce lags This nesting causes too many data returns for every single query, which either makes the database crawl, or completely give up and give no returns. Minimizing nesting is a simple way to make your SQl query efficient and significantly improve speeds.
A lot of your performance will be tied to the reservation size you are using (as mentioned in the prior answers). However, you don't need to do the table variables in your code at all to achieve what you want. In fact, you should pretty much never use them when there is a join involved at all because they have no statistics on them (and thus will likely have poor plan choices when there are complex choices for the optimizer to make). You can read the official guidance on that here:table variables documentation.
So, if you step back and look at the core of what you are trying to do, you can do this:
delete top(1000) dbo.TblJobs 
where (datediff(day, SchedDate, getDate()) > 60)
   or (datediff(day, ModifiedDate, getDate()) > 3 and ToBeRemoved = 1)
You will likely get a table scan from this query because:
When you do the scan, then you are potentially going to get locking issues since you have a workload that is running concurrently on the table. So, if some other request is doing a select statement you might block the update query as it scans through the table. (Posting query plans is really helpful to discussion scaling/concurrency issues, btw).
Furthermore, given that you have a loop where you are taking 1000 rows out of the table, copying them into a table variable, then eventually copying them into yet another and joining with the original table in the delete, you are turning a problem that was O(N) into O(N^2). Algorithmically, your query will likely get slower and slower the more rows you add into your table with this approach.
There are some things you can do to improve this query (potentially):
Most performance tuning and analysis is aided by an understanding of the cost of each operation. Using "set statistics time on" and "set statistics io on" gives you good metrics to track the physical costs of the query. "set statistics profile on" is better for looking at the algorithmic cost to each query operator (for that N^2 problem).
Better late than never, but I hope this helps you (and others) to understand how to improve your SQL Azure performance if you hit a similar situation in the future.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With