Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dictionary/List speed for database update

Tags:

c#

linq-to-sql

I've got my model updating the database according to some information that comes in in the form of a Dictionary. The way I currently do it is below:

SortedItems = db.SortedItems.ToList();
foreach (SortedItem si in SortedItems)
{
    string key = si.PK1 + si.PK2 + si.PK3 + si.PK4;
    if (updates.ContainsKey(key) && updatas[key] != si.SortRank)
    {
        si.SortRank = updates[key];
        db.SortedItems.ApplyCurrentValues(si);
    }
}
db.SaveChanges();

Would it be faster to iterate through the dictionary, and do a db lookup for each item? The dictionary only contains the item that have changed, and can be anywhere from 2 items to the entire set. My idea for the alternate method would be:

foreach(KeyValuePair<string, int?> kvp in updates)
{
    SortedItem si = db.SortedItems.Single(s => (s.PK1 + s.PK2 + s.PK3 + s.PK4).Equals(kvp.Key));
    si.SortRank = kvp.Value;
    db.SortedItems.ApplyCurrentValues(si);
}
db.SaveChanges();

EDIT: Assume the number of updates is usually about 5-20% of the db entires

like image 318
yoozer8 Avatar asked Mar 15 '26 17:03

yoozer8


1 Answers

Let's look:

Method 1:

  • You'd iterate through all 1000 items in the database
  • You'd still visit every item in the Dictionary and have 950 misses against the dictionary
  • You'd still have 50 update calls to the database.

Method 2:

  • You'd iterate every item in the dictionary with no misses in the dictionary
  • You'd have 50 individual lookup calls to the database.
  • You'd have 50 update calls to the database.

This really depends on how big the dataset is and what % on average get modified.

You could also do something like this:

Method 3:

  • Build a set of all the keys from the dictionary
  • Query the database once for all items matching those keys
  • Iterate over the results and update each item

Personally, I would try to determine your typical case scenario and profile each solution to see which is best. I really think the 2nd solution, though, will result in a ton of database and network hits if you have a large set and a large number of updates, since for each update it would have to hit the database twice (once to get the item, once to update the item).

So yes, this is a very long winded way of saying, "it depends..."

When in doubt, I'd code both and time them based on reproductions of production scenarios.

like image 169
James Michael Hare Avatar answered Mar 17 '26 13:03

James Michael Hare



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!