I would like to implement the following functionality:
A C# client connects to an HTTP server and downloads an image to disk.
The next time the client starts checks if the image on the server is newer than the image on disk, and in this case, the client overrides the image on disk.
For me it's easy to download the image, but I'm not sure how to check if the image on the server is newer. How could I implement it? I guess that I could check the timestamp, or the image size (or both) but I don't know how to do it.
Try If-Modified-Since request field. http://en.wikipedia.org/wiki/List_of_HTTP_header_fields
I am not sure that it is fully supported by every server. So if it is not supported and you will still get the file (and not 304 if it is supported) you can calculate checksums and if they are different consider file modified. Or just overwrite - and you will always have newest version.
HttpWebRequest can just use the IE cache, so if all the images will be in that cache anyway, and the cost of re-writing the file (but not having to download it) is acceptable, you can just make use of that.
If you need to handle it yourself though, then:
Given:
string uri; //URI of the image.
DateTime? lastMod; // lastModification date of image previously recorded. Null if not known yet.
string eTag; //eTag of image previously recorded. Null if not known yet.
You'll have to store these at the end of this, and retrieve them again (when not a new image) at the beginning. That's up to you, given that, the rest works:
var req = (HttpWebRequest)WebRequest.Create(uri);
if(lastMod.HasValue)
  req.IfModifiedSince = lastMod.Value;//note: must be UTC, use lastMod.Value.ToUniversalTime() if you store it somewhere that converts to localtime, like SQLServer does.
if(eTag != null)
  req.AddHeader("If-None-Match", eTag);
try
{
  using(var rsp = (HttpWebResponse)req.GetResponse())
  {
    lastMod = rsp.LastModified;
    if(lastMod.Year == 1)//wasn't sent. We're just going to have to download the whole thing next time to be sure.
      lastMod = null;
    eTag = rsp.GetResponseHeader("ETag");//will be null if absent.
    using(var stm = rsp.GetResponseStream())
    {
      //your code to save the stream here.
    }
  }
}
catch(WebException we)
{
  var hrsp = we.Response as HttpWebResponse;
  if(hrsp != null && hrsp.StatusCode == HttpStatusCode.NotModified)
  {
    //unfortunately, 304 when dealt with directly (rather than letting
    //the IE cache be used automatically), is treated as an error. Which is a bit of
    //a nuisance, but manageable. Note that if we weren't doing this manually,
    //304s would be disguised to look like 200s to our code.
    //update these, because possibly only one of them was the same.
    lastMod = hrsp.LastModified;
    if(lastMod.Year == 1)//wasn't sent.
      lastMod = null;
    eTag = hrsp.GetResponseHeader("ETag");//will be null if absent.
  }
  else //some other exception happened!
    throw; //or other handling of your choosing
}
E-tags are more dependable than last-modified when implemented correctly (noting sub-second resolutions on changes, and reflecting different responses due to different Accept-* headers). Some implementations are buggy though (IIS6 on a web-farm without a particular tweak, Apache with mod-gzip) so it can be worth taking out the code relating to e-tags and just going by the date.
Edit: If you wanted to go even further in implementing HTTP caching, you could also store the expires and max-age (use the latter if they're both present and it disagrees with the former) and skip downloading entirely if it's earlier than those values suggest. I've done this and it works well (I had an in-memory cache of objects created by from the XML returned by various URIs, and if the XML was fresh or hadn't changed, I re-used the object), but it may be irrelevant for your needs (if you care to be fresher than the server suggests, or if you're always going to be outside that window).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With