Using a CDN with Umbraco

I recently started using a CDN to get my website loading faster. What surprised me was how easy it was to do, so I thought I'd share my experience.

A fast loading page is great for your client, whether they know it or not, because it will have a better PageRank (Google's version of a h5yr), a lower bounce rate and a happier user. From a developer's perspective there are benefits but mostly I just get a kick from knowing that my page will fly through the Internet faster than Santa on Christmas Eve (look, I made it all festive!). Perhaps because specific page load times are rarely specified in a project's requirements, a lot of page speed improvements remain on that ill-fated list of 'nice-to-have' tickets that never quite make it to Live.

Plain ol' good development

A CDN will help your website run faster but there is no replacement for good development and sparkly servers. This post is intended to encourage you to take the next step after you've covered the basics of minification, bundling, sprites etc. If you want to know more about these then there is plenty of information around but Scott Hanselman has a pretty comprehensive post and also check out Optimus by Tim Geyssens and Web Essentials by Mads Kristensen. I'm not going to cover these topics as you're hopefully already using them, but if you're not then do pay them the attention they deserve. 

Using the techniques that Scott Hanselman suggests, my website gets a grade B (88%) from YSlow, which is actually already pretty good. However I've now managed to get a grade A (98%) using the techniques I'm about to outline. And yes having an outstanding 2% does get right up my nose, but it relates to Google Analytics and Google Fonts for which I have no control.

YSlow results
YSlow results

What is a CDN?

Content Delivery Network (CDN) is a network of proxy web servers that replicate your content to locations around the world. They are responsible for delivering your content, instead of your web server, because they are faster, more reliable, don't use cookies or superfluous headers, and are designed for high loads.

Each CDN needs an origin for the content that it is delivering. The first time one of your files is requested from the CDN, the nearest proxy server to the requester will grab the file from the origin, cache it and send it back to the requester. The next time the file is requested the file is sent straight back to the requester, using the cache. The cache builds up over time and the origin gets fewer and fewer requests for files.

Setting up a CDN on my website makes it load in real terms at least 25% faster, and that is when compared to an already fast web server. In addition to the increased speed and decreased server load, CDNs also often offer the following benefits:

  • Spam and DoS/DDoS protection
  • GZipping of assets
  • Your files are always online
  • Cheaper bandwidth costs
  • Usage statistics

There are loads of CDN suppliers, including:

I've not used most of these so feel free to make recommendations. I'm going to focus on how you can use Windows Azure's CDN, because that is where I host the majority of my sites.

CDN your static files

Static files are the most sensible files to serve from a CDN, because they change rarely, make up the majority of the server requests and changes to them can easily be identified.

Setup the CDN

When setting up a CDN you need to identify the origin, which is the source of the content to be delivered. Typically you have the option of supplying a URL to your website, or uploading content to some file storage manually.

Windows Azure is no different and through the Azure portal you can pick from a list of your Web Apps and Cloud Services hosted on Windows Azure, or select a storage account (cloud based file storage similar to AWS S3). If you're not using Azure, or are using Azure Virtual Machines, then you'll need to use a storage account or an alternative CDN provider.

Selecting a website as a source for the CDN is the simplest solution, but you then lose control over what content is available through the CDN as all content that is accessible through IIS will be available through the CDN. An Azure storage account also provides you with the ability to set the cache control on a file-by-file basis.

If you want to use a storage account then uploading the files to the storage account is easy through Azure Storage Explorer or Cloudberry Explorer but, as you'll see later, just uploading the files isn't enough and we need to set a cache control for each file. To help with this I have created a console app which will synchronise a local directory with a container on your storage account, and set the cache control. This code could be run on each deployment and is easily refactored to change the length of caching or build into part of an automated deployment.

  <appSettings>
    <!-- Specify settings here or as console arguments-->
    <add key="LocalPath" value="{{local_path_here}}"/>
    <add key="ContainerName" value="{{container_name}}"/>
    <add key="AccountName" value="{{account_name_here}}"/>
    <add key="KeyValue" value="{{key_value_here}}"/>
    <add key="ForceBlobProperties" value="False" />
  </appSettings>

AzureStorageSync settings

You'll need to grab the storage account name and access key from the Azure portal.

Control the cache

Managing this cache is important because you want to set a long enough cache to make the setup worth while, which won't happen by default.

Please keep in mind that both the user's browser, each CDN proxy server and potentially other miscellaneous proxy servers will maintain a cache of your files, but we can set up rules about what to cache and how long for.

If your CDN is pointed at a website as its origin then it will look to the Cache-Control response header to tell it if it should cache the file and if so how long for. If we're using a file storage origin then, for Azure, these headers will need to be set up on the file properties of each file. The CDN will supply the user with the same Cache-Control as the origin provided, except where none was specified. In these circumstances some CDNs will allow a default to be setup. Azure does not.

When the cache period has expired the browser/ CDN proxy server will do what's called conditional caching. It will send a request asking if the file has changed. This query is done by sending a date and/or an etag (a tiny bit like a hash of the file). The response will either be a 304 status code, which means "Yes, it is the same" and has a header but no body, or the new file will be returned (a 200 status code). If the source doesn't support conditional caching then the file will always be returned whether it has changed or not.

For static file, IIS can manage conditional caching just fine, but we need to tell it to allow client caching and how long to cache the file for before checking back. This can be done through IIS or in the web.config, and it is best to avoid applying it to the Umbraco folder E.g:

  <location path="scripts">
    <system.webServer>
      <staticContent>
        <clientCache cacheControlMode="UseMaxAge" cacheControlCustom="public" cacheControlMaxAge="7.00:00:00" />
      </staticContent>
    </system.webServer>
  </location>
  <location path="css">
    <system.webServer>
      <staticContent>
        <clientCache cacheControlMode="UseMaxAge" cacheControlCustom="public" cacheControlMaxAge="7.00:00:00" />
      </staticContent>
    </system.webServer>
  </location>
  <location path="media">
    <system.webServer>
      <staticContent>
        <clientCache cacheControlMode="UseMaxAge" cacheControlCustom="public" cacheControlMaxAge="7.00:00:00" />
      </staticContent>
    </system.webServer>
  </location>

7 days of caching for files in the scripts, css and media folders

CDN your media

I'm not sure if you'd classify Umbraco media as static or not, but certainly it makes sense to serve this from your CDN if possible. The complication with media is if you're using ImageProcessor (GetCroppedUrl()) to create cropped versions of images. If the CDN's origin is a file source then the cropped versions won't exist and the uncropped version will be returned. If your origin is your website then the CDN will request each cropped version of each image from the website, which will work fine. It is for this reason that I prefer to have the origin of my CDN as my website.

For those of you using the UmbracoFileSystemProvider (see here and here), a CDN is an even more sensible choice as it will get around the issue that the Virtual Path Provider "may affect performance/caching".

Link to your CDN

Assuming your CDN content has a matching file path to your web server, which I would highly advise as it makes life a lot easier, then we just need to replace the hostname of our web server with the one of your CDN. We'll need to do this for every reference to a file you want to serve from your CDN. To simplify this I've written a UrlProvider extension method that I like to use. It works very similarly to the native Content method.

public static class UrlHelperExtensions
    {
        public static string CdnContent(this UrlHelper url, string contentPath)
        {
            var localPath = url.Content(contentPath);
            if (string.IsNullOrWhiteSpace(CdnHost) || url.RequestContext.HttpContext.Request.Url == null)
                return localPath;

            //Use the same scheme as the current page to avoid browser warnings in HTTPS
            var scheme = url.RequestContext.HttpContext.Request.Url.Scheme;
            var uriBuilder = new UriBuilder
            {
                Scheme = scheme,
                Port = -1,
                Host = CdnHost,
                Path = localPath,
            };

            //Apply a version number to help avoid override caching
            if (VersionNumber > 0)
                uriBuilder.Query = "v=" + VersionNumber;

            return uriBuilder.ToString();
        }

        //Use a property to avoid repeated calls to ConfigurationManager
        private static string _cdnHost;
        //An empty string is used to identify an unspecified value
        private static string CdnHost
        {
            get { return _cdnHost ?? (_cdnHost = ConfigurationManager.AppSettings["CdnHost"] ?? string.Empty); }
        }

        //Use a property to avoid repeated calls to ConfigurationManager
        private static int? _versionNumber;
        private static int VersionNumber
        {
            get
            {
                if (!_versionNumber.HasValue)
                {
                    int foundNo;
                    int.TryParse(ConfigurationManager.AppSettings["CdnVersion"], out foundNo);
                    _versionNumber = foundNo;
                }

                return _versionNumber.Value;
            }
        }
    }

UrlProvider extension: CdnContent

As you can see from this code it looks to the web.config for the hostname of the CDN and only changes the hostname if a CDN hostname is supplied. This means that when in development the website can link to all files on the local server, and then in production it can easily be toggled to link to files on the CDN. Other features are that it will detect whether it needs to use the HTTPS scheme and also a version number is added to the querystring of the url when specified as an AppSetting in the web.config. The querystring is important as your CDN and browser will treat a URL with a different querystring as a different file, so the version number will allow us to bypass any cache when we do a deployment. As such we can set long cache lengths of 365 days if we like, because as soon as the version number changes the CDN and browser will re-download our files.

To use the extension method just call it with your asset's app relative paths, media paths or GetCroppedUrl() calls, much like you might for @Url.Content. Of course also make sure your code has referenced the namespace of the extension method. I put a reference in my /views/web.config to a common namespace used by my extension methods.

<link rel="icon" type="image/x-icon" href='@Url.CdnContent("~/assets/images/favicon.ico")'>

CdnContent usage

What about your dynamic content?

You can put your entire website behind a CDN, and set up your DNS so that your website's hostname maps straight to your CDN. The CDN would then be called for your web pages as well as any other dynamic content. I don't love this idea, even if the website is suitable, because you lose the ability to invalidate the cache with a version number change. There might be justification to do this if you were expecting some very heavy traffic for a short period of time.

My preferred approach is to continue serving all page requests from my web server, but setting up a short cache period for pages that don't change regularly and are the same for all users.

To add a cache control to a page you'll need to have your page rendered using a custom controller. The simplest solution is then to use the OutputCacheAttribute to set a cache period for the client. This gets you part of the way, but conditional caching of dynamic content is not supported by IIS, so after the cache has expired your page will always be returned (200 status) even if the page is unchanged (304 status).

Fortunately you can manage the conditional caching yourself. The following example will do just that by using the page content's UpdateDate property as an indicator of freshness.

    public class MyUmbracoEventHandler : ApplicationEventHandler
    {
        protected override void ApplicationStarting(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
        {
            DefaultRenderMvcControllerResolver.Current.SetDefaultControllerType(typeof(DefaultController));
        }
    }

MyUmbracoEventHandler.cs


    /*Max age of one hour*/
    [OutputCache(Duration = 3600, VaryByParam = "*", Location = OutputCacheLocation.Client)]
    public class DefaultController : RenderMvcController
    {
        public override ActionResult Index(RenderModel model)
        {
            //Set when the current page was last modified
            Response.Cache.SetLastModified(model.Content.UpdateDate);

            //Check if the content has changed since the cached version
            if (!IsModifiedSince(model.Content.UpdateDate))
                return new HttpStatusCodeResult(304, "Not Modified");

            return base.Index(model);
        }

        protected bool IsModifiedSince(DateTime contentModified)
        {
            var header = Request.Headers["If-Modified-Since"];

            if (header != null)
            {
                DateTime headerValue;
                if (DateTime.TryParse(header, out headerValue))
                {
                    return headerValue < contentModified;
                }
            }

            return true;
        }
    } 

DefaultController.cs

Do note that the above example implements the default controller so the code will be applied to all pages that don't have their own controller which you may well not want. Additionally, the returned HTML may change without the relevant the page's content being republished if the page displays external content. Instead of the UpdateDate property, you may prefer to use the last write date on the umbraco.config as a freshness indicator or to return a 200 status if the 'Last-Modified' date is older than 1 day previous (for example).

Conclusion

Here are a few screenshots to show how performance of my site changed as a result of the CDN work, along with the caching of dynamic content.

Before

First page load - 516KB / 541ms
First page load - 516KB / 541ms
Second page load - 7.5KB / 436ms
Second page load - 7.5KB / 436ms

After

First page load - 518KB / 394ms
First page load - 518KB / 394ms
Second page load - 73KB / 255ms
Second page load - 73KB / 255ms

Using Windows Azure, and a minimal amount of code, you can have a much faster site, a more robust hosting environment, a happier user and a happier client. There is a small added complexity to the solution, but I believe CDNs are a 'no brainer' and should be part of the standard solution for any public facing website.

David Peck

David is on Twitter as