Quantcast
Channel: The grumpy coder.
Viewing all articles
Browse latest Browse all 43

Dynamic Robots.txt in Sitecore

$
0
0

For some reason one of the most common search terms for people visiting this blog is robots.txt and since I have not written a post about this I think it is about time. So by popular demand here is a simple take on a dynamic robots.txt handler in Sitecore.

The handler can be used both in a single website scenario where you just want to be able to edit the contents of the robots.txt file in Sitecore or in a multi site solution where you need to have different robots.txt for the sites.

To be able to handle incoming requests for the robots.txt file we first need a suitable place to intercept the call and the easiest solution is to implement a handler in the httpRequestBegin pipeline in Sitecore. So let us create this handler.

using System;
using System.IO;
using System.Runtime.Caching;
using System.Web;
using Sitecore;
using Sitecore.Data.Items;
using Sitecore.IO;
using Sitecore.Pipelines.HttpRequest;
using Sitecore.Web;

namespace TGC.Feature.SEO.Pipelines
{
	public class RobotsTextHandler : HttpRequestProcessor
	{
        public int CacheExpiration { get; set; }

        public override void Process(HttpRequestArgs args)
		{
			Uri url = HttpContext.Current.Request.Url;
			if (!url.PathAndQuery.EndsWith(RobotsTextFileName))
				return;

			SiteInfo siteInfo = Context.Site.SiteInfo;
			if (siteInfo == null)
				return;

			Item startItem = Context.Database.GetItem(Context.Site.StartPath);

			args.AbortPipeline();
			string value = !string.IsNullOrEmpty(startItem?["RobotsText"]) ? startItem["RobotsText"] : GetDefaultRobotsText();
			args.Context.Response.ClearContent();
			args.Context.Response.ContentType = "text/plain";
			args.Context.Response.Write(value);
			args.Context.Response.End();
		}

        private string GetDefaultRobotsText()
        {
            string value = MemoryCache.Default.Get(DefaultRobotsTextCacheKey) as string;
            if (value != null)
                return value;

            if (FileUtil.Exists(RobotsTextFileName))
            {
                using (StreamReader streamReader = new StreamReader(FileUtil.OpenRead(RobotsTextFileName)))
                {
                    value = streamReader.ReadToEnd();
                }
                MemoryCache.Default.Set(DefaultRobotsTextCacheKey, value, DateTimeOffset.UtcNow.AddMinutes(CacheExpiration));
            }
            else
            {
                value = string.Empty;
            }
            return value;
        }

        private const string RobotsTextFileName = "/robots.txt";
        private const string DefaultRobotsTextCacheKey = "DefaultRobotsText";
    }
}

The handler will try to retrieve the text from the RobotsText field on the start item as set by the startPath value in the website definition so for this to work you will need to add a suitable field such as multi-line text to this item in Sitecore. If this field does not exist or is empty the handler will instead look for a physical robots.txt file in the root directory and use this if it exists.

Now we just need to register the handler in the Sitecore configuration and we are good to go.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <pipelines>
      <httpRequestBegin>
        <processor patch:after="*[@type='Sitecore.Pipelines.HttpRequest.DatabaseResolver, Sitecore.Kernel']" type="TGC.Feature.SEO.Pipelines.RobotsHandler, TGC.Feature.SEO" >
          <CacheExpiration>720</CacheExpiration>
        </processor>
      </httpRequestBegin>
    </pipelines>
  </sitecore>
</configuration>

Viewing all articles
Browse latest Browse all 43

Trending Articles