FrameworksLiftScala

Lift sitemap.xml generation

Posted on by

Delivering sitemap.xml in Lift

Usually delivering sitemaps for google or other search bots is not a big deal. This is true for Lift as well. But there are not many resources online on how to achieve this. Hence, our blog post will cover two essential methods to create and deliver a sitemap.xml. The first method will just take a given sitemap.xml file and just stream it out to the client. The second method will handle dynamic sitemaps mostly useful for blogs, product catalogs, articles.

Static Sitemap files

First of all, I would like to introduce a simple file delivery mechanism for lift. In this case, we have a static file committed into our project sources. So the easiest way would be to place the file into the static files folder. Lift has a default rule to allow access to this directory by direct linking to it. If this fits you well, that is fine. But in most use cases, search engines would search for a sitemap under 'mywebsite.com/sitemap.xml'. Since we are placing it in our static folder, the URI would be 'mywebsite.com/static/sitemap.xml'. As you can see, automatic sitemap discovery is limited here for any search engine (which search for this location by default). If you only care about Google you can easily apply a path to you sitemap and go with it.

What we really want here is a solution to fit our default case for all search engines. Just placing the file into the root folder is not enough. Lift will not allow you to just access the file by typing in mywebsite.com/sitemap.xml. We need to allow this access explicitly:

LiftRules.liftRequest.append {
  case Req(List("sitemap"), "xml", _) => false
}

Dynamic files

In this scenario we have to handle big sitemap.xml files. Those which you don't want to embed into you application itself (just because of the files size and the dynamic nature of the sitemap.xml). A common use case is a generated sitemap.xml which is created by iterating through articels, products, blog posts (...) and put the correct links into the file. The file itself is placed somewhere on the server disk where your Lift application can read it.

As you may already know, streaming a file to the client is the best thing you can do here. How to handle this case in Lift? The answer is "as a service". We create a new REST service (extends RestHelper) which will pattern match on the request URI "/sitemap", read the file from disk and stream this file to the client:

  object SitemapService extends RestHelper {
    serve {
      case "sitemap" :: _ Get req =>
        fileResponse(SitemapProperties.path)
    }

    private def fileResponse(path: String): Box[LiftResponse] = for {
      file <- Box !! new java.io.File(path)
      input <- tryo(new java.io.FileInputStream(file))
    } yield StreamingResponse(input,
      () => input.close,
      file.length,
      ("Content-Disposition" -> "attachment; filename=sitemap.xml; type=application/xml") :: Nil,
      Nil, 200)
  }

Here, we use the default java.io.File to read our file from disk and convert it into a FileInputStream (of course you can change/improve this to your needs). SitemapProperties.path is a helper object I created for this example. You can define a static or dynamic path to you files you would like so serve. The main part is our StreamResponse built right into Lift. By providing the correct response header, the file length and the stream itself we can serve a sitemap file to the client. Finally we add the service to our Boot file to enable the serving of the file:

LiftRules.dispatch.append(SitemapService)

Our work is not done yet. What we wanted to do is a proper mywebsite.com/sitemap.xml request. For now we just implemented a REST-Service for mywebsite.com/sitemap. The suffix ".xml" is missing. To get this part working as well, let's add a rewrite to our Boot:

// Rewrite /sitemap.xml to /sitemap
LiftRules.statelessRewrite.prepend(NamedPF("SitemapRewrite") {
  case RewriteRequest(
    ParsePath("sitemap.xml" :: Nil, _, _,_), _, _) =>
      RewriteResponse(
        "sitemap" :: Nil, Map("sitemap" -> "sitemap")  // Use /sitemap as request
      )
})

An additional LiftRule will rewrite all requests to /sitemap.xml to /sitemap which is handled just perfectly by our REST-Service. This way you can easily serve big(ger) files.

If you have any additional information, comments or any problems with the implementation of the above code, just leave a note or ping us un Twitter.