FrameworksLiftScala

Lift sitemap.xml generation

Posted on by 

Delivering sitemap.xml in Lift

Usually delivering sitemaps for Google or other search bots is not a big deal. This is true for Lift as well. However, there are few resources online on how to achieve this. Hence, our blog post will cover two essential methods to create and deliver a sitemap.xml. The first method will just take a given sitemap.xml file and just stream it out to the client. The second method will handle dynamic site maps, mostly useful for blogs, product-catalogues, articles.

Static Sitemap files

First of all, I would like to introduce a simple file delivery mechanism for lift. In this case, we have a static file committed into our project sources. So the easiest way would be to place the file into the static files' folder. Lift has a default rule to allow access to this directory by direct linking to it. If this fits you well, that is fine. However, in most use cases, search engines would search for a sitemap under 'mywebsite.com/sitemap.xml'. Since we are placing it in our static folder, the URI would be 'mywebsite.com/static/sitemap.xml'. As you can see, automatic sitemap discovery is limited here for any search engine (which search for this location by default). If you only care about Google, you can easily apply a path to you sitemap and go with it.

What we really want here is a solution to fit our default case for all search engines. Just placing the file within the root folder is not enough. Lift will not allow you to
access the file by typing in mywebsite.com/sitemap.xml. We need to add the permission explicitly:

LiftRules.liftRequest.append {
  case Req(List("sitemap"), "xml", _) => false
}

Dynamic files

In this scenario, we have to handle big sitemap.xml files. Those which you don't want to embed into your application itself (just because of the files size and the dynamic nature of the sitemap.xml). A common use case is a generated sitemap.xml which is created by iterating through articles, products, blog posts (...) and put the correct links into the file. The file itself is placed somewhere on the server disk where your Lift application can read it.

As you may already know, streaming a file to the client is the best thing you can do here. But, how to handle this case in Lift? The answer is "as a service". We create a new REST service (extends RestHelper) which will pattern match on the request URI "/sitemap", read the file from disk and stream this file to the client:

  object SitemapService extends RestHelper {
    serve {
      case "sitemap" :: _ Get req =>
        fileResponse(SitemapProperties.path)
    }

    private def fileResponse(path: String): Box[LiftResponse] = for {
      file <- Box !! new java.io.File(path)
      input <- tryo(new java.io.FileInputStream(file))
    } yield StreamingResponse(input,
      () => input.close,
      file.length,
      ("Content-Disposition" -> "attachment; filename=sitemap.xml; type=application/xml") :: Nil,
      Nil, 200)
  }

Here, we use the default java.io.File to read our file from disk and convert it into a FileInputStream (of course, you can change/improve this to your needs). SitemapProperties.path is a helper object I created in this example. You can define a static or dynamic path to you files you would like so serve. The main part is our StreamResponse built right into Lift. By providing the correct response header, the file length and the stream itself; we can serve a sitemap file to the client. Finally, we add the service to our Boot file to enable the delivery of the file:

LiftRules.dispatch.append(SitemapService)

Our work is not done yet. What we wanted to do is a proper mywebsite.com/sitemap.xml request. For now, we just implemented an REST-Service for mywebsite.com/sitemap. We are still missing the suffix ".xml". To get this part working as well, let's add a rewrite to our Boot:

// Rewrite /sitemap.xml to /sitemap
LiftRules.statelessRewrite.prepend(NamedPF("SitemapRewrite") {
  case RewriteRequest(
    ParsePath("sitemap.xml" :: Nil, _, _,_), _, _) =>
      RewriteResponse(
        "sitemap" :: Nil, Map("sitemap" -> "sitemap")  // Use /sitemap as request
      )
})

An additional LiftRule will rewrite all requests to /sitemap.xml to /sitemap which is handled just perfectly by our REST-Service. This way, you can easily serve big(ger) files.

If you have any additional information, comments or any problems during the implementation of the above code, just leave a note or ping us on Twitter.