The use of sitemap confers its own set of advantages to a website. While it can add significant value in terms of smooth website navigation and enhanced visibility for the search engines, it also empowers the website with the ability to immediately inform the search engines about any changes happening on the site. This leads to effectively faster indexing of your changes website pages as compared to the scenario when you don’t have a sitemap. Having a sitemap reduces your dependency solely on the external links for bringing the search engines to your website. While it may not be advisable to have errors such as broken links or orphaned pages on your site, a sitemap can help you in such cases too, when you’ve by mistake, failed to fix such errors.
So, just in case your site has a couple of broken internal links or orphaned pages on it, by mistake, that cannot be visited in any other way, a sitemap can help your visitors reach them as well. However, it is any day better to let these errors not make it to your website in the first place.
So, in this article, I plan to discuss all about how to generate and add a sitemap to your Rails Application. Generate Sitemap: Required Gem: Sitemap generator:- https://github.com/kjvarga/sitemap_generator SitemapGenerator is the easiest way to generate Sitemaps in Ruby. Rails integration provides access to the Rails route helpers within our sitemap config file and automatically makes the rake tasks available to us. Or if we prefer to use another framework, we can! We can use the rake tasks provided or run our sitemap configs as plain ruby scripts. Sitemaps XML format: The Sitemap protocol format consists of XML tags. All data values in a Sitemap must be entity-escaped. The file itself must be UTF-8 encoded. The Sitemap must: Begin with an opening tag and end with a closing tag. Specify the namespace (protocol standard) within the tag. Include a entry for each URL, as a parent XML tag. Include a child entry for each parent tag. All other tags are optional. Also, all URLs in a Sitemap must be from a single host, such as www.xyz.com or estore.xyz.com. For more details: https://www.sitemaps.org/protocol.html How to add a sitemap to a Rails app: 1) View for your sitemap: # app/views/mysitemap/index.xml.erb
2) At your Controller:
Let it be our object in view is @articles variable. It needs to get that from a mysitemap controller:
# app/controllers/mysitemap_controller.rb
MysitemapController < ApplicationController
layout nil
def index
headers['Content-Type'] = 'application/xml'
respond_to do |format|
format.xml {@articles = Article.all}
end
end
end
3) Add a route:
# config/routes.rb
get 'sitemap.xml', :to => 'sitemap#index', :defaults => {:format => 'xml'}
How to convert XML file to HTML:
A sample XML file;
# test1.xml
http://localhost:3000/magazines
2016-10-03T12:40:39+00:00
weekly
1.0
http://localhost:3000/magazines/1
2015-05-07T04:00:00+00:00
1.0
http://localhost:3000/magazines/2
2015-05-07T04:00:00+00:00
1.0
http://localhost:3000/magazines/4
2015-05-07T04:00:00+00:00
1.0
1) Using Ruby snippet with Nokogiri gem:
Installing Nokogiri:
https://nokogiri.org/tutorials/installing_nokogiri.html
Code Snippet:
siteMapUrls = Nokogiri::XML(File.open('test1.xml')).xpath("//url/loc").each do |node|
puts node.inner_text
end
2) Using Javascript:
Add a Table inside tag;
Include this script;
3) Using XSL file:
Create a XSL file # test_style_sheet.xsl
My Sitemap links Collection
Sitemap | Last Modified |
---|---|
Your View file:
# mysitemap.rb
require 'wayback_archiver'
require 'sitemap-parser'
require 'open-uri'
require 'nokogiri'
siteMapUrl = ARGV[0]
if !siteMapUrl.nil?
Nokogiri::XML(File.open('test1.xml')).xpath("//url/loc").each do |node|
siteMapLink = node.content
subSiteMapLink = SitemapParser.new siteMapLink
arraySubSiteMapLink = subSiteMapLink.to_a
(0..arraySubSiteMapLink.length-1).each do |j|
WaybackArchiver.archive(arraySubSiteMapLink[j], :url)
end
end
end
Run the script in Ruby prompt:
ruby mysitemap.rb URL, substituting the URL for the sitemap.
The sitemap code snippet may require changes depending on the node tag names.
Validate the sitemap & submit it to Google:
Register your site on Google Webmaster Tools.
From there, we can validate and submit your sitemap for crawling.
Finally, we should be able to see a number of the URL in our sitemap.