In
this post, you are going to learn below things.
a.
What
is sitemap
b.
What
is sitemap protocol
c.
Who
uses the sitemap
d.
How
to generate sitemap using sitemapgen4j library
What is sitemap?
Sitemap
represents all the pages of a website. Usually it is an xml file, that lists
all the urls of your website. For example, below url represents all the web
pages of my blog
What is sitemap
protocol?
Google
developed sitemap protocol, by using this web developers can post their web
pages across sites.
A
sample sitemap looks like below.
<?xml version='1.0' encoding='UTF-8'?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://self-learning-java-tutorial.blogspot.com/2018/03/base64-encoding-and-decoding-in-java.html</loc> <lastmod>2018-03-06T14:04:57Z</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> </urlset>
A
sitemap file can refer other sitemap files.
<?xml version='1.0' encoding='UTF-8'?> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>https://self-learning-java-tutorial.blogspot.com/sitemap.xml?page=1</loc> </sitemap> <sitemap> <loc>https://self-learning-java-tutorial.blogspot.com/sitemap.xml?page=2</loc> </sitemap> </sitemapindex>
Below
table summarizes the important tags of sitemap file.
Tag
|
Description
|
<urlset>
|
All
the urls of your web site are specified in this tag
|
<url>
|
Information
like url location, last modification date are specified in this tag
|
<sitemapindex>
|
You
can specify other sitemaps location in this tag.
|
<sitemap>
|
Specify
the details of sitemap
|
<loc>
|
Specify
the location of url (or) sitemap
|
<lastmod>
|
Specify
the last modification date of the url. Date should be specified in ISO_8601
format.
|
<changefreq>
|
It
tells, how frequently the page can change. It can be one of below values.
always
hourly
daily
weekly
monthly
yearly
never
|
<priority>
|
Priority
of the url relative to other urls in the same web site.
|
Tags
<lastmod>, <changefreq> and <priority> are optional.
Who uses sitemap?
Sitempas
are submitted to search engines. Search engines use the sitemaps while indexing
the content of your web site.
How to generate
sitemap using sitemapgen4j library?
Sitemapgen4j
is a java library used to generate xml sitemaps.
I
am going to use below maven dependency.
<!--
https://mvnrepository.com/artifact/com.github.dfabulich/sitemapgen4j -->
<dependency>
<groupId>com.github.dfabulich</groupId>
<artifactId>sitemapgen4j</artifactId>
<version>1.0.6</version>
</dependency>
Below
step-by-step procedure explains how to generate simple sitemap.
Get an instance of
'WebSitemapGenerator'.
WebSitemapGenerator
webSitemapGenerator = WebSitemapGenerator.builder(WEB_PAGE_URL, new
File(LOCAL_FOLDER_PATH)).build();
Create a sitemap url
using 'WebSitemapUrl' class.
WebSitemapUrl
sitemapUrl1 = new
WebSitemapUrl.Options(url).lastMod(modifiedDate).priority(priority).changeFreq(changeFrequency).build();
Add the sitemap url.
webSitemapGenerator.addUrl(sitemapUrl1);
Find
the below working application.
Sitemaputil.java
package com.sample.util; import java.io.File; import java.util.Date; import com.redfin.sitemapgenerator.ChangeFreq; import com.redfin.sitemapgenerator.WebSitemapGenerator; import com.redfin.sitemapgenerator.WebSitemapUrl; public class SitemapUtil { private static final String WEB_PAGE_URL = "https://self-learning-java-tutorial.blogspot.com"; private static final String LOCAL_FOLDER_PATH = "C:\\Users\\krishna\\Miscellaneous"; public static void main(String args[]) throws Exception { /* get the instance of WebSitemapGenerator */ WebSitemapGenerator webSitemapGenerator = WebSitemapGenerator.builder(WEB_PAGE_URL, new File(LOCAL_FOLDER_PATH)) .build(); String url = "https://self-learning-java-tutorial.blogspot.com/2018/03/base64-encoding-and-decoding-in-java.html"; double priority = 0.8; ChangeFreq changeFrequency = ChangeFreq.YEARLY; Date modifiedDate = new Date(); /* Create an instance of WebsitemapURL */ WebSitemapUrl sitemapUrl1 = new WebSitemapUrl.Options(url).lastMod(modifiedDate).priority(priority) .changeFreq(changeFrequency).build(); /* Add the urls to webSitemapGenerator */ webSitemapGenerator.addUrl(sitemapUrl1); webSitemapGenerator.addUrl("https://self-learning-java-tutorial.blogspot.com/2018/03/how-to-write-multiple-input-streams-to.html"); webSitemapGenerator.write(); } }
When
you ran above application, you can able to see below content in ‘sitemap.xml
file.
Reference
You may like
Interview
Questions
Programming
Questions
the modified date always gets updated right ???
ReplyDeleteyes
Delete