top of page
Writer's pictureDeepali Ingle

How to Use Selenium WebDriver to Test and Identify Broken Links on a Web Page?





Checking the links on a web page for their proper functionality is a crucial part of the testing process. In this blog, we will discuss the importance of testing all the links on a web page and how to identify broken links using Selenium WebDriver. Selenium WebDriver is a powerful tool for automating web application testing. Leveraging its capabilities to perform link testing can significantly improve the efficiency and reliability of the testing process. Throughout this blog, we will explore the practical implementation of Selenium WebDriver to systematically test each link on a web page, capture their responses, and effectively identify any broken links.

 

Why is it necessary to test the links on a web page?

Checking links on a web page during testing is important for several reasons. First, it ensures that all the links on the page are working properly and directing users to the correct destination web pages. This is essential for providing a smooth and hassle-free experience for visitors, allowing them to navigate the website with ease. Moreover, checking the links helps in identifying and addressing any broken links. Broken links not only disrupt the user's journey through the website but also have a negative impact on the website's credibility. From a testing perspective, verifying links also contributes to the overall quality and reliability of the website.

Before delving into the concept of broken links, let's first understand what a request and response are on the

server.


Request and Response Model

The request and response model as shown above is a fundamental concept in web development. When a client, such as a web browser, sends a request to a server, the server processes the request and generates a response. The request typically includes information such as the type of request (GET, POST, etc.), the URL, and any additional data, while the response contains the requested data along with metadata such as status codes and headers. All the response codes and their corresponding response messages are shown in the above image. After submitting a request, if the response code is greater than 400 (either client error or server error), it indicates that the corresponding URL you are requesting is invalid. Here, the broken link comes into the picture.

 

What does "Broken Link" mean?

A broken link refers to a hyperlink present on a website that points to a page or resource that is no longer available or does not exist. When users click on a broken link, they are directed to an error page(with response code above 400) instead of the expected destination web page.

Here are some of the common reasons for broken links on a web page, including:

  • The linked page or resource has been removed, deleted, or relocated without proper redirection.

  • The website hosting the linked page or resource is experiencing downtime or is temporarily unavailable.

  • The URL of the linked page or resource has been changed. Another reason is typing errors in the URL or hyperlink coding.

  • Changes in website structure or navigation that render existing links invalid.

  • External websites being linked to may have removed or modified the content, leading to broken links.

  • The linked page or resource is restricted or requires authentication, resulting in a broken link for unauthorized users.

 

How can broken links on a webpage be identified using Selenium WebDriver?

Automating the process of testing for broken links has several benefits. It saves time and effort, as manually checking numerous links can be time-consuming.Automation tools like Selenium WebDriver efficiently identify broken links, improving the efficiency of the testing process. By implementing automation, we can ensure consistent and accurate testing results, thus significantly reducing the possibility of human error. Please refer to the following code for automation.

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;


public class CheckBrokenLink {
public static void main(String[] args) {

	WebDriver driver = new ChromeDriver();

	// Navigate to orangehrm Website
	driver.get("https://www.orangehrm.com/");

// Finding all the available links on a web page and storing it in links 
	List<WebElement> total_links = driver.findElements(By.tagName("a"));

	//shows total number of links present on web page
	System.out.println("Total number of links present on web page: "+ total_links.size());

	// Here we are iterating each link and checking its response code by 		 		//passing it to the method validateLink()
		for (WebElement link : total_links) {
		String individual_url = link.getAttribute("href");
		 if(individual_url == null || individual_url.isEmpty()){
			 System.out.println("URL is empty or null.");
			                 continue;
		}
		 else {
			 validateLink(individual_url);
		}
	}

	driver.quit();
	
	}


	public static void validateLink(String url) {
	try {
	//URL Class represents a Uniform Resource Locator, a pointer to a 
	//"resource" on the World Wide Web
	@SuppressWarnings("deprecation")
	URL link = new URL(url);

	//As HttpURLConnection is an abstract class, we can't create its 
	//object so creating its reference and opens the connection
HttpURLConnection httpURLConnection = (HttpURLConnection) link.openConnection();
	httpURLConnection.setConnectTimeout(4000); 
	httpURLConnection.connect();

	//Validating the response code
	if (httpURLConnection.getResponseCode() == 200) {
	System.out.println(url + " ---- " + httpURLConnection.getResponseMessage()+" - "+"Link is valid");
	}
	else {
	System.out.println(url + " ---- " + httpURLConnection.getResponseMessage() + " - " + "Link is a broken link");
	}
	} catch (Exception e) {
	System.out.println(url + " ---- " + "Link is a broken link");
	}
	}
	
}
  • In the code above, we first find all the links by using the anchor tag 'a' because every hyperlink is associated with it, and store them in a list called links.

    • Below is the example of link declaration using anchor tag- <a href = "/en/solution/performance-management/">Performance Management</a>

  • Then, to find the actual URL value, the link.getAttribute() method is used.

  • Since we are sending HTTP requests programmatically, the HttpURLConnection class is used, which is an abstract class that extends the URLConnection class.

  • The link.openConnection() method opens the connection to the specified URL and then connects to a server using the connect() method.

  • After that, we validate the response code - if the code is greater than 400, then it's a broken link.

  • The code is surrounded by a try-catch block to handle cases where the URL is null or any exceptions occur.

 To speed up broken link testing, you can perform parallel testing with TestNG or implement headless browsers.


I hope this blog will help you understand how to use Selenium WebDriver to make sure that all links on a web page work properly and how to find broken links. This will help improve user satisfaction by making navigation smoother.

Keep Learning!!!

 


 

119 views

Recent Posts

See All
bottom of page