What is Selenium?
Selenium is an open-source automated testing tool used to test web applications in multiple browsers. Selenium automates the web browser by making the browser execute commands according to our needs.
Selenium test suite comprises of four tools:
Selenium Integrated Development Environment (IDE) : IDE is a Firefox plug-in, where we can record and execute frequent test cases quickly. User interactions are recorded and test cases are created based on these recordings.
Selenium Remote Control (RC) : Selenium remote control has been developed to overcome the same origin policy. Under this policy JavaScript code cannot access elements that belong to other domains.
Selenium WebDriver : It is created as a replacement to Selenium RC. Selenium WebDriver directly communicates with the browser, thus reducing the execution time.
Selenium Grid : This is used in combination with Selenium RC to run test cases remotely. Selenium Grid supports parallel execution of multiple test cases on multiple machines.
Now let us discuss Selenium WebDriver.
What is Selenium WebDriver:
Paul Hammant developed Selenium WebDriver in 2006. It is an open source collection of APIs which are used to automate testing of web applications across many browsers. It has the capability of testing dynamic web applications, where the content of the page changes dynamically. WebDriver supports browsers like Chrome, Firefox, Safari and Internet Explorer.WebDriver can run on windows, macOS, Linux platforms. WebDriver supports programming languages like Java, .Net, PHP, python, ruby etc.
How is Selenium WebDriver different from other components?
Selenium RC requires another server which acts as a middleman between browser and Selenium WebDriver for communication, but Selenium WebDriver communicates directly with the browser from OS level, therefore the execution is faster.
Selenium WebDriver supports various browsers, unlike Selenium IDE which supports Chrome and Firefox.
It supports dynamic web pages, where the content of the page changes with user actions.
Architecture of Selenium WebDriver:
Below image shows the architecture of Selenium WebDriver.
Selenium Client Libraries: A Tester can choose any of the programming languages that are interpreted by the WebDriver to write test cases to run the test cases.
JSON Wire Protocol: JSON Wire Protocol facilitates the transfer of data between client and server. It will transfer the requests between the client library and driver HTTP server.
Browser Drivers: WebDrivers support multiple browsers to test the applications. There is a specific browser driver for each browser, like chrome driver for Chrome browser, gecko driver for firefox etc..Browser driver receives HTTP requests from JSON wire protocol and sends the requests to the real browser, where the testing of the web page takes place.
Real Browser: Real browsers receive requests from browser drivers and test the application according to the request. Browser sends the test results to the browser driver. Browser driver sends these results to the client.
WebDriver commands:
Selenium WebDriver provides multiple sets of commands to perform different operations. Below are few of the most commonly used WebDrive commands:
Browser commands: get, getTitle, getCurrentUrl, close, quit
Browser navigation commands: back, forward, refresh, navigate to
WebElement commands: clear, click, getAttribute, IsEnables, GetText, SendKeys etc..
Locator commands: findElement (By.id, By.linkText, By.className etc..)
How does Selenium WebDriver work?
Now let us see how WedDriver works and how all the communication takes place. Once a tester writes code to test the application and starts executing it, below steps takes place:
JSON wire protocol sends the request to the HTTP server of WebDriver.
Then WebDriver sends the received request to the browser server.
Now the browser will perform all the actions on the elements of the web page and send the response/test results back to WebDriver.
Then WebDriver will send the test results back to the client.
Using Selenium WebDriver for Web automation:
Let us test a web application using Gecko WebDriver.
First step is to instantiate a WebDriver. We can use the below code to do that.
System.setProperty("webdriver.gecko.driver","driver path");
WebDriver driver = new FirefoxDriver();
Navigate to the web page that needs to be automated.
driver.get("http://saucedemo.com/");
Locate the elements in the web page. Here I am logging into the web application, I gave my username and password using SendKeys method.
driver.findElement(By.xpath("//div/input[@id='user-name']")).sendKeys("standard _user");
driver.findElement(By.xpath("//div/input[@id='password']")).sendKeys("secret_sauce");
driver.findElement(By.xpath("//div/form/input[@id='login-button']")).click();
Now run the tests, WebDriver will send all the requests to the browser and browser performs necessary actions and sends the test results back to the client.
Below is how the test result looks like.
Advantages of Selenium WebDriver:
It is open source and available at free of cost.
Works on multiple OS.
It supports cross browser testing.
It can be integrated with Jenkins.
Support multiple programming languages.
Limitations of WebDriver:
As we know that WebDrivers operate on OS level, different browsers communicate with the OS in different ways. So, when a new browser comes out there is going to be a new process for the browser to communicate with the OS which needs to be implemented in the WebDriver release.
It supports Web applications only, it is not compatible to test desktop applications.
support for image testing is limited.
Conclusion: Selenium WebDriver is one of most popular and most loved choices to automate a web page. It supports multiple browsers, languages and operating systems. It interacts with the browser directly and uses most straightforward commands and its simple architecture helps to execute the tests faster.