What is Selenium WebDriver | Architecture, Advantage

In this tutorial, we will learn about Selenium WebDriver that is the successor of Selenium RC.

In the earlier version of selenium, Selenium RC needs a server that must be started before executing test scripts, but WebDriver does not require any server to execute the tests on a particular browser.

There is a separate web driver for each browser that accepts the selenium commands and drives the browser under test.

Therefore, Selenium WebDriver is an excellent tool for automation and to develop a framework.

Let’s start with the basic concepts of WebDriver.

What is WebDriver?

WebDriver is a set of APIs (Application Programming Interface). It is a pure object-oriented tool that provides a lot of capabilities to Selenium suite. It provides a communication facility between languages and browsers.

Basically, WebDriver is a web automation framework that was designed by Simon Stewart while he was working at ThoughtWorks.

It is used to execute tests in different popular browsers such as Firefox, Chrome, IE, etc. It contains several types of abstract methods like get(), findElement(), By(), etc.

What is Selenium WebDriver?

Selenium WebDriver is a popular free and open-source library to automate web applications. It is the merger of two automation frameworks: Selenium and WebDriver.

In 2009, Selenium RC is merged with another testing framework called WebDriver to create a new Selenium tool known as Selenium 2.0 or Selenium WebDriver.

It was released in July 2011. It is the first choice of any tester to automate web application.


1. Selenium 1 version consisted of the combination of Selenium IDE, Selenium Remote Control, and Selenium Grid.

2. Version of Selenium 2 = Selenium IDE + Selenium Remote Control + Selenium WebDriver + Selenium Grid

3. Since Selenium RC was completely removed from the version of Selenium 3, therefore, this version consisted of IDE, WebDriver, and Grid.

4. At present, the project team of Selenium is working on a newer version of Selenium 4.

Selenium WebDriver Architecture

Selenium WebDriver Architecture consists of four basic components. There are as follows:

1. Selenium Language Bindings.
2. JSON Wire Protocol
3. Browser Drivers
4. Real Browsers

The core architecture components of Selenium WebDriver have been represented as shown in the figure.

Selenium WebDriver Architecture

Details of each component:

Selenium Language Binding/Selenium Client/Core Libraries

Automation scripts that interact with Selenium server through Selenium WebDriver, can be written in multiple programming languages such as Java, Ruby, Python, etc.

Therefore, Selenium Developers have decided to develop language bindings or Selenium client libraries that allow selenium to support multiple languages such as Java, Ruby, Python, C#, and JavaScript.

Selenium Client Libraries are nothing but different kinds of Jar files. These client libraries contain classes and methods of Selenium WebDriver that are needed to create automation test scripts.

Selenium core libraries can be installed using package installers available with the respective languages. For instance, suppose you want to use browser driver in Java, you will use the Java client libraries or Java jar files.

All the supported Selenium client libraries can be downloaded from the official website (https://www.seleniumhq.org/download/#client-drivers) of selenium.

Selenium client libraries is not a testing framework. A selenium client library provides an application programming interface (API) i.e. a set of functions that executes the selenium commands from the program.

JSON wire protocol over HTTP

JSON stands for JavaScript Object Notation. It is a very popular data interchange format based on the subset of JavaScript Programming Language which was developed by Douglas Crockford.

It is used to exchange data between a client and the server on the web. It supports data format available in all popular languages like Java, C#, Python, Ruby, etc.

JSON Wire Protocol is a transport mechanism created by WebDriver developers, which transfers the data between the server and a client on the web. Selenium uses JSON to transfer data between a client and the server.

JSON wire protocol uses a REST API (Representational State Transfer Application Programming Interface) to transfer information between the HTTP server. Each browser driver, such as FirefoxDriver, ChromeDriver, IE Driver, etc has its own HTTP server.

Browser Drivers

Selenium uses a specific driver for each browser to establish secure communication with the respective browser without revealing internal logic of browser’s functionality.

This browser driver receives the requests from the language binding and invokes the relevant operations on the browser. Each type of browser has its own driver that implements WebDriver’s wire protocol for that specific browser.

Selenium supports all modern browsers for automation. The class hierarchy diagram of Selenium browser driver is shown in the below figure.

Selenium browser driver class hierarchy diagram

As you can see in the above diagram, all browser drivers are classes that extend another protected class named RemoteWebDriver. RemoteWebDriver implements a WebDriver interface that is extended by a super interface named Search Context.

Real and Headless Browser

A Browser is a software program or an application used for seeing and searching content on the world wide web. Selenium Web driver supports both real and headless browsers.

For example, if you want to automate tests with selenium web driver and execute the script in the real browser like Chrome, you have to download its specific driver application.

This must be done for all browsers that you want to use. GeckoDriver (Firefox), ChromeDriver, and IE Driver are examples of the most frequently used driver and follow the guidelines of the Selenium framework.

An example of a headless browser is HTMLUnit browser(HTMLUnitDriver). Now let’s see how Selenium WebDriver works internally?

How Selenium WebDriver works internally?

In realtime scenario, when the script code is written in Eclipse IDE using any one of the supported Selenium client libraries (say Java), the program source code is executed by clicking the Right option.

After clicking on Run option, Firefox browser will launch and it will navigate to the URL of the website. Now let’s understand what is happening internally after clicking the Run till the launch of Firefox browser.

When we execute any test script using WebDriver, the following steps are performed internally.

1. As we click on Run, the selenium client library run selenium commands from our own program and convert them into the JSON format.

https://localhost:7705/ {“url”: “https://www.scientecheasy.com”} in a serialized JSON format using JSON Wire Protocol over HTTP sends to the browser driver (say FirefoxDriver) for each command. Each browser driver uses the HTTP server to receive an HTTP request.

2. JSON Wire Protocol communicates between a client and the server by transferring the data. The browser driver receives the HTTP request through HTTP Server.

The HTTP Server performs all the specific actions or instructions on the real browser and then the browser will send a request to load URL.

3. After performing all instructions, execution status is sent back to HTTP Server over the HTTP. The browser driver again uses the HTTP server to receive the HTTP request and send it back to the client library via JSON Wire Protocol.

The client library passes it back to your program. The program will report as a success or failure.

Features of Selenium WebDriver

There are so many important powerful features of Selenium WebDriver. Some of them are as follows:
Features of Selenium WebDriver
1. Multiple Browser Support: Selenium WebDriver supports a diverse range of multiple web browser and their version such as Firefox, Chrome, Internet Explorer, Safari, Opera, etc.

It also supports Headless browser called HTMLUnit browser. HTMLUnit browser is a non-conventional browser.

2. Multiple Languages Support: WebDriver also supports most of the commonly used programming languages such as Java, C#, Python, Ruby, Pearl, PHP, JavaScript.

You don’t have to know all of them. WebDriver provides facilities to choose any one of the programming languages based on competency and starts to create test scripts.

3. Speed: WebDriver performs the faster operation as compared to other tools of Selenium Suite. Unlike Selenium RC, it does not require any intermediate server for communicating with the browser. It provides direct communication between WebDriver Client Libraries and Web Browser.

4. Simple and Easy Commands: Selenium WebDriver provides very simple and easy commands to implement in the scripts. For example, If you want to launch the browser using web driver, the following commands have to use:

WebDriver driver = new FirefoxDriver(); (Firefox browser)
WebDriver driver = new ChromeDriver(); (Chrome Browser)
WebDriver driver = new InternetExplorerDriver(); (Internet Explorer Browser)

5. Drivers, Methods, and Classes: Selenium web driver offers multiple solutions to handle some potential challenges in automation testing. It also helps the testers in an easy way to handle the complex type of web elements such as check boxes, dropdowns, and alerts with the help of dynamic finders.

6. Record and Playback: Web driver does not support record and playback features like Selenium IDE.

7. Dynamic Finder: It also supports dynamic finder for locating the web element on the web pages.

8. Simple API commands: Since WebDriver is compact and object-oriented, abstraction and encapsulation can be used to hide irrelevant detail that makes it simple.

9. Easy to install and configure.

10. WebDriver also provides an option to test asynchronous web applications (e.g. Gmail, Facebook, Amazon) built using AJAX or JavaScript.

Benefits/Advantages of Selenium WebDriver

Some of the most important advantages of Selenium WebDriver are as follows:

1. Selenium WebDriver is a powerful open-source, freeware, and portable tool.

2. It supports different operating systems like Windows, Mac or Linux, etc. It also supports third-party tools such as AutoIt, Apache POI.

3. The major advantage of using Selenium WebDriver is that it supports parallel test execution that reduces the time taken in executing the parallel test cases.

4. Selenium web driver also supports iPhone and Android operating systems.

5. It supports the implementation of Dynamic finder and Listener.

6. Starting up a server in WebDriver is not required before the execution of test scripts.

7. We can integrate it with third-party tools like TestNG and JUnit to group the test cases and generating test reports.

8. By using selenium web driver, we can achieve Continuous Testing by integrating with Maven, Jenkins, and Docker.

Limitations/Drawbacks of Selenium WebDriver

There are some limitations that occur in selenium web driver. They are as follows:

1. Selenium WebDriver can be used only to test web-based applications. We can not test Windows-based applications or desktop applications and any other software. We will have to use a third-party tool like AutoIt for testing of Windows-based applications.

2. It is not possible to perform the testing on the image. We can perform image-based testing by integrating selenium with Sikuli.

3. WebDriver does not generate automatically test result file. We need to integrate with third-party frameworks like TestNG or JUnit for generating test reports.

4. WebDriver cannot support new browsers. Selenium also doesn’t support built-in add-ins assistance.

5. CAPTCHA, reCAPTCHA, and bar-code readers can’t be automated using Selenium WebDriver.

Difference between Selenium RC and WebDriver

There were a lot of limitations in Selenium RC which eventually led to the development of Selenium WebDriver. Let’s see some key points on how Selenium WebDriver differs from the Selenium RC.

Selenium RC vs Selenium WebDriver

Selenium WebDriverSelenium RC
1. The architecture of Selenium WebDriver is simpler as compared to the Selenium RC because it controls the browser directly from the OS (Operating System) level.1. Selenium RC’s architecture is more complicated because it uses an intermediate Selenium Remote Control Server to communicate with the browser.
2. Selenium WebDriver performs faster as compared to the Selenium RC because it interacts directly with the browser without using any external proxy server.2. Selnium RC is slower because it uses a JavaScript program called Selenium Core whose commands give the instructions to the browser.
3. Selenium WebDriver is a pure object-oriented API.3. Selenium RC is a semi object-oriented API.
4. It supports both real and headless browser like HTMLUnit browser.4. Selenium RC does not support headless browser. It supports only real browser.
5. WebDriver supports OS(Operating System) for mobile applications such as iOS, Windows mobile, and Android.5. Selenium RC does not support the testing of mobile applications.
6. WebDriver supports Dynamic finder and Listener.6. Selenium RC does not support.

Hope that this tutorial has covered all the important and basic points related to Selenium WebDriver and its architecture, features, benefits, and disadvantages. I hope that you will have understood this tutorial nicely and enjoyed it.

If you find anything incorrect in this Selenium tutorial, please inform to our team through email. In the next tutorial, we will discuss how to download Selenium WebDriver.
Thanks for reading!!!

⇐ Prev Next ⇒

Please share your love