Getting Started with Selenium WebDriver for Automation Testing
By Jash Unadkat, Technical Content Writer and Pradeep Krishnakumar, Manager - July 8, 2019
Selenium WebDriver is one of the most important parts of the Selenium test-suite family, but before directly hopping on to Selenium WebDriver, we must have a basic idea about Selenium.
What is Selenium?
Selenium refers to a suite of tools that are widely used in the testing community when it comes to cross-browser testing. Selenium cannot automate desktop applications; it can only be used in browsers. It is considered to be one of the most preferred testing tool-suite for automating web applications as it provides support for popular web browsers which makes it very powerful.
Browsers supported by Selenium:
- Google Chrome 12+
- Internet Explorer 7,8,9,10
- Safari 5.1+
- Opera 11.5
- Firefox 3+
Along with this, Selenium is also capable of working on multiple Operating Systems.
Multiple Operating Systems supported by Selenium:
Another reason why Selenium is gaining popularity is that it provides compatibility with different programming languages which makes it very flexible for the testers to design test cases.
Programming Languages supported by Selenium
The Selenium test suite comprises of four main components:-
- Selenium IDE
- Selenium RC
- Selenium Webdriver
- Selenium Grid
Selenium IDE (Integrated Development Environment) is primarily a record/run tool. It is an Add-on or an extension available for both Firefox and Chrome that generates tests quickly through its functionality of record and playback. You don’t need to learn any test scripting language for authoring any functional tests.
In the case of working with Selenium RC (Remote Control), one must have good knowledge in at least one programming language. This tool allows you to develop responsive design tests in any scripting language of your choice. Server and client libraries are the two main components of Selenium RC. Its architecture is complex and it has its limitations.
Selenium WebDriver is an enhanced version of Selenium RC. It was introduced in the market to overcome the limitation faced in Selenium RC. Though it is an advanced version of RC, its architecture is completely different from that of RC. Just like Selenium RC, Selenium WebDriver too supports multiple programming platforms to provide wider flexibility and requires knowing any one programming language.
Selenium Grid is a tool that is used for concurrent execution of test cases on different browsers, machines, and operating systems simultaneously. This tool makes Cross-browser compatibility testing very easy. There are two versions of the Selenium Grid – the older version is known as Grid 1 and the recent version is known as Grid 2.
Now that we have covered some of the basics of Selenium, let’s understand one of the important components of Selenium – “Selenium WebDriver”
What is Selenium WebDriver? Why is it used?
Selenium WebDriver is a web framework that permits you to execute cross-browser tests. This tool is used for automating web-based application testing to verify that it performs expectedly.
Selenium WebDriver allows you to choose a programming language of your choice to create test scripts. As discussed earlier, it is an advancement over Selenium RC to overcome a few limitations. Selenium WebDriver is not capable of handling window components, but this drawback can be overcome by using tools like Sikuli, Auto IT, etc.
Now let’s try to understand the WebDriver Architecture.
Selenium WebDriver Framework Architecture
WebDriver Architecture is made up of four major components:
- Selenium Client library
- JSON wire protocol over HTTP
- Browser Drivers
A detailed description of each component
Selenium Client Libraries/Language Bindings:
Selenium provides support to multiple libraries such as Ruby, Python, Java, etc as language bindings have been developed by Selenium developers to provide compatibility for multiple languages. For instance, if you want to use the browser driver in Python, use the Python Bindings. You can download all the supported language bindings of your choice from the official site of Selenium.
JSON Wire Protocol
JSON serves as a REST (Representational State Transfer) API that exchanges information between HTTP servers. Learn more about REST API for accessing Selenium
Selenium provides drivers specific to each browser and without revealing the internal logic of browser functionality, the browser driver interacts with the respective browser by establishing a secure connection. These browser drivers are also specific to the language which is used for test case automation like C#, Python, Java, etc.
You can download the browser driver of your choice as per your language requirements. For example, you can configure Selenium Web driver for Python on BrowserStack.
When a test script is executed with the help of WebDriver, the following tasks are performed in the background:
- An HTTP request is generated and it is delivered to the browser driver for every Selenium Command
- The HTTP request is received by the driver through an HTTP server
- All the steps/instructions to be executed on the browser is decided by an HTTP server
- The HTTP server then receives the execution status and in turn sends it back to the automation scripts
As discussed earlier, Selenium provides support for multiple browsers like Chrome, Firefox, Safari, Internet Explorer etc.
Benefits of Selenium WebDriver
- It is one of the most popular Open-Source tools and is easy to get started with for testing web-based applications. It also allows you to perform cross browser compatibility testing.
- Supports multiple operating systems like Windows, Mac, Linux, Unix, etc.
- It provides compatibility with a range of languages including Python, Java, Perl, Ruby, etc.
- Provides support for modern browsers like Chrome, Firefox, Opera, Safari and Internet Explorer.
- Selenium WebDriver completes the execution of test scripts faster when compared to other tools
- More Concise API (Application Programming interface) then Selenium RC’s
- It also provides compatibility with iPhone Driver, HtmlUnitDriver and AndroidDriver
Limitations of WebDriver
- Support for new browsers is not readily available when compared to Selenium RC
- For the automatic generation of test results, it doesn’t have a built-in command
Understanding cross-browser testing automation using Selenium WebDriver with a complete scenario
Assuming in real time, you start writing the code in your User Interface (consider Eclipse IDE)
with the help of any of the client libraries supported by Selenium (say Python). You also need to have selenium web driver for Chrome if you prefer to use Chrome.
WebDriver driver = new ChromeDriver (); driver. get (https://www.browserstack.com)
As soon as you complete writing your code, execute the program by clicking on Run. The above code will result in the launching of the Chrome browser which will navigate to the BrowserStack website.
Now let us understand what goes behind the scene when you click on Run until the launching of the Chrome Browser.
Once you click on Run, every line of code/script will get transformed into a URL. The JSON Wire protocol over HTTP makes this possible. Then this URL is passed to the browser drivers (in our example, the ChromeDriver). At this point, our client library (Python in our example) translates the code into JSON format and interacts with the ChromeDriver.
The URL after JSON conversion looks as follows:
To receive the HTTP requests, every Browser Driver uses an HTTP server. Once the browser driver receives the URL, it processes the request by passing it to the real browser over HTTP. And then all your commands in the Selenium scripts will be executed.
Types of Requests
There are two types of requests you might be familiar with – GET and POST.
If it’s a GET request then it results in a response that will be generated at the browser end and it will be sent over HTTP to the browser driver and eventually, the browser driver with the help of JSON wire protocol sends it to the UI (Eclipse IDE).
Application of Selenium WebDriver in Testing on the Cloud
You can use Selenium WebDriver to run browser tests on virtual machines, emulators, simulators, physical real devices and real devices on the cloud.
BrowserStack provides access to thousands of real mobile devices. Our Automate product supports automated selenium testing, to help you speed up your product release.
Features of BrowserStack Automate:
- Run hundreds of concurrent tests
- Integrate with popular languages like Python, Java and top CI/CD tools like Jenkins, CircleCI
- Instant access to 2000+ Real Devices and Browsers
- Comprehensive Debugging using video recordings, automated screenshots of errors
- Enterprise-Grade Security & GDPR Compliance
- Selenium testing WebDriver tool is considered to be one of the most preferred choices for testing web-based applications
- As Selenium WebDriver offers great ease of use and flexibility of choosing any scripting language / browsers / operating systems, it enables you to create powerful tests that scale
- The fact that it is an open-source tool means developers and testers can easily integrate with a real device cloud that BrowserStack Automate provides