Selenium 4 is set to be released by the Chinese New year, but which year?
A lot of developments have happened since Selenium 4 was announced during the State of the Union Keynote by Simon Stewart and Manoj Kumar. There has been a significant amount of work done and we’ve released at least six alpha versions of Selenium 4 for users to try out and report back with any potential bugs so that we can make it right.
It is exciting times for the Selenium community as we have a lot of new features and enhancements that make Selenium WebDriver even more usable and scalable for practical use cases.
Selenium is a suite of tools designed to support different user groups:
- Selenium IDE supports rapid test development, and doesn’t require extensive programming knowledge
- WebDriver provides a friendly and flexible API for browser automation in most major programming languages
- Grid makes it possible to distribute and run your tests across more than just one machine.
Let us dive in and take a look at some of the significant features that were released in each of these tools and share some of the cool upcoming features that are in-progress and will be available in Selenium 4.
One of the main reasons to release WebDriver as a major version (Selenium 4) is because of the complete W3C protocol adoption. The W3C protocol dialect has been available since the 3.8 version of Selenium WebDriver along with the JSON wire protocol. This change in protocol isn’t going to impact the users in any way, as all major browser drivers (such as geckodriver and chromedriver), and many third party projects, have already fully adopted the W3C protocol.
However, there are some notable new APIs, as well as the removal of deprecated APIs in the WebDriver API, such as:
- The FindsBy* interfaces (e.g. FindsByID, FindsByCss …) have been deleted. The recommended alternative is to use a `By` instance passed to `findElements` instead.
- “Relative locators”: a friendly way of locating elements using terms that users use, like “near”, “left of”, “right of”, “above” and “below”. This was inspired by an automation tool called Sahi by Narayan Raman, and the approach has also been adopted by tools like Taiko by ThoughtWorks.
- A richer set of exceptions, providing better information about why a test might have failed. These include exceptions like ElementClickInterceptedError, NoSuchCookieError & more.
- Chrome Debugging Protocol (CDP):
- Although Selenium works on every browser, for those browsers that support it, Selenium 4 offers CDP integration, which allows us to take advantage of the enhanced visibility into the browser that a debugging protocol gives.
- Because the CDP is, as the name suggests, designed for debuggers, it’s not the most user friendly of APIs. Fortunately, the Selenium team is working to provide comfortable cross-language APIs to cover common requirements, such as network stubbing, capturing logs, mocking geolocation, and more.
- Browser Specifics:
- A new ChromiumDriver extends packages for both Chrome and Edge browsers.
- A new method to allow install and uninstall add-ons for Firefox browser at runtime.
- Window Handling:
- Users can go in full-screen mode during script executions.
- Better control of whether new windows open as tabs, or in their own window.
- An option to grab a screenshot at UI element level. Unlike the usual view-port level screenshot.
- Full Page Screenshot support for Firefox browser.
What’s next in WebDriver beyond Selenium 4?
It would be nice to have users extend the locator strategy like FindByImage or FindbyAI (like in Appium) – right now we have a hardcoded list of element location strategies. Providing a lightweight way of extending this set, particularly when using Selenium Grid, is on the roadmap.
The original Selenium IDE reached its end of life in August 2017, when Mozilla released Firefox 55, which switched its add-ons from the Mozilla-specific “XPI” format to the standardised “Web Extension” mechanism. This meant that the original Selenium IDE would no longer work in Firefox versions moving forwards.
Thanks to Applitools, Selenium IDE has been revived! It is one of the significant improvements in Selenium 4 and includes notable changes like:
- A new shiny UI, for better user experience.
- A web-extensions based plugin that makes it possible to be available in Chrome and Firefox browsers as well as for any other browser that allows web-extension based plugins. It will soon be available in the MS Edge store.
- A new plugin system that can allow users to create new commands, code exports for new languages and frameworks. The plugins can be shipped as extensions. An example of a plugin is Applitools for Selenium IDE which enables codeless visual testing.
- A new CLI runner called the “Selenium-side-runner” running on NodeJs. It allows users to execute the recorded tests in parallel with multi-browser capability.
- A control flow mechanism which helps users write better tests using “while” & “if” conditions.
- A backup element selector that can fall back and select elements using a different locator strategy like ID, CSS & XPath based on the recorded information. This helps make tests more stable and reliable.
- Selenium IDE is accessible! We’ve gone above and beyond to make sure that it conforms to some of the latest accessibility guidelines and supports necessary controls like focus order, roles, tooltips, announcing the start of recording, color and design.
What’s next in Selenium IDE?
A remarkable milestone for Selenium IDE is that it’s going to be available as a standalone app, re-written to be an Electron app. By binding tightly to the browser, this would allow us to listen out for events from the browser, making test recording more powerful and feature-rich.
One of the essential improvements in Selenium 4 is the ability to use Docker to spin up containers instead of users setting up heavy virtual machines. Selenium Grid has been redesigned so that users can deploy it on Kubernetes for excellent scaling and self-healing capabilities.
Let’s look at some of the significant improvements:
- We’ve enhanced Selenium Grid deployment for more scalable and traceable infrastructure.
- Users can deploy Grid, either as Standalone, Hub-Node or in a distributed mode with different processes like in the below picture,
- Observability is a way of measuring systems’ internal state; a much-needed capability to trace what happens when an API is invoked or a new session creation is requested. This can help admins and developers when debugging, as providing insight into the root cause when strange problems arise.
- Selenium Grid, by default, communicates via HTTP. This is fine for most use cases within the firewall but problematic when your server is exposed to the internet. Now users can have their Grid communicate via the HTTPS protocol with support for TLS connections.
- Unlike in the old versions, where we’ve allowed only IPV4 based IP addresses, now we support IPV6 addresses as well.
- Grid has always allowed you to use configuration files when spinning up Grid instances. In Grid 4, those files can be written using TOML, which makes them easier for humans to understand..
What’s next in Selenium Grid?
As you follow, there have been exciting changes and performance improvements. There are a few more that expected to be added like:
- A revived UI for Grid console
- GraphQL for querying Grid
- More work on Grid stability and resilience
We’ve also refreshed our branding, documentation, and the website, so check out Selenium.dev!
Selenium is an Open-Source project, and we do this voluntarily so there are no definite timelines that can be promised, but we can say before Chinese New Year.
Please come and give us a hand if you have the energy and time! Happy hacking!
Thanks Simon Stewart in helping review this post!
Manoj Kumar is a Principal Consultant at ThoughtWorks. Manoj is an avid open-source enthusiast and a committer to the Selenium & Appium project. And a member of the project leadership committee for Selenium. Manoj has also contributed to various libraries and frameworks in the automated testing ecosystem like ngWebDriver, Protractor and Serenity to name a few. An avid accessibility practitioner who loves to share knowledge and is a voluntary member of the W3C ACT-R group. In his free time, he contributes to Open-Source projects or research on Accessibility and enjoys spending time with his family. He blogs at AssertSelenium.
- 7 Must-read Selenium Tutorials
- The Next Generation of Cross Browser Testing is Ultrafast(
- Selenium Functional Testing with Applitools
- Five Data-Driven Reasons To Add Visual AI To Your End-To-End Tests