Selenium 4 with Simon Stewart | Webinar

We hosted Simon Stewart, creator of WebDriver and core-contributor to the Selenium project for a webinar on the 18th of June. Simon spoke about Selenium 4, the upcoming Selenium upgrade.

Simon covered various topics like the philosophy behind the upgrade and what it means for the QA community. Here's the webinar, along with the transcript.

Video transcription

Praveen Umanath:
All right, let’s get started. Good morning, good afternoon, good evening, from whichever part of the world you're dialing in. Thank you for joining us. We've got a great response for this webinar.

My name is Praveen Umanath. I work at BrowserStack as a Product Manager and also lead the marketing team. It's my honor and privilege today to introduce Simon Stewart, who's going to be presenting the majority of this webinar with us. Simon is someone who needs no introduction. He is the product lead for Selenium, creator of WebDriver, and co-editor of the W3C WebDriver specifications, which we will be talking about in detail today. I sort of introduced myself earlier. I lead the product team for our Selenium automated testing product, BrowserStack Automate, and also lead the marketing team at BrowserStack. So without further ado, I think I will hand it off to Simon to take over from here. Simon, all yours.

Simon Stewart:
Hi everyone. Thank you all very much for coming. I am just reading the messages as they scroll by and everyone introduces themselves from all over the world. People in Nigeria, people in India, the U.K., America, all over the place, it's amazing. Thank you all very much for being here. Today's talk, we're going to go through Selenium 4 and all the new features, and what makes it different, and what makes it the same, and hopefully at the end of it you'll be in a good position to start making use of our new technology.

Simon Stewart:
The too long, didn't read, didn't listen. We're going to cover why are we doing the major version bump, we're going to discuss the new Selenium IDE, we're going to talk about the fact that the user facing API, the WebDriver APIs that you've been using all this time are basically going to remain the same so it should be a drop in upgrade. We're also going to talk about some of the new features that we're providing, including integration with Chrome debugging protocol, new mechanisms for finding elements in a friendly way, and then finally we're going to be discussing the modernized Selenium grid. That's the rough agenda of what we're going to cover.

The very first thing to bear in mind is, when we talk about Selenium, we're talking about a family of products. The main ones are Selenium IDE, Selenium WebDriver, and Selenium grid. You might recall your Selenium test and your Selenium IDE and then run it using Selenium WebDriver and then scale it up using Selenium grid. On the project we describe this as being a bit like John Malkovich, in the film ‘Being John Malkovich’ where everything is called the same thing. When we talk about Selenium, it can be a little bit imprecise so in this talk, I'm probably going to be talking about IDE, WebDriver, and grid, and hopefully from context it'll be obvious what Selenium is.

Kicking it all off, there is Selenium IDE. We have a new and refreshed Selenium IDE coming and that's largely thanks to the hard work of the folks at Applitools who based their project and the work that they've been doing on Selenium IDE on something that was created by Sidex. Going back into the dim and distant past, Selenium IDE gave us a nice record playback experience, but it was written as a Firefox extension using the old XPI mechanisms and over time what happened is Mozilla changed to making everything be a web extension. Suddenly Selenium IDE no longer worked. It didn't have access to the privileged APIs that it had before and we were kind of stuck because there weren't many people working on IDE. In fact, in the end it was just one person, and then they ran out of time, and it was very hard for us to keep that up. The folks at Sidex took the original Selenium IDE, they ported it to be a web extension and they got it working, and they added their own enhancements.

When we were looking around trying to figure out what to do, Applitools came to us and suggested that they would like to be involved in the project and what could they do to help. We went like Selenium IDE would be really good because we're stuck. They picked up and they did it. Selenium IDE gives us a really nice record playback experience. We think of it as the on-ramp to the Selenium project itself. It allows someone who isn't a developer to write tests quickly. It allows somebody who is a developer to quickly put together the basic frameworks of their test which they use WebDriver for. It gives a really nice tool for people to file bugs with in a reproducible way. They can just record a test using Selenium IDE, add that to the bug report and voila, whoever it is that's looking at a bug report can see everything that's going on.

Things that make Selenium IDE new and interesting: originally it was an XPI, then it became a web extension, but one of the things we noticed is that people really like some of the newer tools that are coming out; in particular, things like Cypress and one of the ways that they get their capabilities and their abilities is to bind super tightly to the browser in the same way that WebDriver bound to the browser in the first place; to enable us to do various bits and pieces and was part of the browser infrastructure. The new IDE replacements and tooling has that same capability. The new Selenium IDE is going to be written as an Electron app. If you download it right now, it is a web extension. The Electron app is coming and as part of Selenium 4, that will be one of the hallmark features—the stand alone app that you can use.

The way that the Electron app allows us to do a whole bunch of interesting things is, it enables us to use the Chrome debugging protocol to listen out for events from the browser. That's a super powerful capability because it means that we can do more stable locators and we can track what's going on within the browser a bit more easily, which is a really nice feature. You could do these things from outside the browser and it's perfectly possible to write incredibly stable Selenium tests without needing to be buried neck deep into the browser just by making use of the Chrome debugging protocol. We get a more direct line into the state of the browser rather than having to treat the browser as a black box, and intuit what's going on.

There is new and improved niftiness that we have. Now nifty is a sort of a fairly unusual British English word that may not travel around the world. Imagine it as being cool, or neat, or fantastic. The main things that we have are control flow. Previously, this was an extension that was added to Selenium IDE, and being able to do if, while, at loops, and fors, and constructs like that were always hard to do. They're now baked into the basic grammar of Selenium IDE.

The other thing that we do, that Selenium IDE offers, is backup element selectors. I think we've all been in the situation where we've written a test and it's worked flawlessly and then one of our developers or somebody has changed something, and suddenly we've not been able to find elements, and our tests have started failing. What Selenium IDE does is, while you're recording, it records not just one element locate, not just like an XPath or an ID or a CSS locator, it records four or five of them, and if when you're running a test it can't find the element using one locator, it will then fall back to trying one of the other locators—until it finds the element you're looking for.

This is part of what we offer in order to help your test be rock solid and super stable. Not only are we being able to interrogate the interior event stream of the browser, which we can use to do our normal waits and our normal things, which we could always do before, but which we can do in a slightly more integrated way with the CDP.

We also have these capabilities that allow people to write their tests once and have it be more stable and more coherent. The other thing that Selenium IDE offers, and it already offers this, is the ability for you to add your own plugins and extensions. We have a tool, and it's great and it's lovely, but no tool does everything that is entirely correct for your testing purposes. There's always going to be something that is specific about your setup. Examples of extensions may be things to make finding elements by react simpler, or other capabilities.

The other major feature is code export. We know that when you record your tests in Selenium IDE, that is quite often the first part of a path toward writing a full test suite. Selenium IDE is fantastic but quite often your developers are going to be writing in Java, C Sharp, Python, Ruby, Java Script, some other programming language. What we'd like to be able to do is take the test that's been recorded and export it. The code export features allow this already.

You can find the code export for Java and Python is there right now as well. We are going through the code exports features in the order of their priority to people. We've listened to our users and they've said we need this first and we need that.

Peter Walters is saying, “Your lack of a seatbelt is making me nervous.” I am in my car, that is correct. I've just been to my son's parent's evening. We are parked up and completely stationary. I'm not going anywhere. It is perfectly safe. The car is not moving.

Moving on to WebDriver. You've recorded your test, you've exported them, you probably want to be using the WebDriver APIs. There are some new features coming. They're friendly locators which hook up with the Chrome debugging protocol. The major one is the W3C protocol. The reason why we're doing the Selenium 4 upgrade, and we're calling it Selenium 4 and not Selenium 3 point whatever the next digit of pi is, is because we're wholeheartedly adopting the W3C web protocol. This is the protocol that we spent the past six years working on. It's been a standard for about a year now.

It is fractionally different from the original protocol which we refer to as the JSON wire protocol within the project. The major differences are around how you create a new session and how you can do element interactions using the actions API. The major difference with the actions API is that you can now do multiple actions simultaneously. You can do things like multi touch, you can do pressing two keys simultaneously, you can do chording, and at the same time click with the mouse. Things like that. It's a far richer API than we have offered. The reason why we're doing it is the W3C protocol is what we're using.

What does that mean for users? I'll be honest, hopefully, probably nothing, if you're just using the WebDriver APIs, the APIs are remaining stable, they're remaining consistent. All that's happening is, the wire protocol is changing. Versions of Selenium since 3.8 have spoken both the JSON wire protocol and the W3C protocol. Even as things change, from your point of view, as a person writing the tests, nothing has changed. Although we're doing this big change and we're de-bumping the version number, the differences are at the wire protocol level and not at the API level.

Who does care about protocol dialects? Well, Selenium is a service provider such as BrowserStack. One of the nice things is that we have a good community around the project and I've made a special point of going out and talking to people such as BrowserStack who are kindly hosting this, to make sure that when the Selenium 4 comes out, your tests are going to continue running on their infrastructure—just as stably and just as well as they have always done before.

Like I said, they're a stable user facing APIs. If you've been writing Selenium tests using the WebDriver APIs, you should just find this to be a drop in upgrade. Having said that, there are some caveats. The major one is that if an API was marked as deprecated, it may well be gone. Normally this shouldn't be affecting you because who writes tests using deprecated APIs? You can catch those points by recombining code right now with deprecation being not just a warning but an error.

Quite often the mechanisms for upgrading are simple and easy. In many cases what we've done is, we've moved classes from an internal package to a better top level package and made them publicly available. That's really nice.

In other cases, we are marking things deprecated because we will be changing how things work. The major one for that is that people quite often take a WebDriver and they cast it to something like FindSpies, CSS selector, FindSpyID, or FindSpyName. The appropriate thing to do is to always use the find element API and pass in a locator such as by ID, by CSS, so on and so forth. They're functionally equivalent but one of them is supported and one of them uses APIs that we consider to be private to Selenium itself.

Just be aware of that fact that where you find yourself casting a WebDriver or a web element to one of those sub interfaces, FindSpy, CSS selector, FindSpyID, I prefer to use find element with by ID, by CSS, by name, so on and so forth.

The major language binding for every language other than Java are going to be speaking straight W3C protocol. They will be pulling out support for the JSON wire protocol. That allows them to simplify their code, allows it to be more maintainable, and it allows it to be easier to continue iterating on the frameworks they use that you are familiar with and you are comfortable with, however, we are fully aware on the project that there are some people who, for whatever reason, will want to use the old JSON wire protocol.

Maybe they're using a third party library that hasn't been upgraded yet. Maybe they're stuck with an old version of Firefox, loads of people using the legacy Firefox driver for some reason, and we want to continue to support them. In order to support them, the Java bindings and the Selenium server will provide mechanisms for people to use the old JSON wire protocol. What you may find is if you are using the Ruby or the Javascript bindings or the dot net bindings and you want to take advantage of the legacy Firefox driver, then you may need to stand up the Selenium server in the middle of your tests. You may be doing that now. If you're using Selenium grid that's also possible.

The preservation of backward compatibility is something the project is super dedicated to. In fact, right now what you can do, and somebody on the mailing list actually tried this recently, you can take a Selenium RC 1.0 test, so Selenium 1, and you can run it against the current Selenium grid. We have protocol converters that convert from the original Selenium RC language bindings all the way through to the latest greatest W3C wire dialect.

We will continue having that backward compatibility because we know you have spent a lot of time and a lot of effort and a lot of dedication building up your test suites. Making sure they're rock solid, making sure they're capable, and the last thing we want to do is pull the rug out from underneath you. Your tests are an investment that you have made, so we are trying to make it so your upgrades are as smooth and as simple as possible.

Like I said, dot net, Javascript, Ruby, and Python, which are all the language bindings that we support in the project other than Java will be W3C only. Like I said before, the user facing APIs are stable. They are not changing radically. We are just taking the opportunity to remove deprecated methods, deprecated classes. Hopefully for you, it will just be a case of going to your gemfile for bundler, your Python config, even your Maven configs and just upgrading from 3.141 to 4.0 and magically everything will continue working.

The only people who need to care are people who need that backward compatibility for some reason. The major reason will be because either infrastructure hasn't been upgraded, and it will need to be pretty old infrastructure because Selenium grid supports both protocols natively at the moment. Either things need to be upgraded or you're using the legacy Firefox driver. In either case, you can use the Java bindings either directly or through the Selenium server and they will enable you to use the JSON wire protocol. Even though we're making this very radical change, we are providing mechanisms for the legacy protocols to continue working.

Like I said, does this change the wire protocol matter, do the deprecations matter, probably not, I'll be honest with you. The reasons, like I've said, is that the user facing APIs were either adding new APIs or we are removing APIs that have been clearly marked as being deprecated, and the policy on the project has always been that we will marker an API as being deprecated and then we can remove it at any later version.

One of the nice things is that right now all of the third party drivers, geckodriver, chromedriver, both kinds of the Microsoft WebDriver for Edge, the Optodriver, so on and so forth, all of those use the W3C protocol already, so all of the browsers that you're trying to use already speak the W3C protocol. All you need to do is keep on upgrading those and everything will carry on working. That is a huge accomplishment. I am incredibly pleased. The browser teams have been working super hard on this.

The Chromedriver folks have only recently managed to land all the changes but they have done it and it's fantastic and they've done it with really active involvement with the community. Jim Evans helped them a lot and they listened to him filing bug reports and acted upon it super quickly. Couldn't be happier. Such an amazing achievement.

I said we're going to be adding various bits and pieces. One of those bits and pieces is new APIs. We always know that finding elements is hard, right? They're provided by implementations, are fairly limited, and they're really technical. ID, CSS, name, these are things that are deep down in the structure of the dome. What we're thinking of doing is adding friendly locators.

Friendly locators are an idea that I first saw implemented in a project called Sahi, which was written, came out of Thoughtworks India and Norian wrote it. It was a lovely piece of kit. One of the nice things it has is its ability to find things that are near, above, below, left of, right of. It offered a positional way of finding elements. This has come to light more recently in projects such as Tyco from Thoughtworks which calls them relative locators. The idea started in Sahi but it's already available in several different frameworks and it would be nice if we could offer it in Selenium as well.

If you were to download the Selenium 4 alpha, you will not find friendly locators yet. I need to write the code. It's not quite ready to share with the world, but it should be in probably in alpha two, maybe alpha three, depending on how solid we feel that it is. This will allow you to do tests where you find elements to the left of, find the check box to the left of this label. Find the radio button below this. You can write in a nice simple way that reads really well and it is a bit more comprehensible and hopefully a bit more robust.

There's a question coming in saying "is it to support angular applications?" Not specifically. It's there to support people writing tests. We know that element locators can be super hard to write, super brittle, super hard to maintain. We want to make that as easy as possible for people. Selenium grid is being modernized. We noticed the fact that the original Selenium grid has aged a bit and we want to take advantage of some of the features that have come out since we originally released it. In particular, Docker, the ability to scale out using cloud infrastructure and the rise of observed developcy.

The original Selenium grid, or the grid that we ship now came as part of Selenium 2 and that was released in 2011. It came out of work that Francois Renard had done at Ebay, which they donated to the project. 2011 was a really long time ago. It was eight years ago, and the world has moved on. Back in 2011, we'd have been lucky if we had more than three or four machines to run it on. They tend to be service running something like VMware, starting a new virtual machine was a heavyweight process, it was difficult, and our farms tended to be small.

Right now we can use Docker to spin up new virtual images incredibly quickly. It takes milliseconds instead of tens of seconds. Even more pleasantly, we can use instant kubernetes to allow us to deploy into the cloud and scale horizontally, basically in an almost unlimited way. The world has moved on since 2011.

Like I say, it's Docker now which has allowed us to take simple browser images, run them, and throw them away as necessary. It's made virtual machines basically free to use, and effectively disposable. Tooling like kubernetes, allows people to set up a grid of machines of a lot more simply and a lot more light weight. Of course running kubernetes on private hardware is a bit of a pain in the backside, and things like AWS, GCP Azure, are out there. We want your grid to take advantage of that.

What have we done to change Selenium in order to support your tests? The first step was acknowledging the fact that there are tools out there that already offer this. Selenium is excellent. It's a fantastic piece of kit that gives you already docker support kubernetes, and a really nice UI. There are other tools such as Selenoid that show us that Docker is a great way of allowing you to scale a grid and keeps machine management and browser management simple.

One of the things that we could do is take the ideas of Selenium and merge them into Selenium grid. The problem is the Selenium 2 grid base isn't really set up to make that particularly easy. Like I said it's eight years old and if you take a look at the way we merged the original Selenium server and the Selenium grid project, it wasn't particularly elegant. At the moment, what happens is, when you start up the Selenium stand-alone server, it goes down one of two code paths and we haven't really merged the two separate code bases very well at all.

We could have added docker support but then we wouldn't have addressed the fundamental problem of the bifurcated code base. We need to do better. There are bits in the Selenium grid code base that are better than the original Selenium server code base. There are abstractions in the Selenium server code bases that would make writing grid easier. What are we going to do? We're going to go from Selenium grid 2 to Selenium grid 4, largely because we can't count, and we're going to re architect bits and pieces about this.

Question coming in about "do you need to worry about changes in Selenium grid"? I hope that this question will answer it for you. The new Selenium grid architecture looks like this. There are effectively four separate processes. There is a router. The router is where your messages come in and they hit the application, they hit the Selenium grid. Let’s imagine a new session request comes in. The new session request needs to go somewhere, so it goes to the distributor. The distributor has a list of all the nodes that are currently running in the system and what it does is it selects the perfect node to run your test on and then starts a session on that node. The node then replies to the distributor.

The distributor makes a note of the URL of the node in the session app and then control is returned to the user by responding to the new session by the session capabilities. This sounds really complicated, but it's basically what we're doing now in Selenium grid. In the standalone server, the router, the session map, the distributor, and the node are a single process. In Selenium grid, as it stands right now, you will find that the router, the session map, and the distributor are the hub process and the node is the node. Which is a different way of thinking about it. What we've done is we've broken these pieces out.

One of the other things that is really interesting is when the next command comes in after your new session, we can do something a bit different. What we do is we could go off to the session map and we can try and identify the node that the test is running on and then we can talk directly to the node. This means that we can scale horizontally a lot more easily. For example, if we just keep a cache in the router of known session new IDs to URLs, we don't even need the session map so we can have the router be a new instance of effectively a lander doing the routing. Session map can be running on something such as Redis, and the distributor could be a simple leader-follower group in a kubernetes pod. Nodes can then be spun up using Docker, running on permanent infrastructure or run any way that you think is appropriate.

The final thing that we are landing inside Selenium grid 4 is the ability to be observable. I'm sure all of you have heard about DevOps before. Running infrastructure such as a grid is a huge commitment. It's incredibly difficult, and that's why there are companies such as BrowserStack offering the ability to use Selenium as a service. Ideally you would never manage any of this stuff yourself, right? But if you do, you want to make the management as simple and as lightweight as possible. One of the problems with distributed architectures and microservices and things like that is they're incredibly hard to debug. It's hard to figure out what's going on. It's hard to figure out why things fail. It's really hard to track stuff down. Fortunately things have been improving. Tooling such as open tracing have made distributive traces available to the rest of us.

Praveen Umanath:
There is a Q&A window for questions by the way folks, so if you do have a question I will answer it, hopefully later, relatively soon in fact, and before the end of this talk.

Simon Stewart:
Observability is a mechanism to allow us to track what's going on. Effectively what happens is as the requests come in, each of the arrows from node to node gets decorated with a trace ID and you can track what's going on and they get collected in tooling such as Uber's Jaeger is a really popular tool for doing this. Zipkin is another one. You can visualize what's going on and you can prepare a heatmap of what's been happening and you can track what's going on. So debugging a modern grid, a Selenium 4 grid will be a lot simpler and the DevOps people, well, they're not going to hi5 you, but they're going to be very happy.

So, to recap, why are we doing the major version number bump? There are many reasons, the new IDE is worthy of a version number bump in and of itself. It's a huge piece of work and we have to say thank you to all the people who have poured effort into bringing it from part of the project that was moribund and it was very difficult for us to improve to something that is actually really impressive nowadays.

When we ship the Electron version it'll be even better. We're doing the version bump as well because of the changes to WebDriver. It's still got the same familiar user facing API you're used to but with a few additions. It's got integrations with the Chrome debugging protocol, it's got the new friendly locators, and under the hood in a way that most of you will hopefully never see, it will be speaking only the W3C WebDriver wire protocol. That's super important.

Finally, we're doing the version bump because we're introducing Selenium grid. We're rewriting it and we're modernizing it. Even right now when you download to the Selenium server you effectively have a grid of one. There's a bifurcated code base and we're going to be improving that by merging the code bases, re architecting things, cleaning it up, and coming up with simpler extractions.

Hopefully the Selenium server binary will become smaller as well. We'll clean the code base, we'll make it easier for people to contribute. And by the way, Selenium is an open source project. If there's something you don't like about it, if there's something you're excited about, if there's something you'd like to work on, please come along and give us a hand. There is an active Slack channel that we have which is linked to from the Selenium website and if you go there then you will find the core maintainers who are available who can help. We would love your help. We would love your PRs. We're getting better at doing it. Just to give you an idea, there are two of us working on the job of bindings, there's one person working on dot net, two people working on Python, and only about three or four people working on Ruby, and one person working on Javascript. Selenium could really do with your help.

That is skimming over the changes that are coming in Selenium 4 and I will hand over to let you hear more about Selenium 4 and BrowserStack itself.

Praveen Umanath:
Great. Thanks a lot Simon. So folks, I'm just going to spend a quick five ten minutes talking a little bit about BrowserStack and then also show you what are the changes you need to make in order to pass some of the capabilities and also show you sort of a capability configurator or builder that we have internally, that you guys can use off our website to configure your capabilities 233 compliant.

Just a little history about Dot Stack for many of you who might not have heard of us. We've been around for 8 years now. We had our official launch in 2011. Our first product was IE. We were allowing users to test IE from their Macs. That was a problem that our founder was trying to solve. We hit our first thousand customers in 2012, a little less than a year. Today we have 2 million users across 135 countries. We have four products.

The product that's being discussed today is Automate. That's our Selenium testing product on desktop and mobile browsers or mobile devices. We've got Live, our interactive web based cross browser testing product, and last year we launched a couple app testing products similar to Automate, lets you test your mobile apps using ID frameworks like IPM, Earl Gray, Espresso, and XUI and like Slight we have interactive mobile app testing so you can test your mobile apps on IOS and Android devices. We've crossed 25 thousand customers. These are some of the brands that use our platform. We've also got a lot of open source programs like Jquerry, Discord using us. If you're working open source project and you need help testing it out, please reach out to us. We'd be happy to help you out.

Let me quickly move to what is the impact of Selenium 4 or W3C on Dot Stack, on your desk on Dot Stack. To answer your question, will my test break, short answer is no, your test will continue to work. Like Simon said, backward compatibility is one of the key things. We don't want your test to suddenly stop working. I think this is a great opportunity for all of us to update our test suites, to start using WC compliant capabilities. I'm just going to walk through a couple examples with you guys on what this looks like and then dive into our website and show you how you can do that.

Some of the key changes based on wire and W3C, just basic changes in the way you name your capabilities. These are some of the BrowserStack specific capabilities. All of this is in the documentation we'll share links out with you guys after this webinar along with the video recording. Here's an example, this is in Java using JSON Y, so you can see the capabilities on top and the BrowserStack specific capabilities such as BrowserStack local and debug in line 40, 41, 42. You will see in the command line that it says OSS. All I need to do to switch this to W3C to make these changes. The key thing is to understand that now you're passing all your capabilities through destack colon options, you need capabilities in there with the new naming conventions and pass it through destack colon options. It will tell you you're using W3C as a protocol, right, it's very simple going from JSON wire to W3C and I'll show you how to do that using our deep rebender tool.

Here's another sample. I'm just running a test now on a real device. I'm running this test on the iPhone XS. Same sort of concept using destack colon options. I'm going to exit out of the presentation and show you how you can do this from our website. There is a page called destacker com slash automate slash capability, again we'll share the link with you. You've got two tabs here because it's two capability generator is available to all of you to use. This is my legacy or JSON wire protocol. I can go in here and pick what browser, operating system I want to test on, and you would see the capabilities getting updated on the right hand side. If I want to switch to W3C, I just click on this tab and it starts putting in our capabilities here in the W3C compliant format. I can switch between languages here. It will show how to pass these capabilities in my test suites.

Very simple. As I make changes here, you will see it getting updated on the right. That's it. You need to enter your capabilities here using our selector. They will be generated in the language of your choice here, and then you can copy and paste this into your test suites. I'm going to switch back to the slide deck now.

I talked about this. I will now stop and we can handle some questions

Simon Stewart:
I've just been having a quick look at some of the questions. There are some themes. The first one is, people are worried about the fact that I'm in a car. The reason is I've just been to my son's school's parents evening, and I either had to drive at unsafe speeds back to London or I sat in the car park of his school and did it there. Rather than do something dangerous I thought I would just sit somewhere safe and do it, and there was no way I was going to miss my son's parents evening, right, so, education is super important and I believe that.

Second theme of the questions are people asking for dates of when we're going to do things. We always joke that we will do a release with Selenium 3 it was by Christmas. With Selenium 4 its probably Chinese New Year, but we never really mention which year it's going to be. That's because we're an open source project.

We're going as fast as we can, but there is literally no one in the team working on the language bindings who is paid to work on Selenium. We all do it in our spare time when we have the capacity to do it. If you go back when we started this in 2007, if you go back 12 years, then we were all younger, we had more time, we had more energy, and we could do this a lot faster. Right now, we do it when we can.

ETA, hopefully Chinese New Year. I want it to come out as soon as possible. The language bindings other than Java are in pretty good shape. We need to land the building blocks for the friendly locators so they can be used by everyone, but it won't take long for everyone to do that. The Chrome debugging protocol integration is also coming. The ETA is as soon as possible but no sooner.

There are also questions about will the grid support video playback? This is one of the features that is really nice in Selenium, and also in Selenoid and other products and I think it's something that people have come to expect as sort of table stakes for us to integrate. If you would like to send a PR that would help us integrate that, I will merger it pronto, otherwise, yes it is a planned feature, no I don't know when I will get to it, but I will get to it as soon as I can.

I keep saying send us a PR, feel free to contribute. Contributing seems daunting but it's not actually that bad. The first thing to do is to go to get hub fork the project and download your clone. The second thing to do is to open up your ID of choice and start hacking on the code.

Everything is set up to run if you're using Java, within Inteli-J, and there is a free community version of that so you don't need to spend any money to do that, and we check in the configs for it. There are also visual studio declarations for dot net and you can use your favorite editor for the Ruby and the Python bindings. Just download the code, fork it, start hacking on it. If you can write a test for a new feature that would be amazing. Feel free to copy and paste one of the existing tests in your language of choice and then away you go. That's great. If you get stuck, if you need help, if you're not familiar with the architecture, if you really don't know how anything works, just come along to the Slack channel. The Slack channel is where the core developers hang out. You can ask us and one of us will come and help you. Also Mighty Ms are open on Twitter. You can always ping me and I will try to answer helpfully and rapidly.

PHP WebDriver which is maintained by Facebook won't get any of this right. I need to contact the PHP WebDriver developers. They know that the W3C changes are coming down the track. If they don't do anything, if they don't have an opportunity to update things, then what they can do is they can do nothing and whoever runs PHP WebDriver can use the Selenium server in order to get support for the legacy protocol.

Going through friendly locators, what about things in the shadow dom? I don't know is the answer to that.

There was a very specific question from Sanny which was, "When a Selenium test fails due to an element not being there, or an element not having the correct attributes, the real works starts—finding out what went wrong. Currently the only real data we can use is screenshots and videos. Selenium alternatives like Cypress will allow you to go back in time and inspect the dom tree as it was during testing. These kinds of features are huge time savers when debugging test failures. Will Selenium offer these types of features in the future or do you know of any third parties that do?"

Selenium already offers this as a capability. Its one of the reasons that we wrote the tooling that we did. The way that you do it isn't particularly clear so when I wrote Selenium in the first place, when I wrote the WebDriver APIs, what I expected was it was going to be the machine code of browser automation. It was going to be this library and you could build tools on top of it and people would build the abstractions that they found most appropriate for their testing.

When I realized that people were using the raw API, I helped popularize the idea of the page object patent, Antonio Marcono came out with a screenplay patent as well. There are these various approaches. My thinking was that people were going to do more object enter development and they would very rarely use the web for APIs directly. That isn't correct. That's one of the reasons we're building things like friendly locators into Selenium 4, however, buried in the depths of most of the driver implementations and language bindings there is a class called event firing WebDriver or something like that, which will take an existing WebDriver and will fire events as exciting things happen.

If you wanted to for example grab the dom for every action, then you would use the event firing WebDriver and you would just grab the dom after every interaction and store them in a file like the raw capabilities are there. It's pretty clear to me that what we need to do is provide a better support library to help people take advantage of that.

If somebody wants to send some patches and send some suggestions and nice tooling to do that, we can rarefy that in a slightly clearer way but I'll be clear, those capabilities have been present in Selenium for quite some time if you were willing to do the work yourself. It's clear to me that the thing we would, we didn't do successfully as a project was make it clear how simple some of the stuff could be. I apologize for that because clearly some of you have been suffering, and that's not a good thing.
Praveen, you want to take any questions?

Praveen Umanath:

Yes, I'm sorry, I'm just going to quit the presentation in order to see the questions. I think there were some questions around, one was around open source with Browser Stack, so I think if you are working on an open source project, any open source project and you need BrowserStack to be able to test your project on different websites and mobile apps please reach out to us. You can go to BrowserStack dot com slash open hyphen source or just send an e-mail to support at BrowserStack dot com. All we need is an active project on GetHub or any other management tool that we can look at and know that it's open source. That's the only thing we ask and then we will give you a fully featured version of BrowserStack, whether it's live manual testing or automated testing.

There were a couple of folks who asked about security and I think somebody asked about HIPAA compliance. That's not something that's relevant to us, however we are SockTo, GDPR, and PrivacyShield certified to whatever extent it's possible to get the certifications, we have done that. Feel free to reach out to us on our website if you have more questions about security. I will also send out a e-mail, post this with a link to our security page where you can look at all the things we are certified with.

A couple folks asked about real devices. Is it a real device? We don't use emulators so we've got data centers around the world and they are actual real IO send and Android devices that we have in our own data centers. You are being connected to those devices.

Simon Stewart:
There was a question about are we working with the people who are implementing the W3C protocol and how do we ensure that everyone implements it equally? With the browser vendors they are all members of the W3C and they are implementing the S3C specification which is at W3C dot org slash TR slash WebDriver. There is a suite of tests called the web platform tests and there are a significant number of Selenium tests in there, WebDriver tests, to verify the browsers implement the protocol correctly.

Also I am talking to people such as BrowserStack to ensure when Selenium 4 comes out everyone is ready, and I've been really happy just working with BrowserStack to help make that happen.

Praveen, I'll let you carry on talking.

Praveen Umanath:
Sure, thanks, thanks, Simon. I think there were some questions around how long will we support the legacy protocol. We have no plans to stop support for them. Like Simon mentioned in his presentation, we want to make sure that your tests don't break and we're going to give you as much time as possible. I don't think that's something you need to worry about right now.

Simon, there were some questions around features like downloads, system fire uploads, any updates you can give around that?

Simon Stewart:
So, downloads is an interesting question. One of the things that people, the project has, for a long time, recommended is that people don't test file downloads. If a web browser can't download a file, then the web browser itself is useless. We know that web browsers can download files. What we do want to be able to do is verify that when you go to this URL, then the correct file is generated.

That isn't an end to end test. That is more of an integration test, that you can write. Perhaps you use Curl if you were super keen to hook everything together. Having said that, there are things you can do to force browsers to set a default download directory where things are downloaded automatically. Sadly that is browser dependent, so you need to figure out the correct flags to pass, but you can make that work if you really need to, but generally testing that browsers can download files from the internet is somewhat redundant.

I know I'm being flippant and I know that some of you have actual hard requirements which is why there are the abilities to set the download directories directly.

If you are using a browser based on Chromium and you're using Selenium 4, the CDP integration the Crendy Bugin Protocol Integration will enable you to take advantage of the underlying protocol commands in the Crendy Bugin Protocol to enable you to set the download directory. I should point out that that sort of capability is really useful when you're running on local host, but once you're running on grid, once your test is no longer running on the same machine as the browser, then things get a little bit tricky, and I'm not sure how companies such as BrowserStack would handle that.

Desired capabilities is deprecated. I see a lot of questions about this. In the Java world, and in the dot net world, and in fact in all of the things, we are encouraging people, if they're instancing a particular browser to use the options classes for that. There's a Firefox option, an Internet Explorer option, and Edge option, and so on and so forth, in most of the language bindings. They give you a strongly typed API that you can use to set capabilities as you need to.

If, however, you're using grid or using BrowserStack, then you can use mutable capabilities to set up what you need to do. There are also immutable capabilities which are capabilities you cannot change, which take a map and which you can use to construct things. Either set the options that are specific for the browser using the specific sub classes for each browser or use something like mutable capabilities, and I can't remember for the life of me what the dot net equivalent of mutable capabilities is, but there is something.

Praveen Umanath:
I saw some questions around that so we will be updating the builder. We're in the process of adding that and changing it to mutable capabilities so we will make sure that gets done soon.

I think, Simon, there was a question on how to join the Slack channel or what the link was to that for the Selenium project.

Simon Stewart:
Okay, if you go to the Selenium HQ website then you can go to the support area and there is a link and you can get an invitation to join the slack channel directly from there. I think, I don't use Slack very often I use IRIC in order to figure out these things, otherwise I would tell you it's something like Selenium HQ dot slack dot com is the Slack channel, I believe, but the actual link itself is on the support section of the website at Selenium HQ dot org slash support.

Somebody in the Zoom webinar chat has very kindly posted a link to the Slack channel invitation. If you want to join the Slack right now, you can jump right there and join in the fun.

Praveen Umanath:
Awesome, let me just go through ... I think there was some question around general support. Really good questions, I encourage you to just reach out to our support team. They're very responsive. They will be able to help you troubleshoot issues. I think somebody asked what Firefox and local, so reach out to our support team and they'll be happy to help you.

Let me see. Simon, I think a couple of people were asking about, I know this is not exactly your area, but the people were asking what appium, the W3C axium on appium.

Simon Stewart:
I met up with Jonathan Lipps who is the lead of the Appium project recently, we were having a chat. He tells me that Appium already supports the W3C protocol. That's great news. I think there are sub projects that Appium uses to control specific things like Appium WebDriver agent and other bits and pieces. I think they're being updated to also support the W3C protocol, which is great news. The official line is it supports the W3C protocol.

Praveen Umanath:
Great. All right. I think what we will do after the webinar is we will go through the entire list of questions, make sure we reach out to you guys and make sure that we share answers to whatever we couldn't answer during this session. I think we'll start wrapping up now. Simon, thanks a lot again for taking the time and working with us to host this webinar. Thank you everyone from around the world for joining.

I think we hit 400 plus participants at one point in the webinar so I really appreciate everyone taking the time to join us. Like I said, we will be e-mailing everyone who registered for the webinar with a link to the recording and we'll address all the questions that were asked during this webinar and we will also share the slides that we used during this webinar. I think that is pretty much it. Thanks a lot, Simon, again. Have a great day and thank you everyone else for joining. I really appreciate your time.

Simon Stewart:
Thank you everyone, and thank you Praveen for inviting me, it's very kind of you.

Praveen Umanath:
Thank you. Thanks folks, have a great day, bye.

Simon Stewart:
Bye.

Selenium 4 with Simon Stewart [Webinar] from BrowserStack

[Webinar] Selenium 4 with Simon Stewart

Video transcription

Open Source Spotlight: fullPage.js with Álvaro Trigo

The 3-Part Guide to Faster Regression Testing