Let the Engineers Speak: Selectors in WebdriverIO

Getting Started — Published March 30, 2023

Earlier this month, Applitools hosted a webinar, Let the Engineers Speak: Selectors, where testing experts discussed one of the most common pain points that pretty much anyone who’s ever done web UI testing has felt. In this two-part blog series, I will recap what our experts had to say about locating their web elements using their respective testing frameworks: WebDriverIO and Cypress.

Introducing our experts

I’m Pandy Knight, the Automation Panda. I moderated our two testing experts in our discussion of selectors:

In each article, we’re going to compare and contrast the approaches between these two frameworks. This comparison is meant to be collaborative, not competitive. This series is meant to showcase and highlight the awesomeness that we have with modern frameworks and tools available.

This first article in our two-part series will define our terms and challenges, as well as recap Christian’s WebDriverIO selectors tutorial and WebdriverIO Q&A. The upcoming second article will recap Filip’s Cypress selectors tutorial and Cypress Q&A.

Defining our terms

Let’s define the words that will be used in this series to frame the discussion and so that we have the same understanding as we discuss selectors, locators, and elements:

  • Selectors: A selector is a text-based query used for locating elements. Selectors could be things like CSS selectors, XPaths, or even things like an ID or a class name. In modern frameworks, you can even use things like text selectors to query by the text inside of a button.
  • Locators: A locator is an object in the tool or framework you use that uses the selector for finding the element on the active page.
  • Elements: The element itself is the entity on the page that exists. An element could be a button, it could be a label, an input field, or a text area. Pretty much anything in HTML that you put on a page is an element.

In one sentence, locators use selectors to find elements. We may sometimes use the terms “selector” and “locator” interchangeably, especially in more modern frameworks.

Challenges of locating web elements

We’re going to talk about selectors, locators, and the pain of trying to get elements the right way. In the simple cases, there may be a button on a page with an ID, and you can just use the ID to get it. But what happens if that element is stuck in a shadow DOM? What if it’s in an iframe? We have to do special things to be able to find these elements.

Locating web elements with WebdriverIO

Christian Bromann demonstrated effective web element locator strategies with WebDriverIO. In this series, we used Trello as the example web app to compare the frameworks. For the purposes of the demo, the app is wrapped over a component test, allowing Christian to work with web on the console to easier show you how to find certain elements on the page.

Note: All the following text in this section is based on Christian’s part of the webinar. Christian’s repository is available on GitHub.

Creating a board

First, we want to start creating a new board. We want to write a test to inspect the input element that the end user sees. From the dev tools, the input element can be found.

There are various ways to query for that input element:

  • We could fetch the element with WebdriverIO using $ or $$ commands, depending if you want to fetch one element, the first element on the page, or all elements. Using an input as a selector is not advised, as there could be more inputs and then the wrong input could be fetched.
  • We could also use a CSS class called px-2. Using CSS classes is also not advised, as oftentimes these classes are meant to style an element, not necessarily locate it.
  • We could use a property, such as the name of the input using the data attribute. This is the suggested way to locate a web element in WebdriverIO – using the accessibility properties of an element.

WebdriverIO has a more simplified way to use accessibility properties. This approach is a much better selector, because it’s actually what the user is seeing and interacting with. To make things easier, Chrome helps with finding the accessibility selector for every element under the accessibility tab.

Note: WebdriverIO doesn’t use the accessibility API of the browser and instead uses a complex XPath to compute the accessibility name.

After you’ve located the input element, press enter using the keys API.

  it('can create an initial board', async () => {
    await $('aria/Name of your first board').setValue('Let the Engineers Speak')
    await browser.keys(Key.Enter)
    await expect(browser).toHaveUrlContaining('/board/1')
    
    await browser.eyesCheck('Empty board')
  })

Creating a list

Now that our board is created, we want to start a list. First, we inspect the “Add list” web element for the accessibility name, and then use that to click the element.

Set the title of the list, and then press enter using the keys API.

  it('can add a list on the board', async () => {
    await $('aria/Enter list title...').setValue('Talking Points')
    await $('aria/Add list').click()

    /**
     * Select element by JS function as it is much more perfomant this way
     * (1 roundtrip vs nth roundtrips)
     */
    await expect($(() => (
      [...document.querySelectorAll('input')]
        .find((input) => input.value === 'Talking Points')
    ))).toBePresent()
  })

Adding list items

To add to the list we created, the button is a different element that’s not accessible, as the accessibility name is empty. Another way to approach this is to use a WebdriverIO function that allows me to add a locator search applicator strategy.

The example applicator strategy basically queries all this on the page to find all divs that have no children and that text of a selector that you provide, and now you should be able to query that custom element. After getting the custom elements located, you can assert that three cards have been created as expected. You can inject your JavaScript script to be run in the browser if you don’t have accessibility selectors to work with.

  it('can add a card to the list', async () => {
    await $('aria/Add another card').click()
    await $('aria/Enter a title for this card...').addValue('Selectors')
    await browser.keys(Key.Enter)
    await expect($$('div[data-cy="card"]')).toBeElementsArrayOfSize(1)
    await $('aria/Enter a title for this card...').addValue('Shadow DOM')
    await browser.keys(Key.Enter)
    await $('aria/Enter a title for this card...').addValue('Visual Testing')
    await browser.keys(Key.Enter)
    await expect($$('div[data-cy="card"]')).toBeElementsArrayOfSize(3)

    await browser.eyesCheck('board with items')
  })

Starring a board

Next, we’ll “star” our board by traversing the DOM using WebdriverIO commands. We need to first locate the element that has the name of the board title using the attribute selector and going to the parent element.

From that parent element, to get to the star button, we chain the next command and call the next element. Now we can click the star and see it change to an enabled state. So with WebdriverIO, you can chain all these element queries and then add your action at the end. WebdriverIO uses a proxy in the background to transform everything and execute the promises after each other so that they’re available. One last thing you can also do is query elements by finding links with certain text.

Summarizing Christian’s suggestions for using selectors in WebdriverIO, always try to use the accessibility name of an element. You have the dev tools that give you the name of it. If you don’t have the accessibility name available, improve the accessibility of your application if possible. And if that’s not an option, there are other tricks like finding an element using JavaScript through propertis that the DOM has.

  it('can star the board', async () => {
    const startBtn = $('aria/Let the Engineers Speak').parentElement().nextElement()
    await startBtn.click()
    await expect(startBtn).toHaveStyle({ color: 'rgba(253,224,71,1)' })
  })

Accessing a shadow root

Using a basic web HTML timer component as an example, Christian discussed shadow roots. The example has multiple elements and a nested shadow root. Trying to access a button within the shadow root results in an error that you don’t have access from the main page to the shadow root. WebdriverIO has two things to deal with this challenge. The first method is the deep shadow root selector. This allows you to access all of the shadow root elements or filter by defined attributes of the elements in the shadow root.

A different way to access elements in the shadow is using the browser shadow command, which basically allows you to switch to the search within the shadow root.

    describe('using deep shadow selector (>>>)', () => {
        beforeEach(async () => {
            await browser.url('https://lit.dev/playground/#sample=docs%2Fwhat-is-lit&view-mode=preview')

            const iframe = await $('>>> iframe[title="Project preview"]')
            await browser.waitUntil(async () => (
                (await iframe.getAttribute('src')) !== ''))
            
            await browser.switchToFrame(iframe)
            await browser.waitUntil(async () => (await $('my-timer').getText()) !== '')
        })

        it('should check the timer components to work', async () => {
            for (const customElem of await $$('my-timer')) {
                const originalValue = await customElem.getText()
                await customElem.$('>>> footer').$('span:first-child').click()
                await sleep()
                await customElem.$('>>> footer').$('span:first-child').click()
                await expect(customElem).not.toHaveTextContaining(originalValue)

                await customElem.$('>>> footer').$('span:last-child').click()
                await expect(customElem).toHaveTextContaining(originalValue)
            }
        })
    })

Integrating with Applitools

Lastly, Christian shared using Applitools with WebdriverIO. The Applitools Eyes SDK for WebdriverIO is imported to take snapshots of the app as the test suite runs and upload them to the Applitools Eyes server.

if (process.argv.find((arg) => arg.includes('applitools.e2e.ts'))) {
    config.services?.push([EyesService, {
        viewportSize: {width: 1200, height: 800},
        batch: {name: 'WebdriverIO Test'},
        useVisualGrid: true,
        browsersInfo: [
            {width: 1200, height: 800, name: 'chrome'},
            {width: 1200, height: 800, name: 'firefox'}
        ]
    }])

With this imported, our tests can be simplified to remove some of the functional assertions, because Applitools does this for you. From the Applitools Eyes Test Manager, you can see the tests have been compared against Firefox and Chrome at the same time, even though only one test was run.

WebdriverIO selectors Q&A

After Christian shared his WebdriverIO demonstration, we turned it over to our Q&A, where Christian responded to questions from the audience.

Using Chrome DevTools

Audience question: Is there documentation on how to run WebdriverIO in DevTools like this to be able to use the browser and $ and $$ commands? This would be very helpful for us for day-to-day test implementation.
Christian’s response: [The demo] is actually not using DevTools at all. It uses ChromeDriver in the background, which is automatically spun up by the test runner using the ChromeDriver service. So there’s no DevTools involved. You can also use the DevTools protocol to automate the browser. The functionality is the same. WebdriverIO executes the same XPath. There’s a compliant DevTools implementation to the WebdriverIO protocol so that the WebdriverIO APIs works on both. But you really don’t need DevTools to use all these accessibility selectors to test your application.

Integrating with Chrome Console

Question: How is WebdriverIO integrated with Chrome Console to type await browser.getHtml() etc.?
Christian’s response: With component tests, it’s similar to what other frameworks do. You actually run a website that is generated by WebdriverIO. WebdriverIO injects a couple of JavaScript scripts and it loads web within that page as well. And then it basically sends commands back to the Node.js world where it’s then executed by ChromeDriver and then the responses are being sent back to the browser. So basically WebdriverIO as a framework is being injected into the browser to give you all the access into the APIs. The actual commands are, however, run by Chrome Driver.

Testing in other local languages

Question: If we’re also testing in other languages (like English, Spanish, French, etc.), wouldn’t using the accessibility text fail? Would the ID not be faster in finding the element as well?
Christian’s response: If you have a website that has multiple languages and you have a system to inject those or to maintain the languages, you can use this tool to fetch the accessibility name of that particular language you test the website in. Otherwise, you can say I only tested one language, because I would assume it would not be different in other languages. Or you create a library or a JSON file that contains the same selectors for different languages but the same accessibility name for different languages. And then you import that JSON to your test and just reference the label to have the right language. So there are ways to go around it. It would make it more difficult in different languages, obviously, but still from the way to maintain end-to-end tests and maintain tests in general, I would always recommend accessibility names and accessibility labels.

Using div.board or .board

Question: Do you have a stance/opinion on div.board versus .board for CSS selectors?
Christian’s response: I would always prefer the first one, just more specific. You know, anyone could add another board class name to an input element or what not. And so being very specific is usually the best way to go in my recommendation.

Tracking API call execution

Question: How can we find out if an API call has executed when we click on an element?
Christian’s response: WebdriverIO has mocking capabilities, so you can look into when a certain URL pattern has been called by the browser. Unfortunately, that’s fairly based on DevTools because WebdriverIO, the first protocol, doesn’t support it. We are working with the browser vendors and the new WebdriverIO binary protocol to get mocking of URLs and stubbing and all that stuff to land in the new browsers. And so it’ll be available across not only Chrome, but also Firefox, Safari, and so on.

Working with iframes

Question: Is it possible to select elements from an iframe and work with an iframe like normal?

Christian’s response: An iframe is almost similar to a shadow DOM – just that an iframe has much more browser context than just a shadow DOM. WebdriverIO has an issue currently. To implement the deep shadow selector for iframes as well, you could do three characters and then name any CSS path and it would make finding an iframe very easy. But you can always switch to an iframe and then query it. So it’s a little bit more difficult with iframes, but it’s doable.

Learn more

So that covers our expert’s takeaways on locating web elements using WebdriverIO. In the next article in our two-part series, I’ll recap Filip’s tutorial using Cypress, as well as the Cypress Q&A with the audience.

If you want to learn more about any of these tools, frameworks like Cypress, WebDriverIO, or specifically web element locator strategies, be sure to check out Test Automation University. All the courses and content are free. You can also learn more about visual testing with Applitools in the Visual Testing learning path.
Be sure to register for the upcoming Let the Engineers Speak webinar series installment on Test Maintainability coming in May. Engineers Maaret Pyhäjärvi from Selenium and Ed Manlove from Robot Framework will be discussing test maintenance and test maintainability with a live Q&A.

Are you ready?

Get started Schedule a demo