In this part of the article series, we will learn one of the most fundamental topics when it comes to test automation. And that is locating the Web elements so that we can use Selenium WebDriver to interact with them and perform different kinds of operations.

This is a really fun section, but also extremely critical.

7 steps of Selenium scripts

Let’s talk a little bit about the Selenium scripts. There are seven actions of a selenium script:

1. Instantiating a Web driver object

We’ve already seen this in the first part of the series, when we started the browser in our test:

WebDriver driver = new ChromeDriver();

2. Navigating to the web page 

This is done with the get() method, using the page URL.

driver.get("https://ultimateqa.com/");

3. Locating the elements on the Web page

Here’s an example for locating an element by its ID (we’ll talk about locators later on):

driver.findElement(By.id("sign-in"));

4. Ensuring that the browser is in the correct state

We can do this by using explicit waits where we specify the condition we want to wait for. in this case for the presence of the located element:

WebDriverWait wait = new WebDriverWait(driver, 10);

WebElement signIn = wait.until(ExpectedConditions.presenceOfElementLocated(By.id("sign-in")));

5. Taking action

For example, clicking on an element:

signIn.click();

6. Recording the result

Here’s an example of us using the TestWatcher, that comes with JUnit and we can use to record the results. For example, if our test failed, we are printing out a message saying that it failed, and if it succeeded we’re printing out a message saying that it succeeded. 

https://gist.github.com/ultimate-qa/70ff366adb4346a7965bcd6d26104ddf

7. Quit the driver

Whenever you start a driver, you also want to quit it at the end of the test. It ends the session and kills the browser instance so that you don’t have browser instances building up on your machine and hogging up your RAM. But it’s also really important when we start doing parallelization.

driver.quit();

Basic understanding of HTML

As automation engineers, you’re going to have to learn how to use HTML in order to identify your web elements. The better your understanding of HTML, the easier it will be for you to identify web elements. 

The very first thing you should know about HTML are tags. A tag looks like this:

It has a name and it is placed between <> brackets. HTML tags contain an opening tag, closing tag, and content. The closing tag has a forward slash. If you’re familiar with XML, this is very similar to XHTML. 

The other very important thing that you need to know are  HTML attributes. The HTML attributes are placed inside the opening tag. They are basically something that you can add in order to help differentiate the different tags from each other. They can also provide different kinds of behaviors to a tag. For example, we have an H1 tag (which is a header) and we have an attribute, the named attribute that is equivalent to some value. And of course, we have a closing tag for the HTML attribute:

How to use Chrome for element location

Next, let’s see how to use Chrome browser so that you can inspect the HTML of different elements. There are several ways to do this. 

One way to do this is to open up the page that you want to interact with and click on the Customize menu in the top-right corner of Chrome. From here, click on More Tools -> Developer Tools. 

In Developer tools, you can use the button with the pointer icon to select the elements you want to select. 

You can also perform searches in the Developer Tools, by using the Ctr + F key combination. Here, you can search by an element ID, Xpath or another locator

Another important thing is the panel on the right-hand side:

These are the CSS styles for each element. 

Another way to open up the Developer tools is to come to an element, right-click on it and select Inspect. That will open up developer tools and automatically put you on the selected element in the HTML source code. 

HTML tags have specific meanings and that allows the HTML to function based on the appropriate tags. Of course, you can set attributes depending on what you need. We can use all of these kinds of different attributes, such as class, ID, tags, to help us with Web element identification:

A really cool thing you can do in Developer Tools, is right-click an element and select:

  • Copy element – this will copy the element’s HTML, like this:

<a class="et_pb_button et_pb_button_5 et_pb_bg_layout_light" href="">Button</a>

  • Copy selector – this will copy the absolute CSS selector (we’ll get to that in more details sonn)

#post-579 > div > div.et-l.et-l--post > div > div > div.et_pb_row.et_pb_row_2.et_pb_row_4col > div.et_pb_column.et_pb_column_1_4.et_pb_column_3.et_pb_css_mix_blend_mode_passthrough > div.et_pb_button_module_wrapper.et_pb_button_5_wrapper.et_pb_button_alignment_left.et_pb_module > a

  • Copy full XPath

/html/body/div[1]/div/div/div/article/div/div[1]/div/div/div[3]/div[2]/div[3]/a

  • Copy XPath

//*[@id="post-579"]/div/div[1]/div/div/div[3]/div[2]/div[3]/a

Using absolute CSS and XPath is not recommended, because if the elements are moved on the page, Selenium will not find them anymore. 

Types of locators in WebDriver

Enough talk about HTML, let’s see how all this applies to Selenium. In Selenium, there are eight types of locators that you can use to find elements so that you can interact with them. These are ID, name, CSS, class name, tag name, link text, partial link text, and XPath. 

Here are some examples of how each of them alook in code:

https://gist.github.com/ultimate-qa/1ebee33ff19ddd94dd55eb58d6672702

XPath

XPath is going to become your best friend when it comes to locating web elements, because elements won’t always have unique or static IDs and names. 

What XPath really does is it allows you to locate elements by traversing the HTML as opposed to using an ID or a class name. You can traverse the HTML and navigate from your current location to where you want to get to. 

XPath has several important expressions that you definitely should know:

  • the forward-slash: it’s used to selects from the root node. 
  • a double forward slash: used to select nodes anywhere in the current document that match this selection. It is our most one of the most commonly used XPath expressions. 
  • period, which selects the current node, to be honest, this is one of the least used XPath expressions, the period if you won’t need to utilize that much. I almost never utilized it.
  • the double period, which selects the parent of the current node. This one is pretty useful and will allow you to select the parent. 
  • the @ symbol, which allows you to select attributes
  • The * symbol, one of my favorite XPath expressions that you can use that will allow you to match any element. 

You can always check out this awesome cheat sheet, which explains all the XPath locators.

Let’s start with an example. Any web page starts with this HTML tag. If we use the HTML tag, we begin traversing our HTML starting at the very top element. To get to the next element, we can use a forward slash, that allows us to get to the next element. 

So to get to the first link, we would have this XPath, starting from the HTML tag, and using the forward-slash to go to the next nodes:

/html/body/link[1]

I can also use the double forward-slash, and navigate directly to the link tag, regardless of its parents. It would look like this:

//link

Let’s see how to use the @ symbol in Xpath:

Here, we have an element with the tag button, and the type submit. We can identify it using this XPath:

//button[@type="submit"]

Then, using the double dot, we can navigate to its parent (the form element):

//button[@type="submit"]/..

Another very powerful expression in XPath is the contains() function. This allows us to locate elements based on certain text.Let’s take this button for example:

With its HTML:

We can use this XPath to locate it:

//*[contains(text(), "Xpath Button 1")]

We can also use multiple attributes to locate an element:

//*[contains(text(), "Xpath Button 1")][type=”submit”]

I also used the wildcard asterisk sign, which means that it will not look for a specific element tag, but just look for elements that contain the provided text.

Let’s talk about a couple more cool XPath expressions: parent, following sibling, and preceding sibling. 

A parent of this element is an element directly above it, siblings are elements that are on the same level, under the same parent. We already know that we can navigate to the parent element using “..”, but we can also use this expression:

//button/parent::form

You can do the same thing with following siblings or preceding siblings.

CSS Selectors

The other type of selector worth mentioning is the CSS selector. CSS Selector is very similar to XPath. It’s another way for us to have very flexible ways to find elements. 

Here are some examples:

https://gist.github.com/ultimate-qa/39c13a38ff7590b9d17dc4009f9e66ee

There are debates about CSS versus XPath. For web automation, it doesn’t really matter. XPath is nice because it can traverse the entire DOM up and down, it gives you the capability to locate elements by text, and it’s more computer-readable. CSS is nice because it can traverse elements from the DOM to the parent to the child node and from left to right of the DOM. It’s more human-readable and is really good for mobile automation.

So what do I recommend for you? Use both of them.

Which locators are the best?

OK, so which locator should you use? It’s important to make sure that your code is as stable as possible, to make sure that when something changes in your application, you don’t have to update your locators. 

The most important locator to use whenever you can is an ID, because IDs will almost always be unique. I recommend that you just ask the developers to add an ID to their HTML, whenever an ID is not available.

Another locator you might want to use is the class. However, there are instances where class may not be unique, so make sure that when you are using classes, the class you are using is unique and your locator identifies the correct web element. You could also use a name if it is supplied for an element. 

Data attributes

There is one other way of locating elements that I want to tell you about it, and most people actually don’t know about it. And that’s using data attributes – meaning custom attributes we can add to our elements. They can have any name, for example something like “test-data”:

These are actually something the developer needs to add, which is really good, because then whenever they modify these fields, they just need to leave the data test attributes unchanged, since this will only be used for test automation purposes.

TL;DR

So by the end of this article, we learned how to identify web elements using different techniques, which are extremely important in test automation, since to interact with elements, we must first identify them.

The most important things to remember are: find unique locators, use IDs whenever they are available, and avoid absolute XPath and Css.

Hope you’re ready for the next part of the series! See you soon!