The impressive features of Selenium – Part 2

Chien Nguyen

1 – What’s New in Selenium 4

In the previous section, we learned about 2 features in Selenium 4: Relative locators is a location strategy based on the proximity of other web elements and Chrome DevTools Protocol is native access to the DevTools in Chromium-based browsers.

In this section, we will continue to learn about some new features to capture full-page screenshots, block URLs, open new tabs and windows (An improved way to navigate to different windows and tabs), Emulating Geolocation (Mock geolocation coordinates), Web Driver Bidi (Bidirectional communication between driver and browser)

1.1 Full-page screenshots

Another possible use of the CDP is making screenshots of a full page (i.e., capturing the content page beyond the viewport). This feature is available in other browsers with a full implementation of the CDP, such as Chrome or Edge. Firefox supports the same characteristic through the method getFullPageScreenshotAs() available in FirefoxDriver objects

The image below demonstrates this feature in Chrome.

    @Test
    public void testTakeScreenshotFullpageUsingCDP_2() throws IOException, InterruptedException {
        DevTools devTools = ((ChromeDriver) driver).getDevTools();
        devTools.createSession();  
        Wait<WebDriver> wait = new FluentWait<WebDriver>(driver)

                                    .withTimeout(Duration.ofSeconds(20))

                                    .pollingEvery(Duration.ofSeconds(1))

                                    .ignoring(NoSuchElementException.class)

                                    .ignoring(ElementNotInteractableException.class);
        driver.get("https://ultimateqa.com/fake-landing-page");  

        // We wait until the main content are loaded.
        wait.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector("#main-content")));          

        // Scroll down the page by 100 pixels and pause for 300 milliseconds each scroll until the end of the page.
        scrollDown(100, 300);

        //We get the page layout metrics (to calculate the page dimensions).
        GetLayoutMetricsResponse metrics =  devTools.send(Page.getLayoutMetrics());
        Rect rect = metrics.getContentSize();

        //We send the CDP command to make a screenshot beyond the page viewport. As a result, we obtain the screenshot as a string in Base64.
        String screenshotBase64 = devTools.send(Page.captureScreenshot(Optional.empty(), Optional.empty(),
        Optional.of(new Viewport(0, 0, rect.getWidth(), rect.getHeight(), 1)), Optional.empty(), Optional.of(true), Optional.empty()));

        //We change image from base64 to png
        Path destination = Paths.get("fullpage-screenshot.png");  
        Files.write(destination, Base64.getDecoder().decode(screenshotBase64));

        assertThat(destination).exists();
        devTools.close();

    }

Result: Capture the full content of a web page

[videopress WHm5Auze]

1.2 Block URLs

CDP provides the ability to block given URLs in a session. The example below provides a test blocking the practice web page logo URL.

    @Test
    void testBlockUrl() throws InterruptedException {
        DevTools devTools = ((ChromeDriver) driver).getDevTools();
        devTools.createSession();  
        devTools.send(Network.enable(Optional.empty(), Optional.empty(),Optional.empty()));

        String urlToBlock = "https://bonigarcia.dev/selenium-webdriver-java/img/hands-on-icon.png";
        devTools.send(Network.setBlockedURLs(ImmutableList.of(urlToBlock)));

        //We create a listener to trace the failed events.
        devTools.addListener(Network.loadingFailed(), loadingFailed -> {
            BlockedReason reason = loadingFailed.getBlockedReason().get();
            log.debug("Blocking reason: {}", reason);
            assertThat(reason).isEqualTo(BlockedReason.INSPECTOR);
        });

        driver.get("https://bonigarcia.dev/selenium-webdriver-java/");  
        Thread.sleep(7000);
        assertThat(driver.getTitle()).contains("Selenium WebDriver");
    }

Result: Inspect the browser during the execution, and you will discover that this logo is not displayed on the page.

[videopress c3fYrDgF]

1.3 Emulating Geolocation

Selenium 4 introduced support for the Chrome DevTools Protocol (CDP), enabling us to emulate geolocation in the browser. With CDP, we can override geolocation data, setting specific latitude, longitude, and accuracy values.

Test setting custom geolocation coordinates :

    @Test
    void testEmulateLocation() throws InterruptedException {
        //Custom location, in this case, the coordinates of Mount Everest
        Map<String, Object> coordinates = new HashMap<>();
        coordinates.put("latitude", 27.5916);
        coordinates.put("longitude", 86.5640);
        coordinates.put("accuracy", 8850);
        ((ChromiumDriver) driver).executeCdpCommand("Emulation.setGeolocationOverride", coordinates);

        //Open a practice page where the geolocation coordinates are displayed to the end user  
        driver.manage().window().maximize();
        driver.get("https://bonigarcia.dev/selenium-webdriver-java/geolocation.html");
        driver.findElement(By.id("get-coordinates")).click();

        WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(5));
        WebElement coordinatesElement = driver.findElement(By.id("coordinates"));

        //Assert the coordinates are visible on the page.
        wait.until(ExpectedConditions.visibilityOf(coordinatesElement));  
        Thread.sleep(7000);

    }

Result: The coordinates are visible on the page

1.4 Tabs and windows

Creates a new window (or) tab and will focus on the new window or tab on the screen. You don’t need to switch to work with the new window (or) tab. If you have more than two windows (or) tabs opened other than the new window, you can loop over both windows or tabs that WebDriver can see, and switch to the one which is not the original.

Test opening a new tab

    @Test
    void testNewTab() throws InterruptedException {
        driver.manage().window().maximize();
        driver.get("https://bonigarcia.dev/selenium-webdriver-java/");  
        
        Thread.sleep(3000);
        String initHandle = driver.getWindowHandle();
        driver.switchTo().newWindow(WindowType.TAB);

        driver.get("https://bonigarcia.dev/selenium-webdriver-java/web-form.html");
        assertThat(driver.getWindowHandles().size()).isEqualTo(2);

        Thread.sleep(3000);
        driver.switchTo().window(initHandle);  

        driver.close();  
        Thread.sleep(5000);

        assertThat(driver.getWindowHandles().size()).isEqualTo(1);  
    }

Results: Open a new tab and change the focus to it. And close only the current window. The second tab remains open.

[videopress U6zuPBhh]

1.5 WebDriver BiDi

WebDriver BiDi, the future of browser automation! It’s a new standard browser automation protocol currently under development, aiming to combine the best of both WebDriver “Classic” and CDP. WebDriver BiDi promises bi-directional communication, making it fast by default, and it comes packed with low-level control.

WebDriver BiDi lets you write tests using any of your favorite tools and automate them in any browser or driver, giving you full flexibility.

In Selenium WebDriver, the aim is that BiDi will be a standardized replacement in the long run for advanced operations currently supported by CDP. For example, the Selenium WebDriver API supports implementing event listeners through the HasLogE⁠vents interface. This interface works on top of CDP.

HasLogEvents allows implementing listeners for the following events:

domMutation: To capture events about changes in the DOM.
consoleEvent: To capture events about changes in the browser console, such as JavaScript traces.

Test implementing a listener for console events.

    @Test
    void testConsoleEvents() throws InterruptedException {

        HasLogEvents logger = (HasLogEvents) driver;

        // Create a listener for console events. This test expects to capture four events synchronized using a countdown latch.
        CountDownLatch latch = new CountDownLatch(4);

        logger.onLogEvent(CdpEventTypes.consoleEvent(consoleEvent -> {
            log.debug("{} {}: {}", consoleEvent.getTimestamp(), consoleEvent.getType(), consoleEvent.getMessages());
            latch.countDown();
        }));

        // Open the practice web page, which logs several messages in the JavaScript console.
        driver.get("https://ultimateqa.com/blog/");
        assertThat(latch.await(20, TimeUnit.SECONDS)).isTrue();
    }

Results : Capture four events synchronized using a countdown latch.

2 – Conclusion

We have learned about the key features of Selenium 4. Selenium WebDriver provides the interface HasDevTools for sending CDP commands to the browser. This mechanism is quite powerful since it provides direct access to the CDP with Selenium WebDriver. Nevertheless, it has a relevant limitation since it is tied to both the browser type and version.

The Selenium WebDriver API provides a second way to use the CDP, based on a set of wrapper classes built on top of CDP for advanced manipulation of the browsers such as network traffic interception or basic and digest authentication. The same code should work when the functionality is re-implemented with WebDriver-BiDi.

Referral:

The Selenium Browser Automation Project | Selenium, Hands-On Selenium WebDriver with Java , WebDriver BiDi – The future of cross-browser automation

Chien Nguyen

I am automation test engineer with over 3 years of experience in software testing field across various platforms. I have extensive experience with Groovy, Java Script, Java and Selenium.

Solutions

Technology advisory

Cloud engineering

Data solutions

AI and machine learning

Application engineering

Maintenance and support

Business process solutions

Quality solutions

Industry

Financial services and insurance

Healthcare

Retail

Travel

Media and publishing

Hi-tech and IOT

Logistics and supply chain

Education

Our thinking

News

Insights

Blog

The impressive features of Selenium – Part 2

Chien Nguyen

Table of Contents

1 – What’s New in Selenium 4

1.1 Full-page screenshots

1.2 Block URLs

1.3 Emulating Geolocation

1.4 Tabs and windows

1.5 WebDriver BiDi

2 – Conclusion

Chien Nguyen

Leave a Comment Cancel Reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements