BiDirectional means that communication is happening in two directions simultaneously.
The traditional WebDriver model involves strict request/response commands which only allows for communication to
happen in one direction at any given time. In most cases this is what you want; it ensures that the browser is
doing the expected things in the right order, but there are a number of interesting things that can be done with
asynchronous interactions.
This functionality is currently available in a limited fashion with the [Chrome DevTools Protocol] (CDP),
but to address some of its drawbacks, the Selenium team, along with the major
browser vendors, have worked to create the new WebDriver BiDi Protocol.
This specification aims to create a stable, cross-browser API that leverages bidirectional
communication for enhanced browser automation and testing functionality,
including streaming events from the user agent to the controlling software via WebSockets.
Users will be able to listen for and record or manipulate events as they happen during the course of a Selenium session.
Enabling BiDi in Selenium
In order to use WebDriver BiDi, setting the capability in the browser options will enable the required functionality:
options.setCapability("webSocketUrl",true);
options.enable_bidi =True
UseWebSocketUrl =true,
options.web_socket_url =true
Options().enableBidi();
options.setCapability("webSocketUrl",true);
This enables the WebSocket connection for bidirectional communication,
unlocking the full potential of the WebDriver BiDi protocol.
Note that Selenium is updating its entire implementation from WebDriver Classic to WebDriver BiDi (while
maintaining backwards compatibility as much as possible), but this section of documentation focuses on the new
functionality that bidirectional communication allows.
The low-level BiDi domains will be accessible in the code to the end user, but the goal is to provide
high-level APIs that are straightforward methods of real-world use cases. As such, the low-level
components will not be documented, and this section will focus only on the user-friendly
features that we encourage users to take advantage of.
If there is additional functionality you’d like to see, please raise a
feature request.
1 - WebDriver BiDi Logging Features
These features are related to logging. Because “logging” can refer to so many different things, these methods are made available via a “script” namespace.
Remember that to use WebDriver BiDi, you must enable it in Options.
For more details, see Enabling BiDi
These features are related to networking, and are made available via a “network” namespace.
The implementation of these features is being tracked here: #13993
Remember that to use WebDriver BiDi, you must enable it in Options.
For more details, see Enabling BiDi
Authentication Handlers
Request Handlers
Response Handlers
3 - WebDriver BiDi Script Features
These features are related to scripts, and are made available via a “script” namespace.
The implementation of these features is being tracked here: #13992
Remember that to use WebDriver BiDi, you must enable it in Options.
For more details, see Enabling BiDi
Script Pinning
Execute Script
DOM Mutation Handlers
4 - Chrome DevTools Protocol
Examples of working with Chrome DevTools Protocol in Selenium. CDP support is temporary until WebDriver BiDi has been implemented.
Page being translated from
English to Japanese. Do you speak Japanese? Help us to translate
it by sending us pull requests!
Many browsers provide “DevTools” – a set of tools that are integrated with the browser that
developers can use to debug web apps and explore the performance of their pages. Google Chrome’s
DevTools make use of a protocol called the Chrome DevTools Protocol (or “CDP” for short).
As the name suggests, this is not designed for testing, nor to have a stable API, so functionality
is highly dependent on the version of the browser.
Selenium is working to implement a standards-based, cross-browser, stable alternative to CDP called
[WebDriver BiDi]. Until the support for this new protocol has finished, Selenium plans to provide access
to CDP features where applicable.
Using Chrome DevTools Protocol with Selenium
Chrome and Edge have a method to send basic CDP commands.
This does not work for features that require bidirectional communication, and you need to know what domains to enable when
and the exact names and types of domains/methods/parameters.
var cookie =newDictionary<string,object>{{"name","cheese"},{"value","gouda"},{"domain","www.selenium.dev"},{"secure",true}};((ChromeDriver)driver).ExecuteCdpCommand("Network.setCookie", cookie);
To make working with CDP easier, and to provide access to the more advanced features, Selenium bindings
automatically generate classes and methods for the most common domains.
CDP methods and implementations can change from version to version, though, so you want to keep the
version of Chrome and the version of DevTools matching. Selenium supports the 3 most
recent versions of Chrome at any given time,
and tries to time releases to ensure that access to the latest versions are available.
This limitation provides additional challenges for several bindings, where dynamically
generated CDP support requires users to regularly update their code to reference the proper version of CDP.
In some cases an idealized implementation has been created that should work for any version of CDP without the
user needing to change their code, but that is not always available.
Examples of how to use CDP in your Selenium tests can be found on the following pages, but
we want to call out a couple commonly cited examples that are of limited practical value.
Geo Location — almost all sites use the IP address to determine physical location,
so setting an emulated geolocation rarely has the desired effect.
Overriding Device Metrics — Chrome provides a great API for setting Mobile Emulation
in the Options classes, which is generally superior to attempting to do this with CDP.
4.1 - Chrome DevTools Logging Features
Logging features using CDP.
Page being translated from
English to Japanese. Do you speak Japanese? Help us to translate
it by sending us pull requests!
While Selenium 4 provides direct access to the Chrome DevTools Protocol, these
methods will eventually be removed when WebDriver BiDi implemented.
Page being translated from
English to Japanese. Do you speak Japanese? Help us to translate
it by sending us pull requests!
While Selenium 4 provides direct access to the Chrome DevTools Protocol, these
methods will eventually be removed when WebDriver BiDi implemented.
Basic authentication
Some applications make use of browser authentication to secure pages.
It used to be common to handle them in the URL, but browsers stopped supporting this.
With this code you can insert the credentials into the header when necessary
Predicate<URI> uriPredicate = uri -> uri.toString().contains("herokuapp.com");Supplier<Credentials> authentication =UsernameAndPassword.of("admin","admin");((HasAuthentication) driver).register(uriPredicate, authentication);
var cookieCommandSettings =newSetCookieCommandSettings{
Name ="cheese",
Value ="gouda",
Domain ="www.selenium.dev",
Secure =true};await domains.Network.SetCookie(cookieCommandSettings);
Page being translated from
English to Portuguese. Do you speak Portuguese? Help us to translate
it by sending us pull requests!
The following list of APIs will be growing as the WebDriver BiDirectional Protocol grows
and browser vendors implement the same.
Additionally, Selenium will try to support real-world use cases that internally use a combination of W3C BiDi protocol APIs.
If there is additional functionality you’d like to see, please raise a
feature request.
5.1 - Browsing Context
Page being translated from
English to Portuguese. Do you speak Portuguese? Help us to translate
it by sending us pull requests!
Commands
This section contains the APIs related to browsing context commands.
A reference browsing context is a top-level browsing context.
The API allows to pass the reference browsing context, which is used to create a new window. The implementation is operating system specific.
A reference browsing context is a top-level browsing context.
The API allows to pass the reference browsing context, which is used to create a new tab. The implementation is operating system specific.
Provides a tree of all browsing contexts descending from the parent browsing context, including the parent browsing context upto the depth value passed.
const id =await driver.getWindowHandle()const window1 =awaitBrowsingContext(driver,{browsingContextId: id,})awaitBrowsingContext(driver,{type:'window'})const res =await window1.getTopLevelContexts()
const result =await browsingContext.printPage({orientation:'landscape',scale:1,background:true,width:30,height:30,top:1,bottom:1,left:1,right:1,shrinkToFit:true,pageRanges:['1-2'],})
try(Network network =newNetwork(driver)){
network.addIntercept(newAddInterceptParameters(InterceptPhase.AUTH_REQUIRED));
network.onAuthRequired(
responseDetails ->// Does not handle the alert
network.continueWithAuthNoCredentials(responseDetails.getRequest().getRequestId()));
driver.get("https://the-internet.herokuapp.com/basic_auth");
try(Network network =newNetwork(driver)){
network.addIntercept(newAddInterceptParameters(InterceptPhase.AUTH_REQUIRED));
network.onAuthRequired(
responseDetails ->// Does not handle the alert
network.cancelAuth(responseDetails.getRequest().getRequestId()));
driver.get("https://the-internet.herokuapp.com/basic_auth");