What is Selenium? Architecture & Components

In today’s fast‑paced digital world, websites and web applications are updated frequently. Manually testing these changes is time‑consuming, error‑prone, and simply not scalable. This is where Selenium comes in.

Selenium is a widely used open‑source automation testing framework designed specifically for automating web browsers. It allows testers and developers to simulate real user interactions – clicking buttons, filling forms, navigating pages, and verify that web applications behave as expected.

However, a programming language cannot “talk” to a browser directly. They speak different languages. To make automation possible, Selenium uses a sophisticated Architecture that acts as a translator and messenger.

At its core, Selenium WebDriver is an API (Application Programming Interface) that acts as a bridge between your code and the browser. However, the way this bridge is built changed significantly between Selenium 3 and Selenium 4.

For many years, Selenium 3 was the standard. Its architecture relied heavily on a “middleman” to ensure communication between your code and the browser.

In Selenium 3, the code written in languages like Java or Python and the browser drivers like ChromeDriver or GeckoDriver didn’t communicate using a common language.

  • The Problem: Your code sends requests, but the browser driver expects a different format.
  • The Solution: Selenium used a protocol called the JSON Wire Protocol to translate these requests.

To make this work, four distinct components interacted with each other:

  • Selenium Client Libraries (Language Bindings): This is the layer where you work. Selenium provides libraries for languages like Java, Ruby, Python, and C#. These libraries define the commands (e.g., driver.get() ) that you use in your script.
  • JSON Wire Protocol (The Middleman): This is the most critical component of Selenium 3. Since the Client Libraries and Browser Drivers were not standardized, this protocol acted as a REST API that transferred data. It was responsible for Encoding (wrapping data into JSON format) and Decoding (unwrapping it) so the driver could understand it.
  • Browser Drivers: These are secure executables provided by browser vendors (e.g., chromedriver.exe for Google Chrome). In Selenium 3, these drivers acted as HTTP servers that received the encoded JSON requests.
  • Real Browsers: The actual application (Chrome, Firefox, Safari) where the automation is executed.

In Selenium 3, the data flow was indirect, often described as the “Telephone Game”:

  1. Request: You write a command in your code.
  2. Marshalling (Encoding): The Client Library converts this command into JSON format.
  3. Transmission: The JSON is sent over HTTP to the Browser Driver.
  4. Unmarshalling (Decoding): The Browser Driver receives the JSON, decodes it, and translates it into a native browser command.
  5. Execution: The browser performs the action.
  6. Response: The process is repeated in reverse to send the result back to your code.

The Downside: This constant encoding/decoding created “overhead,” making tests slightly slower and occasionally leading to “flaky” tests (random failures).

Selenium 3 architecture

Selenium 4 introduced a revolutionary change by adopting the W3C (World Wide Web Consortium) Standard. This is the modern architecture used today.

In Selenium 4, the “middleman” (JSON Wire Protocol) was removed entirely.

  • The Upgrade: Both the Selenium Client Libraries and the Browser Drivers now adhere to the same W3C standards.
  • The Result: They speak the same language natively. There is no need for translation (encoding/decoding).

While the names of the components look similar, their behavior has changed:

Selenium Client Libraries: Updated to communicate directly using the W3C standard. They no longer package data into the old JSON Wire format.

The W3C WebDriver Protocol: Instead of a separate “translation layer,” this is now the common language. It is not a physical component but a set of rules that both the Client and the Driver strictly follow. This ensures that a command sent is exactly what the Driver expects to receive.

Browser Drivers: These are still the server executables (like ChromeDriver), but they now accept commands directly without needing to decode a custom Selenium protocol.

Real Browsers: Modern browsers are built to be W3C compliant, ensuring smoother execution of commands.

The flow in Selenium 4 is direct and streamlined:

  1. Request: You write a command in your code.
  2. Direct Transmission: The Client Library sends the command strictly following W3C protocols via HTTP. No encoding/decoding takes place.
  3. Execution: The Browser Driver accepts the request (as it understands the language natively) and instructs the browser to act.
  4. Response: The status is sent back directly to the code.

The Benefit: This direct communication makes Selenium 4 faster, more stable, and easier to debug than Selenium 3.

Selenium 4 architecture

To make the difference easier to understand, let’s look at one simple command in both architectures. Don’t worry if you don’t know what the command does, this is just to show how information moves.

// Java Code to open a website
driver.get("https://www.google.com");
  1. Trigger: The Java script executes the get command.
  2. Translation (Client Side): The API creates a JSON payload: { "url": "https://www.google.com" }.
  3. Transmission: This JSON is sent to the chromedriver server.
  4. Translation (Driver Side): The chromedriver receives the package, unpacks (decodes) the JSON, and identifies the intent.
  5. Action: The driver triggers the internal command to open the URL in Chrome.
  1. Trigger: The Java script executes the get command.
  2. Handshake: The Client Library knows chromedriver is W3C compliant.
  3. Transmission: The command is sent as a standard W3C HTTP request. There is no JSON payload wrapping/unwrapping for protocol translation.
  4. Action: The chromedriver immediately recognizes the standard request and triggers the browser to open the URL.
Liked the article? Share this on

Leave a Comment

Your email address will not be published. Required fields are marked *