Selenium WebDriver: Introduction (Part 1)
Selenium WebDriver is a standard way to automate browsers: you write code that opens a browser, finds elements, clicks, types, and asserts. This post (Part 1) introduces WebDriver and how to write your first script.
What is Selenium WebDriver?
WebDriver is an API that talks to a browser (Chrome, Firefox, Edge, etc.) via a driver (e.g. ChromeDriver). Your code sends commands (navigate, find element, click, get text); the browser executes them. You can use it from Java, Python, C#, JavaScript, and others.
Core concepts
- Driver: You create a driver instance for a browser (e.g. Chrome). It opens the browser and accepts commands.
- Find element: Locate an element by id, CSS selector, XPath, etc. (e.g.
find_element(By.ID, "login-btn")). - Actions: Click, send keys (type), clear, submit. You act on the element you found.
- Assertions: Get text, attribute, or state and assert in your test framework (JUnit, pytest, etc.).
First script (concept)
- Start the driver (e.g. Chrome).
- Navigate to a URL (
driver.get(url)). - Find an element (e.g. by id or CSS).
- Perform an action (e.g. click, type).
- Assert (e.g. check text or URL).
- Quit the driver.
Part 2 covers locators, waits, and best practices (Page Object, explicit waits).
Summary
- WebDriver = API to drive a browser; you find elements, perform actions, and assert.
- Use a driver for your language (Java, Python, etc.) and browser (Chrome, Firefox, etc.).
- Part 2 covers locators, waits, and structure (Page Object) for maintainable tests.