How to get HTML source of a Web Element in Selenium WebDriver
Shreya Bose, Technical Content Writer at BrowserStack - February 9, 2023
Before exploring how to get page source in Selenium, let’s take a quick moment to understand the key terms, such as HTML Source and Web element, which will be addressed in the following sections with code snippets and two methods.
What is HTML Source?
This refers to the HTML code underlying a certain web element on a web page. Since it is the foundation of any web page, testing HTML code in a normal browser and cross-browser testing scenarios becomes vital. Although, do not confuse this with the HTML <source> tag.
What is a Web Element?
Anything that appears on a web page is a web element. Most obviously, this refers to text boxes, checkboxes, buttons, or any other fields that display or require data from the user. Web elements can also mean the tags within the web page’s HTML code. Essentially, interaction with the HTML code is interaction with a web element. Such elements usually have unique identifiers, such as ID, name, or unique classes.
For example, to highlight text on a page, one would have to interact with the “body”, a “div” and perhaps even a “p” element.
It is common for web elements to occur within other web elements. One can use mechanisms such as XPath in Selenium or CSS Selectors to locate them. You find element by XPath in Selenium.
Read More: Quick XPath Locators Cheat Sheet
How to retrieve the HTML source of a web element using Python?
To start with, download the Python bindings for Selenium WebDriver.
- One can do this from the PyPI page for the Selenium package.
- Alternatively, one can use pip to install the Selenium package. Python 3.6 provides the pip in the standard library. Install Selenium with pip with the following syntax:
pip install selenium
It is also possible to use virtualenv to create isolated Python environments. Python 3.6 offers pyvenv which is quite similar to virtualenv.
Notes for Windows users
- Install Python 3.6 with the MSI provided in the python.org download page.
- Start a command prompt using the cmd.exe program. Then run the pip command with the syntax given below to install Selenium.
C:Python35Scriptspip.exe install selenium
Now, here’s how to get a web element:
elem = wd.find_element_by_css_selector('#my-id')
Here’s how to get the HTML source for the full page:
How to retrieve the HTML source of a web element using Selenium?
Read the innerHTML attribute to get the source of the element’s content. innerHTML is a property of a DOM element whose value is the HTML between the opening tag and ending tag.
For example, the innerHTML property in the code below carries the value “text”
<p> a text </p>
This property can use to retrieve or dynamically insert content on a web page. However, if it is used to do anything beyond inserting simple text, some differences may occur in how it operates across different browsers. It is a good practice to test your website across browsers and devices, try now.
Try Cross Browser Testing for Free
innerHTML was first implemented in Internet Explorer 5. It has been part of the standard and has existed as a property of HTMLElement and HTMLDocument since HTML 5.
Implement the innerHTML attribute to get the HTML source in Selenium with the following syntax:
Also Read: How to test Browser Compatibility for HTML5
Read the outerHTML to get the source with the current element. outerHTML is an element property whose value is the HTML between the opening and closing tags and the HTML of the selected element itself.
For example, the code’s outerHTML property carries a value that contains div and span inside that.
<div> <span>Hello there!</span> </div>
Implement the outerHTML attribute to get the HTML source in Selenium with the following syntax:
Automated selenium testing becomes more efficient and result-driven by implementing the code detailed above. Detect, with ease, the HTML source of designated web elements so that they may be examined for anomalies. Needless to say, identifying anomalies quickly leads to equally quick debugging, thus pushing out websites that provide optimal user experiences in minimal timelines.