How to Remove HTML Tags from Text in Javascript
In this tutorial, you will learn how to remove html tags from text in javascript. Before I begin this tutorial, I’m going to assume that you want to get rid of all the HTML elements in a text and just get the plain text back.
Since a string may contain several nested HTML tags, removing HTML tags from it might be challenging. As a result, there is no standard method for removing HTML tags from a text. For a newbie developer, it can be a bit tricky to remove HTML tags from text.
For this problem, there are several online solutions available. The majority of them make use of regular expressions, and if you carefully examine those solutions, you’ll notice that the regular expression’s format is not fixed. This makes it a less than ideal strategy.
I believe the best approach in this scenario would be to use DOMParser()
constructor because it can easily parse HTML or XML from a string. It returns a DOMParser
object and this object contains the parseFromString()
method. This method takes 2 parameters, a string and MIME type.
In this situation, HTML string will produce an HTML document and since it is an HTML document, we can simply edit or parse its content using DOM properties and methods.
In the following example, we have one global HTML string. I simply want to remove all HTML tags in it, grab text content and display the text content in the paragraph element. Please have a look over the code example and the steps given below.
HTML & CSS
- We have 3 elements in the HTML file (
div
,p
, andh1
). Thediv
element is just a wrapper for the rest of the elements. - The inner text for the
button
element is“Remove Tags”
. - Both the paragraph elements are empty because we will populate them by javascript. We have given them unique ids
p1
andp2
. - We have done some basic styling using CSS and added the link to our
style.css
stylesheet inside thehead
element. - We have also included our javascript file
script.js
with ascript
tag at the bottom.
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta http-equiv="X-UA-Compatible" content="ie=edge"> <link rel="stylesheet" href="style.css"> <title>Document</title> </head> <body> <div> <button>Remove Tags</button> <p id="p1"></p> <p id="p2"></p> </div> <script src="script.js"></script> </body> </html>
div { text-align: center; } button { padding: 10px 20px; } p { margin-top: 20px; }
Javascript
- We have selected the
button
element using thedocument.querySelector()
method and stored it inbtnRemove
variable. - In the same way, we have selected both paragraph elements by their ids and stored them in
p1
andp2
variables. - We have one global variable
htmlString
which contains our HTML string. - We have attached the
DOMContentLoaded
event listener to thewindow
and we are setting theinnerHTML
property of the first paragraph equal to ourhtmlString
so that we can see how our HTML string looks with HTML tags when browser parses it. - We have attached the
click
event listener to thebutton
element. - In the event handler function, we are using a
DOMParser
constructor. We are storing theDOMParser
object in theparser
variable. - Like I said above, it contains a method
parseFromString()
. We are passing it 2 parameters,htmlString
, and MIME type which in our case is text/html. We are storing the document returned by this function in thedoc
variable. - It is time to extract text content from the document’s body and display it in the second paragraph. We are using the logical OR (
||
) operator to make sure thetextContent
property of thebody
element is not empty. If it is empty, we want to display“No Content”
.
let btnRemove = document.querySelector('button'); let p1 = document.querySelector('#p1'); let p2 = document.querySelector('#p2'); let htmlString = `<p>Please remove any <u>HTML</u> Tags <strong>from</strong> this <em>text</em> quickly.`; window.addEventListener('DOMContentLoaded', () => { p1.innerHTML = htmlString; }) btnRemove.addEventListener('click', () => { let parser = new DOMParser(); let doc = parser.parseFromString(htmlString, "text/html"); p2.textContent = doc.body.textContent || "No Content"; });