HTML Entity Encoder

Convert special characters to HTML entities for safe display in HTML.

0 characters

What is HTML Entity Encoding?

HTML entity encoding converts special characters into HTML entities (text representations) that browsers can safely display. This prevents characters with special meaning in HTML from being interpreted as code.

Critical Uses of HTML Entity Encoding

  • Preventing XSS (Cross-Site Scripting) Attacks: This is the primary reason for HTML encoding. When displaying user input (comments, usernames, search queries, form data), you must encode it to prevent attackers from injecting malicious scripts. For example, without encoding, a username like <img src=x onerror=alert(document.cookie)> would execute JavaScript and steal cookies. With encoding, it becomes harmless text: &lt;img src=x onerror=alert(document.cookie)&gt;. Every web framework (Django, Rails, Laravel, ASP.NET) includes HTML encoding functions. Use them religiously.
  • Displaying User-Generated Content: Any content submitted by users (blog comments, forum posts, product reviews, chat messages, profile descriptions) must be HTML-encoded before display. Attackers will absolutely try to inject malicious code through these inputs. Even seemingly safe fields like email addresses or phone numbers can contain < > characters in attacks. Encode everything from untrusted sources.
  • Showing Code Examples and Snippets: When displaying HTML, XML, or code snippets on websites (tutorials, documentation, developer blogs), encode the code so browsers display it as text rather than rendering it. For example, to show <div>Hello</div> on a page, you must encode it as &lt;div&gt;Hello&lt;/div&gt;. Without encoding, the browser renders an actual div element instead of showing the code.
  • Encoding HTML Attribute Values: When inserting dynamic data into HTML attributes (especially href, src, onclick), encode the values to prevent attribute breaking and JavaScript injection. For example, <a href="USERDATA"> is vulnerable if USERDATA contains "> (closes the attribute) followed by malicious attributes. Encoding the quote as &quot; prevents this attack. Always encode attribute values, even for "safe" looking data.
seo.html_encode.heading_how

seo.html_encode.how_desc

  • seo.html_encode.how_step1
  • seo.html_encode.how_step2
  • seo.html_encode.how_step3
  • seo.html_encode.how_step4

seo.html_encode.how_example

HTML Entities and Web Security

HTML entities were defined in the original HTML specification to allow displaying reserved characters. The current HTML5 specification maintained by WHATWG defines the complete set of named character references. However, their critical security role emerged in the late 1990s when XSS attacks became prevalent. OWASP (Open Web Application Security Project) lists XSS in the Top 10 web vulnerabilities every year. HTML entity encoding is the primary defense: by converting < to <, malicious scripts like become harmless text: <script>alert(1)</script>. Modern web security depends on consistent, correct HTML encoding of all untrusted data before output.

Critical Security Guidelines

HTML encoding prevents XSS attacks by converting special characters to safe entities. Always encode user input before displaying it on web pages. Learn about encoding security

HTML Entity Encoding in Programming Languages

Every web framework provides HTML encoding to prevent XSS. Here are comprehensive examples:

// htmlspecialchars() - standard encoding (recommended)
$encoded = htmlspecialchars($data, ENT_QUOTES, 'UTF-8');
// Encodes: < > & " '

// htmlentities() - encodes ALL special characters
$encoded = htmlentities($data, ENT_QUOTES, 'UTF-8');
// Encodes: < > & " ' plus accented characters, etc.

// For attribute values (always use ENT_QUOTES)
echo '<div title="' . htmlspecialchars($userInput, ENT_QUOTES, 'UTF-8') . '">';

// Laravel Blade templates (auto-encoding)
// {{ $userInput }} - automatically HTML encoded
// Browser: textContent (automatic encoding - recommended)
element.textContent = userInput; // Safe, auto-encoded

// Creating elements safely
const div = document.createElement('div');
div.textContent = userInput; // Safe

// Manual encoding function
function htmlEncode(str) {
    const div = document.createElement('div');
    div.textContent = str;
    return div.innerHTML;
}

// Library: DOMPurify (for sanitizing HTML)
const clean = DOMPurify.sanitize(dirtyHTML);
import html

# html.escape() - standard encoding
encoded = html.escape(userInput)
# Encodes: < > &

# With quote encoding
encoded = html.escape(userInput, quote=True)
# Encodes: < > & " '

# Django templates (auto-encoding)
# {{ user_input }} - automatically HTML encoded
# {{ user_input|safe }} - NO encoding (dangerous!)

# Jinja2 templates (auto-encoding)
# {{ user_input }} - automatically encoded
import "html"

// html.EscapeString() - standard encoding
encoded := html.EscapeString(userInput)
// Encodes: < > & " '

// In Go templates (html/template package auto-encodes)
tmpl := template.Must(template.New("page").Parse("<p>{{.}}</p>"))
tmpl.Execute(w, userInput) // Auto-encoded safely
// Apache Commons Text (recommended)
import org.apache.commons.text.StringEscapeUtils;
String encoded = StringEscapeUtils.escapeHtml4(userInput);
// Encodes: < > & " '

// OWASP Java Encoder (most secure)
import org.owasp.encoder.Encode;
String encoded = Encode.forHtml(userInput);

// Spring MVC (auto-encoding in JSP)
// <c:out value="${userInput}"/> - auto-encoded
require 'cgi'

# CGI.escapeHTML() - standard encoding
encoded = CGI.escapeHTML(user_input)
# Encodes: < > & " '

# Rails (ERB templates auto-encode)
# <%= user_input %> - automatically HTML encoded
# <%== user_input %> - NO encoding (dangerous!)

# html_safe marker (tells Rails string is safe)
# safe_html.html_safe - skips encoding
using System.Web;
using System.Net;

// HttpUtility.HtmlEncode() - standard encoding
string encoded = HttpUtility.HtmlEncode(userInput);
// Encodes: < > & " '

// WebUtility (no System.Web dependency)
string encoded = WebUtility.HtmlEncode(userInput);

// Razor views (auto-encoding)
// @Model.UserInput - automatically HTML encoded
// @Html.Raw(Model.UserInput) - NO encoding (dangerous!)

Related Tools

Need to decode HTML entities? Use our HTML Entity Decoder to convert < > & back to characters.

Encoding data for URLs? Try our URL Encoder to make text URL-safe.

Encoding binary data? Use our Base64 Encoder for binary-to-text conversion.