Decodificador de Entidades HTML Gratuito Online
Decodifica entidades HTML de vuelta a sus caracteres originales.
¿Qué es la Decodificación de Entidades HTML?
La decodificación de entidades HTML convierte entidades HTML (como <, >, &) de vuelta a sus caracteres originales. Esto es útil cuando necesitas convertir HTML codificado de vuelta a texto legible o código HTML real.
Common Uses of HTML Entity Decoding
- Web Scraping and Data Extraction: When scraping web pages or parsing HTML documents, extracted text sometimes contains HTML entities. Decode these entities to get clean, readable text for analysis, search indexing, or storage. For example, scraping a product description that contains "5" screen" should decode to "5" screen".
- API Response Processing: Some APIs return HTML-entity-encoded data in JSON or XML responses. When consuming these APIs, decode the encoded strings to get usable text. For example, an API might return {"title": "Q&A Session"}, which needs decoding to "Q&A Session" for display or further processing.
- Content Migration and Data Import: When migrating content between systems (moving from WordPress to another CMS, importing blog posts from XML, converting legacy databases), content is often stored with HTML entities. Decoding these entities to plain text or properly formatted HTML is necessary for the migration. For example, blog titles stored as "How to Use <div> Tags" need decoding to "How to Use <div> Tags" for display or search indexing.
- seo.html_decode.use_debugging seo.html_decode.use_debugging_desc
¿Cómo Funciona la Decodificación de Entidades HTML?
La decodificación de entidades HTML revierte el proceso de codificación al convertir referencias de entidades de vuelta a sus caracteres originales. El decodificador reconoce tanto entidades nombradas (como &) como entidades numéricas (como & o &). Este es el proceso técnico:
- Paso 1 - Escanear referencias de entidades: El decodificador lee el texto buscando caracteres ampersand (&) que comienzan referencias de entidades, terminando con punto y coma (;).
- Paso 2 - Identificar el tipo de entidad: Las entidades nombradas (ej., &, <, >) se buscan en la tabla de entidades HTML. Las entidades numéricas que comienzan con se analizan como decimales; las que comienzan con se analizan como hexadecimales.
- Paso 3 - Convertir a carácter: Cada entidad se reemplaza con su carácter Unicode correspondiente. Por ejemplo, & se convierte en &, < se convierte en <, é se convierte en é.
- Paso 4 - Pasar texto no-entidades: El texto que no coincide con ningún patrón de entidad se deja sin cambios. Las entidades inválidas o desconocidas pueden dejarse como están o manejarse según las reglas de análisis HTML.
Ejemplo: "<p>Hello & welcome</p>" se decodifica a "
Hello & welcome
". Las entidades nombradas se reemplazan con sus caracteres correspondientes.Understanding HTML Entity Decoding
HTML entity decoding interprets both named entities (< > &) and numeric entities (< > & in decimal, < > & in hexadecimal). The decoder recognizes entity patterns (& followed by entity name or number, terminated by ;), looks up the corresponding character in the HTML entity table, and replaces the entity with the actual character. Named entities are defined by HTML specifications (over 2,000 entities including mathematical symbols, Greek letters, and special characters). Numeric entities can represent any Unicode character by code point.
Security Warnings for HTML Decoding
HTML decoding converts entities back to special characters. Never decode and display untrusted content without re-encoding it first, as this can enable XSS attacks. Learn about encoding security
HTML Entity Decoding in Programming Languages
HTML entity decoding is available in all major programming languages. Here are comprehensive examples:
// html_entity_decode() - standard decoding
$decoded = html_entity_decode($encoded, ENT_QUOTES, 'UTF-8');
// Decodes: < > & " ' and all HTML entities
// htmlspecialchars_decode() - decodes only basic entities
$decoded = htmlspecialchars_decode($encoded, ENT_QUOTES);
// Decodes: < > & " ' only
// Example: Processing API response
$apiData = json_decode($response, true);
$title = html_entity_decode($apiData['title'], ENT_QUOTES, 'UTF-8');
// Safe decoding using DOMParser
function htmlDecode(str) {
const parser = new DOMParser();
const doc = parser.parseFromString(str, 'text/html');
return doc.documentElement.textContent;
}
// Using textarea element (alternative)
function htmlDecode(str) {
const textarea = document.createElement('textarea');
textarea.innerHTML = str;
return textarea.value;
}
// WARNING: Never use innerHTML with untrusted data
// element.innerHTML = encoded; // DANGEROUS if untrusted
import html
# html.unescape() - decodes all HTML entities
decoded = html.unescape(encoded)
# Decodes: < > & < < etc.
# Example: Processing API response
import json
data = json.loads(response)
title = html.unescape(data['title'])
# BeautifulSoup (for HTML parsing with auto-decode)
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
text = soup.get_text() # Automatically decodes entities
import "html"
// html.UnescapeString() - decodes HTML entities
decoded := html.UnescapeString(encoded)
// Decodes: < > & " ' etc.
// Example: Processing XML feed
type RSSItem struct {
Title string `xml:"title"`
Description string `xml:"description"`
}
// Decode after parsing
item.Title = html.UnescapeString(item.Title)
// Apache Commons Text (recommended)
import org.apache.commons.text.StringEscapeUtils;
String decoded = StringEscapeUtils.unescapeHtml4(encoded);
// Decodes all HTML4 entities
// Example: Processing API response
import com.google.gson.Gson;
ApiResponse response = gson.fromJson(json, ApiResponse.class);
String decodedTitle = StringEscapeUtils.unescapeHtml4(response.getTitle());
// Jsoup (for HTML parsing with auto-decode)
import org.jsoup.Jsoup;
String decoded = Jsoup.parse(encoded).text();
require 'cgi'
# CGI.unescapeHTML() - decodes HTML entities
decoded = CGI.unescapeHTML(encoded)
# Decodes: < > & " etc.
# Nokogiri (for HTML parsing with auto-decode)
require 'nokogiri'
doc = Nokogiri::HTML(html_content)
text = doc.text # Automatically decodes entities
# Example: Processing RSS feed
require 'rss'
rss = RSS::Parser.parse(feed_content)
rss.items.each do |item|
title = CGI.unescapeHTML(item.title)
end
using System.Web;
using System.Net;
// HttpUtility.HtmlDecode() - decodes HTML entities
string decoded = HttpUtility.HtmlDecode(encoded);
// Decodes all HTML entities
// WebUtility (no System.Web dependency)
string decoded = WebUtility.HtmlDecode(encoded);
// Example: Processing API response
var jsonData = JsonConvert.DeserializeObject<ApiResponse>(json);
string decodedTitle = HttpUtility.HtmlDecode(jsonData.Title);
Related Tools
Need to encode text to HTML entities? Use our HTML Entity Encoder to convert < > & to < > & for safe display.
Decoding URL-encoded strings? Try our URL Decoder to convert %XX sequences back to characters.
Decoding Base64 data? Use our Base64 Decoder to convert Base64 strings to original text or binary.