HTML Escape / Unescape

Escapes <, >, &, quotes and apostrophe to entities, and decodes back.

Overview

Useful when embedding snippets in HTML, templates, and attributes. Decoding uses the browser DOM for common entities.

Technical deep dive

The five characters that underpin HTML security

  • `<` (less than) → `&lt;`: This is the character that opens any HTML tag. Without escaping, any dynamic content can inject arbitrary tags. It should always be the first character escaped in untrusted text.
  • `>` (greater than) → `&gt;`: Closes tags. While modern browsers tolerate unescaped `>` in some contexts, escaping consistently eliminates a class of subtle injection vectors, such as `</script>` inside inline JavaScript blocks.
  • `&` (ampersand) → `&amp;`: Starts HTML entities. It must be escaped before the others — if you escape `<` before `&`, you may end up with `&lt;` being re-escaped to `&amp;lt;`. The correct order is always: `&` first.
  • `"` (double quote) → `&quot;`: Critical in double-quote-delimited attributes. `<input value="{{userInput}}">` without escaping allows a user to enter `" onclick="alert(1)` and prematurely close the attribute, injecting event handlers.
  • `'` (single quote) → `&#39;`: Relevant in single-quote-delimited attributes. Less common in HTML5 (double quotes are the standard), but frequent in PHP, Django, and Rails templates that sometimes use single quotes in dynamic attribute generation.

Stored, reflected, and DOM-based XSS: three attack vectors

  • Stored XSS (persistent): the malicious payload is stored in the database and displayed to every user who visits the page. It is the most dangerous: a single forum post or product comment can compromise all visitors. The Samy worm on MySpace was stored XSS.
  • Reflected XSS: the payload comes from a URL or parameter and is reflected immediately in the response. Example: `https://site.com/search?q=<script>stealCookies()</script>`. Requires the victim to click the malicious link, but trivial to distribute via phishing or URL shorteners.
  • DOM-based XSS: the site's own JavaScript reads a value from an untrusted source (`location.hash`, `document.cookie`, `window.name`) and writes it directly to the DOM with `innerHTML` or `document.write` without sanitization. It never touches the server — server-side analysis tools do not detect it.
  • Content Security Policy (CSP) is the second line of defense: an HTTP header that tells the browser which scripts are allowed. `Content-Security-Policy: default-src 'self'` blocks scripts from other domains and inline scripts without a nonce. It does not replace escaping, but mitigates the impact of residual XSS.
  • XSS is different from CSRF (Cross-Site Request Forgery): XSS executes code in the victim's browser in the context of the target site; CSRF forges authenticated requests without executing code. Both require separate defenses: escaping + CSP for XSS, CSRF tokens for CSRF.

Tool guide

  • What HTML is Markup for pages: tags like <p>, attributes, and entities. Special characters inside text or attributes must be escaped so they are not parsed as markup.

  • What HTML entities are Representations such as &lt; for <, &amp; for &, etc., to insert literal text without breaking the document.

  • What the tool does Converts text to entities (escape) or decodes entities back to text.

  • Why use it Embed examples in docs, templates, HTML email, or value attributes without accidentally running tags.

Code Snippets

escapeHtml() in JavaScript (why & must come first)
// WRONG: escaping < before & can create double-escaping
// Input: '<b>'
// Step 1 (< → &lt;): '&lt;b>'
// Step 2 (& → &amp;): '&amp;lt;b>' ← double-escaped!

// CORRECT: & always first
function escapeHtml(str) {
  return String(str)
    .replace(/&/g, '&amp;')   // 1. & FIRST
    .replace(/</g, '&lt;')    // 2. <
    .replace(/>/g, '&gt;')    // 3. >
    .replace(/"/g, '&quot;')  // 4. "
    .replace(/'/g, '&#39;');  // 5. '
}

// Examples:
console.log(escapeHtml('<script>alert(1)</script>'));
// → &lt;script&gt;alert(1)&lt;/script&gt;

console.log(escapeHtml('5 < 10 & "two" > \'one\''));
// → 5 &lt; 10 &amp; &quot;two&quot; &gt; &#39;one&#39;
Sanitization with DOMPurify (common configurations)
import DOMPurify from 'dompurify';

// Basic: removes everything dangerous, keeps formatting
const safeHtml = DOMPurify.sanitize(inputHtml);

// Restrictive: only text and basic formatting
const textOnly = DOMPurify.sanitize(inputHtml, {
  ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'p', 'br'],
  ALLOWED_ATTR: [],
});

// With links allowed (but javascript: blocked)
const withLinks = DOMPurify.sanitize(inputHtml, {
  ALLOWED_TAGS: ['a', 'p', 'strong', 'em', 'ul', 'ol', 'li'],
  ALLOWED_ATTR: ['href', 'title', 'target'],
  FORBID_ATTR: ['onerror', 'onload', 'onclick'],
});

// Check if input was modified by sanitization:
const original = '<p onclick="alert(1)">Text</p>';
const clean = DOMPurify.sanitize(original);
console.log(original === clean); // false → input contained malicious content

Before / after

<script> → &lt;script&gt;

FAQ

What is this tool for?

It runs fully in your browser: useful to validate, format, or convert data in everyday development.

Are my inputs sent to a server?

Processing happens locally with JavaScript. We do not store what you paste into the text areas.

Can I use this for real production data?

Use at your own risk. For secrets (passwords, tokens), prefer controlled environments and your company policies. And always review the generated contents. Never trust blindly things you see on the internet.