Markdown to HTML

Build an HTML fragment from simple Markdown: headings, lists, bold, links, inline code. For docs and web content.

{{ markdownToHtml.message }}

{{ t("markdownToHtmlHint") }}

Overview

John Gruber created Markdown in 2004, in collaboration with Aaron Swartz — the same Aaron Swartz who co-founded Reddit, developed RSS 1.0, and was one of the most significant activists for internet openness. The idea was to solve a problem every technical writer knew: HTML is visual noise when you try to read the plain text. An article with `<p>`, `<strong>`, and `<a href=...>` is still text, but the tags pollute the reading. Gruber wanted a format that would be readable as plain text and also convertible to HTML without effort — and he largely succeeded.

Adoption was quiet but transformative. GitHub adopted Markdown for READMEs in 2009 and the format's status changed: from then on, every open-source repository had readable documentation. Stack Overflow, Reddit, Discord, Notion, Obsidian, Jira — in 2024 it is easier to list tools that do not support Markdown than those that do. The format became the lingua franca of technical documentation. Dialects emerged: CommonMark (formal spec from 2014), GitHub Flavored Markdown (with tables and task lists), MDX (Markdown with JSX components). This tool implements the basic CommonMark syntax.

The output is an HTML fragment — no `<html>`, `<head>`, or `<body>`. If you plan to display that HTML directly in the browser with third-party content, watch for XSS: a link like `[click](javascript:alert(1))` is valid Markdown and produces an executable `href`. In production with untrusted content, always pass the generated HTML through a sanitization library like DOMPurify before inserting it into the DOM.

Technical deep dive

CommonMark, GitHub Flavored Markdown, and MDX: one format, many dialects

  • The original Markdown specification (2004) was deliberately informal — a text file with examples, not a formal grammar. This led to years of incompatibilities between different parsers: text that worked on GitHub would not render the same way on Reddit or in Pandoc. In 2012, Jeff Atwood (founder of Stack Overflow) and John MacFarlane started the project that became CommonMark: a formal specification with over 600 conformance tests.
  • GitHub Flavored Markdown (GFM) is a superset of CommonMark that adds tables (with columns delimited by `|`), task lists (`- [x]`), strikethrough (`~~text~~`), user mentions (`@name`), and issue references (`#123`). Most modern tools support GFM or CommonMark as a baseline.
  • MDX emerged as a solution for the React ecosystem: it allows writing JSX components directly inside Markdown files, making it possible to use `<Chart data={data} />` or `<Callout>Warning</Callout>` inside documentation. Used by Next.js, Docusaurus, Astro, and Gatsby.
  • The most widely used JavaScript parsers are: marked.js (simple, fast, no dependencies), markdown-it (extensible, plugin support), commonmark.js (reference implementation of the CommonMark spec), and remark (unified plugin ecosystem for processing and transforming Markdown as an AST).
  • Pandoc is the universal document converter for the command line. It supports Markdown to HTML, PDF (via LaTeX), DOCX, EPUB, and over 60 formats. It is the right choice when you need full spec support, footnotes, bibliographic citations, or complex tables that simpler tools do not cover.

HTML sanitization: why Markdown output can be dangerous

  • Most Markdown parsers allow raw inline HTML in the document. A block like `<script>alert('xss')</script>` inside a Markdown file is technically valid and will be passed directly to the HTML output. If that output is displayed without sanitization, the script executes in the visitor's browser.
  • The most common attack vector is through links: `[click here](javascript:alert(1))` is valid Markdown and produces `<a href="javascript:alert(1)">`. Some parsers have an option to block URLs with the `javascript:` scheme, but it is not the default. Check your parser's documentation.
  • DOMPurify is the most widely used client-side library for sanitizing HTML in the browser. It removes scripts, event attributes like `onclick`, `javascript:` URLs, and other XSS vectors, while preserving safe formatting. It is lightweight, fast, and actively maintained: `DOMPurify.sanitize(html)`.
  • On the server side, the options are: sanitize-html (Node.js, highly configurable — allows defining which tags and attributes are permitted), Bleach (Python, originally developed by Mozilla), HtmlSanitizer (.NET), and HTMLPurifier (PHP). Use server-side sanitization when the HTML is persisted in a database or sent by email.
  • When you can skip sanitization: if the Markdown is written exclusively by trusted authenticated members (authors of your own CMS), if you control all the content and it never comes from third parties, or if the generated HTML is used only in an internal context (PDF, email to a closed list). When in doubt, sanitize always.

Code Snippets

Markdown syntax reference (CommonMark)
# Heading h1
## Heading h2
### Heading h3

Paragraph with **bold**, *italic*, and `inline code`.

- Unordered list item
- Another item
  - Sub-item with 2 spaces of indentation

1. Numbered list
2. Second item

[Link text](https://example.com)
![Image alt](path/image.png)

> Block quote

```javascript
// Code block with syntax highlighting
const x = 42;
```

---
<!-- Horizontal separator -->
marked.js + DOMPurify (safe browser sanitization)
import { marked } from 'marked';
import DOMPurify from 'dompurify';

// marked configuration
marked.setOptions({
  gfm: true,     // GitHub Flavored Markdown
  breaks: false, // Single line break does not become <br>
});

function markdownToSafeHtml(markdown) {
  // 1. Convert Markdown to HTML
  const rawHtml = marked.parse(markdown);

  // 2. Sanitize HTML (removes scripts, javascript:, onclick, etc.)
  const safeHtml = DOMPurify.sanitize(rawHtml, {
    ALLOWED_TAGS: ['p', 'strong', 'em', 'code', 'pre', 'blockquote',
                   'ul', 'ol', 'li', 'a', 'h1', 'h2', 'h3', 'h4',
                   'img', 'hr', 'br'],
    ALLOWED_ATTR: ['href', 'src', 'alt', 'title', 'class'],
  });

  return safeHtml;
}

// Usage:
document.getElementById('content').innerHTML =
  markdownToSafeHtml(inputText);

Sample

# Guia

- passo **um**
- [site](https://example.com)

FAQ

What is this tool for?

It runs fully in your browser: useful to validate, format, or convert data in everyday development.

Are my inputs sent to a server?

Processing happens locally with JavaScript. We do not store what you paste into the text areas.

Can I use this for real production data?

Use at your own risk. For secrets (passwords, tokens), prefer controlled environments and your company policies. And always review the generated contents. Never trust blindly things you see on the internet.