Remove accents

Strip combining marks for ASCII-ish output (ã→a, é→e). Runs locally.

Overview

Unicode, the standard that defines how computers represent text in virtually every language in the world, was created in 1991 by a consortium that included Apple, Microsoft, IBM, and others. Before Unicode, each company or region used mutually incompatible character tables: Brazil used ISO 8859-1 (Latin-1) to represent ã, é, ç and other Portuguese accents; Japan used Shift-JIS for Japanese; Arab countries had their own standards. Transferring a text file between systems with different encodings produced the dreaded mojibake — those strange unreadable characters that appeared when ã turned into something like ã or ã. Unicode solved this elegantly: every character in any language has a unique code point, from U+0000 to U+10FFFF, and the most common implementations today are UTF-8 (the web standard) and UTF-16 (used internally by JavaScript and Java).

Stripping accents — or normalizing to ASCII — solves a specific set of problems that did not disappear with Unicode. Legacy systems and older databases sometimes have columns indexed in ASCII without proper collation support, where searches with accents return different results from searches without. Human-friendly URLs typically convert accented titles to unaccented versions: `Ação pré-venda` becomes `acao-pre-venda` as a slug. Comparing proper names in registration systems where one user typed `João` and another typed `Joao` — without normalization, these are different strings. CSV exports for systems that do not properly support UTF-8. Form inputs where field validators only accept `[a-z0-9]` using simple regex. In these contexts, removing diacritics is the pragmatic solution.

The correct approach is Unicode NFD (Canonical Decomposition) normalization: each composed character like `ã` is decomposed into two separate code points — the base `a` (U+0061) and the combining tilde (U+0303). After decomposition, filtering all Unicode category Mn (Mark, Nonspacing) characters removes the diacritics, leaving only the base letters. In JavaScript: `str.normalize('NFD').replace(/\p{Mn}/gu, '')`. This technique works perfectly for Portuguese and Spanish. One notable exception: `ç` is a composed character and will be simplified to `c` — correct for URLs, but incorrect if you need text in French or Turkish where ç carries its own semantic value. This tool applies NFD normalization and removes combining marks; manually review cases where specific letters should be preserved.

Technical deep dive

Common questions summarized

  • What is this tool for?: It runs fully in your browser: useful to validate, format, or convert data in everyday development.
  • Are my inputs sent to a server?: Processing happens locally with JavaScript. We do not store what you paste into the text areas.
  • Can I use this for real production data?: Use at your own risk. For secrets (passwords, tokens), prefer controlled environments and your company policies. And always review the generated contents. Never trust blindly things you see on the internet.

Sample payload to try

  • See also the larger "Code Snippets" sample; paste this excerpt to try locally: Input — Ação pré-renovação → Acao pre-renovacao

Tool guide

  • What diacritics are Marks such as acute, tilde, or cedilla that change pronunciation or meaning.

  • What the tool does Applies Unicode NFD and strips combining marks (e.g. á → a).

  • Why use it Normalize text for simple search, legacy ASCII-ish pipelines, or slugs. Proofread when correct spelling matters (proper names).

Code Snippets

Code example
Ação pré-renovação → Acao pre-renovacao

Input

Ação pré-renovação → Acao pre-renovacao

FAQ

What is this tool for?

It runs fully in your browser: useful to validate, format, or convert data in everyday development.

Are my inputs sent to a server?

Processing happens locally with JavaScript. We do not store what you paste into the text areas.

Can I use this for real production data?

Use at your own risk. For secrets (passwords, tokens), prefer controlled environments and your company policies. And always review the generated contents. Never trust blindly things you see on the internet.