Remove extra spaces

Collapse runs of spaces or tabs to one; optional per-line trim.

Overview

Extra spaces appear in text for reasons we rarely control. When copying from a PDF, the extraction program frequently inserts multiple spaces where there was a larger typographic space or a tab. When pasting from an HTML page, the editor may preserve multiple spaces that the browser would automatically collapse during rendering. Files processed by OCR (optical character recognition) are especially prone to this: the algorithm sometimes separates words with two or three spaces because the letters were slightly apart in the scanned image.

What we call a space in digital text is actually an entire family of Unicode characters. The common space (U+0020) is what you type with the spacebar. The non-breaking space (U+00A0) is what some word processors insert automatically between a number and its unit of measure to prevent awkward line breaks. The thin space (U+2009) is used in typography to separate groups of digits in large numbers. Each of these spaces has a different code and most trim and collapse tools handle only U+0020 — which means that stubborn space that just won't go away might be one of the other types.

The decision to collapse spaces carries unexpected consequences in some contexts. Python source code is indentation-sensitive, so collapsing tabs and spaces at the start of lines would break the program. YAML and Makefile config files also use spaces with structural meaning. For prose text, normalization is generally safe. For code snippets that need to run, extra caution is required. This tool was designed for text — not for code.

Technical deep dive

Space characters that go unnoticed

  • U+0020 (common space): the spacebar space, the only one most trim and collapse tools recognize by default.
  • U+00A0 (non-breaking space): prevents line breaks between adjacent words. Invisible to the naked eye but causes surprises in string comparisons — '10 kg' with a non-breaking space is not equal to '10 kg' with a regular space.
  • U+2009 (thin space): used in typography to separate digit groups (1 000 000) and between a number and its SI unit symbol. PDFs generated by desktop publishing tools are full of this character.
  • U+3000 (ideographic space): has the width of a CJK character and appears in Asian documents. When copying Japanese or Chinese text containing this space, it can mix into the content.
  • U+FEFF (BOM / zero-width no-break space): technically not a space, but it appears invisibly at the start of UTF-8 files with BOM and causes silent errors in JSON and XML parsers.

When collapsing spaces can break things

  • Python, YAML, and Makefile: indentation has syntactic meaning. Collapsing or converting tabs to spaces in these languages changes the program's behavior.
  • Markdown: two trailing spaces create a hard line break. Removing those spaces changes the rendered layout.
  • SQL strings: spaces inside SQL strings ('John Smith') are part of the data and should not be collapsed along with formatting whitespace.
  • Regular expressions: patterns like ' ' (two spaces) are literal — collapsing spaces in the expression changes what it matches.
  • Space-delimited CSV: rare but real. Collapsing multiple spaces in space-delimited CSV files can merge columns.

Tool guide

  • What you are working with Text with repeated spaces or tabs, common when copying from PDF or the web.

  • What the tool does Optionally collapses runs of spaces/tabs to a single space and can trim each line.

  • Why use it Normalise pasted paragraphs, prep space-separated data or CSV, improve readability without manual edits.

Code Snippets

Collapse spaces and tabs in JavaScript
// Collapse multiple spaces or tabs into one
const result = text
  .split('\n')
  .map(line => line.replace(/[ \t]+/g, ' ').trim())
  .join('\n');
Handle non-breaking spaces (U+00A0)
// Normalize non-breaking spaces before collapsing
const normalized = text
  .replace(/\u00A0/g, ' ')   // NBSP → common space
  .replace(/[ \t]+/g, ' ') // collapse runs
  .trim();

Before

a    b  	 c  → a b c

FAQ

What is this tool for?

It runs fully in your browser: useful to validate, format, or convert data in everyday development.

Are my inputs sent to a server?

Processing happens locally with JavaScript. We do not store what you paste into the text areas.

Can I use this for real production data?

Use at your own risk. For secrets (passwords, tokens), prefer controlled environments and your company policies. And always review the generated contents. Never trust blindly things you see on the internet.