How to Clean Messy Data in Excel
Transform chaotic spreadsheets into analysis-ready data. Learn essential data cleaning techniques including removing duplicates, fixing formatting, and standardizing entries.
## Removing Duplicates
### Quick Method: Remove Duplicates Tool
1. Select your data range (including headers)
Excel shows how many duplicates were removed and how many unique values remain.
### Formula Method: Identify Before Deleting
Sometimes you want to see duplicates before removing them. Use COUNTIF:
=COUNTIF($A$2:$A$1000, A2) > 1
This returns TRUE for duplicate values. Filter to show only TRUE, then decide what to delete.
## Fixing Text Issues
### Remove Extra Spaces: TRIM
The TRIM function removes:
=TRIM(A2)### Standardize Capitalization
=UPPER(A2) → JOHN SMITH
=LOWER(A2) → john smith
=PROPER(A2) → John Smith
### Remove Non-Printable Characters: CLEAN
=CLEAN(A2)
Removes ASCII characters 0-31 (non-printable characters that sometimes sneak in from imports).
### The Ultimate Cleanup Combo
Combine TRIM and CLEAN:
=TRIM(CLEAN(A2))
## Extracting and Splitting Data
### Text to Columns
When names are "LastName, FirstName" and you need separate columns:
1. Select the column 2. Go to Data > Text to Columns 3. Choose Delimited 4. Select comma as delimiter 5. Finish
### Formula Extraction
Extract first name from "FirstName LastName":
=LEFT(A2, FIND(" ", A2) - 1)
Extract last name:
=RIGHT(A2, LEN(A2) - FIND(" ", A2))
With TEXTSPLIT (Excel 365):
=TEXTSPLIT(A2, " ")
## Fixing Date Problems
### Dates Stored as Text
When dates look like dates but Excel treats them as text:
Method 1: Multiply by 1
=A2 * 1
Method 2: DATEVALUE
=DATEVALUE(A2)
Method 3: Text to Columns 1. Select the date column 2. Data > Text to Columns 3. Next > Next 4. Choose Date format (MDY, DMY, etc.) 5. Finish
### Inconsistent Date Formats
When you have "1/5/2024" and "January 5, 2024" mixed:
Create a helper column with:
=TEXT(DATEVALUE(A2), "YYYY-MM-DD")
## Find and Replace Magic
Ctrl + H opens Find and Replace. It's more powerful than most realize.
### Remove All Spaces
Find: (one space) Replace: (empty)
### Standardize Abbreviations
Find: Street Replace: St
Find: Avenue Replace: Ave
### Remove Line Breaks
Find: Ctrl + J (inserts line break character) Replace: (space or empty)
### Use Wildcards
Enable "Use wildcards" option:
- * matches any sequence of characters
? matches any single character~ escapes special charactersFind: Mr.*
## Handling Blank Cells
### Find and Select Blanks
1. Select your data range 2. Press F5 (or Ctrl+G) > Special 3. Choose "Blanks" 4. All blank cells are selected
Now you can:
### Fill Blanks with Value Above
1. Select data range
= and click the cell above (or press Up Arrow)
4. Press Ctrl + Enter## Number Issues
### Text to Numbers
When numbers have the green triangle warning: 1. Select the range 2. Click the warning icon 3. Choose "Convert to Number"
Or use formula:
=VALUE(A2)
### Remove Number Formatting Symbols
Clean currency, percentages, etc.:
=VALUE(SUBSTITUTE(SUBSTITUTE(A2, "$", ""), ",", ""))
## Power Query: The Professional Choice
For complex or recurring cleaning tasks, Power Query is the answer.
### Access Power Query
1. Select your data 2. Go to Data > From Table/Range 3. Power Query Editor opens
### Common Power Query Transformations
- Remove duplicates: Right-click column header
### Why Power Query?
1. Repeatable: Steps are recorded and can be applied to new data
## Data Validation: Prevent Future Messes
After cleaning, prevent new errors:
### Dropdown Lists
1. Select target cells 2. Data > Data Validation 3. Allow: List 4. Source: Your list of valid options
### Number Ranges
1. Data > Data Validation 2. Allow: Whole number (or Decimal) 3. Set minimum and maximum
### Date Ranges
Restrict to valid date ranges to prevent typos.
## Automation with Macros
If you clean similar data repeatedly, record a macro:
1. Go to View > Macros > Record Macro 2. Perform your cleaning steps 3. Stop recording 4. Run the macro on new data
## Final Steps
After cleaning:
1. Spot check: Manually review a sample 2. Validate: Use COUNTBLANK, COUNTIF to verify 3. Document: Note what you did (for next time) 4. Backup: Keep original data in a separate sheet
## Conclusion
Data cleaning is where real analysis begins. A well-cleaned dataset reveals insights that messy data hides. Master these techniques, and you'll spend less time fighting your data and more time finding answers in it.
Start with the basics: TRIM, Remove Duplicates, Find & Replace. Graduate to Power Query for complex transformations. Your future self—and anyone else who uses your spreadsheets—will thank you.