Address Normalization Fundamentals
Address normalization converts addresses to a standard format for consistent storage, comparison, and matching. Without normalization, the same address can appear differently in your database.
Why Normalize?
The same physical location can be written many ways:
1123 Main Street, Apartment 4B, New York, NY 100012123 Main St., Apt. 4B, New York, NY 100013123 MAIN ST APT 4B NEW YORK NY 100014123 main street, apartment 4-b, new york, ny 10001
Without normalization:
- Duplicate detection fails
- Address matching is unreliable
- Database searches miss valid matches
- Shipping systems may treat these as different addresses
USPS Publication 28
The US Postal Service defines address standards in Publication 28. Key rules:
Casing
- All uppercase (preferred for mail)
- Or consistent title case for display
Street Type Abbreviations
| Full Name | Abbreviation |
|---|---|
| Avenue | AVE |
| Boulevard | BLVD |
| Circle | CIR |
| Court | CT |
| Drive | DR |
| Highway | HWY |
| Lane | LN |
| Parkway | PKWY |
| Place | PL |
| Road | RD |
| Square | SQ |
| Street | ST |
| Terrace | TER |
| Trail | TRL |
| Way | WAY |
Directional Abbreviations
| Full | Abbreviation |
|---|---|
| North | N |
| South | S |
| East | E |
| West | W |
| Northeast | NE |
| Northwest | NW |
| Southeast | SE |
| Southwest | SW |
Unit Designators
| Full | Abbreviation |
|---|---|
| Apartment | APT |
| Building | BLDG |
| Department | DEPT |
| Floor | FL |
| Room | RM |
| Suite | STE |
| Unit | UNIT |
Normalization Steps
Step 1: Case Normalization
javascript1function normalizeCase(address) {2 return address.toUpperCase();3}45normalizeCase('123 Main Street');6// '123 MAIN STREET'
Step 2: Street Type Normalization
javascript1const STREET_TYPES = {2 'STREET': 'ST',3 'AVENUE': 'AVE',4 'BOULEVARD': 'BLVD',5 'DRIVE': 'DR',6 'LANE': 'LN',7 'ROAD': 'RD',8 'COURT': 'CT',9 'CIRCLE': 'CIR',10 'PLACE': 'PL',11 'TERRACE': 'TER',12 'HIGHWAY': 'HWY',13 'PARKWAY': 'PKWY',14 'WAY': 'WAY'15};1617function normalizeStreetType(street) {18 let normalized = street.toUpperCase();1920 for (const [full, abbrev] of Object.entries(STREET_TYPES)) {21 // Match at word boundary22 const pattern = new RegExp(`\\b${full}\\b`, 'g');23 normalized = normalized.replace(pattern, abbrev);24 }2526 return normalized;27}2829normalizeStreetType('123 MAIN STREET');30// '123 MAIN ST'
Step 3: Directional Normalization
javascript1const DIRECTIONALS = {2 'NORTH': 'N',3 'SOUTH': 'S',4 'EAST': 'E',5 'WEST': 'W',6 'NORTHEAST': 'NE',7 'NORTHWEST': 'NW',8 'SOUTHEAST': 'SE',9 'SOUTHWEST': 'SW'10};1112function normalizeDirectionals(street) {13 let normalized = street.toUpperCase();1415 // Normalize compound first (before single)16 for (const [full, abbrev] of Object.entries(DIRECTIONALS)) {17 const pattern = new RegExp(`\\b${full}\\b`, 'g');18 normalized = normalized.replace(pattern, abbrev);19 }2021 return normalized;22}2324normalizeDirectionals('123 NORTH MAIN STREET');25// '123 N MAIN STREET'
Step 4: Unit Designator Normalization
javascript1const UNIT_TYPES = {2 'APARTMENT': 'APT',3 'BUILDING': 'BLDG',4 'DEPARTMENT': 'DEPT',5 'FLOOR': 'FL',6 'ROOM': 'RM',7 'SUITE': 'STE',8 'UNIT': 'UNIT'9};1011function normalizeUnit(unit) {12 if (!unit) return unit;1314 let normalized = unit.toUpperCase();1516 for (const [full, abbrev] of Object.entries(UNIT_TYPES)) {17 const pattern = new RegExp(`\\b${full}\\b`, 'g');18 normalized = normalized.replace(pattern, abbrev);19 }2021 return normalized;22}2324normalizeUnit('APARTMENT 4B');25// 'APT 4B'
Step 5: Punctuation Removal
javascript1function removePunctuation(text) {2 // Remove periods, commas (but keep hyphens in unit numbers)3 return text.replace(/[.,]/g, '');4}56removePunctuation('123 MAIN ST., APT. 4B');7// '123 MAIN ST APT 4B'
Step 6: Whitespace Normalization
javascript1function normalizeWhitespace(text) {2 return text.trim().replace(/\s+/g, ' ');3}45normalizeWhitespace(' 123 MAIN ST ');6// '123 MAIN ST'
Complete Normalization Pipeline
javascript1function normalizeAddress(address) {2 let normalized = address;34 // 1. Uppercase5 normalized = normalized.toUpperCase();67 // 2. Remove punctuation8 normalized = normalized.replace(/[.,]/g, '');910 // 3. Normalize whitespace11 normalized = normalized.trim().replace(/\s+/g, ' ');1213 // 4. Expand/abbreviate street types14 for (const [full, abbrev] of Object.entries(STREET_TYPES)) {15 normalized = normalized.replace(new RegExp(`\\b${full}\\b`, 'g'), abbrev);16 }1718 // 5. Abbreviate directionals19 for (const [full, abbrev] of Object.entries(DIRECTIONALS)) {20 normalized = normalized.replace(new RegExp(`\\b${full}\\b`, 'g'), abbrev);21 }2223 // 6. Abbreviate unit types24 for (const [full, abbrev] of Object.entries(UNIT_TYPES)) {25 normalized = normalized.replace(new RegExp(`\\b${full}\\b`, 'g'), abbrev);26 }2728 return normalized;29}
International Considerations
Different countries have different conventions:
| Country | Casing | Street Type Position |
|---|---|---|
| US | Uppercase | After street name (Main ST) |
| UK | Title case common | Before street name (Street Main) |
| Germany | Title case | Before street name (Straße) |
| France | Uppercase | Before street name (RUE) |
For international addresses, normalize what you can while preserving country-specific formatting.
Storage Strategy
Store both versions:
javascript1{2 raw: "123 Main Street, Apt. 4B, New York, NY 10001",3 normalized: "123 MAIN ST APT 4B NEW YORK NY 10001",4 components: {5 street: "123 MAIN ST",6 unit: "APT 4B",7 city: "NEW YORK",8 state: "NY",9 zip: "10001"10 }11}
Why both?
- Raw for display to users
- Normalized for matching and deduplication
- Components for flexible querying
What's Next
In the workshop, you'll build an AddressNormalizer class that applies USPS rules and handles international formats.