15 minlesson

Address Normalization Fundamentals

Address Normalization Fundamentals

Address normalization converts addresses to a standard format for consistent storage, comparison, and matching. Without normalization, the same address can appear differently in your database.

Why Normalize?

The same physical location can be written many ways:

1123 Main Street, Apartment 4B, New York, NY 10001
2123 Main St., Apt. 4B, New York, NY 10001
3123 MAIN ST APT 4B NEW YORK NY 10001
4123 main street, apartment 4-b, new york, ny 10001

Without normalization:

  • Duplicate detection fails
  • Address matching is unreliable
  • Database searches miss valid matches
  • Shipping systems may treat these as different addresses

USPS Publication 28

The US Postal Service defines address standards in Publication 28. Key rules:

Casing

  • All uppercase (preferred for mail)
  • Or consistent title case for display

Street Type Abbreviations

Full NameAbbreviation
AvenueAVE
BoulevardBLVD
CircleCIR
CourtCT
DriveDR
HighwayHWY
LaneLN
ParkwayPKWY
PlacePL
RoadRD
SquareSQ
StreetST
TerraceTER
TrailTRL
WayWAY

Directional Abbreviations

FullAbbreviation
NorthN
SouthS
EastE
WestW
NortheastNE
NorthwestNW
SoutheastSE
SouthwestSW

Unit Designators

FullAbbreviation
ApartmentAPT
BuildingBLDG
DepartmentDEPT
FloorFL
RoomRM
SuiteSTE
UnitUNIT

Normalization Steps

Step 1: Case Normalization

javascript
1function normalizeCase(address) {
2 return address.toUpperCase();
3}
4
5normalizeCase('123 Main Street');
6// '123 MAIN STREET'

Step 2: Street Type Normalization

javascript
1const STREET_TYPES = {
2 'STREET': 'ST',
3 'AVENUE': 'AVE',
4 'BOULEVARD': 'BLVD',
5 'DRIVE': 'DR',
6 'LANE': 'LN',
7 'ROAD': 'RD',
8 'COURT': 'CT',
9 'CIRCLE': 'CIR',
10 'PLACE': 'PL',
11 'TERRACE': 'TER',
12 'HIGHWAY': 'HWY',
13 'PARKWAY': 'PKWY',
14 'WAY': 'WAY'
15};
16
17function normalizeStreetType(street) {
18 let normalized = street.toUpperCase();
19
20 for (const [full, abbrev] of Object.entries(STREET_TYPES)) {
21 // Match at word boundary
22 const pattern = new RegExp(`\\b${full}\\b`, 'g');
23 normalized = normalized.replace(pattern, abbrev);
24 }
25
26 return normalized;
27}
28
29normalizeStreetType('123 MAIN STREET');
30// '123 MAIN ST'

Step 3: Directional Normalization

javascript
1const DIRECTIONALS = {
2 'NORTH': 'N',
3 'SOUTH': 'S',
4 'EAST': 'E',
5 'WEST': 'W',
6 'NORTHEAST': 'NE',
7 'NORTHWEST': 'NW',
8 'SOUTHEAST': 'SE',
9 'SOUTHWEST': 'SW'
10};
11
12function normalizeDirectionals(street) {
13 let normalized = street.toUpperCase();
14
15 // Normalize compound first (before single)
16 for (const [full, abbrev] of Object.entries(DIRECTIONALS)) {
17 const pattern = new RegExp(`\\b${full}\\b`, 'g');
18 normalized = normalized.replace(pattern, abbrev);
19 }
20
21 return normalized;
22}
23
24normalizeDirectionals('123 NORTH MAIN STREET');
25// '123 N MAIN STREET'

Step 4: Unit Designator Normalization

javascript
1const UNIT_TYPES = {
2 'APARTMENT': 'APT',
3 'BUILDING': 'BLDG',
4 'DEPARTMENT': 'DEPT',
5 'FLOOR': 'FL',
6 'ROOM': 'RM',
7 'SUITE': 'STE',
8 'UNIT': 'UNIT'
9};
10
11function normalizeUnit(unit) {
12 if (!unit) return unit;
13
14 let normalized = unit.toUpperCase();
15
16 for (const [full, abbrev] of Object.entries(UNIT_TYPES)) {
17 const pattern = new RegExp(`\\b${full}\\b`, 'g');
18 normalized = normalized.replace(pattern, abbrev);
19 }
20
21 return normalized;
22}
23
24normalizeUnit('APARTMENT 4B');
25// 'APT 4B'

Step 5: Punctuation Removal

javascript
1function removePunctuation(text) {
2 // Remove periods, commas (but keep hyphens in unit numbers)
3 return text.replace(/[.,]/g, '');
4}
5
6removePunctuation('123 MAIN ST., APT. 4B');
7// '123 MAIN ST APT 4B'

Step 6: Whitespace Normalization

javascript
1function normalizeWhitespace(text) {
2 return text.trim().replace(/\s+/g, ' ');
3}
4
5normalizeWhitespace(' 123 MAIN ST ');
6// '123 MAIN ST'

Complete Normalization Pipeline

javascript
1function normalizeAddress(address) {
2 let normalized = address;
3
4 // 1. Uppercase
5 normalized = normalized.toUpperCase();
6
7 // 2. Remove punctuation
8 normalized = normalized.replace(/[.,]/g, '');
9
10 // 3. Normalize whitespace
11 normalized = normalized.trim().replace(/\s+/g, ' ');
12
13 // 4. Expand/abbreviate street types
14 for (const [full, abbrev] of Object.entries(STREET_TYPES)) {
15 normalized = normalized.replace(new RegExp(`\\b${full}\\b`, 'g'), abbrev);
16 }
17
18 // 5. Abbreviate directionals
19 for (const [full, abbrev] of Object.entries(DIRECTIONALS)) {
20 normalized = normalized.replace(new RegExp(`\\b${full}\\b`, 'g'), abbrev);
21 }
22
23 // 6. Abbreviate unit types
24 for (const [full, abbrev] of Object.entries(UNIT_TYPES)) {
25 normalized = normalized.replace(new RegExp(`\\b${full}\\b`, 'g'), abbrev);
26 }
27
28 return normalized;
29}

International Considerations

Different countries have different conventions:

CountryCasingStreet Type Position
USUppercaseAfter street name (Main ST)
UKTitle case commonBefore street name (Street Main)
GermanyTitle caseBefore street name (Straße)
FranceUppercaseBefore street name (RUE)

For international addresses, normalize what you can while preserving country-specific formatting.

Storage Strategy

Store both versions:

javascript
1{
2 raw: "123 Main Street, Apt. 4B, New York, NY 10001",
3 normalized: "123 MAIN ST APT 4B NEW YORK NY 10001",
4 components: {
5 street: "123 MAIN ST",
6 unit: "APT 4B",
7 city: "NEW YORK",
8 state: "NY",
9 zip: "10001"
10 }
11}

Why both?

  • Raw for display to users
  • Normalized for matching and deduplication
  • Components for flexible querying

What's Next

In the workshop, you'll build an AddressNormalizer class that applies USPS rules and handles international formats.