Postcode Structure Validation

Printer-friendly version

Validation versus Verification

We do not verify that the postcode is right for the address. The only check our system performs is that the postcode conforms to a certain structure. So while DM will accept "ZZ0 0ZZ" as a postcode in GBR (as its structure is correct), it is not actually a valid one according to Royal Mail as there is no house with that postcode.

We have partners that can provide some degree of address correction, and that functionality can be exploited through the verifyAddresses API call in DM API 3.x.

 

Problem

Postcodes (or "zipcodes" to Americans) are used by many countries - but not all countries. Each country has adopted their own standard, and this can sometimes be awkward. Other countries, it seems, have gone for a half-and-half option. For example, Ireland has postcodes for within Dublin, but nowhere else.

Now if you consider American zipcodes for a moment. There are actually two forms, the familiar 5-digit ("Zip") and another form with an additional 4 digits tagged onto the end ("Zip+4"). Moreover, people separate these numbers differently. Some use "-" and some use "/". Alarmingly, some people even include the state code as part of the zipcode!

The point is this: you can't trust the user (or customer) to enter a postcode in a precise format.

For more information about postal codes, check Wikipedia. That article also describes the structures used by different countries.

 

Solution

We employ Postel's Law when you provide us with postcodes. This means we will do everything we can to try and work out exactly what is being meant. We do this by removing any characters that couldn't possibly be part of a valid postcode, and then see what we've got left.

For example, for the USA, we know that zip-codes are made up solely of numbers. Given that fact, we strip out dots, letters, slashes, dashes, spaces etc. to arrive at a series of numbers. If we have 5 or 9 (5+4), we use those as the zipcode. If we get anything other than 5 or 9 characters, we report the invalid postcode error (E20006 ERROR_INVALID_POSTCODE_FORMAT). This simple rule means that all of the following can be successfully validated:

90210
Beverly hills 90210
90210 CA.
9/...0//2----1&*&*&0

When the same methods are applied to UK postcodes (which consist of letters and numbers), you get equally tolerant behaviour:

EC1R 4PF
EC 1R 4 PF
ec1r4pf
e.c.1.r.4.p.f

All of those are regarded as valid. But once the non-alphanumeric characters are stripped out of the next lot, we cannot form a valid UK postcode:

ECR 4PF
EC1RX4PF

Whenever we report a postcode back to you (either on the screen or in the API), we will produce it in the most commonly accepted and unambiguous format.

You give EC1R-4 p f
You get EC1R 4PF