Skip to content

Instantly share code, notes, and snippets.

@colinyoung
Created June 22, 2011 06:29
Show Gist options
  • Save colinyoung/1039592 to your computer and use it in GitHub Desktop.
Save colinyoung/1039592 to your computer and use it in GitHub Desktop.
Parses a postal address into its components. (limited) i18n support and other features. inspired by this forrst post: http://forr.st/~zDa
#
# Most of the regexes here are from http://twitter.com/schuyler.
# They were posted on Forrst (http://forr.st/~zDa) as a public post,
# so I reworked them into a full class.
#
# Added:
# => (limited) international support -- should handle US/Canada and most of western europe
# => added :receipient, :postcode, and :country fields
# => state/province detection built-in
require 'yaml'
class PostalAddress
attr_accessor :addr
Match = {
:number => /^(\d+\W|[a-z]+)?(\d+)([a-z]?)\b/io,
:unit => /\W?(?:Ste|Suite|#|Unit|Apt|Apartment)\s?[0-9]+/,
:street =>
{
:regex => /(?:\b(?:\d+\w*|[a-z'-]+)\s*)+/io,
:block => Proc.new {|v| v.lines.select {|l| l.match(/\A[0-9]/) }.first.strip }
},
:city =>
{
:regex => /(?:\b(?:\d+\w*|[a-z'-]+)\s*)+/io,
:block => Proc.new {|v| v.lines.last }
},
:state =>
{
:regex => /\W(A[BLKSZRAEP]|BC|C[AOT]|D[EC]|F[LM]|G[ANU]|HI|I[ADLN]|K[SY]|LA|M[ABDEHINOPST]|N[BCDEHJLMSUVY]|O[HKNR]|P[AERW]|QC|RI|S[CDK]|T[NX]|UT|V[AIT]|W[AIVY]|YT)\s/i,
:subexpression => 1
},
:zip => /(\d{5})(?:-\d{4})?\s*$/o,
:postcode => /(?:(?:[A-Z0-9]{3,}) (?:[A-Z0-9]{3,})\s*$|(\d{5})(?:-\d{4})?\s*$)/o,
:at => /\s(at|@|and|&)\s/io,
:po_box => /\b[P|p]*(OST|ost)*\.*\s*[O|o|0]*(ffice|FFICE)*\.*\s*[B|b][O|o|0][X|x]\b/,
:recipient => /\A[a-zA-Z'\s]+$/iox,
:country => /^[A-Z\s]+\z/io
}
def initialize(addr)
@addr = addr
end
def country
return "Canada" if /AB|BC|MB|N[BLTSU]|ON|PE|QC|SK|YT/.match(self.state)
return "USA" if Match[:state][:regex].match(@addr)
return "United Kingdom" if self.postcode
self.class.send(:country, @addr)
end
def method_missing(name)
self.class.method_missing(name, @addr)
end
def self.method_missing(name, addr)
m = Match[name]
if m.is_a?(Hash)
matched = m[:regex].match(addr)
if !matched.nil?
if m[:subexpression]
result = matched[m[:subexpression]].to_s.strip
end
if m[:block]
result ||= matched.to_s.strip
result = m[:block].call(result)
end
return result if !result.nil?
end
m = m[:regex] # Fall through
end
result = m.match(addr).to_s.strip
result = result.to_i if result.is_numeric?
result
end
def self.keys
Match.keys
end
def to_hash
h = {}
self.class.keys.each do |part|
h[part.to_s] = self.class.send(part, @addr)
end
h
end
def to_yaml
to_hash.to_yaml
end
end
class String
def is_numeric? # "222" or "4.5" counts
!(self.match(/\A[+-]?\d+?(\.\d+)?\Z/) == nil)
end
def lines
self.split("\n").to_a
end
end
puts "US Address"
addr = "Colin Young
222 N Ashland Suite 300
Chicago, IL 60657"
puts PostalAddress.new(addr).to_yaml
puts "Detected Country: #{PostalAddress.new(addr).country}"
puts "\n\nCanadian Address"
addr = "230 Riot Boulevard
Vancouver, BC V5Y 1K6"
puts PostalAddress.new(addr).to_yaml
puts "Detected Country: #{PostalAddress.new(addr).country}"
puts "\n\nUK Address"
addr = "Colin Young
23 Herefordshire Ln
LONDON
SW1 P01"
puts PostalAddress.new(addr).to_yaml
puts "Detected Country: #{PostalAddress.new(addr).country}"
@mager
Copy link

mager commented Jul 1, 2011

Awesome

@mager
Copy link

mager commented Jul 1, 2011

Awesome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment