I was yesterday working with a colleague at Imbibe on a mapping platform which involved sending lookup requests to Google Maps Geocoding api and processing the response results to perform an overlap approximation to figure out whether the result areas were serviceable by the partners of the platform (we had every partner’s serviceable radius from the partner’s location and needed to figure whether a user’s location fell inside a partner’s serviceable polygon). As we worked on the logic, it seemed some partners actually served geographical areas (e.g. an entire city or a postal code) which was not very convenient to express as a serviceable polygon. Rather we needed to check whether the location returned by the geocoding api matched the city or postal code or state/province etc. to figure whether the location was serviceable by the partner.
I expected the geocoding api response to give me address components (e.g. street address, city etc). However Google api’s representation of these (as locality
/ administrative_area
etc) although made sense from the api’s perspective which had to represent hugely diverse nomenclature used for geographical limits; it made it a bit difficult for us to accurately write our logic for country vs province vs city vs postal code vs a polygon overlap search.
What we really needed was a simple parser for the full address (returned as formatted_address
by the geocoder) splitting it up into street address
/ city
/ province
/ postal code
and country
. How difficult it could be I thought, there must be another Google api or a third-party api providing such parsing. To my surprise, I found none from Google and expensive paid apis from third-parties.
So we took it on ourselves to write such a parser for us. And in a very brief coding session, we came up with the following that worked for almost all of our use cases.:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
String.indexOfAny = function (s, arr, begin) { var minIndex = -1; for (var i = 0; i < arr.length; i++) { var index = s.indexOf(arr[i], begin); if (index != -1) { if (minIndex == -1 || index < minIndex) { minIndex = index; } } } return (minIndex); } String.splitByAny = function (s, arr) { var parts = []; var index; do { index = String.indexOfAny(s, arr); if (index != -1) { parts.push(s.substr(0, index)); s = s.substr(index + 1); } else { parts.push(s); } } while (index != -1); return (parts); } function parseAddress(address) { var obj = { address: "", city: "", province: "", postalCode: "", country: "" }; if(!address) { return (obj); } var parts = address.split(','); for(var i = 0; i < parts.length; i++) { parts[i] = parts[i].trim(); } var i = parts.length - 1; var fnIsPostalCode = function(value) { return (/^\d+$/.test(value)); } var fnParsePostalCode = function(value) { var subParts = String.splitByAny(value, [' ', '-']); for(var j = 0; j < subParts.length; j++) { if (fnIsPostalCode(subParts[j].trim())) { obj.postalCode = subParts[j].trim(); if(j > 0) { return (subParts[j-1]); break; } } } return(value); } if(i >= 0) { if(fnIsPostalCode(parts[i])) {obj.postalCode = parts[i]; i--;} var part = fnParsePostalCode(parts[i]); if(part) { obj.country = part; } i--; } if(i >= 0) { if(fnIsPostalCode(parts[i])) {obj.postalCode = parts[i]; i--;} var part = fnParsePostalCode(parts[i]); if(part) { obj.province = part; } i--; } if(i >= 0) { if(fnIsPostalCode(parts[i])) {obj.postalCode = parts[i]; i--;} var part = fnParsePostalCode(parts[i]); if(part) { obj.city = part; } i--; } if(i >= 0) { parts = parts.slice(0, i + 1); obj.address = parts.join(', '); } return(obj); } |
Here’s a sample invocation of the method and the result returned:
1 2 3 4 5 6 7 8 9 10 |
var address = parseAddress('SCF - 96, Main Market Road, Sector 6, Karnal, Haryana 132001, India') //address contains the following json: /*{ "address":"SCF - 96, Main Market Road, Sector 6", "city":"Karnal", "province":"Haryana", "postalCode":"132001", "country":"India" }*/ |
The method works for partial addresses too, here are some examples:
1 2 3 |
parseAddress('New York, NY, USA'); //{"address":"", "city":"New York", "province":"NY", "postalCode":"", "country":"USA"} parseAddress('New York, NY, USA 10021'); //{"address":"", "city":"New York", "province":"NY", "postalCode":"10021", "country":"USA"} parseAddress('USA'); //{"address":"", "city":"", "province":"", "postalCode":"", "country":"USA"} |
NOTE: The method here assumes the postal codes are all numerics (which is true for many but not all countries). Our target audience had only numeric postal codes, so the method worked fine for our needs. However if your audience resides in a country like Canada where postal codes are alpha-numeric, you would only need to change the regex in fnIsPostalCode
method and the rest of the logic should still work fine for your needs.
Hope this helps someone…
can you please tell the algorithm for this.
Hi Udit, the entire code in javascript is there in the blog post, which can be easily translated into an algorithm. On a quick note:
1) If address is empty, exit
2) Split parts on a comma and truncate white-spaces from each part
3) Starting from the last part in step 2, for each part, split it on a space and a hyphen. Validate the sub-parts for a postal code and the remaining sub-part (if any) would be a country, province and city in this order.
4) If something still remains after you have parsed out the city, that’s the street address.
Let me know if this helps.
Hi Rahul,
Can we restrict address through pincode address?
Hi AK, I think there might be an option in Google’s appropriate api for that. If not, you can always manually implement checks in your own code during form submission.
I’ve added unit tests and support for Canada & UK
https://gist.github.com/nalbion/5097eb072c97204c87b3878b129f6fb8
Hi Nicholas, thanks for sharing the same.