Standardising civic addresses for emergency dispatch

Use forward and reverse geocoding to standardise civic addresses, back-fill coordinates, and build a confidence-driven review queue for dispatch systems.

| July 05, 2026

Standardising civic addresses for emergency dispatch

A 911 call comes in. The caller gives an address. A call-taker types it into the CAD system. The CAD system looks it up. If that address record is misspelled, structurally malformed, or missing coordinates, the nearest unit gets routed to the wrong parcel — or nowhere at all. The call-taker improvises. Time passes.

Emergency dispatch is the sharpest edge of address-data quality. Every other industry absorbs bad address data in the form of a returned parcel or a failed delivery or a wasted field-service visit. Dispatch absorbs it in the form of delayed response. That is a different category of consequence, and it demands a different level of rigour in how you maintain address records.

This post is for the engineers and technical project leads building or maintaining the address database that feeds a dispatch or public-safety records system. It covers how to use forward and reverse geocoding to standardise civic address strings, back-fill missing or stale coordinates, and route low-confidence records to a human review queue — in production, with working code, and with honest scope about what a geocoding API can and cannot do in a 911 context.

The actual problem: four failure modes in civic address databases

Before touching an API, it is worth naming the failure modes precisely. They are different problems with different treatments.

1. Malformed or non-standard address strings. Addresses in government records are entered by humans over decades, across staff turnover, across system migrations. "123 Main St" and "123 MAIN STREET" and "123 Main St." are the same address, but string-matching treats them as different records. "Apt 4B" and "Unit 4B" and "#4B" are the same unit. A geocoder normalises these into a canonical form — which is the first thing you want, independent of coordinates.

2. Missing coordinates. Many legacy civic address databases have an address string but no latitude/longitude. The CAD system cannot route without coordinates. Forward geocoding converts the string to coordinates. The question is how confident you are in the match.

3. Stale or drifted coordinates. A coordinate captured in 2009 may be correct. It may also have been geocoded against a road network that has since been reclassified, had a new development added, or simply been captured with less precision than today's data. Reverse geocoding the stored coordinate and comparing the returned address to the stored address is a cheap consistency check that surfaces most of the drift.

4. Address–parcel mismatches. A coordinate that snaps to the road centreline in front of a parcel is not the same as a coordinate on the parcel itself. For dispatch, the difference matters when the parcel is large — a farm, a hospital campus, an industrial estate — because units approach from one end and the incident is at the other. Centroid-on-parcel is better than centroid-on-road; knowing which one you have is essential.

CSV2GEO's forward and reverse geocoding endpoints return a confidence score and a granularity field on every response. Those two fields are the instruments you use to triage which records need human review. The API does not itself decide which records are authoritative for dispatch purposes — that is a human and governance decision — but it gives you the signal to build the queue correctly.

A clear scope boundary before you build

One thing to state plainly before writing a line of code: CSV2GEO is a geocoding and address-standardisation platform. It is not an authoritative NG911 or MSAG system of record, it does not hold PSAP-certified address layers, and it does not provide an emergency SLA. If your jurisdiction requires data certified by a specific standards body, that certification must come from the appropriate authority.

What CSV2GEO does in this context: it takes an address string and returns a normalised, structured form of it, with coordinates and a confidence score. For a records-management workflow — cleaning a legacy database, back-filling coordinates, building a review queue, standardising free-text intake — that is exactly the right tool. The canonical output feeds the authoritative system; it does not replace it.

What the API gives you

Two endpoints do the work described in this post.

`GET /api/v1/geocode` — forward geocoding. Takes a free-text or structured address query and returns a normalised address string, coordinates, confidence score, granularity level, and component breakdown (street number, street name, city, postcode, country). The 461M+ address index covers 39 countries.

`GET /api/v1/reverse` — reverse geocoding. Takes a lat/lng and returns the nearest address, distance in metres from the query point to the matched address, confidence score, and a granularity level.

Both endpoints return a confidence field in the range 0.0–1.0 and a granularity field that describes the precision of the match. Granularity values indicate whether the geocoder matched at address point level, road/street level, suburb/postcode level, or something coarser. For dispatch purposes:

address_point + confidence ≥ 0.8 → high confidence, accept automatically
address_point or road + confidence 0.6–0.8 → moderate confidence, flag for review
Anything below 0.6 or granularity coarser than road → low confidence, mandatory human review before the record is used in dispatch

These thresholds are a starting point. Calibrate against your jurisdiction's address data once you have run a pilot — a well-maintained urban database will have a higher share of high-confidence matches than a rural county with sparse address coverage.

A thorough explanation of how the confidence score is computed lives in Geocoding confidence scores explained. The accuracy discussion in Reverse geocoding accuracy in meters is required reading before you decide on a distance threshold for your consistency-check step.

Step 1: Normalise and geocode existing address records

The first pass: take every address string in your records system and run it through forward geocoding to get a canonical form and coordinates.

curl -s "https://csv2geo.com/api/v1/geocode" \
  --get \
  --data-urlencode "q=123 main st apt 4b springfield il 62701" \
  --data-urlencode "api_key=$CSV2GEO_API_KEY" \
  | jq '{formatted_address, lat, lng, confidence, granularity}'

The formatted_address in the response is the canonical form. For civic address standardisation, this is the string you write back to the record — not the caller-typed original, not the legacy database value that nobody has cleaned since 2014.

For bulk processing, the REST endpoint is the same call in a loop. Python with requests, batched to respect your rate limit:

import csv
import os
import time
import requests

API = "https://csv2geo.com/api/v1/geocode"
KEY = os.environ["CSV2GEO_API_KEY"]
RATE_LIMIT_DELAY = 0.05  # 20 req/s, adjust for your plan

results = []

with open("civic_addresses.csv") as fin:
    reader = csv.DictReader(fin)
    for row in reader:
        r = requests.get(
            API,
            params={"q": row["address_string"], "api_key": KEY},
            timeout=30,
        )
        if r.status_code == 429:
            time.sleep(2)
            r = requests.get(
                API,
                params={"q": row["address_string"], "api_key": KEY},
                timeout=30,
            )
        r.raise_for_status()
        data = r.json()
        best = data["results"][0] if data.get("results") else {}
        results.append({
            "record_id":          row["record_id"],
            "original_address":   row["address_string"],
            "canonical_address":  best.get("formatted_address"),
            "lat":                best.get("lat"),
            "lng":                best.get("lng"),
            "confidence":         best.get("confidence"),
            "granularity":        best.get("granularity"),
        })
        time.sleep(RATE_LIMIT_DELAY)

with open("geocoded_addresses.csv", "w", newline="") as fout:
    writer = csv.DictWriter(fout, fieldnames=list(results[0].keys()))
    writer.writeheader()
    writer.writerows(results)

The output CSV has everything you need to decide what happens next: a canonical address, coordinates, a confidence score, and a granularity level per record. Nothing in this step modifies your authoritative database — the output is a candidate dataset that your review workflow consumes.

The same logic in Node using fetch, for teams running a JavaScript pipeline:

import { createReadStream, createWriteStream } from 'node:fs';
import { parse } from 'csv-parse';
import { stringify } from 'csv-stringify';

const API = 'https://csv2geo.com/api/v1/geocode';
const KEY = process.env.CSV2GEO_API_KEY;

async function geocodeAddress(addressString) {
  const url = `${API}?q=${encodeURIComponent(addressString)}&api_key=${KEY}`;
  const r = await fetch(url);
  if (!r.ok) throw new Error(`HTTP ${r.status}`);
  const data = await r.json();
  return data.results?.[0] ?? null;
}

// Wire into your CSV pipeline with p-limit or similar concurrency control

For large record sets — tens of thousands of rows — the batch web tool on the CSV2GEO dashboard is often faster for a one-off clean-up. Upload a CSV, map the address column, download the enriched result. Credits are consumed per address row, identical to the API. The batch tool is the right choice when the team doing the clean-up is not the team that maintains the integration — a records officer can run it without touching production code.

Step 2: Run the consistency check on records that already have coordinates

For records that have a stored coordinate, reverse geocode the coordinate and compare the returned address to the stored address string. The distance in metres from the query point to the matched address is the key signal.

REVERSE_API = "https://csv2geo.com/api/v1/reverse"

def check_coordinate(record_id, stored_address, lat, lng):
    r = requests.get(
        REVERSE_API,
        params={"lat": lat, "lng": lng, "api_key": KEY},
        timeout=30,
    )
    r.raise_for_status()
    data = r.json()
    result = data.get("result") or {}
    return {
        "record_id":       record_id,
        "stored_address":  stored_address,
        "reverse_address": result.get("formatted_address"),
        "distance_m":      result.get("distance_m"),
        "confidence":      result.get("confidence"),
        "granularity":     result.get("granularity"),
        "flag":            _flag(result),
    }

def _flag(result):
    dist = result.get("distance_m") or 999
    conf = result.get("confidence") or 0.0
    gran = result.get("granularity") or "unknown"
    if dist > 200 or conf < 0.6 or gran not in ("address_point", "road"):
        return "REVIEW"
    if dist > 50 or conf < 0.8:
        return "CHECK"
    return "OK"

The distance_m threshold matters enormously in a dispatch context. A 15 m offset — point on the road centreline versus point on the parcel — is normal and acceptable for most urban addresses. A 200 m offset could mean a unit is sent to the wrong building on a shared estate road. A 500 m offset is a different street. Set your thresholds from local knowledge, not from defaults.

The reverse geocoding accuracy post has a detailed treatment of what distance offsets mean in practice and which offset ranges should trigger what actions.

Step 3: Build the manual-review queue

The forward and reverse geocoding passes produce a confidence-stratified dataset. The review queue is just a filter:

import json

REVIEW_QUEUE = []
AUTO_ACCEPT  = []

for record in geocoded_results:
    conf = record.get("confidence") or 0.0
    gran = record.get("granularity") or "unknown"

    if conf >= 0.8 and gran == "address_point":
        AUTO_ACCEPT.append(record)
    else:
        priority = "HIGH" if (conf < 0.6 or gran not in ("address_point", "road")) else "MEDIUM"
        REVIEW_QUEUE.append({**record, "review_priority": priority})

print(f"Auto-accept: {len(AUTO_ACCEPT)}")
print(f"Review queue: {len(REVIEW_QUEUE)}")

The review queue is the product. A records officer opens it, sorts by priority, and for each HIGH record either corrects the source address string and re-submits, or escalates to the field for a physical verification. MEDIUM records get a second pair of eyes before being accepted. Only AUTO_ACCEPT records land in the dispatch database without human sign-off.

This is not extra process for the sake of process. It is the explicit audit trail that a PSAP audit or a post-incident review will ask for. "We accepted this coordinate automatically because confidence was 0.92 at address-point granularity" is a defensible answer. "We bulk-loaded from the geocoder and did not check" is not.

Step 4: Handle the edge cases — rural routes, unnamed roads, large parcels

Urban address databases are relatively tractable. The interesting failures happen at the margins.

Rural route and highway contract addresses. "RR 4 Box 12" is a valid USPS address in many rural US counties. A geocoder that has never heard of that format will snap it to the nearest road segment, which may be kilometres away. Forward geocoding these returns low confidence and coarse granularity — they land correctly in your REVIEW queue and should stay there. Resolution is a field verification, not an API call.

Unnamed or newly-named roads. A subdivision platted 18 months ago may have road names that have not yet propagated into every data source. Low confidence is the right signal here too. Build a process for local staff to submit corrections — a simple Google Form feeding a spreadsheet that your DBA batch-imports monthly is enough for most agencies.

Large parcels and campus addresses. A hospital complex, a university campus, or a multi-building industrial estate may have a single mailing address but dozens of meaningful access points. The geocoder will return one coordinate for the canonical address. For dispatch, you need the specific building or gate — which is a problem that geocoding alone cannot solve. The right answer is a supplemental address or cross-street field in your CAD intake form, not a geocoder feature.

Apartment and unit numbers. The geocoder normalises unit numbers in the formatted address string, but the coordinate it returns is typically the centroid of the building or the road-access point, not the specific unit. Unit-level coordinates do not exist in most address datasets. This is fine for dispatch — units approach the building, not the flat — but it means the address string (including the unit) is the accurate record and the coordinate is an approximation at building level. Document this in your data dictionary so future staff know what the coordinate represents.

Step 5: Build the back-fill loop for ongoing intake

The one-time clean-up of legacy records is a project. The ongoing back-fill of new records entering the system is a pipeline. Wire the geocoding call into whatever creates new address records.

In practice, new civic addresses enter through a handful of paths: a 911 caller gives a new address that is not in the database, a permit office adds a new structure, a rural addressing coordinator assigns a new number. Each of these is a moment where the geocoding call should fire automatically and its output should be logged for review rather than immediately accepted.

The pattern is straightforward: new record created → geocode fired asynchronously → result written to a geocoding_results staging table → nightly job promotes high-confidence results to the production address table and surfaces low-confidence results to the review queue.

import threading

def on_new_address_record(record_id: str, address_string: str, db):
    """Called on every new address record insert."""
    def background_geocode():
        r = requests.get(
            API,
            params={"q": address_string, "api_key": KEY},
            timeout=30,
        )
        if not r.ok:
            db.log_geocode_failure(record_id, r.status_code)
            return
        data = r.json()
        best = data["results"][0] if data.get("results") else None
        if best:
            db.write_geocode_result(
                record_id=record_id,
                canonical_address=best["formatted_address"],
                lat=best["lat"],
                lng=best["lng"],
                confidence=best["confidence"],
                granularity=best["granularity"],
            )
    threading.Thread(target=background_geocode, daemon=True).start()

Asynchronous so the record insert does not block on the API call. The staging table accumulates results; the nightly job is the gate. The observability post covers what metrics to instrument on this kind of pipeline — geocode failure rate, confidence distribution over time, review queue depth — so that deterioration is visible before it becomes a dispatch problem.

Caching: what to cache and for how long

Addresses rarely move. A canonical address geocoded today will return the same coordinates tomorrow, next month, and almost certainly next year. Caching geocoding results aggressively is correct behaviour for a records system, not a corner-cutting shortcut.

The practical rule: cache the (canonical address → lat/lng/confidence/granularity) tuple indefinitely, keyed on the normalised address string. Invalidate the cache entry only when the source address string changes or when a human reviewer overrides the geocoded coordinate. The caching post has the full treatment; the short version is that a records system that re-geocodes the same stable address on every CAD lookup is wasting budget and latency that belongs to the dispatcher.

For the reverse geocoding consistency check, cache the result keyed on the (lat/lng rounded to 5 decimal places) tuple, with a TTL of 90 days. A coordinate that passed the consistency check 60 days ago has not moved. A coordinate that was flagged for review 60 days ago should stay flagged until a human resolves it, regardless of the cache.

Cost model for a real civic address database

A county with 80,000 address records, running the one-time clean-up plus ongoing back-fill for a year.

One-time clean-up:

80,000 forward geocoding calls (one per record): 80,000 credits
80,000 reverse geocoding consistency checks: 80,000 credits
Total: 160,000 credits

Ongoing back-fill (year 1):

New addresses in a county of 80,000 records: roughly 500–1,000 per year
Each new record: 1 forward geocode on intake + 1 reverse consistency check on first review = 2 credits
Annual ongoing cost: 1,000–2,000 credits

Total year 1: ~162,000 credits. At paid pricing starting from $54/month for 100,000 calls, the entire year-one project fits within two or three months of the entry plan. Year two is essentially free on any reasonable plan.

The free tier (3,000 calls/day, no credit card) is enough to run a pilot on a subset of records — 3,000 forward geocodes per day means a 10,000-record pilot completes in under four days before you commit budget. See csv2geo.com/pricing/api for the current bracket structure.

What to monitor after you ship

The geocoding layer is not a one-time fix. Build three metrics into your observability stack from day one.

Confidence distribution of new records. If the percentage of new records landing in the REVIEW queue starts climbing, something is changing — a new development is using address formats the geocoder has not seen, a data-entry workflow is producing malformed strings, or the address index has a coverage gap in a recently-annexed area. A graph of confidence percentiles over time surfaces this before it becomes a backlog.

Review queue age. Records sitting in the REVIEW queue for more than 30 days are a risk. Either the review process does not have enough capacity, or the records are genuinely unresolvable and need a policy decision. Instrument queue age, not just queue depth.

Reverse-geocoding distance distribution. The distribution of distance_m values from the consistency check should be roughly stable for a mature address database. A spike in high-distance results means coordinates are drifting — possibly a batch import from a source with a different coordinate system, possibly a projection error in a GIS export.

The observability post has the full list of metrics worth instrumenting in a geocoding pipeline. For dispatch, the three above are the highest-signal ones.

Frequently Asked Questions

Is CSV2GEO a certified NG911 or MSAG address authority? No. CSV2GEO standardises and geocodes address strings; it does not hold PSAP certification or provide addresses that are authoritative for 911 dispatch by regulatory definition. The geocoded output feeds into your authoritative system — it does not replace it. For certification, contact the relevant state or national 911 authority in your jurisdiction.

What confidence threshold should we use for auto-accepting records into dispatch? A reasonable starting point is confidence ≥ 0.8 at address-point granularity for auto-acceptance, with everything below that going to manual review. Calibrate against your own data during the pilot — urban databases with well-maintained input typically achieve 70–80% auto-acceptance rates; rural databases with legacy free-text entry may be closer to 50%.

How do we handle addresses that return low confidence and cannot be resolved by re-geocoding? These require field verification — someone physically checks the location, captures a GPS coordinate from a mobile device, and a records officer enters the verified coordinate manually with a note explaining the override. The geocoder surfaces the problem; it cannot solve the problem of an address that does not yet exist in any dataset.

Can we use the batch web tool instead of the API for the initial clean-up? Yes. The batch web tool on the dashboard accepts a CSV, lets you map the address column, and returns the enriched result with canonical addresses, coordinates, confidence, and granularity. Credits consumed are per address row, same as the API. It is the right choice when a records officer is running the clean-up without engineering support.

Does the geocoder handle apartment and unit numbers? The forward geocoder normalises unit numbers into the canonical address string. The coordinate returned is typically the building centroid or road-access point — unit-level coordinates are not available in most address datasets. For dispatch, this is appropriate: units approach the building, not the flat.

How should we handle addresses that span multiple jurisdictions? Addresses near jurisdictional boundaries — a county line, a municipal boundary, a PSAP boundary — can geocode to the correct coordinate but be assigned to the wrong administrative area if the geocoder's boundary data differs from your CAD system's authority areas. Run a spatial join of geocoded coordinates against your authoritative boundary layer as a post-processing step, and flag any records where the geocoder's returned county or municipality does not match your CAD's assignment. This is a boundary-data problem, not a geocoding problem.

What happens to a record if the geocoding API is unavailable when a new address is created? The asynchronous back-fill pattern handles this gracefully: if the API call fails, log the failure and the record_id to a retry queue. The nightly job retries all failed geocodes before promoting results. The record exists in your system from the moment it is created; geocoding enrichment is a background enhancement, not a blocking gate. See Exponential backoff — when to retry, when to stop for the retry logic.