Reverse Geocoding Accuracy: How Far Off Are Your Results?

How to measure reverse geocoding accuracy in meters: the geometry of 'nearest match,' rooftop vs interpolated, and what 50m of error means downstream.

| May 19, 2026
Reverse Geocoding Accuracy: How Far Off Are Your Results?

A coordinate goes into a reverse geocoder; an address comes out. The address is *somewhere near* the coordinate — but how near? The answer depends on the coordinate's source, the geocoder's index density in that area, and what "match" means in this dataset. In some neighborhoods you get the right house; in others the geocoder returns "the nearest known address," which might be 80 meters away.

This post is the practical version: what reverse geocoding accuracy actually depends on, how to measure it, the four buckets that explain >95% of real-world error, and the patterns for handling each. By the end you should be able to look at a ?lat=…&lng=… result and know whether to trust it down to the building or treat it as "approximately this block."

Two distance metrics that matter

When you reverse-geocode, two distances are interesting:

  1. Distance from input coord to returned address coord — how far did the geocoder "snap" your input?
  2. Distance from input coord to the actual building the user cared about — the truth distance, only knowable if you have ground truth.

The first is observable from any reverse geocoder response. The second requires a labelled dataset (input coord + known correct address). Most teams care about #1 as a proxy and only validate #2 in spot checks.

A typical reverse-geocode response (csv2geo):

{
  "query": { "lat": 38.8977, "lng": -77.0365 },
  "results": [{
    "formatted_address": "1600 PENNSYLVANIA Avenue Northwest, Washington, DC 20500, US",
    "location": { "lat": 38.8976750934991, "lng": -77.0365468377218 },
    "accuracy": "rooftop",
    "accuracy_score": 1.0
  }]
}

The geocoder returned coords ~5m from the input. With accuracy: "rooftop" and accuracy_score: 1.0, this is high-confidence. Compute the distance:

from math import radians, sin, cos, asin, sqrt

def haversine_m(lat1, lng1, lat2, lng2):
    """Great-circle distance in meters."""
    R = 6_371_000   # Earth radius in meters
    lat1, lng1, lat2, lng2 = map(radians, (lat1, lng1, lat2, lng2))
    dlat = lat2 - lat1; dlng = lng2 - lng1
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlng/2)**2
    return 2 * R * asin(sqrt(a))

snap = haversine_m(38.8977, -77.0365, 38.8976750934991, -77.0365468377218)
# 4.4 m

Anything under ~10m is essentially a direct hit. Above 50m you're in "nearest building" or "interpolated street range" territory. Above 200m you're getting "nearest known anything" and should treat the result skeptically.

The four error buckets

After watching enough real-world reverse-geocode logs, errors fall into four buckets:

Bucket 1: Rooftop match (0–10m)

The input coord falls inside or right next to a known building polygon. Geocoder returns the address with high confidence and the snap distance is essentially measurement noise.

What you can trust:

  • The house_number and road are correct.
  • The postcode, city, state, country are correct.
  • Use this for billing, deliveries, anything address-critical.

When this happens:

  • US/Canada urban + suburban (very high coverage).
  • Western Europe (Germany, UK, France, NL — same).
  • Developed APAC (Japan, AU/NZ — same).

Bucket 2: Street centerline interpolation (10–30m)

Geocoder doesn't have the building footprint but has the street. It estimates the address from the house number's position along the street range. Snap distance is small but the *real* address might be 15m away because the street range estimate is imperfect.

What you can trust:

  • The street name and city are correct.
  • The house number is approximately correct (off by 1–2 in dense areas; off by 10–20 in sparse rural).
  • Use this for "near here" routing, market analysis, mapping at city zoom levels.
  • Don't use this for billing or deliveries without confirmation.

When this happens:

  • US suburbs without rooftop data.
  • Anywhere coverage is street-level but not building-level.

Bucket 3: Nearest postcode/locality centroid (30–500m)

Geocoder doesn't have the street or building. Returns the nearest known centroid (postcode or locality). Snap distance can be large because centroids are administrative averages, not building positions.

What you can trust:

  • The postcode and city are usually correct.
  • The street name might be a *random* street in the postcode — treat as null.
  • The house number is meaningless.
  • Use for analytics aggregation ("how many users in postcode X"), nothing per-address.

When this happens:

  • Rural areas of any country.
  • Countries with weak coverage (most of Africa, parts of South America, much of Central Asia).
  • Inputs whose coords are slightly off (your phone GPS gave a 50m-error reading and the geocoder snapped to the wrong neighborhood).

Bucket 4: Bad input or no coverage (>500m or no result)

Geocoder returns nothing, or the snap distance is so large the result is meaningless. Examples:

  • Coordinates in the middle of an ocean.
  • Coordinates in a country with no Overture/OSM coverage.
  • Coordinates that are reversed ({lat: -77, lng: 38} instead of {lat: 38, lng: -77}).

What to do: treat as no-match. Don't show a stale "nearest known thing" 5km away as if it were the user's address.

Reading the accuracy fields

Most reverse geocoders return some indication of which bucket you're in. CSV2GEO returns accuracy as a string + accuracy_score as a 0–1 float:

| accuracy | Bucket | accuracy_score | |---|---|---| | rooftop | 1 (rooftop match) | 0.9–1.0 | | range_interpolation | 2 (street interpolation) | 0.7–0.9 | | geometric_center | 3 (postcode/locality centroid) | 0.4–0.7 | | approximate | 4 (rough estimate) | 0.0–0.4 |

The combination tells you what to trust. A rooftop with accuracy_score: 1.0 is gold; an approximate with accuracy_score: 0.3 should be flagged in your downstream system as low-trust.

The detailed scoring math is in Geocoding Confidence Scores Explained.

Measuring accuracy at the pipeline level

To know how accurate your reverse-geocoding is on YOUR data, build a simple distribution:

from collections import Counter

def bucket_distance(snap_m):
    if snap_m < 10: return 'rooftop'
    if snap_m < 30: return 'street'
    if snap_m < 500: return 'centroid'
    return 'far'

buckets = Counter()

for input_coord, result in reverse_geocode_log:
    if not result:
        buckets['no_match'] += 1
        continue
    snap = haversine_m(
        input_coord['lat'], input_coord['lng'],
        result['location']['lat'], result['location']['lng'],
    )
    buckets[bucket_distance(snap)] += 1

total = sum(buckets.values())
for label, n in buckets.most_common():
    print(f'{label}: {n} ({100*n/total:.1f}%)')

A healthy distribution on US data:

rooftop: 8240 (82.4%)
street: 1100 (11.0%)
centroid: 480 (4.8%)
far: 60 (0.6%)
no_match: 120 (1.2%)

If yours skews toward centroid or far, your input coords are noisy or you're querying in low-coverage regions. Either fix the input source or accept the lower accuracy.

Three operational questions

Pipelines built on reverse geocoding need to answer:

Q1: What snap distance do I tolerate before rejecting?

Depends on use case:

| Use case | Reject above | |---|---| | Address validation for forms | 20m | | "Find nearest store" UI | 100m (you're showing a list anyway) | | Delivery confirmation | 10m | | Marketing analytics | 500m (aggregate is fine) | | Insurance underwriting | 5m, then human review |

Encode this as a config knob, not magic numbers buried in code. Different products have different tolerances.

Q2: What if the GPS coord I have was already noisy?

Phone GPS in good conditions: ±5m. Indoors or in urban canyons: ±50m, sometimes ±100m. If your input is GPS, the snap distance you observe includes the GPS error PLUS the geocoder's snap.

Practical implication: in a noisy-input pipeline, the "rooftop" bucket might mean "rooftop given the input I had" — which could still be the wrong building.

Mitigation: collect a few seconds of GPS before reverse-geocoding (averaging filters out a lot of noise), or use a weighted multi-source approach (GPS + WiFi + cell tower triangulation).

Q3: How do I batch-validate reverse-geocoded results?

def validate_batch(rows):
    """rows: [{lat, lng, expected_address}, ...]"""
    correct = 0
    distances = []
    for row in rows:
        result = reverse_geocode(row['lat'], row['lng'])
        if not result: continue
        if result['formatted_address'] == row['expected_address']:
            correct += 1
        snap = haversine_m(
            row['lat'], row['lng'],
            result['location']['lat'], result['location']['lng'],
        )
        distances.append(snap)
    distances.sort()
    p50 = distances[len(distances)//2]
    p99 = distances[int(len(distances)*0.99)]
    print(f'Match rate: {correct}/{len(rows)} ({100*correct/len(rows):.1f}%)')
    print(f'p50 snap: {p50:.1f}m, p99 snap: {p99:.1f}m')

Run this monthly with a curated sample of 500 ground-truth (coord, address) pairs. Track the trend. A monthly drop in match rate or growth in p99 snap distance is your early warning that something regressed.

Common mistakes

Treating "no match" as a success

A reverse geocoder returning results: [] is a *correct* response — there's genuinely no address near that coord (or your coverage is thin). Treating empty results as a 200 OK and writing "unknown" downstream is fine; treating it as an error and retrying 5 times burns API quota for no benefit.

Using snap distance as the only quality signal

Snap distance only tells you how far the input was from the result coord. It doesn't tell you whether the result is the *correct* answer for that input. A 5m snap to the wrong neighbor's house counts as "rooftop" but is wrong. The accuracy_score and accuracy field together are richer signals — use both.

Aggregating across very different geographies

A pipeline serving global users will have rooftop hits in NYC and centroid-only results in rural Bolivia, both flowing through the same logic. The "average snap distance" across both is meaningless. Bucket by country (or even by region) before computing accuracy stats.

Trusting reverse geocoding for legal addresses

If your pipeline is generating mailing addresses or legal-document addresses from coords, you need a human in the loop on low-confidence results. Even rooftop snaps can be off by one unit number; a Apt 3B may end up labeled Apt 3A in dense buildings. Reverse geocoding is a *suggestion*, not a source of truth for legal contexts.

What csv2geo returns

For reference, a complete csv2geo /v1/reverse response with all the fields you'd need to make accuracy decisions:

{
  "query": { "lat": 38.8977, "lng": -77.0365 },
  "results": [{
    "formatted_address": "1600 PENNSYLVANIA Avenue Northwest, Washington, DC 20500, US",
    "location": { "lat": 38.8976750934991, "lng": -77.0365468377218 },
    "accuracy": "rooftop",
    "accuracy_score": 1.0,
    "components": {
      "house_number": "1600",
      "street": "PENNSYLVANIA Avenue Northwest",
      "city": "Washington",
      "state": "DC",
      "postcode": "20500",
      "country": "US"
    }
  }],
  "meta": { "version": "1.0.0", "timestamp": "2026-05-19T12:00:00Z" }
}

The four fields that matter for accuracy:

  • location — for computing snap distance vs the input
  • accuracy — the bucket
  • accuracy_score — the confidence within that bucket
  • components.house_number — non-empty if the geocoder confidently identifies a building; empty/null if it only knows the street

Combine those four in a small classifier:

def quality(input_lat, input_lng, result):
    if not result:
        return 'no_match'
    snap = haversine_m(input_lat, input_lng,
                       result['location']['lat'], result['location']['lng'])
    if result.get('accuracy') == 'rooftop' and snap < 15:
        return 'gold'
    if result.get('accuracy') in ('rooftop', 'range_interpolation') and snap < 50:
        return 'silver'
    if snap < 500:
        return 'bronze'
    return 'unreliable'

Tag every reverse-geocoded row with this quality label and let downstream code make per-quality decisions. Gold gets used as-is. Silver gets soft-confirmed (e.g., shown to user with "Is this right?"). Bronze gets aggregated only. Unreliable gets dropped.

FAQ

What's an acceptable accuracy in meters for reverse geocoding?

It depends entirely on what you do with the result. Last-mile delivery wants <15m (gold-tier in our schema). Marketing/territory analysis is fine with 50–200m. Country-level aggregation tolerates 5 km+. The right question isn't "what's good" — it's "how much error does my downstream system absorb before someone notices?"

Does the confidence score correlate with the actual distance error?

Loosely. A 0.95-confidence reverse-geocode is more likely to be within 10m than a 0.7 one — but confidence is computed from how cleanly the input matched, not from haversine distance to the returned coordinate. If real-world precision matters, compute the snap distance (haversine from input lat/lng to the returned address's lat/lng) alongside the confidence score, not instead of it.

Is rooftop always more accurate than interpolated?

No. Rooftop means "the building footprint was in the dataset" — but the dataset itself might be five years stale, list a building that was demolished, or have a wrong centroid. Interpolated results computed from accurate street centerline geometry are often within 20m of the true address. The accuracy field tells you the *type* of match, not the absolute quality of the geometry.

How do I measure reverse-geocoding error without ground truth?

You can't measure absolute error, but you can measure consistency. Send the same coordinate to two independent providers (or to the same provider at two timestamps) and measure the haversine distance between their returned coordinates. If two providers agree within ~30m, the result is probably solid. If they disagree by 200m+, treat it as unreliable regardless of either provider's confidence score.

Why does the same coordinate return different addresses across providers?

Each provider draws from a different source dataset (OSM, Overture, HERE, government parcels) with different vintages, different building footprint accuracy, and different "nearest match" tiebreaking rules. The same lat/lng can sit equidistant between two valid addresses, and providers may pick differently. This is a feature, not a bug — but it means single-provider reverse geocoding without consistency checking is a single point of failure.

FAQ

What's an acceptable accuracy in meters for reverse geocoding?

It depends entirely on what you do with the result. Last-mile delivery wants <15m (gold-tier in our schema). Marketing/territory analysis is fine with 50–200m. Country-level aggregation tolerates 5 km+. The right question isn't "what's good" — it's "how much error does my downstream system absorb before someone notices?"

Does the confidence score correlate with the actual distance error?

Loosely. A 0.95-confidence reverse-geocode is more likely to be within 10m than a 0.7 one — but confidence is computed from how cleanly the input matched, not from haversine distance to the returned coordinate. If real-world precision matters, compute the snap distance (haversine from input lat/lng to the returned address's lat/lng) alongside the confidence score, not instead of it.

Is rooftop always more accurate than interpolated?

No. Rooftop means "the building footprint was in the dataset" — but the dataset itself might be five years stale, list a building that was demolished, or have a wrong centroid. Interpolated results computed from accurate street centerline geometry are often within 20m of the true address. The accuracy field tells you the *type* of match, not the absolute quality of the geometry.

How do I measure reverse-geocoding error without ground truth?

You can't measure absolute error, but you can measure consistency. Send the same coordinate to two independent providers (or to the same provider at two timestamps) and measure the haversine distance between their returned coordinates. If two providers agree within ~30m, the result is probably solid. If they disagree by 200m+, treat it as unreliable regardless of either provider's confidence score.

Why does the same coordinate return different addresses across providers?

Each provider draws from a different source dataset (OSM, Overture, HERE, government parcels) with different vintages, different building footprint accuracy, and different "nearest match" tiebreaking rules. The same lat/lng can sit equidistant between two valid addresses, and providers may pick differently. This is a feature, not a bug — but it means single-provider reverse geocoding without consistency checking is a single point of failure.

Summary

Three rules:

  1. Always compute snap distance. It's a 5-line haversine and tells you most of what you need to know about a single result's quality.
  2. Don't treat all results as equal. A rooftop with snap <10m is qualitatively different from a geometric_center with snap 200m. Encode that downstream.
  3. Track accuracy distribution over time. A monthly trend report on the four buckets catches coverage regressions and input-noise increases before they cost you.

Reverse geocoding is rarely "wrong" — it's "approximately right at varying scales." Treat the accuracy signal as a first-class part of the result, not metadata to ignore. Pair with observability metrics so trends are visible, and with confidence scores so each call is interpretable on its own.

Ready to geocode your addresses?

Use our batch geocoding tool to convert thousands of addresses to coordinates in minutes. Start with 100 free addresses.

Try Batch Geocoding Free →