Bulk site risk scoring across thousands of addresses at once

Score thousands of site addresses for risk in one batch pipeline. Geocode, enrich with elevation, then rank by exposure — no GIS team required.

| June 22, 2026

Bulk site risk scoring across thousands of addresses at once

Risk teams run the same workflow quarter after quarter: someone exports a CSV of site addresses, someone else geocodes it manually or in a clunky desktop GIS, a third person joins in elevation data from a separate source, and a fourth writes the final score formula in a spreadsheet that nobody else understands. The whole exercise takes a week and produces a result that is already stale by the time it lands in the underwriting or portfolio review meeting.

This post replaces that workflow with a single batch pipeline. In roughly 200 lines of Python, you will geocode thousands of addresses, enrich each one with a verified ground elevation, derive a terrain-risk class, and emit a scored, ranked CSV ready for review. The same pattern works in Node. The whole job runs in minutes, not days, and the code is trivially re-runnable when the address list changes next quarter.

No GIS licences. No spatial database. No specialist on the team. One API key.

Why batch geocoding is the right primitive for risk

The temptation in risk assessment is to reach for a heavyweight spatial analysis platform. You end up with a six-month integration, a GIS vendor dependency, and a pipeline that only one engineer on the team can debug. That is the wrong starting point for a workflow whose core question is "which of these ten thousand addresses sits in an elevated, exposed, or low-lying position?"

The right primitive is a geocoding API that returns coordinates plus enrichment fields in a single call, at batch scale, against a global address database. CSV2GEO covers 461 million addresses across 39 countries. A single GET /api/v1/batch/geocode call accepts up to 100 addresses at once. Chain a few hundred of those together and a ten-thousand-row portfolio runs in under five minutes on a standard connection.

Three fields drive the bulk of the risk signal available from a pure-address pipeline:

Coordinates — every other field is downstream of getting lat/lng right. A wrong geocode is an invisible error that corrupts every score on that row silently.
Ground elevation — the primary terrain signal for flood, wind-exposure, and hail risk. Globally consistent, queryable from the same key.
Confidence score — the geocoder's own assessment of how certain it is. A low-confidence geocode on a high-value site is itself a risk signal: if you do not know where the site actually is, neither does the model.

This post covers all three and how to compose them into a production-grade batch job.

What the API surface looks like

CSV2GEO exposes 56 endpoints. The three that matter for this pipeline:

`GET /api/v1/geocode` — single-address geocoding. Accepts a free-text string via q or structured fields. Returns lat, lng, confidence, formatted address, and optional enrichment fields. Use this in the streaming/interactive path.

`GET /api/v1/batch/geocode` — batched geocoding. Accepts an array of address strings in a single POST body, up to 100 per call. Returns the same fields per address, preserving input order. Use this in the bulk-job path.

`GET /api/v1/elevation` — elevation lookup. Accepts up to 500 lat,lng points per call, semicolon-separated. Returns a height in metres per point, in input order. Use this after geocoding to enrich each row.

For the risk-scoring pipeline, the call sequence per batch of rows is:

POST 100 addresses → /api/v1/batch/geocode → get lat/lng/confidence per row
POST up to 500 coordinates → /api/v1/elevation → get elevation_m per row
Apply your scoring function locally — no more API calls needed

The elevation call can batch up to 5× more points than the geocoding call, so a natural pipeline rhythm is: geocode 100 addresses, accumulate coordinates, flush to elevation when you have 500. In practice, the difference matters less than keeping the retry logic clean. We will do it simply: geocode in batches of 100, then elevation in batches of 500, as two sequential passes over the data.

The pipeline, end to end

Step 1: Structure your input

The pipeline expects a CSV with at minimum an id column and an address column. Everything else is preserved through the pipeline unchanged. A minimal example:

id,address,portfolio,insured_value
SITE-001,"1 Bay Street, Toronto, ON M5H 2Y4",commercial,12400000
SITE-002,"400 W 15th St, Austin TX 78701",commercial,8750000
SITE-003,"300 Alton Rd, Miami Beach FL 33139",residential,4100000

The id column is your join key throughout. Every intermediate file and the final output keyed by id. If the source system does not emit a stable identifier, generate one with a SHA-256 of the raw address string — idempotent enough for a quarterly re-run, discussed further in Idempotent Geocoding — Safe to Retry.

Step 2: Batch geocode the address list

The core geocoding loop. Note requests.Session() — reusing a connection pool across hundreds of calls matters.

import csv
import os
import time
import requests

API = "https://csv2geo.com/api/v1"
KEY = os.environ["CSV2GEO_API_KEY"]
GEOCODE_BATCH = 100
RETRY_WAIT = 2  # seconds base for exponential backoff

session = requests.Session()

def geocode_batch(addresses: list[str]) -> list[dict]:
    """POST up to 100 addresses; return one result dict per address."""
    payload = {"addresses": addresses, "api_key": KEY}
    for attempt in range(4):
        r = session.post(f"{API}/batch/geocode", json=payload, timeout=30)
        if r.status_code == 429:
            time.sleep(RETRY_WAIT * (2 ** attempt))
            continue
        r.raise_for_status()
        return r.json()["results"]
    raise RuntimeError("geocode_batch: exhausted retries")

def chunks(seq, n):
    for i in range(0, len(seq), n):
        yield seq[i:i+n]

with open("sites.csv") as fin:
    reader = csv.DictReader(fin)
    rows = list(reader)

geocoded = []
for batch in chunks(rows, GEOCODE_BATCH):
    addrs = [r["address"] for r in batch]
    results = geocode_batch(addrs)
    for row, geo in zip(batch, results):
        row["lat"] = geo.get("lat")
        row["lng"] = geo.get("lng")
        row["confidence"] = geo.get("confidence")
        row["formatted_address"] = geo.get("formatted_address")
    geocoded.extend(batch)
    time.sleep(0.1)  # gentle pacing; adjust to your rate limit tier

The backoff on 429 is intentionally simple here. For a fuller treatment of when to retry versus when to surface the failure to the operator, see Exponential Backoff — When to Retry, When to Stop.

After this step, every row in geocoded either has a lat/lng pair or has None in both fields, indicating a geocoding failure. Flag the None rows immediately:

failed_geocode = [r for r in geocoded if r["lat"] is None]
succeeded = [r for r in geocoded if r["lat"] is not None]
print(f"Geocoded: {len(succeeded)} / {len(geocoded)} — failures: {len(failed_geocode)}")

Failures at geocoding are risk signals in themselves. An address that cannot be geocoded at reasonable confidence almost certainly has a data-quality problem at source: a PO Box where a street address should be, a mistyped street name, a suite number that was pasted into the street field. Write the failures to a separate sites_geocode_failures.csv for the risk team to review manually. Do not silently drop them.

Step 3: Enrich with ground elevation

Now the elevation pass. Five hundred points per call means a 10,000-site portfolio requires only 20 elevation API calls:

ELEVATION_BATCH = 500

def fetch_elevations(coords: list[tuple]) -> list[float | None]:
    """coords: list of (lat, lng). Returns elevation_m per point."""
    pts = ";".join(f"{lat},{lng}" for lat, lng in coords)
    r = session.get(
        f"{API}/elevation",
        params={"points": pts, "api_key": KEY},
        timeout=30,
    )
    r.raise_for_status()
    return [e.get("elevation_m") for e in r.json()["results"]]

coord_pairs = [(float(r["lat"]), float(r["lng"])) for r in succeeded]
elevations = []
for batch in chunks(coord_pairs, ELEVATION_BATCH):
    elevations.extend(fetch_elevations(batch))
    time.sleep(0.05)

for row, ele in zip(succeeded, elevations):
    row["elevation_m"] = ele

The same pattern in Node, if your pipeline is TypeScript:

const API = 'https://csv2geo.com/api/v1';
const KEY = process.env.CSV2GEO_API_KEY;

async function fetchElevations(coords) {
  // coords: [{lat, lng}, ...]
  const pts = coords.map(c => `${c.lat},${c.lng}`).join(';');
  const url = `${API}/elevation?points=${encodeURIComponent(pts)}&api_key=${KEY}`;
  const r = await fetch(url, { signal: AbortSignal.timeout(30_000) });
  if (!r.ok) throw new Error(`elevation HTTP ${r.status}`);
  const body = await r.json();
  return body.results.map(e => e.elevation_m ?? null);
}

The key correctness note: null is a legitimate response for a point where the terrain model has no data (open ocean, missing tile). A real elevation of 0 m is encoded as the integer 0. In Python, branch on e is None; in JavaScript, on e === null. Do not use a falsy check — coastal sites at 0 m are real, and you do not want to flag them as "no data" or silently re-score them.

As a sanity check after the elevation pass, probe a few known values against published figures:

Denver should return roughly 1,597 m
Miami should return roughly 1 m
Paris should return roughly 46 m
Sydney should return roughly 64 m
Tokyo should return roughly 40 m

If your test addresses come back wildly off, the issue is almost always a lat/lng swap — coordinates fed as lng,lat instead of lat,lng. The elevation model does not lie; the coordinate order is the bug.

Step 4: Derive terrain-risk class

The terrain classifier is local logic. The API gives you the number; your actuarial or risk team owns the thresholds. A reasonable starting schema for a general commercial portfolio:

def terrain_class(ele_m, confidence):
    if ele_m is None:
        return "unclassified"
    if confidence is not None and float(confidence) < 0.7:
        return "low_confidence"  # geocode uncertainty dominates
    if ele_m < 3:
        return "coastal_critical"    # storm surge / tidal flood exposure
    if ele_m < 10:
        return "coastal_elevated"    # flood consideration
    if ele_m < 30:
        return "lowland"             # floodplain potential
    if ele_m > 2000:
        return "high_altitude"       # snow load, wind, access
    if ele_m > 800:
        return "upland"              # moderate wind exposure
    return "standard"

for row in succeeded:
    row["terrain_class"] = terrain_class(
        row.get("elevation_m"),
        row.get("confidence"),
    )

The low_confidence class is worth expanding on. When the geocoder returns a confidence below 0.7, the lat/lng itself is uncertain — perhaps resolved to a street centroid rather than a parcel centroid, or matched to a ZIP code rather than a specific address. Deriving an elevation from a street centroid four blocks from the actual parcel boundary is fine for most sites; for a coastal site it could mean the difference between +2 m and +12 m, which is the difference between "flag for review" and "standard policy". The low_confidence flag forces that site into the manual-review bucket regardless of its elevation number.

See Geocoding Confidence Scores Explained for a deeper treatment of what the score measures, where it saturates, and how to use it as a quality gate in enrichment pipelines.

Step 5: Score, rank, and emit the output

The final scoring step combines terrain class, elevation, and insured value into a sortable risk score. The formula below is illustrative — replace it with whatever your portfolio model specifies:

TERRAIN_WEIGHT = {
    "coastal_critical": 1.0,
    "coastal_elevated": 0.7,
    "lowland":          0.4,
    "high_altitude":    0.35,
    "upland":           0.2,
    "standard":         0.0,
    "low_confidence":   0.5,  # uncertainty penalty
    "unclassified":     0.6,  # treat as elevated risk
}

def risk_score(row):
    terrain = TERRAIN_WEIGHT.get(row.get("terrain_class", "unclassified"), 0.5)
    try:
        value = float(row.get("insured_value", 0))
    except (ValueError, TypeError):
        value = 0.0
    # Simple composite: terrain exposure * log-scaled value
    import math
    value_factor = math.log10(max(value, 1)) / 10  # normalised 0-1 for realistic values
    return round((0.6 * terrain) + (0.4 * value_factor), 4)

for row in succeeded:
    row["risk_score"] = risk_score(row)

# Sort descending by risk score for the review queue
succeeded.sort(key=lambda r: r["risk_score"], reverse=True)

# Write output
output_fields = [
    "id", "address", "formatted_address", "lat", "lng",
    "confidence", "elevation_m", "terrain_class", "risk_score",
    "portfolio", "insured_value",
]
with open("sites_scored.csv", "w", newline="") as fout:
    writer = csv.DictWriter(fout, fieldnames=output_fields, extrasaction="ignore")
    writer.writeheader()
    writer.writerows(succeeded)

# Also write the failures
with open("sites_failed.csv", "w", newline="") as fout:
    writer = csv.DictWriter(fout, fieldnames=reader.fieldnames + ["lat", "lng", "confidence"])
    writer.writeheader()
    writer.writerows(failed_geocode)

print(f"Scored {len(succeeded)} sites → sites_scored.csv")
print(f"Failed {len(failed_geocode)} sites → sites_failed.csv")

The top of the ranked output is the review queue. The bottom is the auto-approve bucket. The split threshold is a business decision; the pipeline just gives you the sorted list.

Observability in production

A batch pipeline that runs unattended quarterly needs instrumentation. Three metrics worth logging to whatever APM you already use:

Geocode success rate. Failed geocodes divided by total input rows. A well-formed commercial address list should succeed above 95%. Anything below 90% suggests a source-data quality problem, not an API problem.

Mean confidence of successful geocodes. If this drifts below 0.85 on a consistent run, the address data quality is degrading — a signal to feed back to whoever owns the source system.

Elevation null rate. For a US-focused portfolio, a null elevation rate above 2% is unusual and warrants investigation. For an international portfolio across 39 countries, spot-check the nulls against the known coverage limits of the terrain model.

For the full observability treatment — what to log, what to alert on, and how to structure dashboards for a geocoding pipeline — see Observability for Geocoding Pipelines.

Cost arithmetic for a real portfolio

The numbers are straightforward. Each geocoding call costs 1 credit; each elevation call costs 1 credit per point batched into it. A 10,000-address portfolio:

Geocoding: 10,000 credits (100 batch calls × 100 addresses)
Elevation: 10,000 credits (20 batch calls × 500 points)
Total: 20,000 credits

On the entry paid plan ($54/month for 100,000 calls), a full quarterly run of 10,000 sites costs roughly $10.80 in API spend. A 50,000-site portfolio runs to roughly $54 — the entry tier's entire monthly allotment in one job, which means a 50,000-site shop should be on the next bracket up. The free tier (3,000 calls/day) covers pilot runs of 1,500 addresses per day at 2 credits per row. See the API pricing page for the full bracket table.

One thing that meaningfully reduces the cost on repeated runs: caching. Addresses do not move; elevation does not change on the timescales of a quarterly review cycle. A Redis cache keyed on the normalised address string will absorb the overwhelming majority of geocoding credits on the second and subsequent runs. Caching Geocoding Results — 90% Cost Reduction covers the exact pattern in production detail.

Failure modes and how to handle them

Three failure patterns that trip up teams who treat this as a happy-path-only problem.

Partial batch failure on geocoding. The batch endpoint returns a result for every input address, but some results will have lat: null when the address cannot be resolved. These are not HTTP errors — they are valid responses. A team that checks only for 4xx/5xx and assumes all rows succeeded will silently pass null coordinates downstream, producing a null elevation and an "unclassified" terrain class that gets treated as a risk flag rather than a data-quality flag. Check each row's lat/lng after the batch call, not just the HTTP status.

Rate limiting during a large run. If you push a 100,000-row portfolio through the geocoding endpoint without pacing, you will hit the rate limiter and start seeing 429 responses. The retry-with-backoff in Step 2 handles the occasional 429; it does not handle sustained 429s from running 1,000 concurrent requests. The right approach is to cap concurrency to a level the API will accept cleanly for your tier, and to pipeline the work across multiple runs if the volume exceeds the daily call budget. See Concurrency Tuning for Geocoding — the Sweet Spot for the measurement approach.

Address format mismatch across countries. A portfolio that spans multiple countries will contain address strings formatted for their home country's conventions — postcode placement, prefecture notation, suburb-before-city ordering, and so on. The geocoder handles this across 39 countries, but the input format has to be parseable. Feeding a Japanese address formatted for a US parser will produce a low-confidence geocode, not a geocoding error. The fix is to normalise the address format per country before the batch call, or to pass structured fields (street, city, country) rather than a single free-text string for international addresses. See Geocoding Addresses in 200+ Countries for the per-country formatting guide.

What the output enables

A scored, ranked, elevation-enriched CSV is the input to several downstream risk workflows that risk teams typically assemble separately:

Portfolio triage. The top decile by risk score goes to the senior underwriting team for manual review. The bottom quartile auto-renews. The middle gets a rules-based review trigger. None of this requires anything beyond WHERE risk_score > X.

Renewal pricing signals. If a site's terrain class changed since last quarter — because the address data was corrected and the geocode resolved to a different point — that is a signal for a re-rating conversation. The pipeline produces a consistent, versioned record that can be diffed quarter-over-quarter.

Concentration risk. Group the scored output by a bounding-box grid and sum insured values per cell. Cells with high total insured value and high average terrain risk are concentration hotspots. A pivot table on terrain_class and portfolio gives the same signal in three clicks. The pipeline does not draw the maps — you do that in whatever BI tool the risk team already uses — but it produces the data in a form that every BI tool can ingest.

Input to a more sophisticated model. The three fields this pipeline adds — lat, lng, elevation, confidence — are the cheap, reliable feature layer that better models sit on top of. A gradient-boosted loss model that adds weather history, crime indices, and construction type still needs clean coordinates and verified elevation. This pipeline is the foundation layer; it is not the whole edifice.

Frequently Asked Questions

How many addresses can I process in a single job?

There is no upper limit on job size — it is your loop, not ours. The geocoding batch endpoint takes up to 100 addresses per call; the elevation endpoint takes up to 500 points per call. A 1,000,000-row book is 10,000 geocoding calls and 2,000 elevation calls. Whether you run that in one overnight job or spread it across a week depends on your call-rate tier and your operational preference. The free tier gives you 3,000 calls per day; paid tiers start at 100,000 calls per month for $54.

What do I do with addresses that fail geocoding?

Write them to a separate failure CSV immediately. Do not silently drop them or let them propagate with null coordinates. The failure CSV is a data-quality work item for whoever owns the source address list, and it is a risk audit item — if you cannot locate the site, you cannot properly rate it.

The confidence score is below 0.7 for some addresses. Should I still use the elevation?

Use it cautiously. A low-confidence geocode means the coordinate is uncertain — possibly resolved to a street centroid or a postcode centroid rather than the parcel itself. For a site in the middle of a continent at high elevation, a 500-metre coordinate error changes nothing. For a coastal site at +2 m where +2 m versus +12 m changes the flood-zone interpretation, the uncertainty dominates the signal. Flag those rows for manual review rather than auto-scoring them.

Does the elevation API cover all 39 countries in the address database?

The elevation model is global and covers all populated landmass. The 39 countries figure refers to the address geocoding coverage. A site in a country outside those 39 may still geocode successfully via free-text parsing, but confidence will be lower. The elevation lookup itself has no country restriction — it covers any lat/lng on Earth, including probe points like Mauna Kea at 4,198 m or the Dead Sea shore at −415 m.

Should I refresh the scores every quarter or every year?

Quarterly is reasonable for a commercial portfolio where sites change hands, policies lapse, and new sites are added. Elevation and terrain class do not change on any human timescale — the score change on a quarterly refresh will come from new sites in the portfolio, corrected addresses, and changes to your scoring formula, not from the underlying terrain shifting. Cache the elevation results aggressively; re-geocode only new or corrected addresses.

Can I run this on a serverless function or does it need a long-running process?

Either works. A serverless function with a 15-minute execution limit can comfortably process 5,000–10,000 addresses per invocation at comfortable pacing. For larger books, the natural split is: one function per batch of 1,000 addresses, orchestrated by a step function or queue. The pipeline is stateless between batches — each batch reads from the source CSV and writes to the output, so re-running a failed batch is safe. See Idempotent Geocoding — Safe to Retry for the key-design pattern that makes batch retries safe.