Trade area isochrones for new store screening at scale

Screen candidate retail sites with drive-time isochrones and POI counts. REST patterns in Python and Node, cost math, and failure modes.

| July 03, 2026

Trade area isochrones for new store screening at scale

Site selection is one of the more expensive decisions a retail business makes. A bad site choice is not obviously wrong on day one — it shows up two years later in a lease you cannot exit, a store that does the same revenue as the one you opened ten kilometres away for a third of the rent. The canonical defence is a trade area analysis: define a polygon around each candidate site that represents "the area from which this store would realistically draw customers," count what is inside that polygon, and compare candidates on the same basis.

In practice most teams do this in one of two ways. Either someone drives the candidate neighbourhoods and forms a gut impression — which scales to perhaps five serious candidates before it becomes theatre — or an analyst spends three weeks pulling census geographies, clipping them to a hand-drawn circle, and joining them to a third-party POI dataset with a vendor licence that costs more than the analyst's time. Neither approach is replicable at 200 candidates, and neither is scriptable enough to re-run when the shortlist changes.

This post shows a third path. Two REST endpoints — one for drive-time isochrones, one for points of interest — combine with your own candidate list to produce a ranked, replicable screen in an afternoon. The geometry is computed server-side. The polygon comparison is done client-side in about 40 lines of Python. The whole thing costs less per candidate than a tank of fuel for the site visit that replaces.

What a trade area actually is and is not

Before the code, a definition worth agreeing on.

A trade area is the geographic zone from which a retail location draws the majority of its customers. "Majority" is usually defined as the 60-80th percentile of expected customer origin — not every customer, just the realistic catchment. The boundary is not a circle. A 10-minute drive covers a much larger radius on a motorway than it does in a dense urban grid, and it covers zero land on the other side of a river with no bridge within ten kilometres. The boundary is an isochrone: the set of all points reachable from the candidate site within a given travel time by a given mode.

What a trade area is not is a demographic data product. The polygon tells you the geography. What is inside that geography — population, household income, daytime worker count, competitor density — comes from separate data layers that you join to the polygon. This post uses the Places API to count POIs inside the isochrone. Demographic joins are your own data problem; nothing in this pipeline invents or sells census figures that do not exist.

Understanding this distinction matters before you build anything. If your site-selection model requires demographic data, you need a demographic data source — and you need to join it to the polygon yourself. What the isochrone endpoint gives you is the correct polygon for that join. That is the right separation of concerns.

The two endpoints

`GET /api/v1/isochrone` — takes a coordinate, a travel time in minutes, and a travel mode (driving, walking, cycling) and returns a GeoJSON polygon representing the reachable area. Multiple contour times can be requested in one call (e.g. 5, 10, 15 minutes) and each returns a separate polygon. The response is standard GeoJSON FeatureCollection so it drops into any geospatial library without conversion.

`GET /api/v1/places/nearby` — takes a coordinate, a radius in metres, a category filter, and a limit, and returns the POIs within that radius. The important detail for this use case: the radius is Euclidean (point-and-radius), not polygon-aware. You do not pass the isochrone polygon to the Places API. Instead, you query Places with the largest radius that meaningfully bounds the isochrone, then do the polygon containment check client-side. This is the right architecture because it keeps the API simple and keeps your filtering logic in code you own and can modify.

CSV2GEO exposes 56 endpoints in total. The geocoding, isochrone, and Places endpoints are part of the same API key and billing account — there is no separate credential or contract for spatial queries.

The pipeline, step by step

A candidate list arrives as a CSV: site_id, address, city, state. By the end of the pipeline each row carries an isochrone polygon, a POI count by category, and a composite score that ranks the candidates. Five distinct steps.

Step 1: Geocode every candidate address

The isochrone endpoint takes coordinates, not addresses. Geocode the entire candidate list first so you have lat, lng on every row. Batch geocoding keeps the credit count manageable.

import csv, os, time
import requests

API = "https://csv2geo.com/api/v1"
KEY = os.environ["CSV2GEO_API_KEY"]

def geocode(address, city, state):
    q = f"{address}, {city}, {state}"
    r = requests.get(
        f"{API}/geocode",
        params={"q": q, "api_key": KEY},
        timeout=15,
    )
    r.raise_for_status()
    results = r.json().get("results", [])
    if not results:
        return None, None, None
    top = results[0]
    return top["lat"], top["lng"], top.get("confidence", 0)

candidates = []
with open("candidates.csv") as f:
    for row in csv.DictReader(f):
        lat, lng, conf = geocode(row["address"], row["city"], row["state"])
        row.update({"lat": lat, "lng": lng, "confidence": conf})
        candidates.append(row)
        time.sleep(0.05)  # gentle rate pacing; tune per your plan

Low-confidence results — anything below roughly 0.7 — are worth flagging for manual address verification before you spend an isochrone call on them. A mis-geocoded address produces an isochrone for the wrong location, and that error propagates silently through the rest of the pipeline. See Geocoding Confidence Scores Explained for how to interpret the confidence field and where the threshold should sit for your use case.

Step 2: Pull an isochrone for each candidate site

One call per candidate, requesting a 10-minute drive-time polygon. Adjust the contours_min list to match whatever your commercial model defines as the primary trade area — convenience retail might use 5 minutes, a destination furniture store might use 20.

import json

def get_isochrone(lat, lng, contours_min=(10,), mode="driving"):
    r = requests.get(
        f"{API}/isochrone",
        params={
            "lat": lat,
            "lng": lng,
            "contours_min": ",".join(str(c) for c in contours_min),
            "mode": mode,
            "api_key": KEY,
        },
        timeout=30,
    )
    if r.status_code == 429:
        raise RuntimeError("rate_limited")
    r.raise_for_status()
    return r.json()  # GeoJSON FeatureCollection

for c in candidates:
    if c["lat"] is None:
        c["isochrone"] = None
        continue
    try:
        c["isochrone"] = get_isochrone(c["lat"], c["lng"])
    except Exception as e:
        c["isochrone"] = None
        c["isochrone_error"] = str(e)
    time.sleep(0.1)

The same call in Node, for teams whose pipeline is JavaScript:

const API = 'https://csv2geo.com/api/v1';
const KEY = process.env.CSV2GEO_API_KEY;

async function getIsochrone(lat, lng, contoursMins = [10], mode = 'driving') {
  const params = new URLSearchParams({
    lat, lng,
    contours_min: contoursMins.join(','),
    mode,
    api_key: KEY,
  });
  const r = await fetch(`${API}/isochrone?${params}`);
  if (r.status === 429) throw new Error('rate_limited');
  if (!r.ok) throw new Error(`http_${r.status}`);
  return r.json(); // GeoJSON FeatureCollection
}

A few production notes on the isochrone call itself.

The polygon size varies enormously by location. A 10-minute driving isochrone from a suburban candidate on a motorway interchange might cover 35 km². A 10-minute isochrone from a dense urban site blocked by one-way streets and a river might cover 4 km². This is correct behaviour, not a bug — it is the main reason isochrones are more useful than fixed-radius circles for site selection. A circle would give both sites the same "trade area" and that would be wrong.

The polygon is a single connected ring in the typical case. In rare cases (an island site, a peninsula with no bridge) you may get a multipolygon. Handle both — the GeoJSON geometry.type field will tell you which.

Step 3: Query Places within a bounding radius

Now the POI count. The Places API takes a point and a radius; it does not accept a GeoJSON polygon. The pattern is: compute the bounding circle of your isochrone polygon (the smallest circle that contains the polygon), query Places with that radius, then filter the results to only those whose coordinates fall inside the isochrone polygon.

Computing the bounding circle in Python:

from math import radians, cos, sin, asin, sqrt

def haversine_km(lat1, lng1, lat2, lng2):
    R = 6371
    dlat = radians(lat2 - lat1)
    dlng = radians(lng2 - lng1)
    a = sin(dlat/2)**2 + cos(radians(lat1))*cos(radians(lat2))*sin(dlng/2)**2
    return 2 * R * asin(sqrt(a))

def bounding_radius_m(center_lat, center_lng, geojson_feature_collection):
    coords = []
    for feat in geojson_feature_collection["features"]:
        geom = feat["geometry"]
        if geom["type"] == "Polygon":
            coords.extend(geom["coordinates"][0])
        elif geom["type"] == "MultiPolygon":
            for poly in geom["coordinates"]:
                coords.extend(poly[0])
    max_d = max(
        haversine_km(center_lat, center_lng, lat, lng)
        for lng, lat in coords  # GeoJSON is [lng, lat]
    )
    return int(max_d * 1000 * 1.05)  # convert km → m, add 5% buffer

Then query Places with that radius:

def get_places(lat, lng, radius_m, category, limit=200):
    r = requests.get(
        f"{API}/places/nearby",
        params={
            "lat": lat, "lng": lng,
            "radius": min(radius_m, 50000),  # hard cap; adjust per category
            "categories": category,
            "limit": limit,
            "api_key": KEY,
        },
        timeout=20,
    )
    r.raise_for_status()
    return r.json().get("results", [])

The call returns every POI within the bounding circle. Some of those POIs will be outside the isochrone polygon — they are reachable within the circle's Euclidean radius but not within the drive-time limit. Filter them out client-side in Step 4.

Step 4: Point-in-polygon filtering client-side

A minimal ray-casting point-in-polygon test. No geospatial library dependency — useful if you want to keep the pipeline to pure stdlib plus requests.

def point_in_polygon(lat, lng, polygon_coords):
    # polygon_coords: list of [lng, lat] pairs (GeoJSON convention)
    x, y = lng, lat
    n = len(polygon_coords)
    inside = False
    px, py = polygon_coords[0]
    for i in range(1, n + 1):
        qx, qy = polygon_coords[i % n]
        if ((py > y) != (qy > y)) and (x < (qx - px) * (y - py) / (qy - py + 1e-12) + px):
            inside = not inside
        px, py = qx, qy
    return inside

def poi_count_inside_isochrone(places, geojson_feature_collection):
    # Flatten polygon coordinates from the isochrone GeoJSON
    polys = []
    for feat in geojson_feature_collection["features"]:
        geom = feat["geometry"]
        if geom["type"] == "Polygon":
            polys.append(geom["coordinates"][0])
        elif geom["type"] == "MultiPolygon":
            for ring in geom["coordinates"]:
                polys.append(ring[0])
    count = 0
    for poi in places:
        plat = poi["location"]["lat"]
        plng = poi["location"]["lng"]
        if any(point_in_polygon(plat, plng, poly) for poly in polys):
            count += 1
    return count

If you are already running shapely in your environment, use shapely.geometry.shape(geom).contains(Point(lng, lat)) — it is faster and handles edge cases more robustly. The ray-casting version above is the fallback for lean environments.

Step 5: Score and rank candidates

With isochrones pulled and POI counts computed, the final step is the scoring model. What follows is a deliberately simple version — replace the weights and categories with whatever your commercial data supports.

CATEGORIES = {
    "cafe":        {"weight": 0.3, "signal": "footfall_proxy"},
    "supermarket": {"weight": 0.4, "signal": "co_anchor"},
    "gym":         {"weight": 0.2, "signal": "demographic_proxy"},
    "pharmacy":    {"weight": 0.1, "signal": "co_anchor"},
}

def score_candidate(candidate):
    if not candidate.get("isochrone"):
        return None
    iso = candidate["isochrone"]
    lat, lng = candidate["lat"], candidate["lng"]
    radius_m = bounding_radius_m(lat, lng, iso)
    total = 0.0
    for cat, cfg in CATEGORIES.items():
        places = get_places(lat, lng, radius_m, cat)
        count  = poi_count_inside_isochrone(places, iso)
        total += count * cfg["weight"]
    return round(total, 2)

for c in candidates:
    c["score"] = score_candidate(c)

ranked = sorted(
    [c for c in candidates if c["score"] is not None],
    key=lambda x: x["score"],
    reverse=True,
)

The output is a ranked list. The top five candidates go to the next stage of the site-selection process — lease negotiation, foot-count study, manager interviews. The bottom of the list is culled without spending a site visit on them.

Failure modes that will bite you in production

Six months into running this pipeline, these are the problems that show up.

Isochrone timeout on complex urban networks. Dense city centres with many turn restrictions take longer to compute. If your timeout is 10 seconds and the routing engine needs 12, you get a timeout error and a null isochrone, not a degraded polygon. Set a 30-second timeout on the isochrone call. Log every timeout, not just every error — a pattern of timeouts concentrated in one metro is a signal to pre-fetch those isochrones overnight rather than in the screening loop.

Places results that hit the `limit` cap. If you set limit=200 and a candidate site in a dense CBD returns exactly 200 results, you almost certainly have more POIs inside the bounding radius that you did not retrieve. For very dense candidates, either increase the limit or break the query into multiple narrower category requests. The count you compute will be a floor, not a total — which is fine for comparative ranking but wrong if you are asserting an exact number to a board presentation.

The bounding radius is much larger than the isochrone. On a motorway interchange, the isochrone can extend 15 km in one direction and 3 km in another. The bounding circle covers 15 km in all directions. You pay for every Places result in that circle, then discard two-thirds of them in the polygon filter. For very elongated isochrones, consider querying in two or three sub-circles rather than one large one — the credit saving on POI calls is real at scale.

Cached isochrones with stale road data. If you are running this pipeline on Monday and re-running against the same coordinates on Thursday, your caching layer may serve the Monday isochrone. Road changes — a new bypass, a bridge closure — are not frequent, but in fast-developing suburban corridors near your candidate sites they matter. Cache isochrones for 7 days maximum, not indefinitely.

Low-confidence geocodes producing wrong polygons. A candidate address geocoded to the wrong block produces an isochrone centred on the wrong point. The polygon looks plausible — there is no error code — but the POI counts are for the wrong neighbourhood. The fix is enforcing a minimum confidence threshold before calling the isochrone endpoint at all. Anything that cannot be geocoded to at least 0.7 confidence goes to a manual verification queue before the pipeline continues.

Rate limiting during bulk screening. A list of 200 candidates with four Places queries per candidate is 800 API calls in a short burst. On most paid plans that is not a problem spread over an hour; it can be a problem in a 60-second burst. Add a small sleep between iterations, run the batch overnight, or see Concurrency Tuning for Geocoding Pipelines for the concurrency model that keeps you comfortably below the rate limit without sacrificing throughput.

What this pipeline does not do

Honest scope, the same way every post in this series is honest about what the tool covers.

It does not produce demographic data. The isochrone gives you the polygon. Population, household income, age distribution — those come from whichever demographic data source your company already licences or from a census join you build yourself. The pipeline is intentionally designed so that you can join your own demographic data to the isochrone GeoJSON in exactly the same step where you join the Places POI counts.

It does not predict revenue. A score built from POI density is a proxy for customer opportunity and co-anchor strength. It is not a regression model trained on your own store performance data. The ranking tells you which candidates have better surrounding context; it does not tell you which one will do £2.3M in year one. That model belongs in your FP&A team's spreadsheet, not in the screening pipeline.

It does not replace the site visit. The screening pipeline reduces the shortlist from 200 candidates to 10. The site visit still happens. It is just happening at sites that have already passed a quantitative filter, which changes what the site visitor is looking for — they are validating a hypothesis, not forming one from scratch.

The Places API covers POIs, not your competitor's private fleet. If your business competes primarily with one specific chain and their location database is not fully represented in the Places categories, the POI count will undercount them. Supplement with whatever proprietary data you have on competitor locations by testing your own competitor-location CSV against the isochrone polygon using the same point-in-polygon function from Step 4. The isochrone is a polygon in GeoJSON — your own data joins to it in plain Python.

Cost arithmetic for a screening run

200 candidates. One geocoding call each. One isochrone call each. Four Places calls per candidate (four categories). That is:

| Operation | Calls | Credits | |---|---|---| | Geocoding | 200 | 200 | | Isochrone | 200 | 200 | | Places queries | 200 × 4 | 800 | | Total | | 1,200 |

At paid pricing starting from $54/month for 100,000 calls, a full 200-candidate screen costs less than one tank of fuel for a site visit. The free tier (3,000 calls/day) is enough to screen roughly 200 candidates in a single day without a credit card, which makes the pipeline accessible for an analyst who wants to validate the approach before their manager approves the API budget.

For ongoing screening — a retailer that evaluates 50 new candidates per quarter — the annual credit consumption is around 6,000 calls, well inside the entry paid tier's monthly inclusion. See the live pricing brackets at csv2geo.com/pricing/api.

The one cost category that can surprise you is caching behaviour. Isochrones for the same coordinate and travel time are deterministic — if you re-run the pipeline against the same candidate list next week, a good caching layer serves the stored polygon rather than re-calling the API. See Caching Geocoding Results — 90% Cost Reduction for the caching architecture; the same patterns apply to isochrone and Places responses. Cache aggressively: a candidate site's coordinates do not change, and a 10-minute isochrone in a stable road network is good for at least a week.

Observability: what to measure once this is in production

A screening pipeline that runs unsupervised needs instrumentation. Three metrics worth tracking from day one.

Geocode confidence distribution. Track the histogram of confidence scores across your candidate list. A lot of low-confidence results is a data-quality signal about your address list, not a geocoding API problem — and it tells you how much of the subsequent pipeline output to trust.

Isochrone null rate. What fraction of candidates come back with no isochrone (timeout, error, or low-confidence geocode upstream)? Above 2% is worth investigating. Below 2% is normal noise. The null-rate trend over time tells you if the routing engine is getting slower on a particular metro.

POI count variance across candidates. If two adjacent candidates return wildly different POI counts, sanity-check whether the isochrone polygons are genuinely different shapes or whether one hit the limit cap. A Places response that returns exactly limit results is a query-saturation flag, not a real count.

For a deeper treatment of the observability model, Observability for Geocoding Pipelines covers the metrics and alert design that applies here directly.

Frequently Asked Questions

Can I request multiple drive-time contours in a single isochrone call? Yes. Pass a comma-separated list to contours_min (e.g. 5,10,15) and the response includes a separate polygon feature for each contour. The features are ordered from smallest to largest travel time. You can use the 5-minute polygon as a "primary trade area" and the 10-minute polygon as a "secondary trade area" without making a second API call.

The Places API uses a radius, but my isochrone is a polygon — how do I reconcile the two? Query Places with the bounding radius of the isochrone (the smallest circle that fully contains the polygon), retrieve all POIs within that circle, then filter to only those whose coordinates fall inside the actual isochrone polygon using a point-in-polygon test client-side. Step 3 and Step 4 above show the exact implementation. This pattern is correct and efficient: the API handles the radius query at scale, and your code handles the geometric precision.

What travel modes are supported? Driving, walking, and cycling. Driving is the right default for most retail site selection. Walking is appropriate for urban convenience formats (coffee, pharmacy, newsagent). There is no public-transit mode — if your candidates are transit-oriented retail, compute the transit catchment separately from a transit GTFS feed and use driving as a proxy for the initial screen.

How fresh is the routing data that underlies the isochrones? The routing network is updated periodically. It is not live traffic — it reflects typical travel-time conditions, not peak-hour congestion on a Tuesday afternoon. For retail site selection this is the right model: you want typical reachability, not rush-hour worst-case. If you are screening for a petrol station or a drive-through and peak-hour access matters, supplement the isochrone with a manual peak-hour spot check on the top candidates.

My candidate list has 2,000 sites. How should I batch and pace the pipeline? Run the geocoding pass first (all 2,000 in one scripted loop). Then run the isochrone pass overnight — one call per site with a 0.1 s sleep between calls keeps you well within rate limits on any paid plan. Then run the Places pass the following morning. Total wall-clock time for 2,000 candidates with four Places queries each is roughly two to three hours on a single process. Parallelise to four workers and you are done in under an hour. See Concurrency Tuning for Geocoding Pipelines for the safe concurrency ceiling.

Can I use the isochrone polygon to clip my own customer address data? Yes, and this is one of the most powerful uses of the polygon. If you have a CRM export of existing customer addresses (geocoded), you can run the point-in-polygon test from Step 4 against each candidate's isochrone to count how many of your existing customers already live within the drive-time catchment. That is a cannibalisation-risk signal for a network-expansion retailer, and a validation signal for a first-mover entry — both derived from your own data, not from any demographic layer we sell.

Is there an SDK that wraps this workflow? Python and Node SDKs are available. For a pipeline like this one — where the logic lives in your own scoring model and the API calls are a small fraction of the total code — most teams prefer the REST approach shown above. You own the retry logic, the caching layer, and the polygon filtering code, and none of that benefits from SDK abstraction. The SDK is worth evaluating if you are integrating into an existing application that already manages its own HTTP client.

Benchmarking geocoding APIs — honest numbers — how to measure the accuracy and latency characteristics that actually matter for a site-selection pipeline
Caching geocoding results — 90% cost reduction — cache isochrones, Places results, and geocodes aggressively; none of them changes faster than your candidate list
Concurrency tuning for geocoding pipelines — the right concurrency model for a 2,000-candidate screening run without hitting rate limits
Geocoding confidence scores explained — how to interpret confidence, where to set your minimum threshold, and what to do with candidates that fall below it
Observability for geocoding pipelines — the metrics, logs, and alerts that make an unsupervised screening pipeline trustworthy

---

*I.A. / CSV2GEO Creator*