Agricultural-parcel targeting
parcelpump can run in two modes per source (county):
| Mode | Behavior |
|---|---|
| All-parcels (default) | Scrape every parcel in the county per the cadence policy. |
| Agricultural-only | Scrape only parcels where parcels.is_agricultural = true. |
The ag-only mode is what you want when scaling national coverage to all ~7M agricultural parcels without paying to scrape ~143M residential / commercial / industrial parcels you don't care about.
How it works
- CSB ingest —
src/ingest/csb-national.tsloads USDA's Crop Sequence Boundaries (annual national release) intocsb_fields, one polygon per CSB ID with the 8-year crop sequence (R16…R24 for the 2024 release). - Spatial join —
src/ingest/derive-ag-flags.tsfinds each parcel's largest-overlap CSB polygon and copies itsag_year_count+ dominant crop onto the parcel row. - Threshold —
is_agricultural = (ag_year_count >= threshold). Default threshold is 3 of 8 years coded as ag (NASS guidance for "real farmland, not transient pasture rotation"). - Worker filter — when
source_refresh_policies.agricultural_only = true, the scrape worker skips parcels withis_agricultural = false. - Bulk enqueue —
POST /scrape-jobs/enqueue-ag-countyadds every ag parcel in a source to the queue in one call. Defaults to skip_existing=true so re-runs only enqueue parcels that are due per the cadence policy.
End-to-end workflow per state
# Day 1 setup (download CSB once — ~3.7 GB compressed):
curl -L -o /tmp/csb1624.zip 'https://www.nass.usda.gov/Research_and_Science/Crop-Sequence-Boundaries/CSB1624.zip'
unzip -d /tmp /tmp/csb1624.zip # → /tmp/CSB1624.gpkg
# Per-state, after the bulk parcel ingest is done:
# 1. Load CSB for the state
npx tsx src/ingest/csb-national.ts --state-fips=53 --gpkg=/tmp/CSB1624.gpkg
# 2. Spatial-join to parcels and set is_agricultural / dominant_crop
npx tsx src/ingest/derive-ag-flags.ts --state-fips=53 --threshold=3
# 3. Flip the per-county policy
psql $DATABASE_URL -c "
UPDATE source_refresh_policies
SET agricultural_only = true, updated_at = now()
WHERE source LIKE '%-county-wa';
"
# 4. Enqueue every ag parcel in a county
curl -X POST $PARCELPUMP_API_URL/scrape-jobs/enqueue-ag-county \
-H "X-Parcelpump-Key: $PARCELPUMP_API_KEY" \
-H "Content-Type: application/json" \
-d '{"source":"franklin-county-wa"}'
Why latest-only
Each USDA CSB release covers 8 years. The 2024 release covers 2016–2024;
the 2025 release will cover 2017–2025. For ag-targeting purposes only
the latest release matters — historical sequence comparison would
require an append-yearly strategy that doubles storage every release.
Not in scope for v0; documented as a future enhancement.
When a new USDA release lands, re-run csb-national.ts per state. The
script DELETEs prior rows for the same (state_fips, release_year)
tuple before inserting, so it's idempotent.
Storage scaling
- WA + OR ag-only: ~70 GB working set on the t4g.medium RDS.
- National ag-only (7M parcels):
285 GB. Budget t4g.large + 500 GB gp3 ($200/mo) when adapter coverage approaches all 50 states. - The
csb_fieldstable is the largest contributor; latest-only keeps it at ~150 GB national. Storage scales linearly per-state, so adding one state ≈ +3–8 GB.
Tuning the threshold
ag_year_count ranges 0–8. Default threshold is 3. Tradeoffs:
| Threshold | Counts as ag | Captures | Excludes |
|---|---|---|---|
| ≥1 | Any year ag | All farmland + transient hay/pasture | Pure non-ag |
| ≥3 (default) | 3+ years ag of 8 | "Real farmland" per NASS | Rotational pasture |
| ≥5 | Majority ag | High-confidence working farms | Diversified rotations |
| ≥7 | Almost-always ag | Permanent crops + monocrop fields | Anything with rotation |
For ownership-graph + tax-appeal analytics, threshold ≥ 3 is the right balance — it excludes one-off pasture conversions but keeps real crop rotations in scope.
Counties without CSB coverage
CSB covers all 48 contiguous states. Alaska + Hawaii are excluded;
parcelpump's ag-targeting won't work there. Set
source_refresh_policies.agricultural_only = false for AK + HI sources
so the worker scrapes everything (or skip those states entirely).