
During a project overhauling a real estate data platform, we realized the analytics engineers were struggling to maintain their scattered spatial SQL scripts. Their pipeline consisted of cron jobs executing raw PostGIS queries to generate market heatmaps. It was fragile, impossible to test, and lacked clear lineage.
We decided to bring their spatial transformations into their existing dbt (data build tool) environment.
While dbt doesn’t have native “geospatial” modules out of the box, it plays incredibly well with data warehouses that do (like Snowflake, BigQuery, or PostGIS via the Postgres adapter). By wrapping complex spatial logic into dbt macros, we made the pipeline modular, testable, and version-controlled.
Here is an example of how we standardized their polygon centroid extraction using a custom dbt macro.
macros/spatial/get_centroid.sql)
{% macro get_centroid(geom_column) %}
-- ...
-- Handle invalid geometries smoothly while extracting the centroid
CASE
WHEN ST_IsValid({{ geom_column }}) THEN ST_Centroid({{ geom_column }})
-- If invalid, attempt a zero-buffer fix before failing
ELSE ST_Centroid(ST_MakeValid({{ geom_column }}))
END
-- ...
{% endmacro %}
models/marts/dim_parcels_enriched.sql)
-- ...
WITH raw_parcels AS (
SELECT * FROM {{ ref('stg_county_parcels') }}
),
transformed AS (
SELECT
parcel_id,
zoning_type,
assessed_value,
geometry,
-- Utilize our dbt macro to ensure clean centroid generation!
{{ get_centroid('geometry') }} AS centroid_geom
FROM
raw_parcels
)
SELECT * FROM transformed
-- ...
tests/assert_valid_geometries.sql)We also implemented dbt tests to actively monitor data quality, failing the pipeline if upstream shapefiles contained corrupted coordinates.
-- ...
-- Test to ensure all polygons in our output table are valid OGC geometries
SELECT
parcel_id
FROM
{{ ref('dim_parcels_enriched') }}
WHERE
NOT ST_IsValid(geometry)
-- ...
Treating geospatial transformations as first-class citizens within dbt not only stabilized the client’s pipelines but also allowed their standard analytics team (who weren’t GIS experts) to safely interact with complex spatial data.
Godspeed!