LLMs + GIS: How to “Talk” to Your PostGIS Database Using GenAI

A few weeks ago, we got an intriguing question: “Why do I have to submit a ticket to the GIS department to pull a list of high-density commercial parcels every time? Why can’t I just type that into our dashboard like ChatGPT?”

It sounded like science fiction a year ago, but with recent advancements in Large Language Models (LLMs), enabling Text-to-Spatial-SQL is entirely feasible right now.

We built them a prototype that connects OpenAI’s GPT-4 directly to their read-only PostGIS database. By passing the exact database schema and using a solid prompt that defines the behavior of PostGIS functions, the LLM translates plain English questions into valid, complex spatial queries.

Here is a simplified Python example of how we wired this up using the langchain library:

from langchain_community.utilities import SQLDatabase
from langchain_community.chat_models import ChatOpenAI
from langchain.chains import create_sql_query_chain

# ... Setup OpenAI API keys and environment ...

# Connect to the PostGIS instance
# CRITICAL: We explicitly use an engine user with strict READ-ONLY permissions!
db = SQLDatabase.from_uri("postgresql+psycopg2://reader:password@localhost:5432/realestate_db")

# Initialize the LLM
llm = ChatOpenAI(model="gpt-4", temperature=0)

# Create a prompt that specifically guides the LLM to use PostGIS functions
system_prompt = """
You are a PostGIS expert. Given an input question, create a syntactically correct PostgreSQL query to run.
...
Always use proper spatial functions like ST_Intersects and ST_DWithin.
Always cast geometries to Geography types when calculating distances in meters!
"""

# Create the LangChain SQL chain
chain = create_sql_query_chain(llm, db)

# Have the user ask a spatial question
response = chain.invoke({
    "question": "Which commercial parcels are within 500 meters of the new subway station?"
})

print(f"Generated Query:\n{response}")

The LLM effortlessly returns the correct spatial SQL, handling the underlying table joins and PostGIS predicate logic:

SELECT
    p.parcel_id,
    p.address
FROM
    parcels p
JOIN
    subway_stations s
ON
    s.station_name = 'New Subway Station'
WHERE
    p.zoning = 'Commercial'
    AND ST_DWithin(p.geom::geography, s.geom::geography, 500);

While you still need strict guardrails (like restricting the database connection to a read-only role or implementing an intermediate semantic validation layer), seeing an executive ask a map a complex spatial question and get the right answer immediately feels like the future of geospatial data interaction has finally arrived.

Godspeed!

 

Cart (0 items)

Create your account