So with all the recent froth about ChatGPT and Clippy 2.01, err, I mean the new Bing, I thought it might be fun to do a deeper dive and think about how all this might effect the geospatial industry.
In other words, what does the future hold for ‘Map Happenings’ powered by generative AI?
In order to write this article I started by doing a little research and investigation. I wanted to discover just how much these nascent assistants might be able to help in their current form. Now unfortunately I don’t yet have access to the new Clippy, so I had to resort to performing my tests on ChatGPT. However, while I suspect the new Bing might provide better answers, it might also might decide that it loves me or wants to kill me or something2, so for now I’m happy to stay talking to ChatGPT.
I picked a number of different geospatial scenarios — consumer based as well as enterprise based.
The first scenario is based on a travel premise.
I imagined I was planning a trip to an unfamiliar city, in this case to Madrid. I was pleasantly surprised with the results — they weren’t too bad:
But if you try using ChatGPT for something a little more taxing than searching all known written words in the universe, like, for example, calculating driving directions, you will quickly be underwhelmed.
Take this example of driving from Apple’s Infinite Loop campus to Apple Park. At first the directions look innocuous enough:
However, digging in, you’ll find the directions are completely and utterly wrong.
It turns out ChatGPT lives in an alternate maps universe.
Diagnosing each step:
- “Head east on Infinite Loop toward Homestead Rd”: Infinite Loop does not connect to Homestead Rd. Get your catapult!
- “Turn right onto Homestead Rd”: so after catapulting from Infinite Loop over the freeway to Homestead you turn right. OK.
- “Use the left 2 lanes to turn left onto N Tantau Ave”: Err, you can’t turn left from Homestead to Tantau … unless the wind blows your balloon east of Tantau.
- “Use the left 2 lanes to turn left onto Pruneridge Ave”: Really? Hmm. Wrong direction!
- “Use the right lane to merge onto CA-280 S via the ramp to San Jose”: It’s actually I-280, but wait … Pruneridge doesn’t connect to the freeway… get out your catapult again!
- “Take the Wolfe Rd exit”: but if you took “CA-280” towards San Jose then you were traveling east, so now you’re suddenly west of Wolfe Rd. The winds must have blown your balloon again!
- “Keep right at the fork and merge onto Wolfe Rd”: Ok, I think.
- “Turn left onto Tantau Ave”: You’ll be stumbling on this one. Wolfe and Tantau don’t connect.
- “Turn right onto Apple Park Way”: wait, what?
But wait, it gets worse:
ChatGPT runs out of energy at step 47 somewhere in New Jersey, presumably completely befuddled and lost.
Now this authoritative nonsense isn’t limited to directions.
Let’s look at some maths3.
First a simple multiplication:
So far, so good. But now lets make it a little more challenging:
ChatGPT certainly sounds confident. But is the answer correct?
Well’s here’s the answer you’ll get from your calculator, or in this example, WolframAlpha:
Huh? It looks like ChatGPT not only lives in an alternate maps universe it also lives in an alternate maths universe.
Now the founder of Wolfram|Alpha, Stephen Wolfram, recently authored an excellent and fascinating article about this: “Wolfram|Alpha as the Way to Bring Computational Knowledge Superpowers to ChatGPT”. In it he’s lobbying for ChatGPT to use Wolfram to solve its alternate maths universe woes.
Stephen points out not only ChatGPT’s inability to do simple maths, but also its inability to calculate geographic distances, rank countries by size or determine which planets are above the horizon.
Stephen’s big takeaway:
In many ways, one might say that ChatGPT never “truly understands” things
ChatGPT doesn’t understand maths. ChatGPT doesn’t understand geospatial. In fact all it understands is how to pull seemingly convincing answers out of what is essentially a large text database. You can sort of see this in its response to the question about what to do in Madrid — this is likely summarized from the numerous travel guides that have been written about Madrid.
But even that is flawed.
In order to work efficiently the information store from which ChatGPT pulls its answers has to be compressed. And it’s not a lossless compression. It therefore is vulnerable to suffering from the same kind of side effects as audio, video or images that use a lossy compression.
Ted Chiang covers this in his New Yorker article: “ChatGPT is a Blurry JPEG of the Web”
Think of ChatGPT as a blurry jpeg of all the text on the Web. It retains much of the information on the Web, in the same way that a jpeg retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation. But, because the approximation is presented in the form of grammatical text, which ChatGPT excels at creating, it’s usually acceptable. You’re still looking at a blurry jpeg, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.
In other words, don’t let ChatGPT’s skills at forming sentences fool you.
Clearly ChatGPT’s loquacious front-end needs to be able to connect to computational engines. That is what Stephen Wolfram argues for, in his case for a connection to his Wolfram|Alpha computational engine.
I can easily imagine a world where a natural language interface like ChatGPT could be connected to a wide variety of computational engines.
There might even be an internationally adopted standard for such interfaces. Let’s call that interface CENLI (“sen-ly”), short for “Computational Engine Natural Language Interface”.
I challenge folks like Stephen @ Wolfram-Alpha and Nadine @ OGC to push such a CENLI standard. In that way we could build natural language interfaces to all sorts of computational engines. This might include:
- All branches of Mathematics
- Financial Modeling
- Architectural Design
- Aeronautical Design
- Component Design
- … and — of course — all manner of Geospatial
It turns out making a connection between a generative AI and a computational engine has been done already — by NASA. A chap called Ryan McClelland, a research engineer at NASA’s Goddard Space Flight Center in Maryland has been using generative AI for a few years now to design components for space hardware. The results look like something from an alien spaceship:
Jesus Diaz has recently wrote a great article for Fast Company about Ryan’s work:
NASA is taking generative AI to space. The organization just unveiled a series of spacecraft and mission hardware designed with the same kind of artificial intelligence that creates images, text, and music out of human prompts. Called Evolved Structures, these specialized parts are being implemented in equipment including astrophysics balloon observatories, Earth-atmosphere scanners, planetary instruments, and space telescopes.
The components look as if they were extracted from an extraterrestrial ship secretly stored in an Area 51 hangar—appropriate given the engineer who started the project says he got the inspiration from watching sci-fi shows. “It happened during the pandemic. I had a lot of extra time and I was watching shows like The Expanse,” says Ryan McClelland, a research engineer at NASA’s Goddard Space Flight Center in Greenbelt, Maryland. “They have these huge structures in space, and it got me thinking . . . we are not gonna get there the way we are doing things now.
As with most generative AI software, NASA’s design process begins with a prompt. “To get a good result you need a detailed prompt,” McClelland explains. “It’s kind of like prompt engineering.” Except that, in this case, he’s not typing a two-paragraph request hoping the AI will come up with something that doesn’t have an extra five more limbs. Rather, he uses geometric information and physical specifications as his inputs.
“So, for instance, I didn’t design any of this,” [McClelland] says, moving his hands over the intricate arms and curves. “I gave it these interfaces, which are just simple blocks [pointing at the little cube-like shapes you can see in the part], and said there’s a mass of five kilograms hanging off here, and it’s going to experience an acceleration of 60G.” After that, the generative AI comes up with the design. McClelland says that “getting the right prompt is sort of the skill set.
What’s really interesting about McClelland’s work is that it is streamlining the long cycle of design -> engineering -> manufacturing. No longer does he need to pass off the designs to an engineering team who then iterates on it and subsequently passes it on to a manufacturing team who iterates even further. No. Now the generative AI tool compresses that process:
It does all of it internally, on its own, coming up with the design, analyzing it, assessing it for manufacturability, doing 30 or 40 iterations in just an hour. “A human team might get a couple iterations in a week.”
Jesus Diaz sums it up perfectly:
Indeed, to me, it feels like we are the hominids who found the monolith in 2001: A Space Odyssey. Generative AI is our new obsidian block, opening a hyper-speed path to a completely new industrial future.
So, given that a natural language interface to all sorts of computational engines is both possible and inevitable, what might a natural language interface to a geospatial computational engine look like and what might it be capable of doing?
First, let’s start with a consumer example.
I don’t know about you, but I love road trips. But I abhor insanely boring freeways and much prefer two lane back roads.
Many years ago when I lived in California I discovered the wonderful world of MadMaps4
MadMaps has developed a series of maps for people of my ilk. Originally they were designed for those strange people who for some reason like motorbikes, but for me, at the time when I had my trusty Subaru WRX, they were also perfect.
You see MadMaps’ one goal was to tell you about the interesting routes from A to B. So, when I was driving back to Redlands from my annual pilgrimage to the Esri user conference in San Diego, I would be guided by MadMaps to take the windy back roads over the mountains. It would take me about twice as long, but it was hellish fun.
Imagine if the knowledge of MadMaps was integrated into a geographic search engine or your favorite consumer mapping app. And imagine if it also happened to know something about your preferences and interests so that it could incorporate fun places to stop along the way.
It turns out I’m not the first person to think of this.
It was only recently that Porsche announced a revamped version of its ROADS driving app.
ROADS is a valiant attempt to use AI to do what MadMaps does but in an interactive app. Unfortunately the generated routes are, well, pretty simplistic and not particularly enthralling. They lack the reasoning and context that you get from studying a MadMap.
However, I don’t think it would take a huge amount of work by the smart boys and girls at Google Maps and Apple Maps to do something similar, but much more powerful. Imagine this prompt:
“Hey Siri, I’m looking to drive from Tucson to Colorado Springs. I’m traveling with my dog and I’d love to take my time, but I want to do the trip in two days. Can you recommend a route that takes in some beautiful scenery and some great places to eat and stop for good coffee? And by “good coffee” I mean good coffee, not brown water or chain coffee schlock. I’d obviously like find good places to stop for walks to exercise the dog and I’d love to spend the night at some cute boutique hotel or motel close to some eclectic restaurants.”
If you try it today5 you will find what first appears to be a good answer, but on closer analysis it’s lacking in detail and is very vague in some places.
More importantly perhaps: it’s also just a text answer.
It’s not a detailed trip plan displayed on an interactive map that you can then tweak and edit. In other words, it’s only about 50% of the way there.
Switching gears, now let’s imagine a natural language interface to a complex geospatial analytics problem, this time applied to business.
As an example I’ll use the geospatial problem of something called “site selection”. This is a process of determining the best location for some object, some business or some facility. Traditionally this is performed with huge amounts of geospatial data about things like roads, neighborhoods, terrain, geology, climate, demographics, soils, zoning laws … the list goes on.
Organizations like Starbucks and Walmart have used these geospatial and geo-demographic analysis methods for decades to help determine the optimal location for their next store. Organizations like Verizon have used similar processes to help determine the best locations for cell phone towers based on where the population centers are and what the surrounding terrain looks like.
This methodology has not been limited to commercial use cases.
A long time ago I remember someone performing a complex geospatial analysis on the location of Iran’s Natanz uranium enrichment facility. They looked at things like the geology, the climate, the topography, access to transportation and energy. Using this information they spent a significant amount of time, energy and brainpower to determine other locations in Iran that might have similar characteristics — in other words: where else Iran might be hiding another such facility? I think there were only one or two places that the algorithm found.
What’s common about all these enterprise use cases is the complexity of getting to the answer. You have to set up all the right databases, you have to invent, develop and test your algorithms. And just like with the design -> engineering -> manufacturing process that NASA faces with component design, there is a feedback loop — for example, one of the challenges for locating a Starbucks is determining exactly what factors are driving the success of its most profitable stores.
All of this is compounded by the horrible complexity of the user interfaces to these systems. To get the best results you not only need to be well educated in something called ‘GIS’ 6, but it also doesn’t hurt to be an accomplished data analyst. My good friend, Shawn Hanna, who also happens to be a super sharp data analyst, used to work on these site selections scenarios for Petco. He can attest to the complexity of the problem.
But imagine if instead data analysts could issue a prompt to a geospatial computational engine to help them find the optimal answers more quickly:
“I’m looking to figure out the best location to open a new Petco store in the Atlanta metropolitan area. I’d like you to take into account the locations of current Petco stores, their sales and profitability and the location of competitive stores. I’d also like you to take into account the demographics of each potential location and match that against the demographics of my best performing stores. Also take into account likely population growth and predicted trends in the respective local economies. And, of course, information on which households own pets. When you’ve derived some answers, match that against suitable available commercial properties in the area. Rank the results and explain why you chose each location” 6
The trick, as McLelland at NASA says, will be in good prompt engineering.
And of course, you’ll have to have the confidence that your chatty interface is connected to a reliable, dependable and knowledgable computational engine.
It’s not going to eliminate your job, but it sure as hell is going to make you tons more productive.
We’re not there yet. But it’s coming.
Hell, we might even be able to do this:
1 For those of you that don’t remember, here is Clippy 1.0 in action:
2 By now many of you will have read Kevin Roose’s conversation with Bing in his New York Times article. If you don’t have access to the New York Times then you can see a reasonably good summary of the conversation in The Guardian.
3 If you live in the United States, that translates to ‘Math’. Why I’m not sure. People generally don’t study ‘Mathematic’. Perhaps that’s why people from the US sometimes have a reputation for not being as good at mathematics as people in other countries? They don’t realize there’s a number bigger than one.
4 Here is one of my favorite MadMaps:
5 ChatGPT’s answer to a road trip challenge. It’s a reasonably good start, but the directions are pretty vague:
6 GIS stands for ‘Geographically Insidious System’
7 FWIW, here is ChatGPT’s answer to this prompt:
- The folks at OpenAI for letting me highlight ChatGPT
- Stephen Wolfram for his article making the case to connect Wolfram|Alpha to ChatGPT: “Wolfram|Alpha as the Way to Bring Computational Knowledge Superpowers to ChatGPT”
- Ted Chiang for his excellent New Yorker article: “ChatGPT is a Blurry JPEG of the Web”
- Jesus Diaz for his fascinating Fast Company article: “NASA’s new AI-designed parts look like they’re from an alien starship“
- The great folks at MapMaps!
- Shawn Hanna for teaching me a few things along the way
- All the folks who came up with the Matrix
- My good friend, Dr. Barry Glick, who is always an inspiration!