Google uses old news reports and artificial intelligence to predict flash floods


Flash floods are among the deadliest weather events in the world, killing more than 5,000 people each year. They are also among the most difficult to predict. But Google believes it has solved that problem in an unlikely way: by reading the news.

While humans have collected a wealth of weather data, flash floods are too brief and localized to be measured comprehensively, in the same way that temperature or even river flows are monitored over time. That data gap means that deep learning models, which are increasingly capable of forecasting weather, cannot predict flash floods.

To solve that problem, Google researchers used Gemini, Google’s large language model, to classify 5 million news articles from around the world, isolating reports of 2.6 million different floods and converting those reports into a geotagged time series called “Groundsource”. It’s the first time the company has used language models for this type of work, according to Gila Loike, product manager at Google Research. The research and data set were shared publicly Thursday morning.

Using Groundsource as a real-world foundation, researchers trained a model built on a long-term memory (LSTM) neural network to ingest global weather forecasts and generate the probability of flash floods in a given area.

Google’s flash flood forecasting model now highlights risks for urban areas in 150 countries on the company’s website. Flood Center platform and share your data with emergency response agencies around the world. António José Beleza, an emergency response official at the Southern African Development Community who tested the forecasting model with Google, said it helped his organization respond to floods more quickly.

There are still limitations to the model. On the one hand, it has a fairly low resolution and allows risks to be identified in areas of 20 square kilometers. And it’s not as accurate as the U.S. National Weather Service’s flood warning system, in part because Google’s model doesn’t incorporate local radar data, which allows real-time monitoring of rainfall.

Part of the point, however, is that the project was designed to work in places where local governments can’t afford to invest in expensive weather-sensing infrastructure or don’t have extensive records of weather data.

Technology event

San Francisco, CA
|
October 13-15, 2026

“Because we’re aggregating millions of reports, the Groundsource data set actually helps rebalance the map,” Juliet Rothenberg, a program manager on Google’s Resilience team, told reporters this week. “It allows us to extrapolate to other regions where there is not as much information.”

Rothenberg said the team hopes that using LLM to develop quantitative data sets from qualitative written sources can be applied to efforts to build data sets on other ephemeral but important forecasting phenomena, such as heat waves and mudslides.

Marshall Moutenot, CEO of Upstream Tech, a company that uses similar deep learning models to forecast river flows for clients such as hydroelectric companies, said Google’s contribution is part of a growing effort to gather data for deep learning-based weather forecasting models. Moutenot co-founded dynamic.orga group curating a collection of machine learning-ready weather data for researchers and startups.

“Data scarcity is one of the most difficult challenges in geophysics,” Moutenot said. “At the same time, there’s too much data from Earth and then when you want to compare it to the truth, there’s not enough. It was a really creative approach to getting that data.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *