Artificial Intelligence 13 min read

How Google’s Gemini Extracted 2.6 Million Flood Events from 150 Countries’ News

Google Research released the open‑source Groundsource flood dataset, built by automatically processing more than 5 million news articles from over 150 countries with the Gemini large‑language model, yielding over 2.6 million verified flood event records that are evaluated against GDACS and DFO for precision, recall, and spatial resolution.

HyperAI Super Neural

Mar 18, 2026

How Google’s Gemini Extracted 2.6 Million Flood Events from 150 Countries’ News

Google Research announced the open‑source Groundsource flood dataset, which extracts verified ground‑truth flood observations from unstructured data to map historical disaster footprints with unprecedented precision. The team processed over 5 million news articles from more than 150 countries, ultimately curating more than 2.6 million historical flood event records for global flood research.

Motivation and Context

Floods are among the most frequent and destructive natural hazards, making high‑quality historical flood data a cornerstone for hydrological modeling, climate impact analysis, risk assessment, and policy making. Traditional observation networks are sparse and uneven, and existing global flood databases cover only a limited set of events, leaving many regions under‑represented.

Dataset Construction Pipeline

The construction follows a standardized automated workflow. In the data‑collection stage, web crawlers gathered publicly available news reports dating back to 2000. Each article received a flood‑relevance score from Google’s WebRef named‑entity‑recognition system, and a threshold of 0.6 filtered roughly 9.5 million webpages. Manual checks showed that about half of these actually reported flood events.

During text extraction, the system stripped ads and navigation elements, retaining only the article body and publication date, and discarded unreadable or inaccessible pages, resulting in approximately 7.5 million usable articles. Non‑English texts were translated to English, and geographic names were extracted to build a candidate location pool.

Identifying specific flood events from the news text was the most complex step. Reports often contain multiple locations and vague temporal expressions such as “yesterday” or “last week.” To address this, the researchers designed a structured prompting framework for the Gemini large‑language model and tuned it on 250 manually annotated articles. Using Google Read Aloud, raw text from 80 languages was captured and standardized to English via Cloud Translation API. The model performed four tasks sequentially: (1) determine whether the article describes a genuine flood event, (2) extract and normalize the event date, (3) identify the affected locations, and (4) map location names to standard geographic identifiers.

Applying this pipeline, about 5 million of the 7.5 million articles were identified as containing real flood events. Compared with the manually annotated sample, event‑recognition achieved a precision of ~75 % and a recall of ~90 %; date and location extraction were slightly less accurate but still provided useful spatiotemporal cues.

For geocoding, the system matched recognized places to existing geographic entities when possible; otherwise, it used a geocoding service to convert names to coordinates and generated small buffer zones for spatial analysis.

Finally, records with overlapping time‑space information were merged into single flood events, and quality control removed records with excessively large extents or anomalous timestamps. The resulting dataset contains over 2.64 million independent records, each representing a flood observation captured by news coverage at a specific time and place.

Dataset Evaluation

The authors evaluated Groundsource on three dimensions: precision, spatiotemporal distribution, and consistency with external databases (GDACS and Dartmouth Flood Observatory, DFO). Randomly sampling 400 records, they verified original news sources and found that strictly “accurate” records accounted for 60 % (95 % CI ±5 %). Including records with minor deviations raised usable events to ~82 %.

Temporal analysis revealed a recent‑bias: about 64 % of records fall between 2020 and 2025, with 2025 alone contributing 15 % of entries, reflecting the rapid growth of digital news rather than an actual increase in flood frequency.

Spatially, coverage mirrors media density—regions with dense news reporting show more records, while low‑coverage areas suffer from under‑representation. Nevertheless, the dataset captures major flood‑prone zones such as Europe, South Asia, and Southeast Asia, aligning closely with GDACS’s high‑impact flood locations.

Resolution analysis shows that the average event footprint is 142 km², with 82 % of records smaller than 50 km², enabling block‑level or community‑scale analyses that traditional global disaster databases often miss.

Comparisons with GDACS and DFO demonstrate high recall: since 2020, Groundsource recalls 85 %–100 % of GDACS events, and in well‑instrumented regions like the United States, recall rates reach 96 % (GDACS) and 91 % (DFO). Recall correlates strongly with disaster severity, with major floods achieving near‑or‑above 90 % recall.

AI‑Driven Flood Research Landscape

Beyond dataset creation, the article highlights broader AI‑driven flood research. MIT researchers tackled temporal ambiguity and place‑name disambiguation in LLM‑based extraction, improving date‑extraction accuracy to over 80 % and adding multilingual support. Singapore’s National University combined AI‑extracted flood events with urban drainage and high‑resolution terrain data to build city‑scale flood risk models, linking event frequency and impact to infrastructure.

In industry, Microsoft Research and NASA collaborated on the Hydrology Copilot platform, integrating news‑derived flood events, satellite imagery, and real‑time hydrological monitoring to predict flood probabilities and impact zones, now piloted in the United States and several other countries.

Overall, automatic extraction of flood events from news text is emerging as a valuable complement to traditional observation data, offering richer, higher‑resolution information for global flood risk studies as model capabilities and data scales continue to improve.