Alternative Data Mining: From 19th‑Century Cholera Mapping to Modern AI‑Driven Risk Modeling

This talk reviews the concept of alternative data, illustrates its early use in John Snow's cholera map, explores contemporary AI‑powered systems such as IBM's Debater and satellite‑based poverty estimation, and presents the speaker's own research on using unconventional data for financial‑market risk detection and prediction.

DataFunTalk
DataFunTalk
DataFunTalk
Alternative Data Mining: From 19th‑Century Cholera Mapping to Modern AI‑Driven Risk Modeling

The presentation begins by defining "alternative data" as niche, under‑exploited datasets and outlines the agenda: historical examples, cutting‑edge engineering advances, and the speaker's own research on risk modeling.

1. Historical example – John Snow’s cholera map (19th century) : Snow surveyed households, plotted cases on a map, identified a contaminated water pump as the outbreak source, and advocated its removal, demonstrating early data‑driven epidemiology.

2. Modern AI applications – IBM Debater : The Debater system, a decade‑long, multi‑nation effort, combines deep learning, natural‑language processing, and data‑mining to generate arguments from news articles and historical debate transcripts, showcasing AI’s ability to emulate and surpass human debate.

3. Satellite imagery for poverty estimation (Science, 2016) : Researchers used publicly available night‑time light intensity and high‑resolution satellite images to derive features (e.g., building density) and predict poverty indicators in African countries, overcoming the lack of reliable ground‑truth socioeconomic data.

4. Risk modeling with unconventional data : The speaker’s group leverages alternative data to monitor sudden risk events (terrorist attacks, natural disasters, pandemics) and predict secondary‑market reactions. They build event‑driven market models using historical incident databases and real‑time news extraction, incorporate night‑light data to weight economic development, and achieve ~70 % accuracy with decision‑tree classifiers for event‑impact prediction.

5. Political‑tweet analysis for market forecasting : By extracting entities, linking to knowledge bases, and performing sentiment and causal reasoning on high‑profile officials' tweets (e.g., former U.S. President Trump), the team aims to forecast market movements and generate early risk warnings.

The talk concludes with a brief recap, encouraging further exploration of alternative data to uncover hidden “water pumps” that can drive societal and technological progress.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Artificial Intelligencedata miningSatellite Imageryalternative dataRisk Modelingfinancial markets
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.