How to Build a Northbound Capital Timing Strategy with Python and Tushare
This article explains a simple investment timing strategy that adjusts regular contributions based on daily northbound capital flows, walks through the financial concepts, provides complete Python code using Tushare to fetch and process the data, and shows back‑tested results and practical tips.
The author, a data‑analysis enthusiast, shares personal finance reflections and introduces a timing strategy that increases regular investments when the market is in a prolonged trough and reduces them when the market is hot, using daily northbound capital data.
Northbound Capital?
In the Chinese stock market, "north" refers to mainland stocks and "south" to Hong Kong stocks. Northbound capital is the money flowing from Hong Kong (and international investors) into the mainland market, typically via the Stock Connect channels.
Historically, northbound capital outperforms most domestic investors, likely due to richer experience, broader information channels, or foreign speculative flows.
When monitoring daily trading, investors can track northbound buying; however, two concepts are often confused: net inflow and net buy. Net inflow includes both executed trades and unfilled orders, while net buy reflects only completed purchases.
Because net inflow is always greater than or equal to net buy, the author recommends focusing on net buy amounts for a more accurate view of actual foreign participation.
1. Import Libraries
# Import required libraries
import tushare as ts
import datetime
import pandas as pd
import numpy as np2. Connect to Tushare with Token
# Replace the token with your own
token = 'replace_with_your_token'
pro = ts.pro_api(token)3. Get Trading Calendar
# Get all trading days
trade_date = pro.trade_cal(start_date='20180101', end_date=datetime.datetime.today().strftime('%Y%m%d'))
date_list = list(trade_date[trade_date.is_open==1]['cal_date'].values)4. Fetch Daily Northbound Capital Data
Because each request is limited to 300 rows, the data is retrieved in three batches and then concatenated.
# First batch
df_data1 = pro.moneyflow_hsgt(start_date=date_list[0:300][0], end_date=date_list[0:300][-1])
# Second batch
df_data2 = pro.moneyflow_hsgt(start_date=date_list[300:600][0], end_date=date_list[300:600][-1])
# Third batch
df_data3 = pro.moneyflow_hsgt(start_date=date_list[600:][0], end_date=date_list[600:][-1])
# Combine and sort
df_data = df_data1.append([df_data2, df_data3], ignore_index=True)
df_data = df_data.sort_values('trade_date', ascending=True).reset_index(drop=True)
# Rename columns for readability
df_data = df_data.rename(columns={'ggt_ss':'Shanghai', 'ggt_sz':'Shenzhen', 'hgt':'Shanghai Connect', 'sgt':'Shenzhen Connect', 'north_money':'Northbound', 'south_money':'Southbound'})5. Unit Conversion
The original data is in millions; it is converted to billions for easier reading.
# Convert from million to billion
for col in ['Shanghai', 'Shenzhen', 'Shanghai Connect', 'Shenzhen Connect', 'Northbound', 'Southbound']:
df_data[col] = df_data[col] * 0.016. Remove Non‑Trading Days
Some dates have no northbound activity due to holidays (e.g., Hong Kong SAR National Day). Rows where the Shanghai Connect column is NaN are filtered out.
# Filter out days without northbound trading
df_data2 = df_data.loc[~df_data['Shanghai Connect'].isna(), :].reset_index(drop=True)7. Core Strategy Implementation
The strategy calculates a 252‑day rolling average and standard deviation of northbound net buy amounts, then defines upper and lower thresholds at ±1.5 × std. Signals are generated as follows:
signal = 'No Signal'
for index, row in df_data2.iterrows():
if index < 252:
continue
recent = df_data2.iloc[index-252:index]
avg = recent['Northbound'].sum() / 252
std = recent['Northbound'].std()
up_line = float(format(avg + std * 1.5, '.4f'))
down_line = float(format(avg - std * 1.5, '.4f'))
if row['Northbound'] >= up_line:
signal = 'Bullish'
print(f"{row['trade_date']}: <{signal}> Northbound net buy: {row['Northbound']:.4f}B, Upper: {up_line}B, Lower: {down_line}B")
elif row['Northbound'] <= down_line:
signal = 'Bearish'
print(f"{row['trade_date']}: <{signal}> Northbound net buy: {row['Northbound']:.4f}B, Upper: {up_line}B, Lower: {down_line}B")
if index == df_data2.shape[0]-1:
print(f"
Latest data
{row['trade_date']}: <{signal}>
Northbound net buy: {row['Northbound']:.4f}B, Upper: {up_line}B, Lower: {down_line}B
")The back‑test in the original report found that using an upper bound of +1.5 × std and a lower bound of –1.5 × std yielded the highest annualized return of 37.54 %.
Sample output (as of August 30) shows a continuous bearish signal since late July.
Note that the report’s original condition referred to "northbound capital inflow" while the Tushare API provides "net buy"; therefore, threshold values may need adjustment (e.g., using ±0.8 × std for net buy).
Readers are encouraged to experiment with different threshold parameters, back‑test the results, and adapt the strategy to their own dollar‑cost averaging plans.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
