How a Rolling Random Forest Strategy Predicts Bitcoin’s Weekly Direction

This article explains a Python‑based rolling random‑forest classifier that uses a 30‑day training window and selected technical indicators to forecast whether Bitcoin’s price will rise or fall over the next seven days, detailing the methodology, code, back‑test results, and limitations.

Data STUDIO
Data STUDIO
Data STUDIO
How a Rolling Random Forest Strategy Predicts Bitcoin’s Weekly Direction

Core Concepts

Random Forest Classifier

Random forest builds many independent decision trees; each tree is trained on a random subset of the data (bagging) and at each split considers a random subset of features, reducing correlation between trees. Final prediction is obtained by majority voting.

Rolling Prediction Evaluation

Rolling prediction simulates performance in a dynamic market by repeatedly retraining the model on a fixed‑size recent window and forecasting a future horizon. After each iteration the window slides forward one day.

Specific Steps

Define a training window (e.g., past 30 days).

Train the model on data within the window.

Predict the price direction for the next 7 days.

Slide the window forward by one day.

Repeat and aggregate predictions to compute overall metrics.

Python Implementation

Setup and Configuration

import pandas as pd
import numpy as np
import yfinance as yf
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import (accuracy_score, precision_score, recall_score,
                             f1_score, roc_auc_score, confusion_matrix)
import warnings

TICKER = 'BTC-USD'
START_DATE = '2021-01-01'
PREDICTION_HORIZON = 7
TRAINING_WINDOW_DAYS = 30
TOP_FEATURES = [
    'ROC_10', 'STOCHRSI_d', 'ADX_14', 'STOCHRSI_k', 'RSI_14',
    'STOCH_k', 'ATR_14', 'EMA_20', 'STOCH_d', 'MACD',
    'ULTOSC', 'BB_upper', 'SAR', 'Open_Close', 'MACD_hist'
]
# Random forest hyper‑parameters omitted for brevity

Data Loading and Target Definition

Historical price data are downloaded with yfinance. Technical indicators are computed for each row. The target variable is set to 1 if the price after PREDICTION_HORIZON days exceeds the current price, otherwise 0.

Rolling Prediction Loop

# --- Rolling prediction loop ---
all_predictions = []
all_actuals = []
all_probabilities = []

for i in range(start_index, end_index):
    # 1. Extract current training and prediction windows
    X_train_window = X_all_features.iloc[train_start_idx:train_end_idx]
    Y_train_window = Y_all.iloc[train_start_idx:train_end_idx]
    X_predict_point = X_all_features.iloc[[predict_feature_idx]]

    # 2. Scale features
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train_window)
    X_predict_scaled = scaler.transform(X_predict_point)

    # 3. Train and predict
    rf_model = RandomForestClassifier(
        # hyper‑parameters here
    )
    rf_model.fit(X_train_scaled, Y_train_window)

    # 4. Store results
    all_predictions.append(rf_model.predict(X_predict_scaled)[0])
    all_actuals.append(Y_all.iloc[actual_target_idx])
    all_probabilities.append(rf_model.predict_proba(X_predict_scaled)[0, 1])

Comprehensive Evaluation

After the loop, aggregated predictions and actuals are used to compute accuracy, precision, recall, F1, ROC‑AUC and the confusion matrix.

Results

Accuracy ≈ 68 % (baseline ≈ 51 %).

Precision ≈ 69 %.

Recall ≈ 68 %.

ROC‑AUC ≈ 0.75.

Confusion matrix [[122 57] / [61 128]].

These metrics indicate a statistically significant improvement over random guessing, with roughly two‑thirds of 7‑day direction predictions correct and balanced precision/recall.

Limitations

The script evaluates predictive ability only; it does not implement entry/exit signals, stop‑losses, or risk management.

Market non‑stationarity may cause degradation of performance on future data.

Practical deployment requires extensive testing, hyper‑parameter tuning, and feature analysis.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

machine learningPythontime series forecastingRandom ForestBitcoinRolling Prediction
Data STUDIO
Written by

Data STUDIO

Click to receive the "Python Study Handbook"; reply "benefit" in the chat to get it. Data STUDIO focuses on original data science articles, centered on Python, covering machine learning, data analysis, visualization, MySQL and other practical knowledge and project case studies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.