Big Data 11 min read

Building a Simple Open‑Source Self‑Service BI Platform with Flask & React

This article introduces dataplay2, an open‑source self‑service BI platform built with Flask, pandas, scikit‑learn on the backend and React, ECharts, D3, and other JavaScript libraries on the frontend, detailing its architecture, installation steps, core features such as data upload, visualization, classification, clustering, and future improvement ideas.

21CTO
21CTO
21CTO
Building a Simple Open‑Source Self‑Service BI Platform with Flask & React

Introduction

Recently many data‑analysis platforms have appeared in China, all aiming at self‑service BI with visual data exploration, machine‑learning and prediction features. Inspired by Tableau and SAP Lumira, the author built a simple open‑source platform called dataplay2 to experiment with these ideas.

Source code: https://github.com/gangtao/dataplay2

Architecture

Architecture diagram
Architecture diagram

Server‑side components

Flask – lightweight Python web framework

Pandas – data‑wrangling library for CSV handling

Scikit‑learn – popular machine‑learning library (depends on NumPy, SciPy, Matplotlib)

Client‑side components

jQuery – basic DOM utilities

ReactJS – component‑based UI framework

D3.js – data‑driven DOM manipulation for custom visualizations

ECharts – Baidu’s chart library used for most visualizations

Bootstrap – responsive UI framework

jQuery DataTables – table widget

Bootstrap fileinput – HTML5 file upload control

PapaParse – CSV parsing in the browser

RequireJS – module loader

Select2 – enhanced select control

Build tools

Node.js – JavaScript runtime (no further description needed)

Babel – transpiles ES6/JSX to browser‑compatible JavaScript

Typical build commands:

# install node first
npm install -g babel-cli
# compile React JSX
npm install -g babel-cli
# install Babel preset for ES2015 and React
npm install --save-dev babel-preset-es2015 babel-preset-react
# install Bower and fetch front‑end dependencies
npm install -g bower
bower install

Running the application

After cloning the repository, install Anaconda (which provides all Python dependencies), then run the server: python main.py Open a browser and navigate to http://localhost:5000 to start the client.

Data upload and management

The data menu lets users browse existing datasets or upload a CSV file. Upload uses a file‑input control; the backend stores the file and processes it with pandas, while the front end reads the CSV via a REST API and parses it with PapaParse. The CSV must contain a header row and no trailing empty lines.

Data menu
Data menu

Visualization

After loading data, the Analysis → Visualization page lets users create charts. The platform currently supports Pie, Bar, Line, Treemap, Scatter, and Area charts. Visualization works by converting the CSV table into the data structure required by ECharts; each chart type has its own transformation logic (see package/static/js/visualization).

Visualization menu
Visualization menu

Example: a scatter plot of the Iris dataset.

Iris scatter plot
Iris scatter plot

Machine learning

dataplay2 also provides simple machine‑learning functions.

Classification

Supported algorithms: K‑Nearest Neighbors, Naïve Bayes, and Support Vector Machine.

Classification results
Classification results

Clustering

K‑means clustering is implemented.

K‑means clustering
K‑means clustering

Regression

Linear regression and logistic regression are available.

Linear regression
Linear regression
Logistic regression
Logistic regression

Future work

Support additional data sources (databases, data warehouses, REST APIs, streams).

Introduce a richer data model and hierarchical data handling.

Provide a data‑wrangling DSL on top of pandas.

Replace ECharts with a ggplot‑style front‑end library (e.g., Plotly) and add map and hierarchical charts.

Add dashboard functionality (e.g., using pyxley).

Make machine‑learning workflow more user‑friendly: auto‑select algorithms based on target and feature choices, and simplify extension of algorithms.

Personal notes

React’s component model feels much more comfortable than MVC; it greatly improved development speed.

dataplay2’s core architecture is solid, and while the current feature set is modest, it can be extended by anyone interested.

Source: taogang Link: http://my.oschina.net/taogang/blog/630632
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

machine learningdata analysisopen sourcevisualizationBI
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.