Code DAO
Code DAO
Apr 26, 2022 · Artificial Intelligence

Building an Open-Source ML Pipeline – Part 1: Data Ingestion & Storage

This article walks through building the first stage of an open‑source MLOps pipeline—data ingestion and storage—by outlining requirements, selecting tools such as Argo Workflows, Minio and Great Expectations, showing how to set up a minikube cluster, and providing Python scripts and an Argo CronWorkflow to extract, transform, and load OpenAQ air‑quality data into Minio.

Argo WorkflowsKubernetesMLOps
0 likes · 10 min read
Building an Open-Source ML Pipeline – Part 1: Data Ingestion & Storage