2021 InfoWorld BOSSIE Awards: 29 Must‑Know Open‑Source Projects Across AI, Data & Cloud
InfoWorld's 2021 BOSSIE Awards highlight 29 standout open‑source projects—from front‑end frameworks like Svelte to cloud‑native tools such as Minikube, AI platforms like Hugging Face, data‑engineered solutions including Presto and Apache Arrow, and many more—offering developers a curated snapshot of the most influential software of the year.
Svelte and SvelteKit
Svelte and its full‑stack counterpart SvelteKit are ambitious JavaScript frameworks that use a compile‑time strategy to deliver outstanding performance, developer experience, and now support serverless deployments in its beta.
Address: https://github.com/sveltejs/svelte
Minikube
Minikube provides an easy way to run a single‑node Kubernetes cluster locally in a VM, ideal for trying out Kubernetes or daily development.
Address: https://github.com/kubernetes/minikube
Pixie
Pixie is an observability tool for Kubernetes, offering high‑level views like service maps and detailed insights such as pod status, flame graphs, and request tracing, all while using less than 5% of cluster CPU.
Address: https://github.com/pixie-io/pixie
FastAPI
FastAPI is a high‑performance web framework for building APIs, boasting speed comparable to NodeJS and Go, rapid development (200‑300% faster), fewer errors, intuitive editor support, and full OpenAPI/JSON Schema compliance.
Fast: performance on par with NodeJS and Go
Rapid coding: 200‑300% faster development
Fewer errors: ~40% reduction in human mistakes
Intuitive: strong editor support and auto‑completion
Easy to learn: reduces documentation time
Concise: less code duplication
Robust: production‑ready with automatic interactive docs
Standard‑based: fully compatible with OpenAPI and JSON Schema
Address: https://github.com/tiangolo/fastapi
Crystal
Crystal combines C‑level speed with Ruby‑like expressiveness, using static typing and LLVM compilation, offering seamless C interop, compile‑time macros, and a stable 1.0 release suitable for general workloads.
Address: https://github.com/crystal-lang/crystal
Windows Terminal
Windows Terminal is a modern, feature‑rich command‑line tool with multi‑tab support, rich text, theming, GPU‑accelerated rendering, and low resource consumption.
Address: https://github.com/Microsoft/Terminal
OBS Studio
OBS Studio is a real‑time streaming and screen‑recording application designed for efficient capture, composition, encoding, and broadcasting to any platform.
High‑performance real‑time video/audio capture and mixing
Unlimited scenes with custom transitions
Intuitive audio mixer with filters (noise gate, suppression, gain)
Powerful yet easy configuration and layout docking
Address: https://github.com/obsproject/obs-studio
Shotcut
Shotcut is a cross‑platform video editor offering standard editing features, a responsive UI, and strong community support on macOS, Linux, BSD, and Windows.
Address: https://github.com/mltframework/shotcut
Weave GitOps Core
Weave GitOps enables effective GitOps workflows for continuous delivery to Kubernetes clusters, built on the CNCF Flux engine.
Address: https://github.com/weaveworks/weave-gitops
Apache Solr
Apache Solr is a Lucene‑based enterprise search server, clusterable, cloud‑deployable, and includes learning‑to‑rank algorithms for fine‑tuned result weighting.
Address: https://github.com/apache/solr
MLflow
MLflow, created by Databricks and hosted by the Linux Foundation, is an MLOps platform for tracking, managing, and deploying machine‑learning experiments and models.
Address: https://github.com/mlflow/mlflow
Orange
Orange makes data mining productive and fun by allowing users to build visual workflows for machine‑learning and analytics without coding.
Address: https://github.com/biolab/orange3
Flutter
Flutter, built by Google, enables high‑performance, cross‑platform mobile app development with low latency input and high frame rates on Android and iOS.
Address: https://github.com/flutter
Apache Superset
Apache Superset, originally developed at Airbnb, is an open‑source data exploration and visualization platform that provides an intuitive, enterprise‑grade BI web app.
Address: https://github.com/apache/superset
Presto
Presto is an open‑source distributed SQL engine for interactive analytics, capable of querying heterogeneous data sources—including Hive, Cassandra, and relational databases—without moving data.
Address: https://github.com/prestodb/presto
Apache Arrow
Apache Arrow defines a language‑agnostic columnar memory format optimized for modern CPUs and GPUs, supporting zero‑copy reads and libraries across many languages.
Address: https://github.com/apache/arrow
InterpretML
InterpretML is an open‑source Explainable AI library offering state‑of‑the‑art techniques, including glass‑box models like the Explainable Boosting Machine and post‑hoc methods such as LIME.
Address: https://github.com/interpretml/interpret
Lime
LIME (Local Interpretable Model‑agnostic Explanations) is a post‑hoc technique that perturbs input features to explain predictions of any classifier, supporting both text and image domains.
Address: https://github.com/marcotcr/lime
Dask
Dask is an open‑source parallel computing library that scales Python workloads across multiple machines or GPUs, integrating with RAPIDS, NumPy, Pandas, and Scikit‑learn.
Address: https://github.com/dask/dask
BlazingSQL
BlazingSQL is a GPU‑accelerated SQL engine built on the RAPIDS ecosystem, leveraging Apache Arrow and cuDF to provide a SQL interface for large‑scale data science.
Address: https://github.com/BlazingDB/blazingsql
Rapids
NVIDIA Rapids is an open‑source suite of GPU‑accelerated libraries (cuDF, cuML, cuGraph) for end‑to‑end data science pipelines, built on Apache Arrow.
Address: https://github.com/rapidsai/cudf
PostHog
PostHog is an open‑source product analytics platform for developers that automatically captures every event on a website or app without sending data to third parties.
Address: https://github.com/PostHog/posthog
LakeFS
LakeFS adds a Git‑like version‑control layer to object storage, enabling zero‑copy data branching, commits, metadata, and validation hooks for data lakes such as S3 and Azure Blob.
Address: https://github.com/treeverse/lakeFS
Meltano
Meltano is an open‑source DataOps platform that replaces traditional ELT pipelines, offering extractors, loaders, and Singer‑compatible taps and targets for data orchestration.
Address: https://github.com/meltano/meltano
Trino
Trino (formerly PrestoSQL) is a distributed SQL analytics engine that queries large data sources—including data lakes and relational stores—without moving data, integrating smoothly with BI tools.
Address: https://github.com/trinodb/trino
StreamNative
StreamNative is a highly scalable messaging and event‑stream platform that combines Apache Pulsar with Kubernetes, hybrid‑cloud support, and enterprise‑grade tooling for real‑time analytics.
Address: https://github.com/streamnative
Hugging Face
Hugging Face hosts the most important open‑source deep‑learning model repository, extending beyond text to images, audio, video, and object detection.
Address: https://github.com/huggingface/transformers
EleutherAI
EleutherAI is a distributed research group that released The Pile (825 GB dataset) and large language models such as GPT‑J (6 B parameters) and GPT‑NeoX, aiming to democratize GPT‑3‑scale models.
Address: https://github.com/EleutherAI/gpt-neo
Colab notebooks for generative art
OpenAI’s CLIP model, combined with open‑source generators like BigGAN and VQGAN in community‑maintained Colab notebooks, enables prompt‑based generative art creation under an MIT license.
Address: https://github.com/openai/CLIP
These 29 projects constitute the 2021 InfoWorld BOSSIE Awards, many of which are new to the author and enrich their open‑source toolkit.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
