InfoWorld 2021 Best Open Source Software Awards – Overview of 29 Notable Projects
The article presents a curated list of the 2021 InfoWorld BOSSIE award winners, describing 29 open‑source projects ranging from front‑end frameworks like Svelte to cloud‑native tools such as Minikube, data‑science libraries like Dask and MLflow, and AI explainability packages including InterpretML and Lime, each with brief functional overviews and GitHub links.
This article translates the 2021 InfoWorld "Best Open Source Software" (BOSSIE) awards, highlighting 29 award‑winning projects across various technology domains.
1. Svelte and SvelteKit – A forward‑looking JavaScript framework and its full‑stack companion that use a compile‑time approach for high performance, developer experience, and now include serverless deployment capabilities.
2. Minikube – A lightweight tool that runs a single‑node Kubernetes cluster inside a VM on a developer’s laptop, simplifying Kubernetes experimentation and daily development.
3. Pixie – An eBPF‑based observability platform for Kubernetes that provides service maps, resource views, flame graphs, and low‑overhead telemetry for monitoring, performance, and debugging.
4. FastAPI – A high‑performance Python web framework for building APIs, offering speed comparable to Node.js and Go, automatic documentation, type safety, and developer‑friendly features.
5. Crystal – A statically‑typed compiled language that combines C‑level performance with Ruby‑like expressiveness, using LLVM and offering seamless C interop and compile‑time macros.
6. Windows Terminal – A modern, GPU‑accelerated terminal supporting multiple tabs, rich text, theming, and extensive configurability while remaining fast and lightweight.
7. OBS Studio – Open‑source software for real‑time video streaming and screen recording, providing high‑performance capture, mixing, and encoding for all major streaming platforms.
8. Shotcut – A cross‑platform video editor with a simple UI, supporting layered editing, effects, and a vibrant community of tutorials.
9. Weave GitOps Core – A GitOps engine built on CNCF Flux that enables continuous delivery of applications to Kubernetes clusters.
10. Apache Solr – A Lucene‑based enterprise search server offering clustering, cloud deployment, and learning‑to‑rank capabilities.
11. MLflow – An MLOps platform from Databricks for tracking experiments, packaging code, and managing model lifecycle across environments.
12. Orange – A visual programming tool for data mining and machine learning that lets users build workflows by dragging widgets onto a canvas.
13. Flutter – Google’s UI toolkit for building high‑performance, cross‑platform mobile applications with a single codebase.
14. Apache Superset – An open‑source data exploration and visualization platform originally developed at Airbnb, offering a web‑based BI experience.
15. Presto – A distributed SQL engine for interactive analytics that can query heterogeneous data sources such as Hive, Cassandra, and relational databases.
16. Apache Arrow – A language‑agnostic columnar memory format designed for fast analytics on CPUs and GPUs, with libraries for many programming languages.
17. InterpretML – An Explainable AI library that provides glass‑box models like Explainable Boosting Machine and post‑hoc techniques such as LIME.
18. LIME – A model‑agnostic explanation method that perturbs input features to interpret predictions of any classifier, supporting text and image data.
19. Dask – A parallel computing library that scales Python workloads across multiple machines and GPUs, integrating with RAPIDS, XGBoost, and scikit‑learn.
20. BlazingSQL – A GPU‑accelerated SQL engine built on RAPIDS and Apache Arrow, providing a SQL interface to cuDF for large‑scale data science.
21. Rapids – NVIDIA’s suite of GPU‑accelerated data‑science libraries (cuDF, cuML, cuGraph) that leverage Apache Arrow for fast end‑to‑end pipelines.
22. PostHog – An open‑source product analytics platform that automatically captures events from web and mobile apps without sending data to third parties.
23. LakeFS – A Git‑like version control layer for object storage, enabling zero‑copy data branching, commits, and rollback for data lakes.
24. Meltano – An open‑source DataOps platform derived from GitLab, offering ELT pipelines, Singer taps/targets, and orchestration dashboards.
25. Trino – A distributed SQL query engine (formerly PrestoSQL) that queries data lakes, relational stores, and other sources without moving data.
26. StreamNative – A managed platform built on Apache Pulsar that simplifies real‑time messaging, stream processing, and observability for enterprise applications.
27. Hugging Face – The leading open‑source hub for transformer models, providing libraries for text, image, audio, and multimodal deep‑learning tasks.
28. EleutherAI – A distributed research group that releases large‑scale language models (GPT‑J, GPT‑NeoX) and the 825 GB "The Pile" dataset for open‑source AI research.
29. Colab notebooks for generative art – Community‑maintained notebooks that combine OpenAI’s CLIP model with open‑source generators like BigGAN and VQ‑GAN to create prompt‑based generative artwork.
Each entry includes a brief description and a link to its GitHub repository for further exploration.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.