Infra Learning Club
Author

Infra Learning Club

Infra Learning Club shares study notes, cutting-edge technology, and career discussions.

72
Articles
0
Likes
0
Views
0
Comments
Recent Articles

Latest from Infra Learning Club

72 recent articles
Infra Learning Club
Infra Learning Club
Nov 25, 2024 · Cloud Native

Testing NVIDIA GPU DRA on Kubernetes 1.31

This guide walks through setting up an Ubuntu 22.04 environment, installing Docker, kind, the NVIDIA Container Toolkit, configuring the NVIDIA runtime as default, building and deploying the Kubernetes DRA driver, and running three demo scenarios that demonstrate GPU sharing across containers and pods in a Kubernetes 1.31 cluster.

Device Resource AllocationDockerKubernetes
0 likes · 7 min read
Testing NVIDIA GPU DRA on Kubernetes 1.31
Infra Learning Club
Infra Learning Club
Nov 2, 2024 · Artificial Intelligence

Set Up a Local GitHub Copilot‑Like AI Assistant in 5 Minutes

This guide shows how to deploy the open‑source Tabby AI coding assistant with the Qwen2.5‑Coder‑1.5B‑Instruct model using Docker, register an admin account, configure the Tabby VS Code extension, and verify real‑time multi‑line code completion, all in a few minutes.

AI coding assistantDockerQwen2.5-Coder
0 likes · 4 min read
Set Up a Local GitHub Copilot‑Like AI Assistant in 5 Minutes
Infra Learning Club
Infra Learning Club
Nov 1, 2024 · Artificial Intelligence

Configuring vLLM swap_space and cpu_offload_gb for Stable Large-Model Inference

The article explains vLLM’s GPU compute capability requirement, describes the swap_space and cpu_offload_gb parameters, outlines their ideal usage scenarios, and provides step‑by‑step code examples that demonstrate how adjusting these settings enables loading and running a 7B‑parameter model on a 16 GB T4 GPU.

GPU memory managementcpu_offload_gblarge language model inference
0 likes · 9 min read
Configuring vLLM swap_space and cpu_offload_gb for Stable Large-Model Inference
Infra Learning Club
Infra Learning Club
Oct 31, 2024 · Industry Insights

Top AI Startups to Watch in 2024: 10 Leading and 6 Emerging Companies

The article surveys the most funded and influential AI startups of 2024, profiling ten large‑scale companies such as OpenAI, Anthropic, and Scale AI, and highlighting six promising newcomers, while detailing their products, CEOs, valuations, recent milestones, and industry impact.

2024AI industryAI startups
0 likes · 11 min read
Top AI Startups to Watch in 2024: 10 Leading and 6 Emerging Companies
Infra Learning Club
Infra Learning Club
Oct 31, 2024 · Artificial Intelligence

What Is a Token in Large Language Models?

The article explains that a token is the unit processed by large language models, describes three common tokenizer methods—word‑level, character‑level, and sub‑word level—with English and Chinese examples, discusses their advantages and limitations, and shows how OpenAI’s tokenizer varies across model versions.

NLPcharacter-leveljieba
0 likes · 5 min read
What Is a Token in Large Language Models?
Infra Learning Club
Infra Learning Club
Oct 30, 2024 · Artificial Intelligence

How GPT-3 Evolved: From Transformer Roots to Massive Language Models

The article traces the development of GPT series—from the 2017 Transformer breakthrough, through GPT‑1, GPT‑2, and GPT‑3’s 175 billion parameters, to later models like Codex and ChatGPT—highlighting key papers, architectural choices, and the surprising role of OpenAI’s decoder‑only approach.

GPT-3GoogleLanguage Model
0 likes · 4 min read
How GPT-3 Evolved: From Transformer Roots to Massive Language Models
Infra Learning Club
Infra Learning Club
Sep 29, 2024 · Cloud Native

Current State of Kubernetes DRA and the New Architecture with ResourceClaimParameters and ResourceSlice

The article examines the scheduling performance and tight coupling issues of Kubernetes DRA before version 1.30, explains the original workflow involving PodSchedulingContext and DRA driver, and then details the latest design that introduces ResourceClaimParameters and ResourceSlice to let the scheduler handle complex device constraints internally.

Cloud NativeDRAKubernetes
0 likes · 4 min read
Current State of Kubernetes DRA and the New Architecture with ResourceClaimParameters and ResourceSlice