Tag

GPT-2

0 views collected around this technical thread.

IT Services Circle
IT Services Circle
May 2, 2024 · Artificial Intelligence

LLM.c: A 1000‑Line C Implementation for Training GPT‑2

Andrej Karpathy’s LLM.c project demonstrates how a compact, pure‑C (and CUDA) codebase of roughly 1000 lines can train a GPT‑2 model, covering data preparation, memory management, layer implementations, compilation, and practical tips for running and testing the model on CPUs and GPUs.

AIC++CUDA
0 likes · 10 min read
LLM.c: A 1000‑Line C Implementation for Training GPT‑2
Sohu Tech Products
Sohu Tech Products
Mar 6, 2024 · Mobile Development

On‑Device Deployment of Large Language Models Using Sohu’s Hybrid AI Engine and GPT‑2

The article outlines how Sohu’s Hybrid AI Engine enables on‑device deployment of a distilled GPT‑2 model by converting it to TensorFlow Lite, detailing the setup, customization with Keras, inference workflow, and core SDK calls, and argues that this approach offers fast, private, and cost‑effective AI for mobile devices despite typical LLM constraints.

GPT-2Hybrid AIKeras
0 likes · 9 min read
On‑Device Deployment of Large Language Models Using Sohu’s Hybrid AI Engine and GPT‑2
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 1, 2023 · Artificial Intelligence

Do Language Models Learn Language in the Same Stages as Children? An Analysis of GPT‑2 Developmental Trajectories

This article reviews a study that compares the stage‑wise language acquisition of infants with the learning trajectory of GPT‑2, using linguistic probes and statistical tests to determine whether deep language models follow sequential or parallel learning patterns similar to children.

AI researchGPT-2developmental learning
0 likes · 17 min read
Do Language Models Learn Language in the Same Stages as Children? An Analysis of GPT‑2 Developmental Trajectories
WeiLi Technology Team
WeiLi Technology Team
May 8, 2023 · Artificial Intelligence

How to Run GPT‑2 Locally: Complete Setup and Code Adjustments

This guide explains the GPT‑2 background, required software, environment configuration, code modifications for TensorFlow 2.x, data download, execution commands, and sample test results, providing a full step‑by‑step process for local deployment of the model.

AIGPT-2Local Deployment
0 likes · 7 min read
How to Run GPT‑2 Locally: Complete Setup and Code Adjustments
DataFunTalk
DataFunTalk
Nov 22, 2022 · Artificial Intelligence

NVIDIA's Advances in Multi‑Role Generative Dialogue Modeling and Synthetic Data‑Driven QA

This article reviews NVIDIA's recent work on multi‑role generative dialogue modeling using GPT‑2‑based architectures and on enhancing question‑answering systems with synthetic data pipelines, covering model design, data preparation from Reddit, extensive experiments, scaling effects, and practical Q&A insights.

GPT-2Generative DialogueModel Scaling
0 likes · 17 min read
NVIDIA's Advances in Multi‑Role Generative Dialogue Modeling and Synthetic Data‑Driven QA
IT Services Circle
IT Services Circle
Mar 13, 2022 · Artificial Intelligence

PolyCoder: An Open‑Source 27B‑Parameter Code Generation Model Excelling in C Language

Carnegie Mellon researchers introduced PolyCoder, a 27‑billion‑parameter open‑source code generation model built on GPT‑2, trained on 249 GB of multi‑language code and achieving superior performance to Codex in C while remaining competitive across eleven other programming languages.

AIC ProgrammingGPT-2
0 likes · 5 min read
PolyCoder: An Open‑Source 27B‑Parameter Code Generation Model Excelling in C Language
Sohu Tech Products
Sohu Tech Products
Nov 25, 2020 · Artificial Intelligence

Illustrated Guide to GPT-2: Detailed Explanation of the Decoder‑Only Transformer Model

This article provides a comprehensive, illustrated walkthrough of OpenAI's GPT‑2 language model, covering its decoder‑only Transformer architecture, self‑attention mechanisms, token processing, training data, differences from BERT, and applications beyond language modeling, enriched with visual diagrams and code snippets for deeper understanding.

AIGPT-2Self-Attention
0 likes · 24 min read
Illustrated Guide to GPT-2: Detailed Explanation of the Decoder‑Only Transformer Model
Python Programming Learning Circle
Python Programming Learning Circle
Nov 12, 2019 · Artificial Intelligence

Create a Text‑Generating Web App with GPT‑2 in Under 50 Lines of Python

This tutorial walks you through building a lightweight web application that uses OpenAI's GPT‑2 model to generate text, covering environment setup, model loading, a custom prediction function, and an interactive Panel‑based UI with callbacks, all in less than fifty lines of code.

GPT-2PanelPython
0 likes · 11 min read
Create a Text‑Generating Web App with GPT‑2 in Under 50 Lines of Python