Artificial Intelligence 10 min read

How Transfer Learning Accelerates Deep Learning Across Vision, NLP, and Reinforcement Learning

The article explains how transfer learning reduces data and time requirements in deep learning by reusing pretrained models for vision, natural language processing, and reinforcement learning, while discussing challenges such as overfitting, the need for progressive networks, entropy regularization, domain adaptation, multi‑task learning, and model distillation.

Code DAO

Apr 24, 2022

How Transfer Learning Accelerates Deep Learning Across Vision, NLP, and Reinforcement Learning

We learn from past experience and apply that knowledge to new tasks; training deep networks from scratch is inefficient because it demands large amounts of data and compute.

In convolutional neural networks, early layers capture generic features such as edges and colors. By pre‑training on ImageNet and then fine‑tuning on a target dataset, the required training time and sample size are dramatically reduced—this is a classic example of transfer learning.

Transfer learning also applies to NLP: a model trained on English can be adapted to other languages, including Chinese, using far fewer labeled examples.

A key problem is that fine‑tuning on a small new dataset can cause the model to forget the generic features it previously learned, leading to over‑fitting and poor generalisation. A common mitigation is to freeze all but the last few layers, though this does not achieve the same optimisation as full end‑to‑end training.

Progressive networks address this by keeping the original parameters fixed and adding a smaller network trained on the new data; the new network receives the original network’s outputs as additional inputs, enriching feature capture without overwriting existing knowledge.

In reinforcement learning, transfer is harder because extracted features, value functions, and policies are highly task‑specific. Adding an entropy bonus to the objective encourages more diverse actions, improving robustness and generalisation across different scenarios.

EPOpt highlights another issue: lack of diversity during training. By randomising physical parameters of simulated walkers, the trained policy learns to handle a wide range of conditions rather than memorising a single solution.

Simulated environments are cheap to generate, but real‑world physics is hard to diversify. Using GANs, synthetic semantic maps can be turned into realistic scenes, and “sympathetic graphics” can provide training data without real images, enabling robots to learn navigation in complex indoor spaces.

Multi‑task transfer learning builds on a single pretrained model and fine‑tunes it for several tasks, or trains a small network that leverages predictions from a larger, task‑specific model. This can reveal shared patterns that generalise across tasks.

Model‑based reinforcement learning can exploit known physical laws: by learning dynamics models from multiple tasks, a robot arm can predict motion under new conditions.

Ensemble methods combine several models to improve robustness, but they are computationally expensive. Distillation trains a single student model to mimic the ensemble’s soft predictions, preserving richer information such as class probability distributions.

For example, a policy trained only on Space Invaders would perform poorly in Pong. Multi‑task training yields a more flexible strategy that can adapt its actions based on context, such as escaping ghosts in Pac‑Man while still pursuing objectives.

Modular networks apply a software‑engineering principle of decomposition: separate policies handle specific robots or tasks, and once trained they can be recombined to accomplish new robot‑task combinations.

Contextual policies introduce an additional state ω that captures relevant context. Mathematically this can be treated as an extra input, and experience can be encoded as context to inform decision‑making.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Deep Learning multi-task learning reinforcement learning transfer learning domain adaptation model distillation progressive networks

Written by

Code DAO

We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.