Artificial Intelligence 13 min read

Understanding PyTorch’s Dynamic Autograd System: Variable, Function, and Engine

This article explains how PyTorch implements its dynamic autograd graph using the core C++ classes Variable, Function, and Engine, detailing their initialization, inheritance hierarchy, code structures, and the execution flow of backward propagation.

Python Programming Learning Circle

Jul 22, 2021

Understanding PyTorch’s Dynamic Autograd System: Variable, Function, and Engine

PyTorch’s dynamic graph framework is primarily implemented in the torch/csrc/autograd directory, where three main base classes— Variable , Function , and Engine —form the foundation of the autograd system.

The graph is called “dynamic” because it is constructed during each forward pass and discarded after the backward pass; this article uses the source code under torch/csrc/autograd to provide a detailed walkthrough of that system.

Autograd initialization is performed by the function void THPAutograd_initFunctions(), which creates the Python module torch._C._functions and populates the cpp_function_types unordered map that links C++ function types to their Python counterparts.

The map is used, for example, when printing a tensor’s grad_fn in Python; the grad_fn object is an instance of a subclass of Function retrieved via this mapping.

A typical Variable example shows how a tensor with requires_grad=True participates in the graph, with its grad_fn pointing to the backward function that created it.

During a forward pass, each operation creates Variable and Function instances. The article lists a concrete example where a tensor undergoes addition and multiplication, producing a chain of Variable objects and corresponding backward Function objects such as AddBackward0 and MulBackward0, each storing metadata, sequence numbers, and edge connections.

In the autograd graph, Function objects are vertices and the Edge struct (holding a std::shared_ptr<Function> and an input index) represents edges; the next_edges_ member of a Function connects it to downstream functions.

The base Function class is defined as follows:

using edge_list = std::vector<Edge>;
using variable_list = std::vector<Variable>;
struct TORCH_API Function {
  virtual variable_list apply(variable_list&& inputs) = 0;
  const uint64_t sequence_nr_;
  edge_list next_edges_;
  PyObject* pyobj_ = nullptr;
  std::unique_ptr<AnomalyMetadata> anomaly_metadata_ = nullptr;
  std::vector<std::unique_ptr<FunctionPreHook>> pre_hooks_;
  std::vector<std::unique_ptr<FunctionPostHook>> post_hooks_;
  at::SmallVector<InputMetadata, 2> input_metadata_;
};

The call operator of Function simply forwards to apply:

variable_list operator()(variable_list&& inputs) { return apply(std::move(inputs)); }

Input metadata is captured by the InputMetadata struct, which records the data type, shape, and device of each input.

Edges are defined as:

struct Edge { std::shared_ptr<Function> function; uint32_t input_nr; };

From the base Function, several derived classes are generated, including AccumulateGrad, TraceableFunction, and GraphRoot. AccumulateGrad holds a reference to the variable whose gradient is being accumulated:

struct AccumulateGrad : public Function {
  explicit AccumulateGrad(Variable variable_);
  variable_list apply(variable_list&& grads) override;
  Variable variable;
};

GraphRoot

wraps the final forward output as the root of the backward graph:

struct GraphRoot : public Function {
  GraphRoot(edge_list functions, variable_list inputs)
    : Function(std::move(functions)), outputs(std::move(inputs)) {}
  variable_list apply(variable_list&& inputs) override { return outputs; }
  variable_list outputs;
};

TraceableFunction

adds a flag indicating the function can be traced and serves as the base for over 300 backward‑only subclasses such as AddBackward0:

struct AddBackward0 : public TraceableFunction {
  using TraceableFunction::TraceableFunction;
  variable_list apply(variable_list&& grads) override;
  Scalar alpha;
};

The Engine class orchestrates the backward pass by executing a graph of edges, managing dependencies, and optionally running on multiple threads. Its core method is:

struct Engine {
  using ready_queue_type = std::deque<std::pair<std::shared_ptr<Function>, InputBuffer>>;
  using dependencies_type = std::unordered_map<Function*, int>;
  virtual variable_list execute(const edge_list& roots, const variable_list& inputs, ... const edge_list& outputs = {});
  void queue_callback(std::function<void()> callback);
protected:
  void compute_dependencies(Function* root, GraphTask& task);
  void evaluate_function(FunctionTask& task);
  void start_threads();
  virtual void thread_init(int device);
  virtual void thread_main(GraphTask* graph_task);
  std::vector<std::shared_ptr<ReadyQueue>> ready_queues;
};

A derived class PythonEngine overrides execute only to translate C++ exceptions into Python exceptions; the actual computation is performed by the base Engine.

The backward call stack for tensor.backward() proceeds from the Python tensor method through torch.autograd.backward, then to Variable._execution_engine.run_backward, followed by the C++ entry point THPEngine_run_backward, and finally to Engine::execute.

The next article will focus on how the Engine class drives the execution of PyTorch’s dynamic graph during a backward() call.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Artificial Intelligence Deep Learning C#PyTorch Dynamic Graph Autograd

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.