Deriving Scalars with Respect to Vectors Using the Definition Method
This article explains how to compute the derivative of a scalar-valued function with respect to a vector by applying the definition method, illustrating the process with simple and more complex examples, outlining basic differentiation rules, and discussing the limitations of the definition approach for vector‑matrix derivatives.
Deriving Scalars with Respect to Vectors Using the Definition Method
Derivative of a scalar with respect to a vector is formally the derivative of a real‑valued function whose input is an n‑dimensional vector and whose output is a scalar. For a given real‑valued function f(x), we seek \(\partial f/\partial x\).
The usual approach is to apply the matrix‑calculus definition: differentiate the scalar with respect to each component of the vector, then assemble the results into a vector. Consider a simple example: the derivative with respect to the i‑th component equals the i‑th component of the gradient vector, forming an n‑dimensional vector.
Using the same reasoning we can directly obtain the gradient for other simple cases. A simple test is provided for readers to try deriving by definition.
A slightly more complex example is then presented, where the derivative of a more involved function is computed component‑wise. The first part of the derivative involves the transpose of a matrix column multiplied by a vector, and the second part involves a matrix row multiplied by a vector, leading to a final matrix result.
The definition method works well for simple functions, but for complex expressions the component‑wise computation and subsequent assembly become cumbersome, motivating the search for more convenient overall differentiation techniques.
Basic Rules for Scalar‑to‑Vector Differentiation
Before seeking shortcuts, we review basic rules analogous to scalar‑to‑scalar differentiation:
Derivative of a constant vector is zero.
Linear rule: if f and g are real‑valued functions and a is a constant, then \(\partial (af+bg)/\partial x = a\,\partial f/\partial x + b\,\partial g/\partial x\).
Product rule: if f and g are real‑valued functions, then \(\partial (fg)/\partial x = f\,\partial g/\partial x + g\,\partial f/\partial x\) (valid only for real‑valued functions).
Quotient rule: if f and g are real‑valued functions and g \neq 0, then \(\partial (f/g)/\partial x = (g\,\partial f/\partial x - f\,\partial g/\partial x)/g^{2}\).
Deriving Vectors with Respect to Vectors Using the Definition Method
The article also provides a concrete example of vector‑to‑vector differentiation. Given a matrix A and vectors x and y, we need the derivative of \(y^{T}Ax\) with respect to x. By definition, the derivative of the inner product of the i‑th row of A with x with respect to the j‑th component of x yields the (i, j) entry of A. Assembling all entries produces the matrix A, not its transpose.
Limitations of the Definition Method for Matrix‑Vector Differentiation
While the definition method can handle simple cases, it quickly becomes algebraically intensive for complex expressions, and arranging the resulting components into the final matrix is error‑prone. The next article will discuss matrix differential and trace‑function techniques as more efficient alternatives.
Source: https://www.cnblogs.com/pinard/p/10773942.html (author Liu Jianping Pinard)
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.