Hessian matrix
The Hessian matrix is a square matrix of second-order partial derivatives of a scalar-valued function. It describes the local curvature of a function of multiple variables.
Definition
Given a function f : ℝn → ℝ that is twice differentiable, the Hessian matrix H is an n × n matrix defined as:
| [ |
|
] |
In index notation, the (i,j)-entry is ∂²f/∂xi∂xj.
Properties
Symmetry: If all second partial derivatives are continuous (which is often the case for smooth functions), then the Hessian is symmetric by Clairaut's theorem (mixed partials are equal).
Interpretation: The Hessian represents the quadratic part of the Taylor expansion of f around a point:
f(x + Δx) ≈ f(x) + ∇f(x)T Δx + ½ ΔxT H(f)(x) Δx.
Applications
Optimization: At a critical point (where the gradient is zero), the Hessian's eigenvalues determine the nature of the point:
All eigenvalues positive → local minimum.
All eigenvalues negative → local maximum.
Mixed signs → saddle point.
Newton's method: Used in optimization to find roots of the gradient, where the update step involves the inverse of the Hessian.
Curvature: The Hessian provides information about the function's curvature in different directions (e.g., via the second directional derivative).
In summary, the Hessian matrix is a fundamental tool in multivariable calculus, particularly for analyzing the behavior of functions near critical points and for designing optimization algorithms.
No comments:
Post a Comment