Numerical Methods · Learn
You don't always need to solve a differential equation exactly. Pick a handful of "reasonable-looking" functions, demand that the equation's leftover error be orthogonal to each one, and a small linear system hands you a surprisingly good answer.
Consider the boundary value problem
This is a disguised Bessel equation, and it happens to have a known closed-form solution — useful, because it means we can grade an approximate method against the truth:
In general, most boundary value problems you'll meet don't have a closed form at all. The Ritz-Galerkin method gives you a way forward anyway.
Pick a small set of trial functions $\varphi_i(t)$ and look for an approximate solution as their weighted sum:
Here, $\varphi_n(t) = t^{n-1}(1-t)$. This isn't an arbitrary guess — the factor $(1-t)$ makes every trial function vanish at $t=1$ automatically, matching $x(1)=0$ for any choice of coefficients, and $t^{n-1}$ keeps each one finite at $t=0$. The boundary conditions are satisfied by construction, before a single coefficient is solved for — only the differential equation itself is left to approximately satisfy.
Substitute the approximation into the left side of the ODE. Write $L[\varphi] = \frac{d}{dt}(t\varphi') + t\varphi$ for the operator. Because the trial functions can't satisfy the equation exactly, you're left with a nonzero residual:
You can't force $R(t)=0$ at every point — that would take infinitely many trial functions. Galerkin's condition instead asks that $R(t)$ be orthogonal to every trial function used to build the approximation:
That's exactly $N$ linear equations in the $N$ unknown coefficients $c_i$ — an ordinary matrix problem, solvable directly for small $N$ or by an iterative method like Gauss-Seidel for larger systems.
With $N=1$, $\varphi_1(t)=1-t$. The operator gives $L[\varphi_1](t) = -1+t-t^2$, and the single Galerkin equation $\int_0^1\big(c_1 L[\varphi_1]-t\big)\varphi_1\,dt=0$ reduces to
so the one-term approximation is $x(t)\approx -0.4(1-t)$ — crude, but it already has the right boundary behavior and the right sign.
Adding a second trial function $\varphi_2(t)=t(1-t)$ turns the single equation into a $2\times 2$ system:
Notice the matrix is symmetric — that's not a coincidence. The operator $L$ here is self-adjoint (it comes from a Sturm-Liouville problem), and self-adjoint operators always produce symmetric Galerkin matrices.
Slide N from 1 to 6 and watch the maximum error against the exact $J_0$-based solution fall from about $9\times10^{-2}$ at one term to under $2\times10^{-8}$ at six — and watch the live equations being assembled, term by term, as you add trial functions. Open the lab →
Ritz's original method starts somewhere completely different: minimize a variational functional (an energy-like integral whose minimum corresponds to the true solution) directly over the trial coefficients. Galerkin's method, derived above, starts from the differential equation itself and forces a weighted residual to vanish. For a self-adjoint operator, both approaches generate the exact same linear system — which is why the combined name persists even though the two methods don't obviously start from the same place. For operators that aren't self-adjoint, only Galerkin's weighted-residual route still applies.
Everything above uses global trial functions — each $\varphi_i$ is nonzero across the whole domain, which is why a handful of well-chosen polynomials can already get within $10^{-8}$. The Finite Element Method applies the identical weighted-residual idea to local, piecewise trial functions — each one nonzero only over a small element of the domain. The bookkeeping scales to far more complex geometries that way, but the governing condition, $\int(\text{residual})\cdot(\text{trial function})\,dt=0$, never changes. The same idea also extends directly to PDEs in two or more dimensions — the trial functions just become functions of $x$ and $y$ together, and the integral becomes a double integral over the domain.
Worked from the site author's own teaching notes on the Ritz-Galerkin method. References cited there: Optimization by Variational Methods, Morton M. Denn, McGraw-Hill Book Company, 1969; Conduction Heat Transfer, V. Arpaci, Addison-Wesley Publishing Company, 1966.
Everything here is one spatial dimension. The 2D companion applies the identical method to a heated square plate — same trial-function idea, one more integral, and a genuinely interesting catch about trial functions and symmetry.
EngineeringCandy · Learn · the theory behind the playground