Ritz-Galerkin Method — EngineeringCandy

Worked from the site author's own teaching notes on the Ritz-Galerkin method. References cited there: Optimization by Variational Methods, Morton M. Denn, McGraw-Hill Book Company, 1969; Conduction Heat Transfer, V. Arpaci, Addison-Wesley Publishing Company, 1966.

Why this particular trial function

The trial functions here are φ(n,t) = tⁿ⁻¹(1−t). That's not an arbitrary choice — it's built to automatically satisfy the boundary conditions so the approximation never has to fight them. The factor (1−t) makes every trial function vanish at t=1, matching x(1)=0 exactly, no matter what the coefficients turn out to be. The factor tⁿ⁻¹ keeps each function finite at t=0, matching the "finite at the origin" condition. Whatever coefficients come out of the linear solve, the boundary conditions are satisfied for free.

Forcing the residual to "disappear" — on average

Plug an approximate x(t) = Σ cᵢφᵢ(t) into the ODE and you don't get zero — you get a leftover residual R(t), since the trial functions can't satisfy the differential equation pointwise. Galerkin's idea: instead of demanding R(t)=0 everywhere (impossible with only a few terms), demand that R(t) be orthogonal to every trial function — ∫R(t)·φᵢ(t)dt = 0 for each i. That's N equations for N unknown coefficients, and it's exactly solvable as ordinary linear algebra, the same kind of system as the Gauss-Seidel lab solves (just small enough here to invert directly).

Why "Ritz" and "Galerkin" share a name

Ritz's method starts from a different place: minimize a variational functional (an energy-like integral) directly over the trial coefficients. Galerkin's method starts from the differential equation and forces a weighted residual to vanish. For a self-adjoint operator — and the Bessel/Sturm-Liouville operator here is one — both routes produce the exact same linear system. That coincidence is why the combined name "Ritz-Galerkin" is used even though the two methods start from opposite ends.

More terms, better fit — verified against the real Bessel function

With N=1 the approximation is a single straight-line-times-parabola shape, and it's noticeably off — about 30% error at t=0. By N=4 the curve is visually indistinguishable from the exact J₀(t) solution. This is the general promise of the method: a small, well-chosen trial space can get you most of the way to an exact answer without ever solving the differential equation in closed form — the same spirit (if not the same machinery) as the Euler/RK4 numerical ODE solver, just trading time-stepping for a single linear solve.

Where this leads: finite elements

Everything here uses global trial functions — each φᵢ is nonzero across the whole domain. The Finite Element Method is the same weighted-residual idea applied to local, piecewise trial functions that are nonzero only on a small chunk of the domain — trading a few high-order global polynomials for many simple local ones. The bookkeeping looks different, but the core equation — ∫(residual)·(trial function) = 0 — is identical.

EngineeringCandy · Numerical Methods · weighted residuals computed live, checked against the exact J₀-based solution · build it, break it, learn

Ritz-Galerkin Method — approximating what you can't solve by hand

Number of trial functions

Coefficients solved for

What you're seeing

The weighted-residual equations being solved, live

Why this particular trial function

Forcing the residual to "disappear" — on average

Why "Ritz" and "Galerkin" share a name

More terms, better fit — verified against the real Bessel function

Where this leads: finite elements