ETH_NumericalMethodsForCSE

links - NumCSE

script: website
lecture notes: Moodle
exercises: website, CodeExpert
- hand-in: https://app.discuna.com
- solutions + source code: gitlab
old exercises: website
community solutions: https://exams.vis.ethz.ch/category/numericalmethodsforcse

Info

exam
- NumCSE midterm: Monday 10:15 - 11:00, HG G 19.1 (@2025-10-27)
- NumCSE endterm: Monday 10:15 - 11:00, HG G 19.1 (@2025-12-15)
- NumCSE final exam:
- mid+endterm: 10min reading + 30min writting; closed-book; optional
- final exam: 30min reading + 180min writing; open-book; codeExpert + paper
exercises
practice classes
- ETH_NumericalMethodsForCSE_Übungen

Hilfsmittel NumericalMethodsForCSE

final exam: [course script, Eigen documentation, C++ documentation]
mid-/endterm (optional for bonus): closed book

Quote

The course content is quite interesting, but the material is really chaotic. Subsubsubsubsections, missing links, wrongly named files, incoherent structure, overly detailed script - it really does not help that on top of all that, the curse is teached in a flipped classroom mode.
The course is far too much effort for only 9 credits, but it is probably the most important course for CSE, so I would recommend to pay attention and try to do as much as possible.

-> 20250918_NumCSE_Eigen_cheatsheet
-> HS2025_tasks

Exercises

2.3
2.5 (solved sloppily, with ChatGPT / solutions)
2.13

Problem 4-4: Solving triangular Toeplitz systems (220 min)
Problem 5-2: Largrange Polynomials (60 min)
Problem 5-13: Various Aspects of Interpolation by Global Polynomials (35 min)
Problem 5-16: Some Aspects of Interpolation by Global Polynomials (35 min) !!important task
Problem 5-18: Polynomial Interpolation based on Newton basis (60 min) (done with ChatGPT)

Problem 5-1: Evaluating the derivatives of interpolating polynomials (135 min)
Problem 6-6: Polynomial Best Approximation (35 min)
Problem 6-9: Finding domains of analyticity (45 min)
Problem 6-3: Adaptive polynomial interpolation (105 min)

week8

Problem 6-8: Monomial representation of Chebychev polynomials (30 min)
Problem 6-10: Convergence of Chebychev Interpolants (70 min)
- look at solution again for b) https://people.math.ethz.ch/~grsam/NumMeth/HOMEWORK/6-10-2-1:.pdf
Problem 6-5: Chebychev interpolation of analytic functions (130 min) (quite tricky, especially f))
Problem 6-1: Piecewise linear approximation on graded meshes (170 min)

week9

Problem 7-1: Smooth integrand by transformation (115 min) (I understood this task and it was quite fun)
Problem 7-18: Regularizing Transformation (95 min)
Problem 7-3: Numerical Quadrature of Improper Integrals (75 min)
Problem 7-12: Clenshaw-Curtis-Fejer Quadrature Formula (130 min)
- d) really tricky (derive + compute nodes, weights using FFT)

week 10

Problem 7-14: Aspects of Numerical Quadrature (65 min)
- would be nice to discuss d) with Prof. Hiptmair -> done in Q&A 2025-11-24, ca. at 30.00
Problem 7-5: Quadrature Error Plots (30 min)
- write result on cheatsheet -> comparison of different Q.R. -> see Q&A 2025-11-24 (als written down in [[#Q&A]])
Problem 7-16: On Quadrature Formulas (30 min)
- explain b)
Problem 7-17: Quadrature Formulas (60 min) (more like 30min; not sure about b))
Problem 7-15: Convolution Quadrature (125 min)

week 11

Problem 8-1: Convergent Newton iteration (25 min) (nice and simple)
Problem 8-4: A derivative-free iterative scheme for finding zeros (85 min)
- d) understood the idea, but not the execution
Problem 8-8: Julia Set (105 min)
Problem 8-10: Solving a quasi-linear system (130 min)

problem 12

Problem 8-16: Periodic Collocation for a Non-Linear Differential Equation (95 min)
- running program -> ok; running tests -> failed
Problem 8-11: Radioactive Decay (95 min)
Problem 8-15: Non-linear Least Squares Location Estimation (100 min)

Videos

week 1

1.1.1 Notations (6 min)
1.2.1 Eigen (11 min)
1.2.3 (Dense) Matrix Storage Formats (9 min)
1.4 Computational Effort (28 min)
1.5 Machine Arithmetic and Consequences (15 min)
1.5.4 Cancellation (21 min)
1.5.5 Numerical Stability (17 min)

week 2

2.1 Introduction: Linear Systems of Equations (LSE) (5 min)
2.1.0.3. Nodal Analysis of Linear Circuits (7 min)
2.2.2 Sensitivity of Linear Systems (15 min)
2.3 Gaussian Elimination & 2.5 Survey: Elimination Solvers for Linear Systems of Equations (16 min)
2.6 Exploiting Structure when Solving Linear Systems (17 min)
2.7.1 Sparse Matrix Storage Formats (10 min)
2.7.2 Sparse Matrices in Eigen (6 min)
2.7.3 Direct Solution of Sparse Linear Systems of Equations (9 min)

week 3

3.0.1 Overdetermined Linear Systems of Equations: Examples (12 min)
3.1.1 Least Squares Solutions: Definition (8 min)
3.1.2 Normal Equations (16 min)
3.1.3 Moore-Penrose Pseudoinverse (7 min)
3.2 Normal Equation Methods (12 min)
4.1.1 Discrete Finite Linear Time-Invariant Causal Channels/Filters (21 min)
4.1.2 LT-FIR Linear Mappings (24 min)
4.1.3 Discrete Convolutions (17 min)
4.1.4 Periodic Convolutions (24 min)

week 4

4.2.1 Diagonalizing Circulant Matrices (33 min)
4.2.2 Discrete Convolution via Discrete Fourier Transform (13 min)
4.2.3 Frequency filtering via DFT (40 min)
4.2.5 Two-dimensional DFT (38 min)
4.3 Fast Fourier Transform (FFT) (29 min)
4.5 Toeplitz Matrix Techniques (20 min)

week 5

5.1 Abstract Interpolation (AI) (16 min)
5.2 Global Polynomial Interpolation (6 min)
5.2.2 Polynomial Interpolation: Theory (5 min)
5.2.3 Polynomial Interpolation: Algorithms (17 min)
5.2.3.3 Extrapolation to Zero (12 min)
5.2.3.4 Newton Basis and Divided Differences (16 min)
5.2.4 Polynomial Interpolation: Sensitivity (13 min)

week 6

6.1 Approximation of Functions in 1D: Introduction (7 min)
6.2.1 Polynomial Approximation: Theory (13 min) (solved review questions)
6.2.2 Error Estimates for Polynomial Interpolation (13 min) (solved review questions)
6.2.2.2 Interpolands of Finite Smoothness (17 min) (solved review with chatgpt)
6.2.2.3 Analytic Interpolands (27 min)

week 7 (no review quesitons solved)

6.2.3.1 Chebychev Interpolation: Motivation and Definition (10 min)
6.2.3.2 Chebychev Interpolation: Error Estimates (14 min)
6.5.1 Approximation by Trigonometric Interpolation (5 min)
6.5.2 Trigonometric Interpolation Error Estimates (14 min) (didn't quite understand)
6.6.1 Piecewise Polynomial Lagrange Interpolation (17 min)

week 8 (no review quesitons solved)

7.1 Numerical Quadrature: Introduction (4 min)
7.2 Quadrature Formulas – Quadrature Rules (13 min)
7.3 Polynomial Quadrature Formulas (8 min)
7.4.1 Order of a Quadrature Rule (8 min)
7.4.2 Maximal-Order Quadrature Rules (16 min)
- look again at proofs (there are 3 of them), might be important
7.4.3 Quadrature Error Estimates (18 min)

week 9 (no review questitons solved)

7.5 Composite Quadrature (18 min)
7.6 Adaptive Quadrature (13 min)
8.1 Iterative Methods for Non-Linear Systems of Equations: Introduction (5 min)
8.2.1 Fundamental Concepts (6 min)
8.2.2 Speed of Convergence (15 min)
8.2.3 Termination Criteria/Stopping Rules (14 min)
8.3 Fixed-Point Iterations (12 min)

week 10

week 11

8.6 Quasi-Newton Method (15 min)
8.7 Non-linear Least Squares (7 min)
8.7.1 Non-linear Least Squares: (Damped) Newton Method (13 min)
8.7.2 (Trust-region) Gauss-Newton Method (13 min)
11.1 Initial-Value Problems (IVPs) for Ordinary Differential Equations (35 min)

week 12

11.2 Introduction: Polygonal Approximation Methods (17 min)
11.3. General Single-Step Methods (14 min)
11.3.2 (Asymptotic) Convergence of Single-Step Methods (20 min)
11.4 Explicit Runge-Kutta Single-Step Methods (27 min)
11.5 Adaptive Stepsize Control (32 min)

Midterm

HS22 - 0.2 nicht relevant, 0.1 nochmals anschauen; ansonsten gut machbar
[ ]

Q&A

#timestamp 2025-11-24
Q&A Week 11

Compisute quadrature rules vs global Gauss quadrature

When choosing quadrature rule, always first consider Gauss (global)

benefits from function smoothness / analyticity
if $f \in C^{0} ([a, b])$ , but M-p.w. smooth (piece-wise in Mesh cells, only non-linear at Mesh nodes) -> composite Q.R. will cvg. alg. with max rate -> use composite Q.R.
equidistant trapezoidal rule is exact for trigonometric polynomials (periodic functions)

Equidistant TR cvg. exponentially for periodic integrands

⟺

integrand has anlytic extension (on $C$ beyond integration interval)
integration is over a period

Vorlesung

#timestamp 20250918

Video 1.4: Complexity introduction, tricks to improve complexity

tricks to improve complexity

exploit associativity

y = a (b^{⊤} x) O (n) instead of y = (a b^{⊤}) x O (n^{2})

hidden summation

y = triu (A B^{⊤}) x O (n^{2} p)

but $y_{i} = \sum_{j \geq i} a_{i} b_{j} x_{j} = a_{i} \sum_{j \geq i} b_{j} x_{j}$

reuse of intermediate results

Video 1.5: Machine numbers, relative errors, gram-schmidt orthogonalisation

Never exactly compare floats (always compare with small diff)

Video 1.5.4 cancellation + how to avoid

cancellation in difference quotients

f^{'} (x) \approx \frac{f (x + h) - f (x)}{h}

-> cancellation in numerator
-> divided by tiny h -> error "blows up"

Minimal relative error for

h \approx \sqrt{EPS}

#todo look at Vieta's formula

\log_{10}

rel. error

\approx

no. of decimal digits of accuracy

#todo where to find review questions?

Video 1.5.5 Numerical Stability

Problem $\hat{=}$ function/mapping from data to result space (usually normed vector space)

Pasted image 20250921193546.png
stable algorithm $\hat{=}$ an algorithm, which gives a slightly perturbed result. The result has to be reachable by the problem (which is exact) from a slightly perturbed data(point), near the exact data. This has to be true for every data(point)

ω (x) = #ops

impact of roundoff of stable algorithm is the same order of magnitude as perturbations because rounding of input data

Video 2.1

square LSE -> uniqueness of solution, if $A$ invertible / A regular / $det (A) \neq 0$ / cols lin. indep. / rows lin. indep.

(do not use matrix inversion function to solve LSE with numerical libraries)

Video 2.1.0.3

Pasted image 20250923091946.png

Video 2.2.2 Sensitivity/Conditioning of LSE

quantifies how small (relative) perturbation of data lead to changes of the output

Pasted image 20250923092509.png

condition of a matrix $A \in R^{n \times n}$ :

cond (A) := | | A^{- 1} | | | | A | |

$cond (A) \approx 1$ : LSE well-conditioned
$cond (A) ≫ 1$ : LSE ill-conditioned (result relatively useless)

if condition number of matrix is very large, it's columns/rows are almost linearly dependent ( $\hat{=}$ parallel)

Video 2.3 Gaussian Elimination (GE), LU-decomposition

Gaussian eleminitaion -> $n^{3}$
LU-Decomposition -> $n^{3}$

A \vec{x} = \vec{b} ⟺ L (U \vec{x}) = \vec{b} ⟹ L \vec{z} = \vec{b}, U \vec{x} = \vec{z}

-> three-stage splitting of GE

X = A.lu().solve(B)

Video 2.6 Exploiting structure when solving linear systems

Pasted image 20250925223215.png

Video 2.7.1 sparse matrix: how to store

Video 2.7.2 sparse matrices in Eigen

standard CRS/CSS format

#include<Eigen/Sparse>
Eigen::SparseMatrix<int, Eigen::RowMajor> Bsp(rows, cols);

#initalise
std::vector <Eigen::Triplet<double>> triplets;
Bsp.setFromTriplets(triplets.begin(), triplets.end());

alternative: allocate enough space at start (see 2.7.2.1) / squeeze out zeroes -> slightly more effizient than triplets, but the non-zero data size is not always known in advance

Video 2.7.3 direct solutions of sparse LSE (LGS)

mat.solve(vec);
// note: matrix is
Eigen::SparseLU<Eigen::SparseMatrix<double>> mat;

-> sparse solution is stable

(sparse elimination for combinatorial graph laplacian: asymptotic runtime $O (n^{1.5})$ )

=> in practice: Cost(sparse solution of $A \vec{x} = \vec{b}$ ) = $O ((n n z (A))^{α})$ , $α \approx 1.5 - 2.5$

Video 3.0.1 Overdetermined linear systems

You cannot afford not to use every piece of information available

Video 3.1.1 Least squares solution

idea: find vector $\vec{\tilde{x}}$ for $A \vec{x} = \vec{b}$ , s.t. the residual $r := \vec{b} - A \vec{x}$ is as small as possible

notation: lsq(A,b) $:= x \in {argmin}_{y \in R^{n}} | | A y - b | |_{2}^{2}$

least square solutions always exists

Video 3.1.2 solving least square problems

Pasted image 20250929103952.png Pasted image 20250929110426.png

Video 3.1.3 Moore-Penrose-Pseudoinverse

LSQ solution not unique:

FRC is violated $⟺ rank (A) < n ⟺ N (A) \neq {0}$

-> Additional selection criterium:
Pasted image 20250929110058.png
Pasted image 20250929110334.png

4.1 Filters and Convolutions

Video 4.1.1 Filters and Convolutions

discrete finite linear time-invariant casual channels/filters

finite: it stops after some time
time-invariant: the result does not change if it arrives now or 1min later
linear: addition + multiplication with scalar
casual: output only after onset of input

Video 4.1.2 LT-FIR linear Mappings + convolutions

convolutions are commutative

Video 4.1.3 Discrete Convolutions

Pasted image 20251001081307.png

Video 4.1.4 Periodic convolutions

Video 4.2.1 Diagonalizing circulant matrizes

All circulant matrices have the same eigenvectors!

\in C^{n}

n-th root of unity

ω_{n} = e^{\frac{- 2 π}{n}}

Pasted image 20251010165754.png

the scaled fourier matrix $\frac{1}{\sqrt{n}} F_{n}$ is unitary:

F_{n}^{- 1} = \frac{1}{n} F_{n}^{H} = \frac{1}{n} \overset{―}{F_{n}}

the fourier matrix diagonalizes circulant matrices

Video 4.2.2 Discrete Convolution via DFT

Pasted image 20251010170716.png

Eigen::FFT<double> fft;
fft.inv(
	(
		(fft.fwd(u)).cwiseProduct(fft.fwd(x))
	).eval()
)

explanation:
Pasted image 20251010171006.png

Video 4.2.3 Frequency filtering via DFT

DFT is a computer's eye for periodic patterns in data

Pasted image 20251010172559.png

Video 4.2.5 Two dimensional DFT

Pasted image 20251012165911.png

Video 4.2 FFT

The execution time for fft depends on the length of the transform. It is fastest for powers of two. It is almost as fast for lengths that have only small prime factors. It is typically several times slower for lengths that are prime or which have large prime factors

Video 4.5 Toeplitz Matrix Techniques

Toeplitz matrices:

data sparse (not sparse)

Pasted image 20251012173508.png

overdetermined LSE -> least squares estimator

Video 5.1 AI (Abstract Interpolation)

get unique solution $\forall y$ only if $m = n$ (matrix square)

if $A$ square, regular => can write Interpolant as linear combination of basis functions

f (t) = \sum_{j = 0}

interpolant can be recovered by forming a matrix-vector product

#todo write down end of video

Video 5.2 Uni-Variate Polynomials

Polynomials form a Vectorspace with

\dim P_{k} = k + 1

convention for storing polynomials:

p = [a_{n} a_{n - 1} \dots a_{0}]^{⊤} \in R^{n} + 1

Video 5.2.2 Polynomial Interpolation Theory

Pasted image 20251030114840.png
B1 - Lagrangian basis
B2 - Newton basis
B3 - monomial basis

Video 5.2.3 Polynomial Interpolation Algorithms

Pasted image 20251016115256.png
$O (\underset{c o n s t r u c t i o n}{\underset{⏟}{n^{2}}} + \underset{eval()}{\underset{⏟}{N \cdot n}})$

Pasted image 20251016115609.png

O (n^{2})

-> adding one more point ( $\hat{=}$ one diagonal) $O (n)$

Video 5.2.3.3 Extrapolation to Zero

Video 5.2.3.4 Newton Basis

Homer-like scheme
Pasted image 20251017231124.png

Video 5.2.4 Polynomial Interpolation: sensitivity

sensitivity: amplification of perturbation of data (of a problem)

Pasted image 20251028120258.png
-> tiny errors will have huge impact on the interpolation result
-> not suitable for data interpolation

6. Approximations of functions in 1D

Video 6.1 Introduction

in 5: create function to combine data points
in 6: find easier function, because 5 too costly

-> find simple (easy to evaluate) function with small approximation error (norm)

Video 6.2.1 Polynomial approximation: theory

bernstein theorem: if $f$ nearly continuous -> can be approximated by polynomial

but not how close, how fast -> dissapointing

jackson theorem: Max-norm of best-approximation error
Pasted image 20251030143913.png
how to move from interval $[- 1, 1]$ to $[a, b]$ :
Pasted image 20251030144157.png

max norm stays same
perserves degree of polynomials
derivative: change uniformly

Video 6.2.2 Convergence of interpolation errors

Pasted image 20251030145312.png
Pasted image 20251030145858.png
what can you tell from algebraic divergence?
-> how to increase polynomial degree to achieve certain reduction of error

Video 6.2.2.2 Interpolands of finite smoothness

Runge's Counterexample: $f (t) = \frac{1}{1 + t^{2}}$ , $I = [- 5, 5]$ -> approximation has exponential blow-up of error

Pasted image 20251030154000.png
$τ_{t}$ is not known, so usually we use the lest tight bound with the max norm:
Pasted image 20251030154249.png

Quantitative interpolation error estimates rely on smoothness!

Video 6.2.2.3

real analytic function
-> if has convergent taylor series for every point in interval $I$

Every real-analytic function on

I \subset R

can be extended to a complex-analytic function on some open set

D \subset C

with

I \subset D

Pasted image 20251030165738.png
Pasted image 20251030170438.png
Pasted image 20251030170548.png

Video 5.2.3.1 Chebychev Interpolation

The $n$ -th Chenychev polynomial is

T_{n} (t) = \cos (n \cos^{- 1} (t))

Pasted image 20251110163312.png
Pasted image 20251110163538.png

Video 5.2.3.2 Chebyshev INterpolation Error Estimates

equidistnt nodes affected by oscillations close to endpoints

Lebeseque links best approximation error and interpolation errors

Lebesque constant for Chebychev nodes:

λ_{τ} \leq \frac{2}{π} \log (n + 1)

-> increase verys slowly

(compare with equid. nodes: $λ_{τ} \geq C e^{n / 2}$ )

trick: plug in lebesque constant in between interpolation error and best approximation error

Pasted image 20251110164344.png

Line in lin-log plot -> exponential convergence

Line in log-log plot -> algebraic convergence

smoothness is crucial, also for chebyshev nodes

Suitebla integration path: elippsis around $[- 1, 1]$ :

γ (θ) = \cos (θ_{0} i \log (ρ))

$ρ > 1$
-> cos, to cancel out with chebyshev polynomials
Pasted image 20251110165002.png
for chebyshev polynomials for analytic functions

slower for smaller domain of analyticity $D \in C$

Video 6.5 Approximation by Trigonometric Polynomials

space of 1-periodic functions of trigonometric functions of degree 2n

P_{2 n}^{T} = span {1, \sin (2 π t), \cos (2 π t), \sin_{4} p i}

Video 6.5.2 Trigenometric Interpolation Error Estimates (fourier series)
#todo didn't quite understand the video

Pasted image 20251110214025.png
-> ${\hat{c}}_{j}$ bounded

Pasted image 20251110214336.png

Video 6.6.1 Piecewise Polynomial Lagrange Interpolation

approximation by piecewise polynomials

piece-wise polynomials; benefit:

locality
cheap algorithm
shape preservation

Pasted image 20251110220514.png

$H$ : mesh

lagrange interpolation for function that is $n + 1$ times continously differentiable ( $C^{n + 1}$ )

Pasted image 20251110220838.png

for mesh width $\to$ 0 ( $h_{m} \to_{0}$ ) we have algebraic convergence with rate $n + 1$

7. Numerical Quadrature

Video 7.1 Introduction

Numerical Quadrature (Integration)

Video 7.2 Quadrature Formulas Rules

$f$ given in procedural form

double f(double)

Pasted image 20251113115805.png

transformation:
Pasted image 20251113115834.png
Pasted image 20251113115949.png

interpolation schemes (5) -> approximation schemes (6) (simple functions) -> quadrature schemes (7)

quadrature erros can often be computer easily, especially for quadrature rules arising from interpolation schemes

(bounded by max norm of interpolation error)

Video 7.3 Polynomial Quadrature Formulas

Use Lagrangian polynomial interpolation

Pasted image 20251113120802.png

midpoint rule

(b - a) f (\frac{1}{2} (a + b))

trapezoidal rule

\frac{b - a}{2} (f (a) + f (b))

midpoint, trapezoidal

Dangerous for $n ≫ 1$ : huge weights with alternating sign: cancellation

solution:

chebyshev nodes -> no cancellation, since weights $w_{j} > 0$

error estimates for polynomial quadrature:
Pasted image 20251113121328.png

Video 7.4.1 Gauss Quadrature / Order of a Quadrature Rule

order of polynomial n-pt. quadrature rule = n, since n-1 polynomail -> n+1 of that order of quadrature rule

Pasted image 20251113121811.png
Pasted image 20251113121924.png

7.4.1.7: how to compute weights to achieve order n by solving a suitable linear system:
Pasted image 20251113122118.png
-> solve weights to get quadrature rule of at least order $n$

Video 7.4.2: Maximal-Order Quadrature Rules

Pasted image 20251113124452.png

how to find maximum order quadrature rule?

Pasted image 20251113124519.png

how to find n-point quadrature rule of order 2n?

Pasted image 20251113125224.png

Pasted image 20251113125353.png
Pasted image 20251113125406.png
Note that Hiptmair mostly proves lemmas indirectly.

Pasted image 20251113125511.png

Video 7.4.3 Quadrature Error Estimates

Pasted image 20251113194654.png
very important property for quadrature error:
Pasted image 20251113195529.png
from chapter 6);

limithed smoothness of f: ( $f \in C^{r} ([a, b])$ , but $f \notin C^{r + 1} ([a, b])$ ):
algebraic convergence with rate r ( $O (n^{- r})$ )
analytic $C^{\infty}$ -integrand (analytic extension):
exponential convergence ( $O (q^{n})$ )

high smoothness

⟹

fast convergence

restoring smoothness (removing singularity by transformation):

integration by substitution (e.g. $\sqrt{t}$ -> gets analytic) (more art than science, but can often be used)

The message of asymptotic estimates:

tell us how much additional work ~ n is needed to reduce Error by factor
algebraic convergence:
exponential convergence:

-> adavantage of exponential convergence: needs fixed amount of points to reduce error by factor

Eigen::ArrayXd::LinSpaced(n,a,b)

builds a sequence of $n$ values

{(a + \frac{b - a}{n - 1} \cdot j)}_{j = 0}^{n - 1}

Video 7.5 Composite Quadrature

split interval into mesh, integrate very mesh individually
-> cost = number of mesh cells (* integral)

good approximation for local error:
Pasted image 20251117161159.png

"h-covergence": convergence by making mesh finer
Pasted image 20251117161855.png
-> for equidistant meshes, use Gauss-Legendre Q.R. instead of composite quadrature
-> for trigonometric polynomials (periodic functions):
Pasted image 20251117162711.png
with

P_{2 n}^{⊤} = Span {t \mapsto e^{2 π i k t}, - n \leq k \leq n}

best approximation error estimeted by triag.pol.:
Pasted image 20251117162948.png

Video 7.6 Adaptive Quadrature (13 min)

goal: choosing mesh such that local errors are equidistribution
-> choose mesh a priori (based on prior knowledge about f) (often not known)
-> choose mesh a posteriori (automatic based on informatio gained during the computation)

Pasted image 20251117163628.png

with initial mesh $M = {x_{j}}_{j = 0}^{m}$

compute estimate for each mes cell: (simpson - trepezoidal Q.R.)
sum up to get estimate for total error
if error too large, find intervals contribute above average to the total error
add their midpoints to the new refined mesh. start anew.

estimate is normally not good, but always on the large side

8. Iterative Methods for Non-linear Systems of Equations

Video 8.1 Iterative Methods for Non-Linear Systems of Equations: Introduction (5 min)

form:

F (\vec{x}) = 0 F : D \subseteq R^{n} \to R^{n}

n number of unknowns/equations
no general theory
F often only implicit

8.2 Iterative Methods

Video 8.2.1 Fundamental Concepts (6 min)

solve by "getting closer and closer" iteratively
-> will this solution converge?

m-point method:

x^{(k + 1)} = \underset{iteration function}{\underset{⏟}{Φ_{F}}} \underset{last m iterates}{\underset{⏟}{(x^{(k)}, \dots, x^{(k - m + 1)})}}

-> insert last $m$ iterates to get next one

start with initial $m$ guesses:

consistent: vector at which iteration becomes stationary -> vector must be solution
Pasted image 20251117165154.png
-> consequence: assume

limit of sequence exists ( $\vec{x}$ cvgt.)
iteration function consistent ( $ϕ_{F}$ consistent)
iteration function continuous on all arguments ( $ϕ_{F}$ continous)

F ({\vec{x}}^{*}) = 0

local convergence:

cvg. will heavily depend on initial guess

questions

how fast will the convergence of the sequence $\to$ solution be? $\to$ 8.2.2
what is the initial region of local convergence? $\to$ partially answered by 8.3

Video 8.2.2 Speed of Convergence (15 min)

How fast will the norm converge?

How fast | | {\vec{x}}^{(k)} - {\vec{x}}^{*} | | \to 0 ?

Pasted image 20251117165937.png
smallest L: rate of cvg.

Detect linear convergence:
Pasted image 20251117170255.png

Error points aligned in lin-log plot

Pasted image 20251117170522.png

p=1 -> linear convergence
C usually unknown

Detect p $> 1$ convergence:
Pasted image 20251117170659.png

No easy graphical way to see

quadratic convergence (p=2): doubling of correct digits in each step

Video 8.2.3 Termination Criteria/Stopping Rules (14 min)

Ideal stopping rule:
stop iteration if prescribed absolute or relative tolerance reacher
-> not working, since exact value not known

practical stopping rules

A priori termination -> STOP after a fixed no. of steps
Residual-based termination -> STOP, when $| | F ({\vec{x}}^{k}) | | \leq τ \hat{=}$ prescribed tolerance
- tells us little about $| x - x^{*} |$
Correction-based termination -> stop convergent iteration, when correction stops changing much

error estimate:
Pasted image 20251117175325.png
-> Reliable error estimate, even if we only know upper bound $\overset{―}{L}$ , $L \leq \overset{―}{L} < 1$

8.3 Fixed-Point Iterations

Video 8.3 Fixed-Point Iterations (12 min)

we do not always know $x^{*}$ , so assume:

{\vec{x}}^{*} = lim_{k \to \infty} {\vec{x}}^{(k)} \underset{ϕ continuous}{\underset{⏟}{⟹}} {\vec{x}}^{*} = ϕ ({\vec{x}}^{*})

Pasted image 20251117180201.png
Pasted image 20251117180343.png
Pasted image 20251117180724.png
Pasted image 20251117180907.png

Video 8.4.1 Finding Zeros of Scalar Functions: Bisection (6 min)

find $x^{*} \in R$ with $F (x^{*}) = 0$

intermediate value theorem

If on interval $I$ , $i_{start}$ negative, $i_{end}$ positive -> continous function must have zero somewhere in $I$

Idea: geometrically decrease interval
-> linear convergence, factor $\frac{1}{2}$

+ guaranteed cenverence -> robust
+ F - only evluations needed -> procedural ok
+ derivative-free
+ works $\forall$ continuous $F$
- not extremely fast

Video 8.4.2.1 Newton Method in the Scalar Case (20 min)

Idea: replace function locally around $x^{(k)}$ with a simpler method $F (x) \approx F_{k} (x)$

Newton: use tangent on $x^{(k)}, F (x^{(k)})$
Pasted image 20251121182439.png

F^{'} (x^{(k)}) \neq 0

required

Pasted image 20251121182747.png
Pasted image 20251121182758.png

Ex. 8.4.2.4: functions with same $x^{*}$ may have different Newton iterations -> advantageous to first recast function before calculating Newton iteration

Find $F^{'} (x)$ implicit with implicit differentiation

Pasted image 20251123153836.png

If (F) is hard to solve but close to an easily invertible $\hat{F}$ , define

g (x) = {\hat{F}}^{- 1} (F (x)), G (x) = g (x) - {\hat{F}}^{- 1} (0) .

Then $F (x) = 0 ⟺ G (x) = 0$ , and $G$ is almost linear.
Newton on (G) converges faster and from farther away because the preconditioning makes the problem closer to the identity map.
Pasted image 20251123154638.png

Video 8.4.2.3 Multi-Point Methods (12 min)

Pasted image 20251123154854.png
secant method:
Pasted image 20251123155000.png

replaces derivative in Newton's method by quotient
2-point method, requires 2 initial guesses
derivative-free
one F-evaluation per step
if function flat, breaks down
fractional convergence with p $\approx$ 1.62

Pasted image 20251123155818.png

Newton, multi-point methods converge only locally!

-> secant method more efficient (see 8.4.3) than Newton method!

8.4 Finding Zeros of Scalar Functions

Video 8.4.3 Asymptotic Efficiency of Iterative Methods for Zero Finding (9 min)

Efficiency = \frac{gain - no, of correct digit}{work -#F-eval+#F’-eval}

Pasted image 20251123160220.png
Pasted image 20251123160512.png
lower bound for number of required steps:
Pasted image 20251123160528.png
Pasted image 20251123160804.png

8.5 Newtons Method in $R^{n}$

Video 8.5.1 The Newton Iteration in Rn (I) (10 min)

The standard method for solving (generic) non-linear systems of equations.

Pasted image 20251123161923.png

Newton method if affine invariant (invariant to multiplicaiton with matrix from left)
-> should also be desireable for stopping rules

Video 8.5.1.15 Multi-dimensional Differentiation (20 min)

Matrix * Vector is general way to write linear mapping $R^{n} \to R^{n}$
Pasted image 20251123173219.png
Pasted image 20251123173229.png
bilinear $\hat{=}$ linear in every argument

Pasted image 20251123173337.png
Pasted image 20251123173443.png

Pasted image 20251123173548.png

Video 8.5.1 The Newton Iteration in Rn (II) (15 min)

Pasted image 20251123204656.png
Pasted image 20251123190130.png
Pasted image 20251123190919.png

Simplified Newton method: $O (n^{2})$ instead of $O (n^{3})$ from solving LSE by "freezing" Jacobian

Drawback: usually sacrifices the asymptotic quadratic convergence of the Newton method: merely linear convergence can be expected.

#todo look video again to understand better

Video 8.5.2 Convergence of Newton’s Method (9 min)

Convergence of Newton's method:
$n = 1 ⟹$ local quadratic cvg.

Pasted image 20251123192547.png

Video 8.5.3 Termination of Newton Iteration (7 min)

terminate, when $| | x^{(k)} - x^{*} | |$ small

=> $| | x^{(k)} - x^{*} | | \approx | | x^{(k)} - x^{(k + 1)} | |$ , but unnecseeary iteration k+1 ( $O (n^{3})$ )

Pasted image 20251123205028.png
-> affine invariant: does not change when

F \to A F, A \in R^{n \times n} regular

Video 8.5.4 Damped Newton Method (11 min)

Pasted image 20251123205712.png
=> goal: enlarge domain of cvg.

Newton correct in the wrong direction -> no remed
"Overshooting" Newton correction -> apply damping

Pasted image 20251123210004.png
Pasted image 20251123210108.png
if quadratic cvg. =>

| | Δ {\overset{―}{x}}^{(k)} | | \leq \frac{1}{2} | | Δ x^{(k)} | |

if NMT (natural monotonicity test) fails

$λ \leftarrow \frac{λ}{2}$ & retry
if passed
next step, (8.5.4.2), $k \leftarrow k + 1$
$λ \leftarrow 2 λ$ in next step

called a reject=accept approch

-> works, because of affine invariant property of Newton method

8.6 Quasi-Newton Method

Video 8.6 Quasi-Newton Method (15 min)

Drawback of Netwon method: DF required

Goal: Derivative-free iteration of NLSE
for $n = 1$ : Secant method, 8.4.2.3

Pasted image 20251125074732.png

replaces $D F (x^{(k)})$ (Jacobian) with difference quotient approximation $J_{k} \in R^{n, n}$

Pasted image 20251125075051.png
$(B C) + (S C)$ unique determine $J_{k}$

Pasted image 20251125075109.png

No asymptotic quadratic cvg. for Broyden's quasi-Newton method (but still faster than e.g. simplfied Newton)
rank-1-modification

for $k = 1$ (see also problem 8.10)d) from week 11), the Sherman-Morrison-Woodbury formula can be simplified to:
Pasted image 20251125075418.png
Pasted image 20251125075732.png

solve $n \times n$ LSE
vector arithmetic from then on (very cheap)

Pasted image 20251125075908.png

can have stability problems! (because of SMW-formula)

8.7 Non-linear Least Squares

Video 8.7 Non-linear Least Squares (7 min)

minimize Euclidean norm of residual (same as linear case)
Pasted image 20251125080250.png

forefactor $\frac{1}{2}$ only convention
More general version of the linear least squares problem

linear case:
Pasted image 20251125080615.png
linear combination of basis function

non-linear case:
Pasted image 20251125080723.png

Video 8.7.1 Non-linear Least Squares: (Damped) Newton Method (13 min)

with $F \in C^{2}$ , $ϕ : R^{n} \to R$
Pasted image 20251125081553.png
Pasted image 20251125081700.png

Pasted image 20251125081903.png
then the LSE for Newton crrection $\vec{s} \in R^{n}$ is:
Pasted image 20251125082059.png
-> normal equation in case $F (x) = A x - b$ (see paper notes W11)

Video 8.7.2 (Trust-region) Gauss-Newton Method (13 min)

Pasted image 20251125083438.png
Pasted image 20251125083557.png

Pasted image 20251125083756.png
Advantages over Newton-method:
+ 2nd-derivativ free
+ Larger domain of cvg.
- no local quadratic cvg.

-> no global cvg., but we can also use damping (called trust region Method $\hat{=}$ implicitly damped Gauss-Newton method)

Pasted image 20251125084246.png
Pasted image 20251125084321.png

11. Numerical Integration - Single Step Methods

Video 11.1 Initial-Value Problems (IVPs) for Ordinary Differential Equations (35 min)

Pasted image 20251125085615.png
Pasted image 20251125091925.png

for autonomous ODE, choosing initial time $t_{0} = 0$ is natural
time invariant -> wa can always fix initial time $t_{0} = 0$ without loosing generality

Pasted image 20251125092146.png
Pasted image 20251125092213.png

time-dependent ODEs can be converted to autonomous ODEs
Pasted image 20251125092736.png

conversion: from higher-order ODEs for first-order ODEs
Pasted image 20251125093156.png

for initial-value problem, we have to specify initial values for both $y, \dot{y}, \dot{y} \dot{}, \dots$ for the first-order ODE

both time-dependent and higher-order ODEs can be converted to 1st-order autonomous IVPs:

$\dot{\vec{y}} = \vec{f} (\vec{y}), \vec{y} (0) = {\vec{y}}_{0} \in R^{N}$

Pasted image 20251125093607.png
Pasted image 20251125093805.png

Pasted image 20251125094219.png
left:

trajectory of single initial point moving through space
fix starting point, vary time

t \mapsto ϕ^{t} y_{0}

right:

a set of points move through space, with evolution operators $ϕ^{1}, ϕ^{2}, \dots$
time fixed, look how all spatial points move

y \mapsto ϕ^{t} y t const.

=> as time progresses, the evolution operator takes a "part of the space" and "flows" it forward

Recovering the ODE from the Flow: the vector field $f (y)$ (which defines the differential equation) is simply the initial velocity of the evolution operator.

f (y) = \frac{\partial Φ}{\partial t} (0, y)

Numerical integration is concerned with the approximation of evolution operators.

for Autonomous ODE, we have the group property:

ϕ^{s} \circ ϕ^{t} = ϕ^{s + t}