插值问题

物理问题 \rightarrow 离散化 \rightarrow 计算机处理

离散的物理数据,但不知道实际函数\rightarrow 构造近似函数作为实际函数f(x)f(x)的逼近


插值问题的数学表述: f(x)f(x)是定义在区间[a,b][a, b]的函数,(x0,y0), (x1,y1), (x2,y2)..., (xn,yn)(x_0, y_0),\ (x_1, y_1),\ (x_2, y_2)...,\ (x_n, y_n)是该区间上n+1n+1个不同的点。我们需要构造一个便于计算的函数P(x)P(x),s.t.

P(xi)=yi(1.1)P(x_i) = y_i \tag{1.1}

则称P(x)P(x)f(x)f(x)的插值函数,[a,b][a, b]是插值区间,(xi,yi)(x_i, y_i)是插值节点。同时在其他xxix \neq x_i点上,P(x)P(x)可以作为f(x)f(x)的近似。

多项式插值

Pn(x)=a0+a1x+a2x2+...+anxn,    a0,a1,...,anRP_n(x) = a_0 + a_1x + a_2x^2 + ... + a_nx^n,\;\; a_0, a_1, ..., a_n \in \mathbb{R},且满足插值条件1.1{1.1}

将插值条件写成线性方程组的形式:

{a0+a1x0+a2x02++anx0n=y0a0+a1x1+a2x12++anx1n=y1a0+a1xn+a2xn2++anxnn=yn\begin{cases} a_0 + a_1 x_0 + a_2 x_0^2 + \cdots + a_n x_0^n = y_0 \\ a_0 + a_1 x_1 + a_2 x_1^2 + \cdots + a_n x_1^n = y_1 \\ \vdots \\ a_0 + a_1 x_n + a_2 x_n^2 + \cdots + a_n x_n^n = y_n \end{cases}

这样可用矩阵语言表述:

(y0y1yn)=(1x0x0n1x1x1n1x2x2n1xnxnn)(a0a1an)\begin{pmatrix} y_0 \\ y_1 \\ \vdots \\ y_n \end{pmatrix} = \begin{pmatrix} 1 & x_0 & \cdots & x_0^n \\ 1 & x_1 & \cdots & x_1^n \\ 1 & x_2 & \cdots & x_2^n \\ 1 & x_n & \cdots & x_n^n \end{pmatrix} \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \end{pmatrix}

记作(1.2)(1.2)。上述系数矩阵是方阵(这很显然),记作XX。由线性代数的知识可知,当系数行列式X0|X| \neq 0时,齐次方程组Xt=0Xt = 0只有零解,对应的非齐次方程组Xt=bXt = b只存在一组特解。

求解:

Y=Xaa=X1YY = Xa\\ \Rightarrow a = X^{-1}Y

其中YYaa分别表示yiy_iaia_i对应的的列向量,X1X^{-1}表示XX的逆矩阵。

注意到系数矩阵恰巧时范德蒙德(Vandermonde)矩阵。由线性代数知识可知,Vandermonde行列式的值为(将对应矩阵记作V):

det(V)=1x0x0n1x1x1n1xnxnn=0j<in(xixj)0\det(V) = \begin{vmatrix} 1 & x_0 & \cdots & x_0^n \\ 1 & x_1 & \cdots & x_1^n \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & \cdots & x_n^n \end{vmatrix} = \prod_{0 \leq j < i \leq n}(x_i - x_j) \neq 0

Vandermonde行列式求解步骤:

已知det(V)=1x0x0n1x1x1n1xnxnn\det(V) = \begin{vmatrix} 1 & x_0 & \cdots & x_0^n \\ 1 & x_1 & \cdots & x_1^n \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & \cdots & x_n^n \end{vmatrix}

由行列式的性质:转置以及某行的kk倍加到另一行,行列式的值不变。

因此做变换:先转置,再从最后一行开始rix0ri1r_i - x_0r_{i-1}rir_i表示xix_i所在的行)得到

det(VT)=111x0x1xnx02x12xn2x0nx1nxnn=1110x1x0xnx00x12x1x0xn2xnx00x1nx1n1x0xnnxnn1x0=x1x0x2x0xnx0x1(x1x0)x2(x2x0)xn(xnx0)x1n1(x1x0)x2n1(x2x0)xnn1(xnx0)=i=1n(xix0)111x1x2xnx12x22xn2x1n1x2n1xnn1\begin{align*} \det(V^T) &= \begin{vmatrix} 1 & 1 & \cdots & 1 \\ x_0 & x_1 & \cdots & x_n\\ x_0^2 & x_1^2 & \cdots & x_n^2\\ \vdots & \vdots & \ddots & \vdots\\ x_0^n & x_1^n & \cdots & x_n^n \end{vmatrix}\\ &= \begin{vmatrix} 1 & 1 & \cdots & 1 \\ 0 & x_1-x_0 & \cdots & x_n-x_0\\ 0 & x_1^2-x_1x_0 & \cdots & x_n^2-x_nx_0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & x_1^n-x_1^{n-1}x_0 & \cdots & x_n^n-x_n^{n-1}x_0 \end{vmatrix}\\ &= \begin{vmatrix} x_1-x_0 & x_2-x_0 & \cdots & x_n-x_0\\ x_1(x_1-x_0) & x_2(x_2-x_0) & \cdots & x_n(x_n-x_0)\\ \vdots & \vdots &\ddots & \vdots\\ x_1^{n-1}(x_1-x_0) & x_2^{n-1}(x_2-x_0) & \cdots & x_n^{n-1}(x_n-x_0) \end{vmatrix}\\ & = \prod_{i=1}^n(x_i-x_0) \begin{vmatrix} 1 & 1 & \cdots & 1\\ x_1 & x_2 & \cdots & x_n\\ x_1^2 & x_2^2 & \cdots & x_n^2\\ \vdots & \vdots & \ddots & \vdots\\ x_1^{n-1} & x_2^{n-1} & \cdots & x_n^{n-1} \end{vmatrix} \end{align*}

可以看到又出现了Vandermonde行列式,不过此时阶数1-1。再进行一轮上述操作,又得到因子j=2n(xjx1)\prod_{j=2}^n (x_j-x_1)。以此类推,最终可得:

det(VT)=det(V)=i=1n(xix0)j=2n(xjx1)(xnxn1)=0j<in(xixj)\begin{align*} \det(V^T) &= \det(V) = \prod_{i=1}^n(x_i-x_0)\prod_{j=2}^n (x_j-x_1)\prod \cdots (x_n-x_{n-1})\\ &=\prod_{0 \leq j < i \leq n}(x_i - x_j) \end{align*}

定理:对于n+1n+1个互不相同的插值节点,满足插值条件的nn次多项式插值函数存在且唯一。

多项式插值方法一 —— 拉格朗日插值法

引入:线性插值

给定两个点(x0,y0)(x_0, y_0)(x1,y1)(x_1, y_1)求解

{a0+a1x0=y0a0+a1x1=y1\begin{cases} a_0+ a_1x_0 = y_0\\ a_0+ a_1x_1 = y_1 \end{cases}

解出:

{a0=y0x1y1x0x1x0a1=y0y1x0x1\begin{cases} a_0 = \frac{y_0x_1-y_1x_0}{x_1-x_0}\\ a_1 = \frac{y_0 - y_1}{x_0 - x_1} \end{cases}

可得

P1(x)=a0+a1x=y1x0y0x1x0x1+y0y1x0x1x=xx1x0x1y0xx0x0x1y1P_1(x) = a_0+ a_1x = \frac{y_1x_0-y_0x_1}{x_0-x_1} + \frac{y_0 - y_1}{x_0 - x_1}x =\frac{x-x_1}{x_0 - x_1}y_0 - \frac{x-x_0}{x_0-x_1}y_1

l0=xx1x0x1,  l1=xx0x1x0l_0 = \frac{x-x_1}{x_0-x_1}, \; l_1 = \frac{x-x_0}{x_1-x_0},那么 P1=l0(x)y0+l1(x)y1    l0(x),l1(x)P_1 = l_0(x) y_0 + l_1(x) y_1 \; \Rightarrow \; l_0(x), l_1(x)为插值基函数。

推广:拉格朗日插值

首先需要明白插值基函数的性质。它们满足在插值节点xkx_k上等于1,其余地方为0。即:

lk(xi)={1i=k0ikl_k(x_i) = \begin{cases} 1 \quad i = k\\ 0 \quad i\neq k \end{cases}

构造:

Pn(x)=k=0nlk(x)yk,lk(x)=jkxxjxkxj P_n(x) = \sum_{k=0}^n l_k(x) \cdot y_k, \quad l_k(x) = \prod_{j \neq k} \frac{x - x_j}{x_k - x_j}

其中lk(x)l_k(x)是基函数。

易验证拉格朗日插值基函数满足以下性质:

  • 归一性:lk(xi)={1i=k0o/wl_k(x_i) = \begin{cases} 1 \quad i = k\\ 0 \quad o/w \end{cases}
  • 多项式次数:每个lkl_knn次多项式。

举例理解

上面的线性插值已经是一个例子。现在看n=2n = 2

对于(x0,x1,x2)(x_0,x_1,x_2)

l0(x)=(xx1)(xx2)(x0x1)(x0x2)l1(x)=(xx0)(xx2)(x1x0)(x1x2)l2(x)=(xx0)(xx1)(x2x0)(x2x1)\begin{aligned} l_0(x) &= \frac{(x-x_1)(x-x_2)}{(x_0-x_1)(x_0-x_2)} \\ l_1(x) &= \frac{(x-x_0)(x-x_2)}{(x_1-x_0)(x_1-x_2)} \\ l_2(x) &= \frac{(x-x_0)(x-x_1)}{(x_2-x_0)(x_2-x_1)} \end{aligned}

更高阶同理。

线性插值误差

定理:设 f(x)f(x) 一阶连续可导,且 f(x)f''(x) 在区间 (a,b)(a, b) 上存在,则对任意 x[a,b]x \in [a, b],存在 ξ[a,b]\xi \in [a, b] 使得:

R(x)=f(x)P1(x)=f(ξ)2(xx0)(xx1)R(x) = f(x) - P_1(x) = \frac{f''(\xi)}{2}(x-x_0)(x-x_1)

这里的 P1(x)P_1(x) 是线性插值的基函数。

证明:

f(x0)=P1(x0);f(x1)=P1(x1)\because f(x_0) = P_1(x_0); \quad f(x_1) = P_1(x_1)

R(x)=f(x)P1(x)\therefore R(x) = f(x)-P_1(x)存在两个零点x0,x1x_0, x_1

\therefore 可以设 R(x)=k(x)(xx0)(xx1)R(x) = k(x)(x-x_0)(x-x_1),且:

f(x)P1(x)=k(x)(xx0)(xx1)(1.2)f(x)-P_1(x) = k(x)(x-x_0)(x-x_1) \tag{1.2}

构造辅助函数:

ϕ(t)=f(t)P1(t)k(x)(tx0)(tx1)\phi(t) = f(t) - P_1(t) - k(x)(t-x_0)(t-x_1)

则可以将1.2{1.2}看成ϕ(t)\phi(t)t=xt=x时的取值,且ϕ(x)=0\phi(x) = 0

则由以上讨论,ϕ(t)\phi(t)x0,x,x1x_0,x,x_1 处有零点。由Rolle定理,ϕ(t)\phi'(t)(x0,x1),(x1,x2)(x_0, x_1),\, (x_1, x_2) 上存在零点,分别记作 (ξ1,ξ2)(\xi_1,\xi_2)。再由罗尔定理,ϕ(t)\phi''(t)(ξ1,ξ2)(\xi_1,\xi_2) 有零点,记作 ξ\xi

ϕ(ξ)=0\therefore \phi''(\xi) = 0ϕ(t)\phi(t) 两边求二阶导得:

ϕ(t)=f(t)P1(t)2k(x)=0\phi''(t) = f''(t) - P_1''(t) - 2k(x) = 0

P1(t)\because P_1(t)是线性插值基函数,仅有一次项 P1(x)=0\therefore P_1''(x) = 0 \quad \Rightarrow f(ξ)=2k(x)f''(\xi) = 2k(x)

再代回原式即可得证。

一般形式的误差估计: 对于 nn 次插值多项式 Pn(x)P_n(x)

Rn(x)=f(x)Pn(x)=f(n+1)(ξ)(n+1)!ωn+1(x)R_n(x) = f(x) - P_n(x) = \frac{f^{(n+1)}(\xi)}{(n+1)!}\omega_{n+1}(x)

其中:

ωn+1(x)=i=0n(xxi)\omega_{n+1}(x) = \prod_{i=0}^n (x-x_i)

可以看出:节点越密集 ((xxi)(x-x_i) 越小),误差越小。

很多情况下,f(x)f(x) 未知,f(n+1)f^{(n+1)}也难以得知,因此可采用以下方法估计误差:

使用 n+2n+2 个数据点进行两次插值运算:前 n+1n+1 点构造 Pn(x)P_n(x);后 n+1n+1 点构造 P^n(x)\hat{P}_n(x)

两次的误差分别是:

Rn(x)=f(x)Pn(x)=f(n+1)(ξ1)(n+1)!(xx0)(xx1)(xxn)R^n(x)=f(x)P^n(x)=f(n+1)(ξ2)(n+1)!(xx1)(xx2)(xxn+1)R_n(x) = f(x) - P_n(x) = \frac{f^{(n+1)}(\xi_1)}{(n+1)!}(x-x_0)(x-x_1)\cdots(x-x_n)\\ \hat{R}_n(x) = f(x) - \hat{P}_n(x) = \frac{f^{(n+1)}(\xi_2)}{(n+1)!}(x-x_1)(x-x_2)\cdots(x-x_{n+1})

整理可得误差关系式:

f(x)Pn(x)xx0x0xn+1(Pn(x)P^n(x))f(x) - P_n(x) \approx \frac{x-x_0}{x_0-x_{n+1}}(P_n(x)-\hat{P}_n(x))

这里说说得到这个误差关系式的具体步骤:

已知

Rn(x)=f(x)Pn(x)=f(n+1)(ξ1)(n+1)!(xx0)(xx1)(xxn)R^n(x)=f(x)P^n(x)=f(n+1)(ξ2)(n+1)!(xx1)(xx2)(xxn+1)R_n(x) = f(x) - P_n(x) = \frac{f^{(n+1)}(\xi_1)}{(n+1)!}(x-x_0)(x-x_1)\cdots(x-x_n)\\ \hat{R}_n(x) = f(x) - \hat{P}_n(x) = \frac{f^{(n+1)}(\xi_2)}{(n+1)!}(x-x_1)(x-x_2)\cdots(x-x_{n+1})

我们假设 f(n+1)(ξ1)f(n+1)(ξ2)f^{(n+1)}(\xi_1) \approx f^{(n+1)}(\xi_2),即高阶导数在 ξ1\xi_1ξ2\xi_2 处的值相近。于是可以将两个误差表达式中的高阶导数项消去:

Rn(x)R^n(x)(xx0)(xx1)(xxn)(xx1)(xx2)(xxn+1)=xx0xxn+1\frac{R_n(x)}{\hat{R}_n(x)} \approx \frac{(x - x_0)(x - x_1) \cdots (x - x_n)}{(x - x_1)(x - x_2) \cdots (x - x_{n+1})} = \frac{x - x_0}{x - x_{n+1}}

Rn(x)xx0xxn+1R^n(x)\Rightarrow R_n(x) \approx \frac{x - x_0}{x - x_{n+1}} \hat{R}_n(x)

由于R^n(x)=f(x)P^n(x)\hat{R}_n(x) = f(x) - \hat{P}_n(x),则代入可得:

f(x)Pn(x)xx0xxn+1(f(x)P^n(x))f(x) - P_n(x) \approx \frac{x - x_0}{x - x_{n+1}} (f(x) - \hat{P}_n(x))

f(x)f(x) 的项移到左边:

f(x)(1xx0xxn+1)Pn(x)xx0xxn+1P^n(x)f(x) \left(1 - \frac{x - x_0}{x - x_{n+1}}\right) \approx P_n(x) - \frac{x - x_0}{x - x_{n+1}} \hat{P}_n(x)

f(x)x0xn+1xxn+1Pn(x)xx0xxn+1P^n(x)\Rightarrow f(x) \frac{x_0-x_{n+1}}{x-x_{n+1}} \approx P_n(x)-\frac{x - x_0}{x - x_{n+1}} \hat{P}_n(x)

f(x)xxn+1x0xn+1[Pn(x)xx0xxn+1P^n(x)]=xxn+1x0xn+1Pn(x)xx0x0xn+1P^n(x)=x0xn+1+xx0x0xn+1Pn(x)xx0x0xn+1P^n(x)=Pn(x)+xx0x0xn+1[Pn(x)P^n(x)]\begin{aligned} \Rightarrow f(x) &\approx \frac{x-x_{n+1}}{x_0-x_{n+1}}[P_n(x)-\frac{x - x_0}{x - x_{n+1}} \hat{P}_n(x)]\\ &=\frac{x-x_{n+1}}{x_0-x_{n+1}}P_n(x) - \frac{x-x_0}{x_0-x_{n+1}}\hat{P}_n(x)\\ &=\frac{x_0-x_{n+1}+x-x_0}{x_0-x_{n+1}}P_n(x) - \frac{x-x_0}{x_0-x_{n+1}} \hat{P}_n(x)\\ &=P_n(x) + \frac{x-x_0}{x_0-x_{n+1}}[P_n(x) - \hat{P}_n(x)] \end{aligned}

移项整理即可得到误差关系式。

拉格朗日插值法的优点/缺点:

  • 优点:
    • 显示公式:无需解线性方程组,直接构造多项式。
    • 理论直观
  • 缺点:
    • 计算效率低:新增节点需重新计算所有基函数。

代码实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def Pn_x(x_variables, x_data, y_data):
"""
计算拉格朗日插值多项式在 x_variables 处的值

参数:
x_variables: float 或 array-like, 插值点的自变量值
x_data: list 或 array-like, 已知数据点的自变量值
y_data: list 或 array-like, 已知数据点的因变量值

返回:
float 或 array-like, 插值多项式在 x_variables 处的值
"""
n = len(x_data) # 数据点数量
result = 0.0 # 初始化结果

for i in range(n):
term = y_data[i] # 初始化当前项
# 计算拉格朗日基函数
for j in range(n):
if j != i:
term *= (x_variables - x_data[j]) / (x_data[i] - x_data[j])
result += term
return result

多项式插值方法二——牛顿插值法

引入

目的:避开拉格朗日插值法“新增节点需重新计算所有基函数”这一问题。

由数学知识可知任何一个多项式都可以表示为:

Nn(x)=a0+k=1naki=0k1(xxi)N_n(x) = a_0 + \sum_{k=1}^n a_k \prod_{i=0}^{k-1} (x - x_i)

在牛顿形式下,我们可以认为此时 aka_k 对应的基函数是 kk 的多项式i=0k1(xxi)\prod_{i=0}^{k-1} (x - x_i),它们相互独立。因此有更高阶数加入的时候只需计算相应新的多项式系数即可。

计算步骤

  1. 将插值条件P(xi)=yiP(x_i) = y_i写成线性方程组的形式:

{y0=a0=f(x0)y1=y0+a1(x1x0)=f(x1)y2=y0+a1(x2x0)+a2(x2x0)(x2x1)=f(x2)...yn=y0+a1(xnx0)+a2(xnx0)(xnx1)+...+an(xnx0)(xnx1)...(xnxn1)=f(xn)\begin{cases} y_0 = a_0 = f(x_0)\\ y_1 = y_0 +a_1 (x_1-x_0) = f(x_1)\\ y_2 = y_0 + a_1 (x_2-x_0) + a_2(x_2-x_0)(x_2-x_1) = f(x_2)\\ ...\\ y_n = y_0 + a_1 (x_n-x_0) + a_2(x_n-x_0)(x_n-x_1) + ... + a_n(x_n-x_0)(x_n-x_1)...(x_n-x_{n-1}) = f(x_n) \end{cases}

  1. 引入差商的概念:
    (y0,y1)(y_0, y_1)我们引入一阶差商

F(x0,x1)=f(x1)f(x0)x1x0=a1F(x_0, x_1) = \frac{f(x_1) - f(x_0)}{x_1 - x_0} = a_1

类似可得更高阶的差商公式:

F(x0,x1,x2)=F(x2,x1)F(x0,x1)x2x0F(x0,...,xn)=F(x1,...,xn)F(x0,...,xn1)xnx0F(x_0, x_1,x_2) = \frac{F(x_2, x_1)-F(x_0, x_1)}{x_2-x_0}\\ \cdots \\ F(x_0,...,x_n) = \frac{F(x_1,...,x_n)-F(x_0,...,x_{n-1})}{x_n-x_0}

根据以上公式我们可以改写差商形式的牛顿插值多项式:

Pn(x)=a0+k=1naki=0k1(xxi)=F(x0)+k=1nF(x0,...,xk)i=0k1(xxi)\begin{aligned} P_n(x) &= a_0 + \sum_{k=1}^n a_k \cdot \prod_{i=0}^{k-1} (x - x_i) \\ &= F(x_0) + \sum_{k=1}^n F(x_0,...,x_k) \cdot \prod_{i=0}^{k-1}(x-x_i) \end{aligned}

对应关系:ak=F(x0,...,xk)a_k = F(x_0, ..., x_k)

这里以 a2a_2 为例,详细展开系数的计算:

易知

a2=f(x2)f(x0)F(x0,x1)(x2x0)(x2x0)(x2x1)=f(x2)f(x0)f(x1)f(x0)x1x0(x2x0)(x2x0)(x2x1)=(f(x2)f(x0))(x1x0)(f(x1)f(x0))(x2x0)(x2x0)(x2x1)(x1x0)=f(x2)f(x0)(x2x0)(x2x1)f(x1)f(x0)(x1x0)(x2x1)(1.3)\begin{aligned} a_2 &= \frac{f(x_2)-f(x_0)-F(x_0, x_1)(x_2-x_0)}{(x_2-x_0)(x_2-x_1)}\\ &= \frac{f(x_2)-f(x_0)-\frac{f(x_1) - f(x_0)}{x_1 - x_0}(x_2-x_0)}{(x_2-x_0)(x_2-x_1)}\\ &= \frac{(f(x_2)-f(x_0))(x_1-x_0)-(f(x_1)-f(x_0))(x_2-x_0)}{(x_2-x_0)(x_2-x_1)(x_1-x_0)} \\ &= \frac{f(x_2)-f(x_0)}{(x_2-x_0)(x_2-x_1)} - \frac{f(x_1) - f(x_0)}{(x_1-x_0)(x_2-x_1)}\\ \end{aligned} \tag{1.3}

而更常见的形式为:

a2=f(x2)f(x1)x2x1f(x1)f(x0)x1x0x2x0(1.4)a_2 = \frac{\frac{f(x_2) - f(x_1)}{x_2 - x_1}-\frac{f(x_1) - f(x_0)}{x_1 - x_0}}{x_2-x_0} \tag{1.4}

下证(1.3)(1.3)(1.4)(1.4)等价性:

(1.3)=(f(x2)f(x0))(x1x0)(f(x1)f(x0))(x2x0)(x2x0)(x2x1)(x1x0)=(x1f(x2)x0f(x2)x1f(x0)+x0f(x0))(x2f(x1)x0f(x1)x2f(x0)+x0f(x0))(x2x1)(x2x0)(x1x0)=x1f(x2)x0f(x2)x1f(x0)x2f(x1)+x0f(x1)+x2f(x0)(x2x1)(x2x0)(x1x0)=f(x2)(x1x0)+f(x1)(x0x2)+f(x0)(x2x1)(x2x1)(x2x0)(x1x0)\begin{aligned} (1.3) &= \frac{(f(x_2)-f(x_0))(x_1-x_0)-(f(x_1)-f(x_0))(x_2-x_0)}{(x_2-x_0)(x_2-x_1)(x_1-x_0)}\\ &= \frac{(x_1f(x_2)-x_0f(x_2)-x_1f(x_0)+x_0f(x_0))-(x_2f(x_1)-x_0f(x_1)-x_2f(x_0)+x_0f(x_0))}{(x_2-x_1)(x_2-x_0)(x_1-x_0)}\\ &= \frac{x_1f(x_2)-x_0f(x_2)-x_1f(x_0)-x_2f(x_1)+x_0f(x_1)+x_2f(x_0)}{(x_2-x_1)(x_2-x_0)(x_1-x_0)}\\ &= \frac{f(x_2)(x_1-x_0)+f(x_1)(x_0-x_2)+f(x_0)(x_2-x_1)}{(x_2-x_1)(x_2-x_0)(x_1-x_0)} \end{aligned}

(1.4)=f(x2)f(x1)(x2x1)(x2x0)f(x1)f(x0)(x1x0)(x2x0)=(x1x0)(f(x2)f(x1))(x2x1)(f(x1)f(x0))(x2x1)(x2x0)(x1x0)=x1f(x2)x1f(x1)x0f(x2)+x0f(x1)(x2f(x1)x2f(x0)x1f(x1)+x1f(x0))(x2x1)(x2x0)(x1x0)=x1f(x2)x0f(x2)+x0f(x1)x2f(x1)+x2f(x0)x1f(x0)(x2x1)(x2x0)(x1x0)=f(x2)(x1x0)+f(x1)(x0x2)+f(x0)(x2x1)(x2x1)(x2x0)(x1x0)\begin{aligned} (1.4) &= \frac{f(x_2)-f(x_1)}{(x_2-x_1)(x_2-x_0)} - \frac{f(x_1)-f(x_0)}{(x_1-x_0)(x_2-x_0)}\\ &= \frac{(x_1-x_0)(f(x_2)-f(x_1))-(x_2-x_1)(f(x_1)-f(x_0))}{(x_2-x_1)(x_2-x_0)(x_1-x_0)}\\ &= \frac{x_1f(x_2)-x_1f(x_1)-x_0f(x_2)+x_0f(x_1)-(x_2f(x_1)-x_2f(x_0)-x_1f(x_1)+x_1f(x_0))}{(x_2-x_1)(x_2-x_0)(x_1-x_0)}\\ &= \frac{x_1f(x_2)-x_0f(x_2)+x_0f(x_1)-x_2f(x_1)+x_2f(x_0)-x_1f(x_0)}{(x_2-x_1)(x_2-x_0)(x_1-x_0)}\\ &= \frac{f(x_2)(x_1-x_0) + f(x_1)(x_0-x_2) + f(x_0)(x_2-x_1)}{(x_2-x_1)(x_2-x_0)(x_1-x_0)} \end{aligned}

可见 (1.3)    (1.4)(1.3) \iff (1.4)

而由定义可得

F(x2,x1)=f(x2)f(x1)x2x1F(x_2, x_1) = \frac{f(x_2) - f(x_1)}{x_2 - x_1}

所以根据(1.4)(1.4),我们可以把 a2a_2 写成

a2F(x0,x1,x2)=F(x1,x2)F(x0,x1)x2x0a_2 \triangleq F(x_0, x_1, x_2) = \frac{F(x_1, x_2) - F(x_0, x_1)}{x_2-x_0}

这就是上面给出的形式。
(所以怎么直接由(1.3)(1.3)推出(1.4)(1.4)……求大神评论区指点。我自己也会想想办法的。)

牛顿差商的性质

  1. 线性叠加:
    利用数学归纳法可以证明:

    F(x0,x1,...,xm)=j=0mf(xj)(xjx0)(xjxj1)(xjxj+1)(xjxm)F(x_0, x_1, ..., x_m) = \sum^m_{j=0}\frac{f(x_j)}{(x_j-x_0)\cdots(x_j-x_{j-1})(x_j-x_{j+1})\cdots(x_j-x_m)}

  2. 线性性:
    h(x)=αf(x)+βg(x)h(x) = \alpha f(x) + \beta g(x),则:

H(x0,...,xn)=αF(x0,...,xn)+βG(x0,...,xn)H(x_0,...,x_n) = \alpha F(x_0,...,x_n) + \beta G(x_0,...,x_n)

  1. 对称性
    差商值与节点排列顺序无关:

F(x0,x1,...,xn)=F(xσ(0),xσ(1),...,xσ(n))F(x_0,x_1,...,x_n) = F(x_{\sigma(0)},x_{\sigma(1)},...,x_{\sigma(n)})

其中σ\sigma是任意排列

  1. 差商和导数关系:
    f(x)Cn[a,b]f(x)\in C^n[a,b],则存在 ξ(a,b)\xi\in(a,b) 使得:

F(x0,...,xn)=f(n)(ξ)n!F(x_0,...,x_n) = \frac{f^{(n)}(\xi)}{n!}

  1. 差商表可加性:
    新增节点xn+1x_{n+1}时:

F(x0,...,xn+1)=F(x1,...,xn+1)F(x0,...,xn)xn+1x0F(x_0,...,x_{n+1}) = \frac{F(x_1,...,x_{n+1}) - F(x_0,...,x_n)}{x_{n+1}-x_0}

其中F(x1,...,xn+1)F(x_1,...,x_{n+1})可由性质1算出。

代码实现

1
def P_n(x_variables, x_data, y_data)

龙格(Runge)现象

指在高次多项式插值中,随着插值节点数量的增加,插值结果在区间端点附近出现剧烈振荡的现象。这种现象由德国数学家Carl Runge在研究多项式插值时首次发现。

  • 这说明使用高次多项式插值并不总能提高准确性。

解决方法:多次线性插值
将插值区间划分为若干子区间,在每个子区间上用线性多项式进行插值的方法。给定节点(x0,y0),(x1,y1),...,(xn,yn)(x_0,y_0),(x_1,y_1),...,(x_n,y_n),我们构造的插值函数应该满足:

  • S(xi)=yiS(x_i) = y_i (通过所有节点)
  • S(x)S(x)在每个[xi,xi+1][x_i,x_{i+1}]上是线性函数
  • 整体函数连续(C0C^0连续)

具体实现步骤:

  1. 找到待插值点xx及其所在的区间[xk,xk+1][x_k,x_{k+1}]
  2. 计算:P1(x)=xxk+1xkxk+1yk+xxkxk+1xkyk+1P_1(x) = \frac{x-x_{k+1}}{x_k - x_{k+1}}y_k + \frac{x-x_k}{x_{k+1} - x_k}y_{k+1}(参照拉格朗日插值的形式)

缺陷:C1C^1不连续,无法保证插值节点的光滑性。

样条插值

样条插值(Spline Interpolation)是一种分段低次多项式插值方法,通过强制拼接点处的光滑条件来保证整体曲线的平滑性。(相当于多次线性插值的一种解决方案)

原理

插值函数需要满足:

  • Si(xi)=yiS_i(x_i)=y_i, Si(xi+1)=yi+1S_i(x_{i+1})=y_{i+1}(插值条件)
  • 连续性:
    • C0C^0: Si1(xi)=Si(xi)S_{i-1}(x_i) = S_i(x_i)
    • C1C^1: Si1(xi)=Si(xi)S'_{i-1}(x_i) = S'_i(x_i)
    • C2C^2: Si1(xi)=Si(xi)S''_{i-1}(x_i) = S''_i(x_i)

最常用:三次样条插值函数,即在每段[xi,xi+1][x_i,x_{i+1}]上构造:

Si(x)=ai+bi(xxi)+ci(xxi)2+di(xxi)3S_i(x) = a_i + b_i(x-x_i) + c_i(x-x_i)^2 + d_i(x-x_i)^3

n+1n+1个插值点,总共有nn个区间,那么总参数有4n4n个。

约束条件(加起来一共4n4n个):

  • 插值条件:提供 n+1n+1
  • 连续性:提供 3(n1)3(n-1)
  • 边界条件:提供2个

边界条件还能分为以下三种情况:

  • 第一类边界条件:一阶导数值指定,即S(x0)=y0S'(x_0)=y'_0S(xn)=ynS'(x_n)=y'_n
  • 第二类边界条件:二阶导数值指定,即S(x0)=y0S''(x_0)=y''_0S(xn)=ynS''(x_n)=y''_n
    • 自然边界条件:S(x0)=S(xn)=0S''(x_0)=S''(x_n)=0
  • 周期性边界条件:S(x0)=S(xn)S'(x_0)=S'(x_n)S(x0)=S(xn)S''(x_0)=S''(x_n)

求解步骤

以上面提到的三次样条插值为例,对于x[xk,xk+1)x \in [x_k, x_{k+1})

Sk(x)=ak+bkh+ckh2+dkh3(h=xxk)Sk(x)=bk+2ckh+3dkh2Sk(x)=2ck+6dkh\begin{aligned} S_k(x) &= a_k + b_k h + c_k h^2 + d_k h^3 \quad (h = x-x_k) \\ S'_k(x) &= b_k + 2c_k h + 3d_k h^2 \\ S''_k(x) &= 2c_k + 6d_k h \end{aligned}

应满足条件:

  • Sk(xk)=ykak=ykS_k(x_k) = y_k \Rightarrow a_k = y_k(插值条件)
  • Sk(xk+1)=yk+1ak+hkbk+hk2ck+hk3dk=yk+1S_k(x_{k+1}) = y_{k+1} \Rightarrow a_k + h_k b_k + h_k^2 c_k + h_k^3 d_k = y_{k+1} (其中 hk=xk+1xkh_k = x_{k+1}-x_k)( C0C^0 连续性条件)
  • bk+2hkck+3hk2dk=bk+1b_k + 2h_k c_k + 3h_k^2 d_k = b_{k+1}C1C^1 连续性条件)
  • 2ck+6hkdk=2ck+12c_k + 6h_k d_k = 2c_{k+1}C2C^2 连续性条件)

求解该方程组:
mk=2ckm_k = 2c_k,则

  1. 由二阶导数连续条件:

    dk=mk+1mk6hkd_k = \frac{m_{k+1}-m_k}{6h_k}

  2. 代入C0C^0连续性条件得:

    bk=yk+1ykhkhk2mkhk6(mk+1mk)b_k = \frac{y_{k+1}-y_k}{h_k} - \frac{h_k}{2}m_k - \frac{h_k}{6}(m_{k+1}-m_k)

  3. 将上式代入C1C^1连续条件,整理得到:

hkmk+2(hk+hk+1)mk+1+hk+1mk+2=6(yk+2yk+1hk+1yk+1ykhk)h_k m_k + 2(h_k+h_{k+1})m_{k+1} + h_{k+1}m_{k+2} = 6\left(\frac{y_{k+2}-y_{k+1}}{h_{k+1}} - \frac{y_{k+1}-y_k}{h_k}\right)

  1. 加上两个边界条件的约束 \Rightarrow 建立线性方程组求解mkm_kmk+1m_{k+1}mk+2m_{k+2}以及对应的系数 aka_kbkb_kckc_kdkd_k

举例

例如,在 S(x0)=S(xn)=0S''(x_0) = S''(x_n) = 0 的自然边界条件下,(等会再给个详细的整理过程)

(10000h02(h0+h1)h1000h12(h1+h2)h2000hn22(hn2+hn1)hn100001)(m0m1m2mn1mn)=6(y2y1h1y1y0h0y3y2h2y2y1h1ynyn1hn1yn1yn2hn20)\begin{pmatrix} 1 & 0 & 0 & 0 & \cdots & 0 \\ h_0 & 2(h_0 + h_1) & h_1 & 0 & \cdots & 0 \\ 0 & h_1 & 2(h_1 + h_2) & h_2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & h_{n-2} & 2(h_{n-2} + h_{n-1}) & h_{n-1} \\ 0 & 0 & \cdots & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} m_0 \\ m_1 \\ m_2 \\ \vdots \\ m_{n-1} \\ m_n \end{pmatrix} = 6 \begin{pmatrix} \frac{y_2 - y_1}{h_1} - \frac{y_1 - y_0}{h_0} \\ \frac{y_3 - y_2}{h_2} - \frac{y_2 - y_1}{h_1} \\ \vdots \\ \frac{y_n - y_{n-1}}{h_{n-1}} - \frac{y_{n-1} - y_{n-2}}{h_{n-2}} \\ 0 \end{pmatrix}

可以利用Gauss消元法 / LU分解法 / Gauss-Seidel迭代法求解上述矩阵。

代码实现

1
def 

数据拟合

数据拟合是通过数学模型近似描述离散数据点的方法,与插值的区别在于:

  • 插值:强制通过所有数据点
  • 拟合:允许存在偏差,追求整体趋势匹配

基本思想:偏差最小化。

设逼近函数ϕ(x)\phi(x)和真实数据yiy_i (i=1,2,...,ni=1, 2, ..., n)。则最小化偏差可以由以下方式定义:

L1范数(绝对偏差和):mini=1nϕ(xi)yiL2范数(平方偏差和):mini=1n(ϕ(xi)yi)2L范数(最大偏差):minmax1inϕ(xi)yiL_1\text{范数(绝对偏差和)}:\min \sum_{i=1}^n |\phi(x_i) - y_i|\\ L_2\text{范数(平方偏差和)}:\min \sum_{i=1}^n (\phi(x_i) - y_i)^2\\ L_{\infty}\text{范数(最大偏差)}:\min {\max_{1 \leq i \leq n}|\phi(x_i) - y_i|}

最小二乘法拟合

数学原理:最小化残差平方和

mini=1n(yif(xi))2\min \sum_{i=1}^n (y_i - f(x_i))^2

线性拟合

考虑f(x)=a+bxf(x) = a + bx

\therefore 残差平方和:Q(a,b)=i=1n(f(xi)yi)2=i=1n(a+bxiyi)2Q(a,b) = \sum_{i=1}^n (f(x_i) - y_i)^2 = \sum_{i=1}^n (a + b x_i - y_i)^2

为使 Q(a,b)Q(a, b) 最小,需满足:

{Qa=2i=1n(a+bxiyi)=0Qb=2i=1n(a+bxiyi)xi=0\begin{cases} \frac{\partial Q}{\partial a} = 2\sum_{i=1}^n (a + b x_i - y_i) = 0\\ \frac{\partial Q}{\partial b} = 2\sum_{i=1}^n (a + b x_i - y_i)x_i = 0 \end{cases}

将方程组整理为矩阵形式:

(nxixixi2)(ab)=(yixiyi)\begin{pmatrix} n & \sum x_i \\ \sum x_i & \sum x_i^2 \end{pmatrix} \begin{pmatrix} a \\ b \end{pmatrix} = \begin{pmatrix} \sum y_i \\ \sum x_i y_i \end{pmatrix}

求解(a,b)(a, b)即可。(同前,可以用Gauss消元法 / LU分解法 / Gauss-Seidel迭代法)

多项式拟合

k次多项式模型:

f(x)=j=0kajxj=a0+a1x+a2x2+...+akxkf(x) = \sum_{j=0}^k a_j x^j = a_0 + a_1 x + a_2 x^2 + ... + a_k x^k

残差表达式

Q(a0,a1,...,ak)=i=1n(a0+a1xi+a2xi2+...+akxikyi)2Q(a_0, a_1, ..., a_k) = \sum_{i=1}^n(a_0 + a_1 x_i + a_2 x_i^2 + ... + a_k x_i^k - y_i)^2

对系数求导:

{Qa0=2i=1n(a0+a1xi+a2xi2+...+akxikyi)=0Qa1=2i=1n(a0+a1xi+a2xi2+...+akxikyi)xi=0Qa2=2i=1n(a0+a1xi+a2xi2+...+akxikyi)xi2=0Qak=2i=1n(a0+a1xi+a2xi2+...+akxikyi)xik=0\begin{cases} \frac{\partial Q}{\partial a_0} = 2 \sum_{i=1}^n(a_0 + a_1 x_i + a_2 x_i^2 + ... + a_k x_i^k - y_i) = 0\\ \frac{\partial Q}{\partial a_1} = 2 \sum_{i=1}^n (a_0 + a_1 x_i + a_2 x_i^2 + ... + a_k x_i^k - y_i) \cdot x_i = 0\\ \frac{\partial Q}{\partial a_2} = 2 \sum_{i=1}^n (a_0 + a_1 x_i + a_2 x_i^2 + ... + a_k x_i^k - y_i) \cdot x_i^2 = 0\\ \vdots\\ \frac{\partial Q}{\partial a_k} = 2 \sum_{i=1}^n (a_0 + a_1 x_i + a_2 x_i^2 + ... + a_k x_i^k - y_i) \cdot x_i^k = 0 \end{cases}

整理:

{a01+a1xi++akxik=yia0xi+a1xi2++akxik+1=xiyia0xik+a1xik+1++akxi2k=xikyi\begin{cases} a_0 \sum 1 + a_1 \sum x_i + \cdots + a_k \sum x_i^k = \sum y_i \\ a_0 \sum x_i + a_1 \sum x_i^2 + \cdots + a_k \sum x_i^{k+1} = \sum x_i y_i \\ \vdots \\ a_0 \sum x_i^k + a_1 \sum x_i^{k+1} + \cdots + a_k \sum x_i^{2k} = \sum x_i^k y_i \end{cases}

整理成矩阵形式:

(nxixikxixi2xik+1xikxik+1xi2k)(a0a1ak)=(yixiyixikyi)\begin{pmatrix} n & \sum x_i & \cdots & \sum x_i^k\\ \sum x_i & \sum x_i^2 & \cdots & \sum x_i^{k+1} \\ \vdots & \vdots & \ddots & \vdots \\ \sum x_i^k & \sum x_i^{k+1} & \cdots & \sum x_i^{2k} \end{pmatrix} \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_k \end{pmatrix} = \begin{pmatrix} \sum y_i \\ \sum x_i y_i \\ \vdots \\ \sum x_i^k y_i \end{pmatrix}

求解即可。

非线性拟合

模型类型 变换方法 线性形式
指数模型 y=aebxy=ae^{bx} 取对数 lny=lna+bx\ln y = \ln a + bx
幂函数 y=axby=ax^b 取对数 lny=lna+blnx\ln y = \ln a + b\ln x
?对数模型 y=alnx+by=a\ln x + b - 直接拟合
倒数模型 y=xa+bxy = \frac{x}{a+bx} 取倒数 1y=ax+b\frac{1}{y} = \frac{a}{x} + b
  • 核心思想:参数线性化

容易出现的问题:欠拟合&过拟合。(有机器学习基础的应该都了解……不解释了。)

数值微分

利用离散的点进行数值微分。

Case 1:非边界点微分值的计算

假设数据点均匀分布,xi=x0+ihx_i = x_0 + i \cdot h,在xi1x_{i-1}xix_ixi+1x_{i+1}附近Taylor展开:

fi+1=fi+hfi+h22fi+h36fi+O(h4)f_{i+1} = f_i + h f'_i + \frac{h^2}{2} f''_i + \frac{h^3}{6} f'''_i + O(h^4)

fi1=fihfi+h22fih36fi+O(h4)f_{i-1} = f_i - h f'_i + \frac{h^2}{2} f''_i - \frac{h^3}{6} f'''_i + O(h^4)

就可以得到以下公式:

类型 公式 误差阶
前向差分 f(x)fi+1fihf'(x) \approx \frac{f_{i+1}-f_i}{h} O(h)O(h)
后向差分 f(x)fifi1hf'(x) \approx \frac{f_i-f_{i-1}}{h} O(h)O(h)
中心差分 f(x)fi+1fi12hf'(x) \approx \frac{f_{i+1}-f_{i-1}}{2h} O(h2)O(h^2)

再在xi2x_{i-2}xi+2x_{i+2}处展开:

fi+2=fi+2hfi+4h22!fi+8h33!fi+...f_{i+2} = f_i + 2h \cdot f_i' + \frac{4h^2}{2!} f_i'' + \frac{8h^3}{3!} f_i'''+...

fi2=fi2hfi+4h22!fi8h33!fi+...f_{i-2} = f_i - 2h \cdot f_i' + \frac{4h^2}{2!} f_i'' - \frac{8h^3}{3!} f_i'''+...

fi2+8fi+1fi+28fi1f_{i-2}+8f_{i+1}-f_{i+2}-8f_{i-1} 消去 fif_i'''fif_i''fif_i,可得:

fi=fi28fi1+8fi+1fi+212h+O(h4)f'_i = \frac{f_{i-2}-8f_{i-1}+8f_{i+1}-f_{i+2}}{12h} + O(h^4)

类似地,可以得到三点或五点的二阶微分:

fi=fi+12fi+fi1h2+O(h2)f_i'' = \frac{f_{i+1} - 2f_i + f_{i-1}}{h^2} + O(h^2)

fi=fi2+16fi130fi+16fi+1fi+212h2+O(h4)f_i'' = \frac{-f_{i-2} + 16f_{i-1} - 30f_i + 16f_{i+1} - f_{i+2}}{12h^2} + O(h^4)

Case 2:边界点微分值的计算

x0x_0为例。

  • 利用两点计算一阶微分:

    f(x0)=f1f0h+O(h)f'(x_0) = \frac{f_1-f_0}{h} + O(h)

    缺陷:精度只有O(h)O(h)

  • 利用三点计算一阶微分:

    {f0=f0f1=f0+hf0+h22!f0+O(h3)f2=f0+2hf0+4h22!f0+O(h3)\begin{cases} f_0 = f_0\\ f_1 = f_0 + h \cdot f_0' + \frac{h^2}{2!} \cdot f_0'' + O(h^3)\\ f_2 = f_0 + 2h \cdot f_0' + \frac{4h^2}{2!} \cdot f_0'' + O(h^3) \end{cases}

    通过他们的线性组合将 f0f_0'' 消去,我们可以有:

    f0=4f13f0f22h+O(h2)f_0' = \frac{4f_1 - 3f_0 - f_2}{2h} + O(h^2)

同理,xnx_n处利用三点计算一阶微分:

fn=3fn+fn24fn12h+O(h2)f_n' = \frac{3f_n + f_{n-2} - 4f_{n-1}}{2h} + O(h^2)

数值积分

实际中常见情况:f(x)f(x) 表达式未知,仅以离散点列 (xi,f(xi))(x_i,f(x_i)) 给出 or 很难求出 f(x)f(x) 的原函数。

\Rightarrow 可以用分割小区间再数值求和的方法求解积分。

由前面知识可知,拉格朗日插值Ln(x)=k=0nlk(x)f(xk)L_n(x) = \sum_{k=0}^n l_k(x) \cdot f(x_k),其中 lk(x)l_k(x)为插值基函数。用 Ln(x)L_n(x) 代替 f(x)f(x) 可得:

I(f)=abf(x)dx=f(x)换成Ln(x)abLn(x)dx=abk=0nlk(x)f(xk)dxI(f) = \int_a^b f(x) \text{d}x \overset{f(x)换成L_n(x)}{=} \int_{a}^{b} L_n(x) \text{d}x = \int_{a}^{b} \sum_{k=0}^{n} l_k(x) f(x_k) \text{d}x

这里的积分与求和可以交换顺序(别问我为什么……我不是学数学的):

abk=0nlk(x)f(xk)dx=i=0n[abli(x)dx]f(xi)=i=0nαif(xi)\int_{a}^{b} \sum_{k=0}^{n} l_k(x) f(x_k) \text{d}x = \sum_{i=0}^{n} \left[ \int_{a}^{b} l_i(x) dx \right] f(x_i) = \sum_{i=0}^{n} \alpha_i f(x_i)

此时 xix_i 是求积分节点,αi\alpha_i 是求积分系数。

Newton-Cotes积分方法: 对积分区间 [a,b][a,b]nn 等分,步长 h=banh = \frac{b-a}{n} 进行积分节点构造。

非复化数值积分

下面就以Newton-Cotes积分方法为例,查看一阶、二阶和多阶插值下的计算情况。

  1. 一阶梯形积分:我们知道两点插值 (n=1n=1)

    L1(x)=l0(x)y0+l1(x)y1=xx1x0x1y0+xx0x1x0y1L_1(x) = l_0(x)y_0 + l_1(x)y_1 = \frac{x-x_1}{x_0-x_1} \cdot y_0 + \frac{x-x_0}{x_1-x_0} \cdot y_1

    代回积分表达式:

    abf(x)dx=abL1(x)dx=abi=01li(x)f(xi)dx=abl0(x)dxf(a)+abl1(x)dxf(b)=ba2[f(a)+f(b)]\begin{aligned} \int_a^b f(x) \text{d}x = \int_a^b L_1(x) \text{d}x &= \int_a^b \sum_{i=0}^1 l_i(x) f(x_i) \text{d}x\\ &= \int_a^b l_0(x) \text{d}x \cdot f(a) + \int_a^b l_1(x) \text{d}x \cdot f(b)\\ &= \frac{b - a}{2} \left[ f(a) + f(b) \right] \end{aligned}

    【 这里计算一下abl0(x)dx\int_a^b l_0(x) \text{d}x

    abl0(x)dx=abxbabdx=1ab(x22bx)ab=12(ab)(ba)(ab)\int_a^b l_0(x) \text{d}x = \int_a^b \frac{x-b}{a-b} \text{d}x = \frac{1}{a-b} (\frac{x^2}{2}-bx)\mid^b_a = \frac{1}{2(a-b)}(b-a)(a-b)

    abl1(x)dx\int_a^b l_1(x) \text{d}x同理 】

    误差公式前面也已得出:

    R(x)=f(x)P1(x)=f(ξ)2(xx0)(xx1)R(x) = f(x) - P_1(x) = \frac{f''(\xi)}{2}(x-x_0)(x-x_1)

    代入积分:

    E1(x)=abf(ξ)2(xa)(xb)dx=f(ξ)2ab(xa)(xb)dx=f(ξ)12(ba)3\begin{aligned} E_1(x) &= \int_a^b \frac{f''(\xi)}{2}(x - a)(x - b) \text{d}x\\ &= \frac{f''(\xi)}{2} \int_a^b (x - a)(x - b) \text{d}x = -\frac{f''(\xi)}{12}(b - a)^3 \end{aligned}

  2. 二阶辛普森(Simpson)积分:n=2n=2,三点插值

    L2(x)=l0(x)y0+l1(x)y1+l2(x)y2=(xx1)(xx2)(x0x1)(x0x2)y0+(xx0)(xx2)(x1x0)(x1x2)y1+(xx0)(xx1)(x2x0)(x2x1)y2\begin{aligned} L_2(x) &= l_0(x) \cdot y_0 + l_1(x) \cdot y_1 + l_2(x) \cdot y_2\\ &= \frac{(x - x_1)(x - x_2)}{(x_0 - x_1)(x_0 - x_2)}y_0 + \frac{(x - x_0)(x - x_2)}{(x_1 - x_0)(x_1 - x_2)}y_1 + \frac{(x - x_0)(x - x_1)}{(x_2 - x_0)(x_2 - x_1)}y_2 \end{aligned}

    带入积分表达式:

    abf(x)dx=abL2(x)dx=abi=02li(x)f(xi)dx=abl0(x)dxf(a)+abl1(x)dxf(a+b2)+abl2(x)dxf(b)=ba6[f(a)+4f(a+b2)+f(b)]\begin{aligned} \int_a^b f(x) \text{d}x = \int_a^b L_2(x) \text{d}x &= \int_a^b \sum_{i=0}^2 l_i(x) f(x_i) \text{d}x\\ &= \int_a^b l_0(x) \text{d}x \cdot f(a) + \int_a^b l_1(x) \text{d}x \cdot f\left(\frac{a+b}{2}\right) + \int_a^b l_2(x) \text{d}x \cdot f(b)\\ &= \frac{b-a}{6} \left[ f(a) + 4f\left(\frac{a+b}{2}\right) + f(b) \right] \end{aligned}

    (这里的误差项我不确定课件上的是否正确……[ac01])

    以此类推,可总结如下:(我懒得一一去算了……有空来填这个坑)

    Step size ( h ) Common name Formula Error term
    bab - a Trapezoidal rule h2(f0+f1)\frac{h}{2}(f_0 + f_1) 112h3f(ξ)-\frac{1}{12}h^3f''(\xi)
    ba2\frac{b - a}{2} Simpson’s rule h3(f0+4f1+f2)\frac{h}{3}(f_0 + 4f_1 + f_2) 190h5f(4)(ξ)-\frac{1}{90}h^5f^{(4)}(\xi)
    ba3\frac{b - a}{3} Simpson’s 3/8 rule 3h8(f0+3f1+3f2+f3)\frac{3h}{8}(f_0 + 3f_1 + 3f_2 + f_3) 380h5f(4)(ξ)-\frac{3}{80}h^5f^{(4)}(\xi)
    ba4\frac{b - a}{4} Boole’s rule 2h45(7f0+32f1+12f2+32f3+7f4)\frac{2h}{45}(7f_0 + 32f_1 + 12f_2 + 32f_3 + 7f_4) 8945h7f(6)(ξ)-\frac{8}{945}h^7f^{(6)}(\xi)

复化数值积分

Runge现象:在高次多项式插值中,随着插值节点数量的增加,插值结果在区间端点附近出现剧烈振荡。(前面已经提过)

\therefore 实际计算过程(复化数值积分思想):把积分区间分割成多个子区间,然后在每个子区间采用一阶梯形积分或二阶辛普森积分,再将所有子区间积分值加起来。

  1. 复化梯形积分:
    对被积函数 f(x)f(x) 在积分区间 [a,b][a, b] 内作等距分割,h=banh = \frac{b-a}{n},并且在每个区间内做线性插值,那么

    abf(x)dx=i=0nxixi+1f(x)dx=i=0nxixi+1L1(x)+Rn(x)dx=i=0n(h2[f(xi)+f(xi+1)]f(ηi)h312)=i=0n(h2[f(xi)+f(xi+1)]f(ηi)h312)h[12f(a)+i=1n1f(xi)+12f(b)]\begin{aligned} \int_{a}^{b} f(x) \text{d}x = \sum_{i=0}^{n} \int_{x_i}^{x_{i+1}} f(x) \text{d}x &= \sum_{i=0}^{n} \int_{x_i}^{x_{i+1}} L_1(x) + R_n(x) \text{d}x\\ &= \sum_{i=0}^{n} \left( \frac{h}{2} [f(x_i) + f(x_{i+1})] - f''(\eta_i) \frac{h^3}{12} \right)\\ &= \sum_{i=0}^{n} \left( \frac{h}{2} [f(x_i) + f(x_{i+1})] - f''(\eta_i) \frac{h^3}{12} \right)\\ &\approx h \left[ \frac{1}{2} f(a) + \sum_{i=1}^{n-1} f(x_i) + \frac{1}{2} f(b) \right] \end{aligned}

    同时我们可以估计其误差为(第二个等号到第三个等号,以及最后两个等号之间均用了积分中值定理):

    En(f)=i=0n(f(ηi)h312)=h212i=0n(f(ηi)h)=h212i=0nxixi+1f(x)dx=h212abf(x)dx=h212(ba)f(η)\begin{aligned} E_n(f) = \sum_{i=0}^{n} \left( -f''(\eta_i) \frac{h^3}{12} \right) &= \frac{h^2}{12}\sum_{i=0}^{n} (-f''(\eta_i) \cdot h) \\ &= -\frac{h^2}{12}\sum_{i=0}^{n}\int_{x_i}^{x_{i+1}} f''(x) \text{d}x \\ &= -\frac{h^2}{12} \int_{a}^{b} f''(x) \text{d}x\\ &= -\frac{h^2}{12} (b-a) f''(\eta)\\ \end{aligned}

    \because h=ban  h = \frac{b-a}{n} \; 代入上式:

    L.H.S.=(ba)312n2f(η),η(a,b)\text{L.H.S.}= -\frac{(b-a)^3}{12n^2} f''(\eta), \quad \eta \in (a, b)

  2. 复化 Simpson 积分:
    对各子区间做二阶多项式插值。

    对积分区间 [a,b][a,b] 内作偶数等距分割,n=2mn=2m,此时我们有 2m+12m+1 个积分节点和 mm 个积分区间,积分步长 h=banh = \frac{b-a}{n}。每次积分在 (x2i,x2i+1,x2i+2)(x_{2i}, x_{2i+1}, x_{2i+2})三点上进行:

    abf(x)dx=i=0m1x2ix2i+2f(x)dx=i=0m1{2h6[f(x2i)+4f(x2i+1)+f(x2i+2)](2h)52880f(4)(ηi)}(这里步长为2h)=h3[f(a)+4i=0m1f(x2i+1)+2i=1m1f(x2i)+f(b)]+En(f)\begin{aligned} \int_a^b f(x) \text{d}x &= \sum_{i=0}^{m-1} \int_{x_{2i}}^{x_{2i+2}} f(x) \text{d}x\\ &= \sum_{i=0}^{m-1} \{ \frac{2h}{6} \left[ f(x_{2i}) + 4f(x_{2i+1}) + f(x_{2i+2}) \right] - \frac{(2h)^5}{2880} f^{(4)}(\eta_i) \} (\text{这里步长为}2h)\\ &= \frac{h}{3} \left[ f(a) + 4 \sum_{i=0}^{m-1} f(x_{2i+1}) + 2 \sum_{i=1}^{m-1} f(x_{2i}) + f(b) \right] + E_n(f) \end{aligned}

    其中误差项

    En(f)=i=0m1(2h)52880f(4)(ηi)h4180(ba)f(4)(η)=(ba)5180n4f(4)(η)η(a,b)\begin{aligned} E_n(f)=\sum_{i=0}^{m-1}-\frac{(2h)^5}{2880} f^{(4)}(\eta_i) &\approx -\frac{h^4}{180}(b-a)f^{(4)}(\eta)\\ &= -\frac{(b-a)^5}{180n^4}f^{(4)}(\eta) \quad \eta \in (a,b) \end{aligned}

    (约化技巧与复化梯形积分误差项类似,注意这里的步长为2h2h。)

数据滤波

实际观测中永远免不了随机噪声。噪声分类:

  • 过程噪声:物理系统演化过程中系统本身的随机性
  • 测量噪声:做测量时测量行为带来的随机性

滤波方法一:平滑滤波

滤波方法二:卡尔曼滤波