其他分享
首页 > 其他分享> > 论文解读《The Emerging Field of Signal Processing on Graphs》

论文解读《The Emerging Field of Signal Processing on Graphs》

作者:互联网

感悟

  看完图卷积一代、二代,深感图卷积的强大,刚开始接触图卷积的时候完全不懂为什么要使用拉普拉斯矩阵( $L=D-W$),主要是其背后的物理意义。通过借鉴前辈们的论文、博客、评论逐渐对图卷积有了一定的了解,作为一个刚上研的博士生,深感得对图神经网络进行一个系统的学习。

  本篇论文得感谢论文 David I Shuman 作者及博主:纯牛奶爱酸牛奶 


Paper Information

  Authors:D. Shuman, S. Narang, P. Frossard, Antonio Ortega, P. Vandergheynst
  Sources:2012, IEEE Signal Processing Magazine
  Paper:Download chrome-extension://ibllepbpahcoppkjjllbabhnigcbffpi/https://arxiv.org/pdf/1211.0053.pdf
  Code:Download
  2528 Citations, 75 References


Abstract

  The emerging field of signal processing on graphs merges algebraic and spectral graph theoretic concepts with computational harmonic analysis to process such signals on graphs.

  图上信号处理的新兴领域,将代数和谱图理论的概念与计算谐波分析相结合,以处理图上的信号。

  本文将阐述该领域上的一些挑战,以及定义在  graph spectral domains  上的一些方法,这些方法和经典的 frequency domain  相似。本文同时还点明了在处理 graph signal 融合使用 irregular data of graph 的重要性(每每看论文都出现结合结构特性,但并没有有人能完全说清楚)。然后介绍一些 common operator ,比如说  filtering、translation、modulation、dilation、downsampling to the graph setting 。对已经提出的   localized, multiscale transforms  用于提取图数据的高维特征做了总结。


1 Introduction 

    

1.1 The Main Challenges of Signal Processing on Graphs

   Challenges on Graph:

  处理数据域的不规则性、图结构,在之前提到的应用中,可以表示很多顶点的特征。为了能很好地对数据的尺度进行缩放,对于图信号的处理技术应该使用局部操作,通过对每个顶点,计算顶点的邻居,或是和它很近的顶点的信息。

     The overarching challenges of processing signals on graphs:


2 The graph spectral domain

2.1 Weighted Graphs and Graph Signals

  Defitions of graph:

   According the need of application,a common way to construct a weight matrix is full connect which for construct a symmetric matrx.More detail you can refer to my blog 《谱聚类原理总结》.

    $W_{i, j}=\left\{\begin{array}{ll}\exp \left(-\frac{[\operatorname{dist}(i, j)]^{2}}{2 \theta^{2}}\right) & \text { if } \operatorname{dist}(i, j) \leq \kappa \\0 & \text { otherwise }\end{array}\right.$

  Where, $ dist(i, j)$ can be Euclidean distance between two vector $i$ and $j$. two leanable parmeter $\theta$ and $\kappa$.

  Another way two generate a weight matrix is KNN .

  $f: \mathcal{V} \rightarrow \mathbb{R}$ defined the value of verticles of the graph,the $i^{t h}$  component $f_i$ mean the $i^{t h}$ vertices value .

2.2  The Non-Normalized Graph Laplacian

  The non-normalized graph Laplacian, also called the combinatorial graph Laplacian, is defined as

    $L:=\mathbf{D}-\mathbf{W}$

  Where

   The graph Laplacian matrix is define as :

    $(L f)(i)=\sum \limits _{j \in \mathcal{N}_{i}} W_{i, j}[f(i)-f(j)]$

  If you want to know the background physical meaning of this Eq, you can refer to my another blog 《图神经网络基础二:谱图理论》.

  Where 

   Due to graph Laplacian matrix $L$ is a real symmetric matrix ,so it can do Laplace Spectral Decomposition as the following:

    $L \mu_{k}=\lambda_{k} \mu_{k}$

   Because $L$ is real symmetric matrix ,so it can be transformed as:

    $L=U \Lambda U^{-1}=U \Lambda U^{T}$

   Some note in here:

2.3  A Graph Fourier Transform and Notion of Frequency

   Classical fourier tramsform is :

    $\hat{f}(\xi):=\left\langle f, e^{2 \pi i \xi t}\right\rangle=\int_{\mathbb{R}} f(t) e^{-2 \pi i \xi t} d t$

  即:将函数  $f$  在特征函数(eigenfunctions)上表示出来,类似坐标点在平面上的形式。

  Eigenfunction with Laplace operator :

    $-\Delta\left(e^{2 \pi i \xi t}\right)=-\frac{\partial^{2}}{\partial t^{2}} e^{2 \pi i \xi t}=(2 \pi \xi)^{2} e^{2 \pi i \xi t}\quad \quad \quad \quad(2)$

   即:Eigenfunction 的负散度。参考《图神经网络基础二:谱图理论

  Analogously,we can define the Graph Fourier tramsform $\hat{f}$ of any function $\mathbf{f} \in \mathbb{R}^{N}$ on the vertices of the $G$  as the expansion of f in terms of the eigenvectors of the graph Laplacian:

    $\hat{f}\left(\lambda_{l}\right):=\left\langle\mathbf{f}, \mathbf{u}_{l}\right\rangle=\sum\limits _{i=1}^{N} f(i) u_{l}^{*}(i)\quad \quad \quad \quad(3)$

  The inverse of Graph Fourier tramsform is :

    $f(i)=\sum \limits _{l=0}^{N-1} \hat{f}\left(\lambda_{l}\right) u_{l}(i)\quad \quad \quad \quad(4)$

   More analysis about Classical Fourier tramsform:

  In graph Laplacian, the eigenvalues and eigenvectors provide a similar notion of frequency. 

   This is demonstrated in both Figure 2.

    

  分析

    Figure 3, which shows the number of zero  crossings of each graph Laplacian eigenvector. The set of zero crossings of a signal f on a graph G is defined as:

    $\mathcal{Z}_{\mathcal{G}}(\mathbf{f}):=\{e=(i, j) \in \mathcal{E}: f(i) f(j)<0\}$

     

   It means that if $\lambda $ is more bigger ,the bigger the change.

2.4  Graph Signal Representations in Two Domains

   The graph Fourier transform (3) and its inverse (4) give us a way to equivalently represent a signal in two different domains: the vertex domain and the graph spectral domain.

   Figure 4 will show you the graph Fourier coefficients with different $\lambda_{l}$.

    

   Figure 4 告诉我们一个 graph signal 在不同的域上的等价表现形式。图4中展示的一个平缓信号的图傅里叶系数衰减的很快。这样的信号是可压缩的(compressible),因为可以通过调整一些图傅里叶系数来趋近他们。

2.5 Discrete Calculus and Signal Smoothness with Respect to the Intrinsic Structure of the Graph

  When we analyze signals, it is important to emphasize that properties such as smoothness are with respect to the intrinsic structure of the data domain.

  下面将介绍 smoothness function 。至于推导完全可以看《图神经网络基础二:谱图理论

  The edge derivative of a signal  $\mathbf{f}$  with respect to edge  $e=(i, j)$  at vertex  $i$  is defined as

    $\left.\frac{\partial \mathbf{f}}{\partial e}\right|_{i}:=\sqrt{W_{i, j}}[f(j)-f(i)]$

   The graph gradient of $f$ at vertex $i$ is the vector

    $\nabla_{i} \mathbf{f}:=\left[\left\{\left.\frac{\partial \mathbf{f}}{\partial e}\right|_{i}\right\}_{e \in \mathcal{E} \text { s.t. } e=(i, j) \text { for some }_{j \in \mathcal{V}}}\right]$

   Then the local variation at vertex $i$:

    $\begin{aligned}\left\|\nabla_{i} \mathbf{f}\right\|_{2} &:=\left[\sum \limits _{e \in \mathcal{E} \text { s.t. } e=(i, j) \text { for some } j \in \mathcal{V}}\left(\left.\frac{\partial \mathbf{f}}{\partial e}\right|_{i}\right)^{2}\right]^{\frac{1}{2}} \\&=\left[\sum \limits_{j \in \mathcal{N}_{i}} W_{i, j}[f(j)-f(i)]^{2}\right]^{\frac{1}{2}}\end{aligned}$

  This provides a measure of local smoothness of $f$ around vertex $i$,  as it is small when the function $f$ has similar values at $i$ and all neighboring vertices of $i$.

  For global smoothness of all nodes in the graph,we can define a discrete p-Dirichlet form of $f$ :

    $S_{p}(\mathbf{f}):=\frac{1}{p} \sum \limits _{i \in V}\left\|\nabla_{i} \mathbf{f}\right\|_{2}^{p}=\frac{1}{p} \sum\limits_{i \in V}\left[\sum_{j \in \mathcal{N}_{i}} W_{i, j}[f(j)-f(i)]^{2}\right]^{\frac{p}{2}}\quad \quad \quad \quad(5)$

  When  $p=1$, $S_{1}(\mathbf{f})$  is the total variation of the signal with respect to the graph. When  $p=2$ , we have

    $\begin{aligned}S_{2}(\mathbf{f}) &=\frac{1}{2} \sum\limits _{i \in V} \sum\limits_{j \in \mathcal{N}_{i}} W_{i, j}[f(j)-f(i)]^{2} \\&=\sum\limits_{(i, j) \in \mathcal{E}} W_{i, j}[f(j)-f(i)]^{2}\\&=\mathbf{f}^{\mathrm{T}} \text { Lf }\end{aligned}\quad \quad \quad \quad(6)$

 拉普拉斯算子

    $\begin{aligned}\Delta f(x) &=\frac{\partial^{2} f}{\partial x^{2}} \\&=f^{\prime \prime}(x) \\& \approx f^{\prime}(x)-f^{\prime}(x-1) \\& \approx[f(x+1)-f(x)]-[f(x)-f(x-1)] \\&=f(x+1)+f(x-1)-2 f(x)\end{aligned}$

图的拉普拉斯算子

    $\Delta f_{i}=\sum \limits _{j \in N_{i}}\left(f_{i}-f_{j}\right)$

  而如果边 $E_{i j}$ 具有权重 $W_{i j}$ 时,则有:

    $\Delta f_{i}=\sum\limits_{j \in N} W_{i j}\left(f_{i}-f_{j}\right)$

  对于任意向量  $f$,有:

    $\begin{aligned}f^{T} L f &=f^{T} D f-f^{T} W f \\&=\sum\limits_{i=1}^{N} d_{i} f_{i}^{2}-\sum\limits_{i=1}^{N} \sum\limits_{j=1}^{N} w_{i j} f_{i} f_{j} \\&=\frac{1}{2}\left(\sum\limits_{i=1}^{N} d_{i} f_{i}^{2}-2 \sum\limits_{i=1}^{N} \sum\limits_{j=1}^{N} w_{i j} f_{i} f_{j}+\sum\limits_{j=1}^{N} d_{j} f_{j}^{2}\right) \\&=\frac{1}{2}\left(\sum\limits_{i=1}^{N} \sum\limits_{j=1}^{N} w_{i j} f_{i}^{2}-2 \sum\limits_{i=1}^{N} \sum\limits_{j=1}^{N} w_{i j} f_{i} f_{j}+\sum\limits_{i=1}^{N} \sum\limits_{j=1}^{N} w_{i j} f_{j}^{2}\right) \\&=\frac{1}{2} \sum\limits_{i=1}^{N} \sum\limits_{j=1}^{N} w_{i j}\left(f_{i}-f_{j}\right)^{2}\end{aligned}$

   $S_{2}(\mathbf{f})$  is known as the graph Laplacian quadratic form and the semi-norm  $\|\mathbf{f}\|_{\text {L }} $ (准范数)is defined as

    $\|\mathbf{f}\|_{L}:=\left\|L^{\frac{1}{2}} \mathbf{f}\right\|_{2}=\sqrt{\mathbf{f}^{\mathrm{T}} L \mathbf{f}}=\sqrt{S_{2}(\mathbf{f})}$

   $S_{2}(f)$ is small when the signal $f$ has similar values at neighboring vertices connected by an edge with a large weight.

   现在回到 graph Laplacian eigenvalues 和 eigenvectors :(Courant-Fischer Theorem)可以参考《极大极小定理

    $\lambda_{0}=\underset{\mathbf{f} \in \mathbb{R}^{N} \atop\|\mathbf{R}\|_{2}=1}{min} \;\left\{\mathbf{f}^{\mathrm{T}} L \mathbf{f}\right\}\quad \quad \quad \quad(7)$

  and 

    ${\large \lambda_{l}=\underset{\mathbf{f} \in \mathbb{R}^{N} \atop   \underset{\mathbf{f} \perp \operatorname{span}\left\{\mathbf{u}_{0}, \ldots, \mathbf{u}_{l-1}\right\}}{\|\mathbf{f}\|_{2}=1} }{min}    \left\{\mathbf{f}^{\mathrm{T}} L \mathbf{f}\right\}, l=1,2, \ldots, N-1}\quad \quad \quad \quad(8)$ 

  拉普拉斯矩阵可以定义图傅里叶变换(通过特征向量)以及平滑性的不同表示。并且图的连通性也编码进了拉普拉斯矩阵。 Example 1 展示了 smoothness 和一个图信号的谱内容是如何依赖于图的。

    

  Discussion: 上面的三个图,点同边不同,顶点域的图  the vertex domains 下面的三个图是对应的谱域 graph spectral domains 可以看出,smoothness and graph spectral content of the signa 取决于图的结构的,$G_1$ 的  signal  $f$  最平滑(smoothest),$G_3$  最不平滑。

2.6 Other Graph Matrices

Method1:

  A second popular option is to normalize each weight $W_{i, j}$ by a factor of $\frac{1}{\sqrt{d_{i} d_{j}}}$ .
  Doing so leads to the normalized graph Laplacian, which is defined as

     $\tilde{L}:=\mathbf{D}^{-\frac{1}{2}} \text { LD }^{-\frac{1}{2}}$

  equivalently

    $(\tilde{L}f)(i) = \frac{1}{\sqrt{d_i}} \sum \limits_{j \in \mathcal{N}_i} W_{i,j} \LARGE[\normalsize \frac{f(i)}{\sqrt{d_i}} - \frac{f(j)}{\sqrt{d_j}} \LARGE]$

   The eigenvalues  satisfy 

    $0 = \tilde{\lambda}_0 < \tilde{\lambda}_1 \leq ... \leq \tilde{\lambda}_{\text{max}} \leq 2$

   with $\tilde{\lambda}_{\text{max}} = 2$  if and only if $G$ is bipartite(二部图).

   The normalized and non-normalized graph Laplacians are Generalized graph Laplacians.

  A generalized graph Laplacian of a graph G is any symmetric matrix:

Method2: random walk matrix

    $P:=D^{-1} W$

  $P_{ij}$  describes the probability of going from vertex $i$ to vertex $j$ in one step of a Markov random walk on the graph $G$.

  For connected, aperiodic graphs, each row of $P^{t}$ converges to the stationary distribution of the random walk as $t$ goes to infinity.

   Another type random walk matrix :asymmetric graph Laplacian

    $L_{a}:=\mathbf{I}_{N}-\mathbf{P}$

   Where 

    $\mathbf{I}_{N}$ is the $N \times N$  identity matrix.

  Note that  $L_{a}$  has the same set of eigenvalues as  $\tilde{L}$ , and if  $\tilde{\mathbf{u}}_{l}$  is an eigenvector of  $\tilde{L}$  associated with  $\tilde{\lambda}_{l}$ , then  $\mathbf{D}_{}^{-\frac{1}{2}} \tilde{\mathbf{u}}_{l}$  is an eigenvector of  $L_{a}$  associated with the eigenvalue  $\tilde{\lambda}_{l}$ .

   The normalized graph Laplacian has the nice properties that its spectrum is always contained in the interval [0, 2] and, for bipartite graphs.


3 Generalized Operators For Signals on Graphs

  In this section ,we will review different fundamental operations such as  filtering, translation, modulation, dilation, and downsampling to the graph setting.

3.1  Filtering

3.1.1 Frequency Filtering

  In classical signal processing,

    $\hat{f}_{\text {out }}(\xi)=\hat{f}_{\text {in }}(\xi) \hat{h}(\xi)$

  Where

    $\hat{h}(\cdot)$  is the transfer function of the filter.  

  This frequency filtering will combine the input signal as a linear combination of complex exponentials.其实就是将输入的  signal 使用基函数线性组合来表示。

  The inverse Fourier transform is as following:

    $f_{\text {out }}(t)={\mathcal{F}}^{-1}\left\{\hat{f}_{\text {in }}(\xi) \hat{h}(\xi)\right\}$

  equivalently

    $\begin{array}{l}f_{\text {out }}(t)&=\int_{\mathbb{R}} \hat{f}_{\text {in }}(\xi) \hat{h}(\xi) e^{2 \pi i \xi t} d \xi \quad  \quad\quad \quad\quad\quad\quad\quad(10)  \\&=\int_{\mathbb{R}} f_{i n}(\tau) h(t-\tau) d \tau=:\left(f_{i n} * h\right)(t)\;\;\quad\quad(11)  \end{array}$

Refer to my blog《图神经网络基础一:傅里叶级数与傅里叶变换

  $F(W)$ 是 $f(t) $ 的傅里叶变换

    $F(W)=\int_{-\infty}^{+\infty} f(t) e^{-i W t} \mathrm{~d} t$

  $f(t) $ 是傅里叶变换的逆变换。 

    $f(t)=\frac{1}{2 \pi} \int_{-\infty}^{+\infty} F(W) e^{i W t} \mathrm{~d} W$

     Frequency filtering or graph spectral filtering definition:

    $\hat{f}_{\text {out }}\left(\lambda_{l}\right)=\hat{f}_{\text {in }}\left(\lambda_{l}\right) \hat{h}\left(\lambda_{l}\right)\quad\quad\quad\quad(12)$

     Inverse graph Fourier transform

     $f_{\text {out }}(i)=\sum\limits _{l=0}^{N-1} \hat{f}_{i n}\left(\lambda_{l}\right) \hat{h}\left(\lambda_{l}\right) u_{l}(i)\quad\quad\quad\quad(13)$

Graph Laplacian

$ L=U\left[\begin{array}{cccc}\lambda_{0} & 0 & \cdots & 0 \\0 & \lambda_{1} & \cdots & 0 \\\vdots & \vdots & \ddots & \vdots \\0 & 0 & \cdots & \lambda_{N-1}\end{array}\right] U^{-1}=U\left[\begin{array}{cccc}\lambda_{0} & 0 & \cdots & 0 \\0 & \lambda_{1} & \cdots & 0 \\\vdots & \vdots & \ddots & \vdots \\0 & 0 & \cdots & \lambda_{N-1}\end{array}\right] U^{T}$

   Graph Laplacian another expression:

    $\hat{h}(\mathbf{L}):=\mathbf{U}\left[\begin{array}{ccc}\hat{h}\left(\lambda_{0}\right) & & \mathbf{0} \\& \ddots & \\\mathbf{0} & & \hat{h}\left(\lambda_{N-1}\right)\end{array}\right] \mathbf{U}^{\mathrm{T}} .$

    $\mathbf{f}_{\text {out }}=\hat{h}(L) \mathbf{f}_{i n}$

  The basic graph spectral filtering in Eq.12 can be used to Gaussian smoothing, bilateral filtering, total variation filtering, anisotropic diffusion, and non-local means filtering.

  Example.2 中比较了 the classical Gaussian filter 与 graph spectral filtering。前者在平滑处理时,会将图片的边缘也 smooth ;而后者没有,这是因为图拉普拉斯矩阵包含了几何结构信息。

  Here,we talk about the common minimize mode (discrete regularization framework):

    $\min _{\mathbf{f}}\left\{\|\mathbf{f}-\mathbf{y}\|_{2}^{2}+\gamma S_{p}(\mathbf{f})\right\}$

  Where

3.1.2 Filtering in the Vertex Domain

   In the vertex domain,the output $f_{out}(i)$ at vertex is :

    $f_{\text {out }}(i)=b_{i, i} f_{\text {in }}(i)+\sum \limits _{j \in \mathcal{N}(i, K)} b_{i, j} f_{\text {in }}(j)\quad\quad\quad\quad(18)$

  It is a linear combination of the components of the input signal at vertices within a K-hop local neighborhood of vertex $i$:

  It also mean a localized linear transform.

  Now,we relate the spectral domain with vertex domain .When Eq.12  frequency filter is a order $K$ polynomial kernal,

    $\hat{h}\left(\lambda_{l}\right)=\sum\limits _{k=0}^{K} a_{k} \lambda_{l}^{k}$

  Where 

     Some constants  $\left\{a_{k}\right\}_{k=0,1, \ldots, K}$

  Attempt it to vertex domain in Eq.13:

    $\begin{aligned}f_{\text {out }}(i) &=\sum\limits_{l=0}^{N-1} \hat{f}_{\text {in }}\left(\lambda_{l}\right) \hat{h}\left(\lambda_{l}\right) u_{l}(i) \\&=\sum\limits_{j=1}^{N} f_{\text {in }}(j) \sum\limits_{k=0}^{K} a_{k} \sum\limits_{l=0}^{N-1} \lambda_{l}^{k} u_{l}^{*}(j) u_{l}(i) \\&=\sum\limits_{j=1}^{N} f_{\text {in }}(j) \sum\limits_{k=0}^{K} a_{k}\left(L^{k}\right)_{i, j}\end{aligned}$

   当 vertex $i$ 和 vertex $j$ 的距离大于 $K$时 ,有

    $\left(\mathrm{L}^{k}\right)_{i, j}=0$

  此时将 Eq18 中的系数写成

    $b_{i, j}:=\sum \limits _{k=d_{\mathcal{G}}(i, j)}^{K} a_{k}\left(L^{k}\right)_{i, j}$

3.2 Convolution

  We can not dirrectly define a convolution product into graph setting($h(t-\tau)$),so we try to replace the  complex exponentials with graph Laplacian eigenvectors.

    $(f * h)(i):=\sum\limits _{l=0}^{N-1} \hat{f}\left(\lambda_{l}\right) \hat{h}\left(\lambda_{l}\right) u_{l}(i)\quad\quad\quad\quad(20)$

3.3 Translation

  在图域中没法直接定义“平移”概念,因此仍需要通过谱域定义平移。
  时域平移可以视为信号与在延时  $t$  上的脉冲  $δ$的卷积结果,因此顶点域的平移  $n$  可以视为信号与在顶点  $n$  上的脉冲  $δ$  的卷积,而  $δ$  的图傅里叶变换即为顶点  $n$  上特征向量之和。

    $\left(T_{n} g\right)(i):=\sqrt{N}\left(g * \delta_{n}\right)(i) \stackrel{20}{=} \sqrt{N} \sum \limits _{l=0}^{N-1} \hat{g}\left(\lambda_{l}\right) u_{l}^{*}(n) u_{l}(i)\quad\quad\quad(21)$

    where

    $\delta_{n}(i)=\left\{\begin{array}{ll}1 & \text { if } i=n \\0 & \text { otherwise }\end{array}\right.\quad\quad\quad(22)$

  但是,我们一般不认为这是“图上的平移”,而是认为这是图谱域上的核(核即前文提到的信号 kernel )操作,要将核  $g(·)$ 平移到顶点 $n$ 上,则需要在核 $g$ 的每一项上乘上对应的特征向量 $u_l(n)$,再反变换回顶点域。其次,系数  $\sqrt{N}$  是为了保证平移算子保持了原信号的均值不变。再次,信号 $g(·)$ 的平滑程度控制了平移后信号在顶点n附近的局部性(localization),即随着顶点 $i$ 与 $n$ 距离增大(最短路径跳数),其幅值下降的程度。最后,广义平移算子并非等距映射,due to the possible localization of the graph Laplacian eigenvectors(谁能解释一下)。

   看不下去了...........心情好了接着看

 

标签:Emerging,right,mathbf,graph,Processing,Graphs,quad,lambda,left
来源: https://www.cnblogs.com/BlairGrowing/p/15819835.html