首页 > 其他分享> > OpenCV图像处理笔记[16]

OpenCV图像处理笔记[16]

2022-08-03 11:32:43 作者：互联网

Painting Light

1. 前期工作

理解核心代码所需的预备知识

关于BMP文件格式的详解 https://blog.csdn.net/zjq_1314520/article/details/53830349
BMP位图与调色板分析 https://blog.csdn.net/qsycn/article/details/7801145
聚类用于图像分割 https://juejin.im/post/5e9eb5eef265da47fe1e050f#heading-1
Kmeans聚类及图像分割
- https://blog.csdn.net/jteng/article/details/48811881
- https://jingxa.github.io/2018/12/08/CS131-Homework-5/

导包

import cv2
import rtree
import scipy
import trimesh
import numpy as np
import tensorflow as tf
from scipy.spatial import ConvexHull
from cv2.ximgproc import createGuidedFilter

trimesh: Trimesh是一个纯粹的Python(2.7-3.4+)库，用于加载和使用三角形网格，强调不透水的表面。该库的目标是提供一个完整的功能和良好的测试Trimesh对象，允许方便的操作和分析，在风格的多边形对象在Shapely库。

来自：https://github.com/mikedh/trimesh

rtree: Rtree 是一个 ctypes 的python包装 libspatialindex 这为空间感兴趣的Python用户提供了许多高级空间索引功能。这些功能包括：
- 最近邻搜索
- 交叉点搜索
- 多维指标
- 聚集索引（直接用索引项存储python pickle）
- 散装装载
- 删除
- 磁盘序列化
- 自定义存储实现（例如，在zodb中实现空间索引）

来自：https://www.osgeo.cn/rtree/

cv2.ximgproc.guidedFilter()

导向滤波（Guided Filter)

引导滤波是由何凯明等人于2010年发表在ECCV的文章《Guided Image Filtering》中提出的，后续于2013年发表了改进算法快速引导滤波的实现。

引导滤波（导向滤波）是一种图像滤波技术，通过一张 引导图，对初始图像p（输入图像）进行滤波处理，使得 最后的输出图像大体上与初始图像P 相似，但是 纹理部分与引导图相似。其典型应用有两个：保边图像平滑，抠图。
引导滤波（导向滤波）的目的是，保持双边滤波的优势（有效保持边缘，非迭代计算） ，而克服双边滤波的缺点（设计一种时间复杂度为 O(1) 的快速滤波器，而且在主要边缘附近没有梯度的变形）。
引导滤波（导向滤波）不仅能实现双边滤波的边缘平滑，而且 在检测到边缘附近有很好的表现，可应用在图像增强、HDR压缩、图像抠图及图像去雾等场景。

来自：https://jinzhangyu.github.io/2018/09/06/2018-09-06-OpenCV-Python%E6%95%99%E7%A8%8B-16-%E5%B9%B3%E6%BB%91%E5%9B%BE%E5%83%8F-3/

数据预处理

理解参考https://zhuanlan.zhihu.com/p/24157634

在TensorFlow张量上调用Keras层

创建一个TensorFlow会话并且注册Keras。Keras将使用注册的会话来初始化它在内部创建的所有变量。

session = tf.Session()
tf.keras.backend.set_session(session)
ip3 = tf.placeholder(dtype=tf.float32, shape=(None, None, None, 3))
srcnn = tf.keras.models.load_model('srcnn.net')
srcnn_op = srcnn(tf.pad(ip3 / 255.0, [[0, 0], [16, 16], [16, 16], [0, 0]], 'REFLECT'))[:, 16:-16, 16:-16, :] * 255.0
session.run(tf.global_variables_initializer())
srcnn.load_weights('srcnn.net')

**tf.Session() **

Session 是 Tensorflow 为了控制,和输出文件的执行的语句. 运行 session.run() 可以获得你要得知的运算结果, 或者是你所要运算的部分.
keras

Keras 是一个用 Python 编写的高级神经网络 API，它能够以 TensorFlow, CNTK, 或者 Theano 作为后端运行。
tf.keras.backend.set_session(session)
定义在：tensorflow/python/keras/backend.py。

设置全局TensorFlow会话。

参数：
- session：TF会话。
tf.placeholder(dtype, shape=None, name=None)

https://blog.csdn.net/william_hehe/article/details/81612640

1.函数功能：

插入一个张量的占位符，这个张量总是被喂入图片数据。相当于一个形参。

形参：只有在被调用时才分配内存单元,在调用结束时,就会释放出所分配的内存单元。

2.函数参数：

dtype：数据类型；shape：数据维度；name：数据名称
Keras框架训练模型保存及再载入
- 在训练环境下训练出模型，然后拿训练好的模型（即保存模型相应信息的文件）到生产环境下去部署
- load_model('srcnn.net') 加载srcnn网络模型
- load_weight 加载模型权重，即模型权重文件
srcnn : 深度卷积神经网络Super-Resolution

SRCNN首先使用双三次(bicubic)插值将低分辨率图像放大成目标尺寸，接着通过三层卷积网络拟合非线性映射，最后输出高分辨率图像结果。

三层卷积的结构的三个步骤：图像块的提取和特征表示，特征非线性映射和最终的重建

2. 图片大小处理

min_resize()

按m尺寸比例调整图像大小

如果图像宽 < 图像长

将S0设为目标尺寸

S1 即最终结果的长 = 目标值 m 与实际值 shape[0] 的比例 × 原图的长

否则

将S0 设为最终结果的宽 = 目标值 m 与实际值shape[1] 的比例 × 原图的宽

S1 设为目标尺寸

新图像尺寸就调整为min (长s1, 宽s0)

比较新比例和原图比例

如果尺寸：新图 < 原图

缩小原图

否则

放大原图

# Some image resizing tricks.
def min_resize(x, m):
    if x.shape[0] < x.shape[1]:
        s0 = m
        s1 = int(float(m) / float(x.shape[0]) * float(x.shape[1]))
    else:
        s0 = int(float(m) / float(x.shape[1]) * float(x.shape[0]))
        s1 = m
    new_max = min(s1, s0)
    raw_max = min(x.shape[0], x.shape[1])
    if new_max < raw_max:
        interpolation = cv2.INTER_AREA
    else:
        interpolation = cv2.INTER_LANCZOS4
    y = cv2.resize(x, (s1, s0), interpolation=interpolation)
    return y

shape函数

shape函数的功能是读取矩阵的长度，shape[0]就是读取矩阵第一维度的长度,相当于行数。它的输入参数可以是一个整数表示维度，也可以是一个矩阵。shape函数返回的是一个元组，表示数组（矩阵）的维度，shape返回的元组表示该数组的行数与列数
cv2.resize()
cv2.resize(src, (dst_w, dst_h), interpolation)

src是输入图，dst是目标图，w是宽,h是高，interpolation是差值方法

作用：改变图像大小即改变尺寸，无论是单独的高或宽，还是两者。也可以按比例调整图像大小。

y = cv2.resize(x, (s1, s0), interpolation=interpolation)
- x: 原图像
- （s1,s0) : 输出图像所需大小
- interpolation 插值方法
  - INTER_AREA - 基于局部像素的重采样（resampling using pixel area relation）。
    
    使用像素区域关系重新采样。它可能是图像抽取（image decimation）的首选方法，因为它可以提供无莫尔条纹的结果。但如果是放大图像时，它和最近邻法INTER_NEAREST的效果类似。
    
    区域插值分为3种情况：图像放大时类似于线性插值，图像缩小时可以避免波纹出现。
  - INTER_LANCZOS4 - 基于8x8像素邻域的Lanczos插值
    
    在x,y方向分别对相邻的八个点进行插值，也就是计算加权和，所以它是一个8×8的描述子
  - 通常的，缩小使用cv.INTER_AREA，放缩使用cv.INTER_CUBIC(较慢)和cv.INTER_LINEAR(较快效果也不错)。默认情况下，所有的放缩都使用cv.INTER_LINEAR。
- 使用cv2.resize时，参数输入是 宽×高×通道 ， dsize 的参数输入是 x轴×y轴，即 宽×高与以往操作不同，需要注意。

d_resize()

再次调整尺寸

比较出向下取样和向上取样两种图像之间的大小，选择对应的插值（这里可以简单理解为：缩小使用cv.INTER_AREA方法，放大使用cv2.INTER_LANCZOS4方法）

如果向下取样的图像大小 < 向上取样的图像

将该图缩小

如果向下取样的图像大小 > 向上取样的图像

将该图放大

# Some image resizing tricks.
def d_resize(x, d, fac=1.0):
    new_min = min(int(d[1] * fac), int(d[0] * fac))
    raw_min = min(x.shape[0], x.shape[1])
    if new_min < raw_min:
        interpolation = cv2.INTER_AREA
    else:
        interpolation = cv2.INTER_LANCZOS4
    y = cv2.resize(x, (int(d[1] * fac), int(d[0] * fac)), interpolation=interpolation)
    return y

3. 图片梯度处理

get_image_gradient()

梯度的方向是函数f(x,y)变化最快的方向，当图像中存在边缘时，一定有较大的梯度值，相反，当图像中有比较平滑的部分时，灰度值变化较小，则相应的梯度也较小，图像处理中把梯度的模简称为梯度，由图像梯度构成的图像成为梯度图像

# Some image gradient computing tricks.
def get_image_gradient(dist):
    cols = cv2.filter2D(dist, cv2.CV_32F, np.array([[-1, 0, +1], [-2, 0, +2], [-1, 0, +1]]))
    rows = cv2.filter2D(dist, cv2.CV_32F, np.array([[-1, -2, -1], [0, 0, 0], [+1, +2, +1]]))
    return cols, rows

Python-OpenCV中的filter2D()函数

使用自定义内核对图像进行卷积。该功能将任意线性滤波器应用于图像。支持就地操作。当光圈部分位于图像外部时，该功能会根据指定的边框模式插入异常像素值。

参考：filter2D函数 https://www.cnblogs.com/lfri/p/10599420.html

图像梯度sobel https://www.cnblogs.com/DJC-BLOG/p/9129034.html

函数原型：
dst=cv.filter2D(src, ddepth, kernel[, dst[, anchor[, delta[, borderType]]]])
参数描述

src 原图像

dst 目标图像，于原图像尺寸和通过数相同

ddepth 目标图像的所需深度

kernel 卷积核（或相当于相关核），单通道浮点矩阵;如果要将不同的内核应用于不同的通道，请使用拆分将图像拆分为单独的颜色平面，然后单独处理它们

anchor 内核的锚点，指示内核中过滤点的相对位置;锚应位于内核中;默认值（-1，-1）表示锚位于内核中心。

detal 在将它们存储在dst中之前，将可选值添加到已过滤的像素中。类似于偏置。

borderType 像素外推法，参见BorderTypes

cols = cv2.filter2D(dist, cv2.CV_32F, np.array([[-1, 0, +1], [-2, 0, +2], [-1, 0, +1]]))

其中ddepth表示目标图像的所需深度，它包含有关图像中存储的数据类型的信息，可以是unsigned char（CV_8U），signed char（CV_8S），unsigned short（CV_16U）等等...

sobel算子

sobel内核用于仅显示特定方向上相邻像素值的差异，分为left sobel、right sobel（检测梯度的水平变化）、top sobel、buttom sobel（检测梯度的垂直变化）。

例如，buttom sobel

x方向：

-1 0 +1

-2 0 +2

-1 0 +1

y方向：

-1 -2 -1

0 0 0

+1 +2 +1

参数	描述
src	原图像
dst	目标图像，于原图像尺寸和通过数相同
ddepth	目标图像的所需深度
kernel	卷积核（或相当于相关核），单通道浮点矩阵;如果要将不同的内核应用于不同的通道，请使用拆分将图像拆分为单独的颜色平面，然后单独处理它们
anchor	内核的锚点，指示内核中过滤点的相对位置;锚应位于内核中;默认值（-1，-1）表示锚位于内核中心。
detal	在将它们存储在dst中之前，将可选值添加到已过滤的像素中。类似于偏置。
borderType	像素外推法，参见BorderTypes

-1	0	+1
-2	0	+2
-1	0	+1

-1	-2	-1
0	0	0
+1	+2	+1

4. 生成灯光效果

generate_lighting_effects()

1. 向下取样pyrDown函数及使用

图像金字塔操作的将是图像的像素问题

向下取样：将图像的尺度变小，变成原来的四分之一

从高分辨率到低分辨率图像，缩小图像

**dst = cv2.pyrDown( src ) **
- dst , 向下取样结果
- src , 原始图像

def generate_lighting_effects(stroke_density, content):

    # Computing the coarse lighting effects
    # In original paper we compute the coarse effects using Gaussian filters.
    # Here we use a Gaussian pyramid to get similar results.
    # This pyramid-based result is a bit better than naive filters.
    h512 = content
    h256 = cv2.pyrDown(h512)
    h128 = cv2.pyrDown(h256)
    h64 = cv2.pyrDown(h128)
    h32 = cv2.pyrDown(h64)
    h16 = cv2.pyrDown(h32)
    c512, r512 = get_image_gradient(h512)
    c256, r256 = get_image_gradient(h256)
    c128, r128 = get_image_gradient(h128)
    c64, r64 = get_image_gradient(h64)
    c32, r32 = get_image_gradient(h32)
    c16, r16 = get_image_gradient(h16)
    c = c16

2. 向上取样pyrUp函数操作

向上取样：在每个方向上扩大为原来的2倍，新增的行和列以0填充。放大图像

使用与“向下采用”同样的卷积核乘以4，获取“新增像素”的新值。
注意：放大后的图像比原始图像要模糊。

向上取样：将图像的尺度变大，变成原来的四倍

**dst = cv2.pyrUp( src ) **
dst , 向下取样结果
src , 原始图像

  c = d_resize(cv2.pyrUp(c), c32.shape) * 4.0 + c32
    c = d_resize(cv2.pyrUp(c), c64.shape) * 4.0 + c64
    c = d_resize(cv2.pyrUp(c), c128.shape) * 4.0 + c128
    c = d_resize(cv2.pyrUp(c), c256.shape) * 4.0 + c256
    c = d_resize(cv2.pyrUp(c), c512.shape) * 4.0 + c512
    r = r16
    r = d_resize(cv2.pyrUp(r), r32.shape) * 4.0 + r32
    r = d_resize(cv2.pyrUp(r), r64.shape) * 4.0 + r64
    r = d_resize(cv2.pyrUp(r), r128.shape) * 4.0 + r128
    r = d_resize(cv2.pyrUp(r), r256.shape) * 4.0 + r256
    r = d_resize(cv2.pyrUp(r), r512.shape) * 4.0 + r512
    coarse_effect_cols = c
    coarse_effect_rows = r

3. 规范化

# Normalization —— 标准化
EPS = 1e-10
max_effect = np.max((coarse_effect_cols**2 + coarse_effect_rows**2)**0.5)
coarse_effect_cols = (coarse_effect_cols + EPS) / (max_effect + EPS)
coarse_effect_rows = (coarse_effect_rows + EPS) / (max_effect + EPS)

# Refinement  —— 改进
stroke_density_scaled = (stroke_density.astype(np.float32) / 255.0).clip(0, 1)
coarse_effect_cols *= (1.0 - stroke_density_scaled ** 2.0 + 1e-10) ** 0.5
coarse_effect_rows *= (1.0 - stroke_density_scaled ** 2.0 + 1e-10) ** 0.5
refined_result = np.stack([stroke_density_scaled, coarse_effect_rows, coarse_effect_cols], axis=2)

return refined_result

归一化/标准化:
不同的评价指标往往具有不同的量纲和量纲单位，这样无法对结果进行分析，难以对结果进行衡量
，为了消除指标之间的量纲影响，需要对数据进行标准化处理，以使数据指标之间存在可比性。
（所谓数据归一化处理就是将所有数据都映射到同一尺度。）
归一化：是指变量减去它的均值，再除以标准差；
优点：归一化后加快了梯度下降求最优解的速度；并且有可能提高精度。
为什么要进行图像归一化？（Normalization）
1、转换成标准模式，防止仿射变换的影响。
2、减小几何变换的影响。
3、加快梯度下降求最优解的速度。
在这个项目里使用的方法是(0,1)标准化——这是最简单也是最容易想到的方法，通过遍历featur
vector（特征向量）里的每一个数据，将Max和Min的记录下来，并通过Max-Min作为基数
，Max=1）进行数据的归一化处理：

5. 运行函数

run

1. 参数：

参数	表示
image	原图像
mask	掩码图像
ambient_intensity	环境强度
light_intensity	光照密度
light_source_height	光源高度
gamma_correction	$\gamma$ 修正
stroke_density_clipping	笔画密度剪裁
light_color_red\green\blue	光照颜色
enabling_multiple_channel_effects	启用多通道效果

def run(image, mask, ambient_intensity, light_intensity, light_source_height, gamma_correction, stroke_density_clipping, light_color_red, light_color_green, light_color_blue, enabling_multiple_channel_effects):

2. 图片处理

# Some pre-processing to resize images and remove input JPEG artifacts.
raw_image = min_resize(image, 512)
raw_image = run_srcnn(raw_image)
raw_image = min_resize(raw_image, 512)
raw_image = raw_image.astype(np.float32)
unmasked_image = raw_image.copy()

3. 掩膜处理

设置$\alpha$值

if mask is not None:
    alpha = np.mean(d_resize(mask, raw_image.shape).astype(np.float32) / 255.0, axis=2, keepdims=True)
    raw_image = unmasked_image * alpha

4. 计算凸包形状调色板

# Compute the convex-hull-like palette.
h, w, c = raw_image.shape
flattened_raw_image = raw_image.reshape((h * w, c))
raw_image_center = np.mean(flattened_raw_image, axis=0)
hull = ConvexHull(flattened_raw_image)

raw_image.reshape((h * w, c) ：将图像扁平化处理，把color作为特征

计算出图像中心即g

convexHull第一个参数是要求凸包的点集，第二个参数是输出的凸包点，第三个参数是一个bool变量，表示求得的凸包是顺时针方向还是逆时针方向，true是顺时针方向。注意：第二个参数可以为vector，此时返回的是凸包点在原轮廓点集中的索引，也可以为vector，此时存放的是凸包点的位置。

凸包(Convex Hull)是一个计算几何（图形学）中的概念，在一个实数向量空间V中，对于给定集合X，所有包含X的凸集的交集S被称为X的凸包。
X的凸包可以用X内所有点(x1, x2….xn)的线性组合来构造。在二维欧几里得空间中，凸包可以想象为一条刚好包着所有点的橡皮圈，用不严谨的话来讲，给定二维平面上的点集，凸包就是将最外层的点连接起来构成的凸多边形，它能包含点集中所有的点。常见的有Graham’s Scan法和Jarvis步进法

5. 估计笔画密度图

# Estimate the stroke density map.
intersector = trimesh.Trimesh(faces=hull.simplices, vertices=hull.points).ray

从现有的面faces和顶点vertices数据创建网格对象
TriMesh 就是Triangle mesh.（三角网格），是三角形的集合。索引数组的长度必须是3的倍数，因为三角形总是有三个顶点。上面的数组长度为6，说明有两个三角形组成。前三个为｛0，1，2｝，意思是m对象的第一个三角形通过连线vertexes [0] → vertexes [1] → vertexes [2]画成的。Vertex [0] 的法线是normals [0], 颜色是colors [0], texture coordinate 是texCoords [0]。第二个三角形是vertexes [1] → vertexes [2] → vertexes [3]
trimesh.ray

使用pyembree软件包进行Ray查询

start = np.tile(raw_image_center[None, :], [h * w, 1])

函数原型：tile(array, repeat)

说明：

array：Array类数组。
repeat：各个维度上重复的次数。

功能：重复array 的各个维度，得到的新数组的维度由repeat的维度d和array.ndim的大小决定，shape值由扩充后的array和repeat相应维度值的乘积得到。

ndarray.ndim

指数组的维度，即数组轴（axes）的个数，其数量等于秩（rank）

direction = flattened_raw_image - start
print('Begin ray intersecting ...')

方向

index_tri, index_ray, locations = intersector.intersects_id(start, direction, return_locations=True, multiple_hits=True)
print('Intersecting finished.')

intersects_id : 查找光线列表所命中的三角形，包括可选地沿着一条光线进行多次命中。
- 参数
  - ray_origins: 光线起源
  - ray_directions: 光线方向
  - multiple_hits ：如果为true, 则返回光线的每一个命中点，如果为false，只返回第一个命中点
  - retrun_location: 是否返回命中的位置
- 返回值
  - index_tri : mesh.faces的索引
  - index_ray : 射线的索引
  - locations : 交点，仅在retrun_locations时返回
参考：trimsh官方文档 https://trimsh.org/trimesh.ray.ray_pyembree.html

intersections = np.zeros(shape=(h * w, c), dtype=np.float32)
intersection_count = np.zeros(shape=(h * w, 1), dtype=np.float32)
CI = index_ray.shape[0]
for c in range(CI):
    i = index_ray[c]
    intersection_count[i] += 1
    intersections[i] += locations[c]
intersections = (intersections + 1e-10) / (intersection_count + 1e-10)
intersections = intersections.reshape((h, w, 3))
intersection_count = intersection_count.reshape((h, w))
intersections[intersection_count < 1] = raw_image[intersection_count < 1]

intersection: 交点

intersection_distance = np.sqrt(np.sum(np.square(intersections - raw_image_center[None, None, :]), axis=2, keepdims=True))

交点（命中点）距离： $\sqrt{\Sigma （c_p - g）^2}$

pixel_distance = np.sqrt(np.sum(np.square(raw_image - raw_image_center[None, None, :]), axis=2, keepdims=True))

对应公式：像素距离= $\sqrt{\Sigma （c_p - g）^2}$

stroke_density = ((1.0 - np.abs(1.0 - pixel_distance / intersection_distance)) * stroke_density_clipping).clip(0, 1) * 255

对应公式： $ 1 - | 1 - \frac{\sqrt{\Sigma(c_p - h_p)^2}}{\sqrt{\Sigma （c_p - h_p）^2} * 笔画密度裁剪}| $

np.clip:限制数组的最大值和最小值

6. 提升笔画密度图质量

# A trick to improve the quality of the stroke density map.
# It uses guided filter to remove some possible artifacts.
# You can remove these codes if you like sharper effects.
guided_filter = createGuidedFilter(pixel_distance.clip(0, 255).astype(np.uint8), 1, 0.01)
for _ in range(4):
    stroke_density = guided_filter.filter(stroke_density)

使用导向滤波器
createGuidedFilter(guide, radius, eps)
- 参数
  - guide : 三个通道的导向图片
  - radius : 导向滤波器的半径

7. 生成笔画密度估计值和灯光效果

# Visualize the estimated stroke density.
cv2.imwrite('stroke_density.png', stroke_density.clip(0, 255).astype(np.uint8))

# Then generate the lighting effects
raw_image = unmasked_image.copy()
lighting_effect = np.stack([
   generate_lighting_effects(stroke_density, raw_image[:, :, 0]),
   generate_lighting_effects(stroke_density, raw_image[:, :, 1]),
   generate_lighting_effects(stroke_density, raw_image[:, :, 2])
], axis=2)

update_mouse

    def update_mouse(event, x, y, flags, param):
        global gx
        global gy
        gx = - float(x % w) / float(w) * 2.0 + 1.0
        gy = - float(y % h) / float(h) * 2.0 + 1.0
        return

    light_source_color = np.array([light_color_blue, light_color_green, light_color_red])

    global gx
    global gy

    while True:
        light_source_location = np.array([[[light_source_height, gy, gx]]], dtype=np.float32)
        light_source_direction = light_source_location / np.sqrt(np.sum(np.square(light_source_location)))
        final_effect = np.sum(lighting_effect * light_source_direction, axis=3).clip(0, 1)
        if not enabling_multiple_channel_effects:
            final_effect = np.mean(final_effect, axis=2, keepdims=True)
        rendered_image = (ambient_intensity + final_effect * light_intensity) * light_source_color * raw_image
        rendered_image = ((rendered_image / 255.0) ** gamma_correction) * 255.0
        canvas = np.concatenate([raw_image, rendered_image], axis=1).clip(0, 255).astype(np.uint8)
        cv2.imshow('Move your mouse on the canvas to play!', canvas)
        cv2.setMouseCallback('Move your mouse on the canvas to play!', update_mouse)
        cv2.waitKey(10)

$光源方向 = \frac{光源位置}{\sqrt{\Sigma{光源位置^2} }}$
$光照效果 = \Sigma(光照效果 \cdot 光照位置)裁剪效果$

如果叠加的效果存在：

最终效果 = 最终效果的均值

$渲染图像 = （环境强度 + 最终效果 \cdot 光照密度） \cdot 光源颜色 \cdot 原图 $

渲染图像进行$\gamma $ 校正

画布 = 原图和渲染图像叠加

标签：raw,16,image,cv2,OpenCV,shape,图像处理,图像,np
来源： https://www.cnblogs.com/tow1/p/16546451.html