首页 > 其他分享> > 【Bias 05】Representation Learning with Statistical Independence to Mitigate Bias

【Bias 05】Representation Learning with Statistical Independence to Mitigate Bias

2020-11-26 11:01:33 作者：互联网

Abstract

偏见是当前机器学习面对的主要问题之一，表现在从医学研究中变量的虚假关联，到性别或面部识别系统中的种族偏见。但是在数据预处理阶段控制所有的偏见是笨重，甚至不可能的，因此需要在现有数据的基础上，建立公平学习特征的模型。

本文基于对抗训练建立模型，用两个竞争目标去学习特征，使得：（1）最大化对应任务的区分能力；（2）最小化偏见的statistical mean dependence。

具体来说，包含一个新的adversarial loss function，该损失鼓励去掉偏见和学习特征之间的关联性。

作者在人造数据、medical images（task bias）和性别分类（dataset bias）中实验，结果显示通过本文方法学习的特征有更好的表现，同时去除了偏见。

Introduction

Bias: one or a set of extraneours protected variables that distort the relationship between the input (independent) and output (dependent) variables
protected variabsles: variables that define the bias.
statistical mean independence: adversarial minimization of the linear correlation can remove non-linear association between the learned representations and protected variables, thus achieving statistical mean independence.

1. 本文认为Bias分为两类：Dataset bias和dataset bias。

其中dataset bias通常表现为缺少足够的数据，例如，对于一个通过人脸预测性别的模型，可能在不同人种中表现不一，这体现在不同人种的训练数据量不同。
task bias，表现在对于神经成像应用，一些人口统计学上的变量，例如性别、年龄都会影响模型的输入：神经成像，和输出：诊断。

2. CNN通常用于提取图片特征，类似于其他机器学习方法，当不加以控制时，CNN倾向于捕捉偏见。

3. 近期的工作聚焦在：

causal effect of bias on database
learning fair models with de-biased representations based on developments in invariant feature learning
learning fair models with de-biased representations based on developments in domain adversarial learning

4. 本文，我们提出一个representation learning scheme，学习带有最少偏见的特征，本文的方法受启发于domain-adversarial training approaches [20] with controllable invariance [55] within the context of GANs [22].

我们基于true和bias的预测值之间的Pearson 关系建立一个adversarial loss function。
我们理论上证明了线性关系的adversarial minimization可以消除特征和bias之间的非线性关联，实现statistical mean independence。
我们的框架类似adversarial invariant feature learning works.
我们在Magnetic Resonance Images (MRIs)和Gender shades Pilot Parliaments Benchmark (GS-PPB) dataset上进行了测试。

Related Work

1. 机器学习中的偏见。近期解决这个问题的方法在：（1）建立更公平的数据集；（2）通过验证特征是否预测真实的输出，从现有数据中学习公平的特征。但是这类方法不能应用到连续变量上。

2. Domain-Adversarial Training：[20]使用对抗训练去做域适应任务，通过使用学习特征去预测域标签（二值变量：source或target），其他方法在损失函数，域discriminator设置或自洽上进行了修改。该方法致力于close the domain gap（通过被编码为一个二元变量）。

3. Invariant Representation Learning：这类方法旨在学习到一种“表示形式”，这种表示形式对数据的特别因素是不变的。（例如Bias 04，通过解耦place feature和appearance feature，得到相对稳定的representation）。例如：[58]使用一种信息模糊方法（information obfuscation approach），模糊处理训练时偏见数据的关联；[6, 40]引入一种正则化方法。[55]提出使用domain-adversarial训练策略去学习invariant feature。[43, 52]使用基于和域适应相似的损失函数来实现对抗技术，去预测准确的bias变量，例如52使用binary cross-entropy去移除性别的影响，[43]用linear和kernelized最小平方预测作为对抗部分。

标签：Mitigate,bias,adversarial,偏见,feature,学习,Bias,Learning
来源： https://blog.csdn.net/qq_40731332/article/details/110132677