首页 > 其他分享> > 【GCN-RS】难负采样:MixGCF (KDD‘21)、Simplify and Robustify Negative Sampling (NIPS‘20)

【GCN-RS】难负采样:MixGCF (KDD‘21)、Simplify and Robustify Negative Sampling (NIPS‘20)


RS的Hard Negative与False Negative

负采样肯定是要采信息量大的Hard Negative样本,但是在RS场景中,unlabelled 样本并不等价于负样本,可能是还没有曝光给用户的样本。所以Hard Negative样本有可能是False Negative。所以过采样Hard Negative有可能引入False Negative。

Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering (NIPS 2020)

《Simplify and Robustify Negative Sampling》观察到虽然False Negative和Hard Negative都会有较大的Socre,但是False Negative有更低的预测方差。所以提出一个Simplify and Robustify Negative Sampling方法,在训练epoch t t t 时,根据前5个epoch的训练记录,预测评分高、方差大的样本作为Hard Negative:
j = arg ⁡ max ⁡ k ∈ M u P p o s ( k ∣ u , i ) + α t ⋅ std ⁡ [ P p o s ( k ∣ u , i ) ] P ( j ∣ u , i ) = sigmoid ⁡ ( r u i − r u j ) std ⁡ [ P pos  ( k ∣ u , i ) ] = ∑ s = t − 5 t − 1 [ [ P pos  ( k ∣ u , i ) ] s − Mean ⁡ [ P pos  ( k ∣ u , i ) ] 2 / 5 Mean ⁡ [ P pos  ( k ∣ u , i ) ] = ∑ s = t − 5 t − 1 [ P pos  ( k ∣ u , i ) ] s / 5 \begin{aligned} j &=\arg \max _{k \in \mathcal{M}_{u}} P_{\mathrm{pos}}(k \mid u, i)+\alpha_{t} \cdot \operatorname{std}\left[P_{\mathrm{pos}}(k \mid u, i)\right] \\ P(j \mid u, i) &=\operatorname{sigmoid}\left(r_{u i}-r_{u j}\right) \\ \operatorname{std}\left[P_{\text {pos }}(k \mid u, i)\right] &=\sqrt{\sum_{s=t-5}^{t-1}\left[\left[P_{\text {pos }}(k \mid u, i)\right]_{s}-\operatorname{Mean}\left[P_{\text {pos }}(k \mid u, i)\right]^{2} / 5\right.} \\ \operatorname{Mean}\left[P_{\text {pos }}(k \mid u, i)\right] &=\sum_{s=t-5}^{t-1}\left[P_{\text {pos }}(k \mid u, i)\right]_{s} / 5 \end{aligned} jP(j∣u,i)std[Ppos ​(k∣u,i)]Mean[Ppos ​(k∣u,i)]​=argk∈Mu​max​Ppos​(k∣u,i)+αt​⋅std[Ppos​(k∣u,i)]=sigmoid(rui​−ruj​)=s=t−5∑t−1​[[Ppos ​(k∣u,i)]s​−Mean[Ppos ​(k∣u,i)]2/5 ​=s=t−5∑t−1​[Ppos ​(k∣u,i)]s​/5​

一些Hard Negative Sampling 方法



MixGCF: An Improved Training Method for Graph Neural Network-based Recommender Systems (KDD 2021)

MixGCF参考了mixup的做法,并且根据GCN的特点引入了hop mixing:


其中 e v x l e_{v_x}^{l} evx​l​ 是DNS得到的若干个Hard Negative在 l l l 层的表征,positive mixing是在Negative的表征中混入一些positive的表征,这样人为构造Hard Negative,这样得到了 ( l + 1 ) × m (l+1) \times m (l+1)×m 个表征;在hop mixing中,每层从 m m m 个表征中选一个最难的:
e v x ′ ( l ) = arg ⁡ max ⁡ e v m ′ ( l ) ∈ E ( l ) f Q ( u , l ) ⋅ e v m ( l ) \mathrm{e}_{v_{x}}^{\prime(l)}=\underset{\mathrm{e}_{v_{m}}^{\prime(l)} \in \mathcal{E}^{(l)}}{\arg \max } f_{\mathrm{Q}}(u, l) \cdot \mathrm{e}_{v_{m}}^{(l)} evx​′(l)​=evm​′(l)​∈E(l)argmax​fQ​(u,l)⋅evm​(l)​
f Q ( u , l ) f_{\mathrm{Q}}(u, l) fQ​(u,l) 是 u 在 l 层的表征。最后再将 l l l 层的negative表征pooling起来。


来源: https://blog.csdn.net/yanguang1470/article/details/121412232