首页 > 其他分享> > 知识追踪-Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing

知识追踪-Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing

2021-02-28 12:29:34 作者：互联网

记录知识追踪领域所看的论文

本文研究内容
相关工作
Knowledge Tracing Machines
- - Relation to Existing Models
训练
数据集
实验结果
总结
未来可研究的点

本文研究内容

采用FM因子分解机，利用特征之间的组合特性，预测习题是否被正确做出。主要在于如何构造特征， User,items,skills,是id类feature的one-hot(skills是向量)，分别是用户id，习题id，习题对应知识点id向量，Wins和Fails变量的设置是核心所在，分别表示做对和做错某题对应知识点的计数器，但仅在该时刻习题知识点与历史做题记录共同的知识点上进行累加。

作者另一篇论文-待看
Deep Factorization Machines for Knowledge Tracing

Knowledge Tracing Machines

KTMs基于事件中涉及的所有特征的稀疏权重集，对事件（对或错）的二进制结果进行了建模。事件中涉及的特征由长度为N的稀疏向量x编码，仅当该事件涉及特征1≤i≤N时，使得 x i x_i xi>0。对于涉及x的每个事件，观察到正结果的概率p(X)验证为：
在这里插入图片描述
µ : a global bias
w ： refer to the vector of biases ( w 1 , . . . , w N ) (w_1, . . . , w_N ) (w1,...,wN)
V ：refer to the matrix of embeddings v i , i = 1 , . . . , N v_i, i = 1, . . . , N vi,i=1,...,N

每个特征i都是由一个偏置 w i w_i wi∈R和一个嵌入 v i ∈ R d v_i∈R^d vi∈Rd对某一维d建模的。
在这里插入图片描述

特征feature

Users：n个学生，用n个特征描述，1<=i<=n，当学生i参与观察时， x i x_i xi=1，其他设置为0（one-hot vector）
Items：m个问题，用m+个特征描述，1<=j<=m，当问题j参与观察时， x j x_j xj=1，其他设置为0
Skills：s个技能，用s个特征描述，一个学生对问题j的观察所涉及的技能称为KC(j)
Attempts：s个特征，计数器，为一个学生在测试中进行多少次尝试才能获得一项技能计数
Wins and Fails：s个特征成功获得一项技能，则尝试为correct；反之，为incorrect
Extra side information：学生的school id，teacher id，以及参加的考试为low stake或high stake

基于以上特征的数据进行编码（此时特征长度N=m+n+3s）
在这里插入图片描述
举例说明：第一轮，User2回答问题item2，outcome为1，从而获得skills1和2
第二轮，User2回答问题item2, outcome为0，之前的操作使得这一轮的Wins1和2为1
第三轮，User2回答问题item2, outcome为1，之前的操作使得这一轮的Wins1和2为1，Fails1和2为0
如此进行编码 …

Relation to Existing Models

ψ = logit，d = 0，only biases are learned for features, no embeddings

Relation to IRT
encode the pair (student i, question j)

当k=i或k=n+j时， x k x_k xk=1

假设n个student features have bias： w i = θ i − µ w_i = θ_i - µ wi=θi−µ，m个question features have bias - d j d_j dj ，KTM就会变成 the 1-PL IRT model,也就是Rasch model 。此时 w = ( θ 1 − µ , . . . , θ n − µ , − d 1 , . . . , − d m ) w = (θ_1 - µ, . . . , θ_n -µ, -d_1, . . . , -d_m) w=(θ1−µ,...,θn−µ,−d1,...,−dm)

Relation to AFM and PFA
encoding of skills, wins and fails at skill level
( q j k ) 1 ≤ j ≤ m , 1 ≤ k ≤ s (q_{jk})1≤j≤m,1≤k≤s (qjk)1≤j≤m,1≤k≤s：question和skill之间的二进制映射

假设 w = ( β 1 , . . . , β s , γ 1 , . . . , γ s , δ 1 , . . . , δ s ) w = (β_1, . . . , β_s, γ_1, . . . , γ_s, δ_1, . . . , δ_s) w=(β1,...,βs,γ1,...,γs,δ1,...,δs)，encoding of “student i attempted question j” is given by x = ( q j 1 , . . . , q j s , q j 1 W i 1 , . . . , q j s W i s , q j 1 F i 1 , . . . , q j s F i s ) x = (q_{j1}, . . . , q_{js}, q_{j1}W_{i1}, . . . , q_{js}W_{is}, q_{j1}F_{i1}, . . . , q_{js}F_{is}) x=(qj1,...,qjs,qj1Wi1,...,qjsWis,qj1Fi1,...,qjsFis)，其中 W i k 和 F i k W_{ik}和F_{ik} Wik和Fik是skill水平上成功和失败尝试的计数器。此时，KTM就变成了PFA模型

Relation to MIRT
d > 0

the embeddings ： V = ( θ 1 , . . . , θ n , d 1 , . . . , d m ) V = (θ_1, . . . , θ_n, d_1, . . . , d_m) V=(θ1,...,θn,d1,...,dm)
其他和IRT的一样

训练

通过最小化所有S观测样本的负对数似然NLL来训练KTMS
在这里插入图片描述
X = ( x i ) 1 ≤ i ≤ S X = (x_i)1≤i≤S X=(xi)1≤i≤S ：样本特征
y = ( y i ) 1 ≤ i ≤ S ∈ 0 , 1 S y = (y_i)1≤i≤S ∈ {0, 1}^S y=(yi)1≤i≤S∈0,1S：输出结果
为了知道训练，以及避免过度拟合，假设一些先验模型参数
bias $w_k ：w_k ∼ N (µ, 1/λ) $
embedding component v k f , f = 1 , . . . , d : v k f ∼ N ( µ , 1 / λ ) v_{kf }, f = 1, . . . , d : v_{kf} ∼ N (µ, 1/λ) vkf,f=1,...,d:vkf∼N(µ,1/λ)
µ and λ :正则化参数，遵循超先验 µ ∼ N (0, 1) and λ ∼ Γ(1, 1)
由于这些超先验，我们不需要手工调整正则化参数。当我们使用ψ=probit，即正态分布的CDF逆，我们可以用吉布斯抽样来拟合模型。

The model is learned using the MCMC Gibbs sampler implementation of libFM2 in C++ ， using the pywFM Python wrapper3

KTMs可以可视化ebeddings
在这里插入图片描述

数据集

在这里插入图片描述

Temporal Datasets
Assistments
Assistments
Non-Temporal Datasets
Castor
ECPE
Fraction
TIMSS

实验结果

在这里插入图片描述

总结

本文介绍了KTM，对EDM领域的一些经典模型采用KTM处理知识追踪领域的分类问题。即使观测数据稀疏，它也可以估计用户和项目参数，并提供比现有模型更好的预测。

未来可研究的点

根据数据的收集方式改进KTM中特征的编码：

Are the observations made at skill level or problem level?
Does it make sense to count the number of attempts at item level or at skill level?
What are extra sources of information that may raise better understanding of the observations?

代码实现

标签：...,Knowledge,Tracing,question,bias,student,skill,model,Machines
来源： https://blog.csdn.net/CZYruobing/article/details/114188091

知识追踪-Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing

记录知识追踪领域所看的论文

本文研究内容

相关工作

BKT的缺点

DKT

Factor Analysis

Item Response Theory

Additive factor model (AFM)

Performance factor analysis model (PFA)

Factorization Machines

Knowledge Tracing Machines

Relation to Existing Models

训练

数据集

实验结果

总结

未来可研究的点