SNPsnap | 筛选最佳匹配的SNP | 富集分析
作者:互联网
一个矛盾:
GWAS得到的SNP做富集分析的话,通常都会有强的偏向性。
co-localization of GWAS signals to gene-dense and high linkage disequilibrium (LD) regions, and correlations of gene size, location and function
SNPsnap: a Web-based tool for identification and annotation of matched SNPs
providing matched sets of SNPs that can be used to calibrate background expectations.
基于:allele frequency, number of SNPs in LD, distance to nearest gene and gene density
根据条件,选出类似的SNP:
- Minor allele frequency : we partitioned SNPs into minor allele frequency bins (using 1–2, 2–3, … , 49–50% strata).
- LD buddies : for each SNP, we counted the number of ‘buddy’ SNPs in LD at various thresholds (r 2 > 0.1, 0.2, … , 0.9) [using PLINK v.1.07 ( Purcell et al. , 2007 ) to compute LD].
- Distance to nearest gene : we computed the distance to the nearest 5′ start site using Ensembl gene coordinates ( Flicek et al. , 2014 ). If the SNP was within a gene, we used the distance to that gene’s start site.
- Gene density : we counted the number of genes in loci around the SNP, using LD (r 2 > 0.1, 0.2, … , 0.9) and physical distance (100, 200, … , 1000 kb) to define loci.
这里我们就要根据这个工具来筛选T0的SNP。
a) the number of T0 loci was set to be the same as that of the T1 loci (associated with a single trait);
b) the length distribution of T0 loci was set to be the same as that of the T1 loci;
c) the T0 loci should not include the ENCODE blacklist regions and human leukocyte antigen (HLA) regions; and
d) they should be randomly selected from autosomal regions.
待续
标签:富集,LD,SNPs,SNPsnap,loci,SNP,using,gene 来源: https://www.cnblogs.com/leezx/p/11822163.html