首页 > 编程语言> > 将C#转换为惯用的R

将C#转换为惯用的R

2019-07-18 04:07:11 作者：互联网

最初,我正在使用我编写的一个简短的C#程序来平均一些数字.但是现在我想进行更广泛的分析,所以我将C#代码转换为R.但是,我真的不认为我在R中正确地使用它或利用语言.我用与C#完全相同的方式编写了R.

我有一个包含两列的CSV.第一列标识行的类型(三个值之一：C,E或P),第二列标识数字.我想平均分组在类型(C,E或P)上的数字.

我的问题是,在R中这样做的惯用方法是什么？

C#代码：

        string path = "data.csv";
        string[] lines = File.ReadAllLines(path);

        int cntC = 0; int cntE = 0; int cntP = 0; //counts
        double totC = 0; double totE = 0; double totP = 0; //totals
        foreach (string line in lines)
        {
            String[] cells = line.Split(',');
            if (cells[1] == "NA") continue; //skip missing data

            if (cells[0] == "C") 
            {
                totC += Convert.ToDouble(cells[1]);
                cntC++;
            }
            else if (cells[0] == "E")
            {
                totE += Convert.ToDouble(cells[1]);
                cntE++;
            }
            else if (cells[0] == "P")
            {
                totP += Convert.ToDouble(cells[1]);
                cntP++;
            }
        }
        Console.WriteLine("C found " + cntC + " times with a total of " + totC + " and an average of " + totC / cntC);
        Console.WriteLine("E found " + cntE + " times with a total of " + totE + " and an average of " + totE / cntE);
        Console.WriteLine("P found " + cntP + " times with a total of " + totP + " and an average of " + totP / cntP);

R代码：

dat = read.csv("data.csv", header = TRUE)

cntC = 0; cntE = 0; cntP = 0  # counts
totC = 0; totE = 0; totP = 0  # totals
for(i in 1:nrow(dat))
{
    if(is.na(dat[i,2])) # missing data
        next

    if(dat[i,1] == "C"){
        totC = totC + dat[i,2]
        cntC = cntC + 1
    }
    if(dat[i,1] == "E"){
        totE = totE + dat[i,2]
        cntE = cntE + 1
    }
    if(dat[i,1] == "P"){
        totP = totP + dat[i,2]
        cntP = cntP + 1
    }
}
sprintf("C found %d times with a total of %f and an average of %f", cntC, totC, (totC / cntC))
sprintf("E found %d times with a total of %f and an average of %f", cntE, totE, (totE / cntE))
sprintf("P found %d times with a total of %f and an average of %f", cntP, totP, (totP / cntP))

解决方法:

我会做这样的事情：

dat = dat[complete.cases(dat),]  ## The R way to remove missing data
dat[,2] <- as.numeric(dat[,2])   ## convert to numeric as you do in c#
by(dat[,2],dat[,1],mean)         ## compute the mean by group

当然要在data.frame中聚合你的结果你可以使用经典,但我不认为这里有必要,因为它是3个变量的列表：

 do.call(rbind,result)

EDIT1

这里的另一个选择是使用优雅的大道：

ave(dat[,2],dat[,1])

但结果在这里有所不同.从某种意义上说,您将获得与原始数据长度相同的向量.

EDIT2要包含更多结果,您可以详细说明您的匿名函数：

by(dat[,2],dat[,1],function(x) c(min(x),max(x),mean(x),sd(x)))

或者返回data.frame更适合rbind调用和列名：

by(dat[,2],dat[,1],function(x) 
            data.frame(min=min(x),max=max(x),mean=mean(x),sd=sd(x)))

或者使用优雅的内置功能(您也可以定义)摘要：

by(dat[,2],dat[,1],summary)

标签：c,r,idiomatic
来源： https://codeday.me/bug/20190718/1492600.html