一、 RDD创建
1.从本地文件系统中加载数据创建RDD
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319104905336-357825194.png)
2.从HDFS加载数据创建RDD
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319113905733-1025045055.png)
启动hdfs
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319105536343-1357148367.png)
上传文件
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319115428841-267542470.png)
查看文件
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319115458329-1524037390.png)
加载
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319115540752-557265605.png)
停止hdfs
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319105559931-1967454332.png)
3.通过并行集合(列表)创建RDD
输入列表、字符串、numpy生成数组
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319124745387-814566190.png)
二、 RDD操作
转换操作
1.map(func)
显式定义函数
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319125315754-746686730.png)
lambda函数
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319130255875-1348374931.png)
2.filter(func)
lambda函数
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319125942352-58215292.png)
显式定义函数
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319130719019-570414315.png)
行动操作
foreach(print)
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319131248079-1082291128.png)
collect()
![](https://www.icode9.com/i/l/?n=22&i=blog/2765561/202203/2765561-20220319131344102-757472775.png)
标签:hdfs,函数,创建,RDD,显式,操作,加载
来源: https://www.cnblogs.com/passerby-/p/16012927.html