其他分享
首页 > 其他分享> > pysark streaming

pysark streaming

作者:互联网

一、一个例子

from pyspark import SparkContext
from pyspark.streaming import StreamingContext
# create sc with two working threads
sc = SparkContext("local[2]","test")
# create local StreamingContext with batch interval of 1 second
ssc = StreamingContext(sc,1)
# create DStream that connects to localhost:9999
lines = ssc.socketTextStream("localhost",9999)
words = lines.flatMap(lambda line: line.split(" "))
pairs = words.map(lambda x: (x,1))
wordcount = pairs.reduceByKey(lambda x,y: x+y)
# 打印DStream里每个RDD的前10个元素
wordcount.pprint()
ssc.start()
ssc.awaitTermination()

运行过程:
1、linux 首先查看9999端口是否已经使用

netstat -ntpl | grep 9999

2、开启999端口

nc -lk 9999

如果在win10,使用

nc -l -p 9999

3、在新的窗口运行脚本,在之前的窗口输入字符串,在新窗口查看打印输出

-------------------------------------------
Time: 2021-10-21 15:49:17
-------------------------------------------
('kaka', 2)
('tt', 1)

二 spark streaming解析

标签:pysark,create,9999,streaming,sc,StreamingContext,ssc
来源: https://www.cnblogs.com/leimu/p/15434664.html