开启Hadoop/Yarn的日志监控功能,配置Spark历史服务,解决web端查看日志时的Java.lang.Exception:Unknown container问题
作者:互联网
解放方法
下来查询官方文档后,才了解到yarn的日志监控功能默认是处于关闭状态的,需要我们进行开启,开启步骤如下:
Ps:下面配置的文件的位置在hadoop根目录 etc/haddop文件夹下,比较老版本的Hadoop是在hadoop根目录下的conf文件夹中
本文hadoop配置环境目录:
/usr/local/src/hadoop-2.6.5/etc/hadoop
一、在yarn-site.xml文件中添加日志监控支持
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
二、在mapred-site.xml文件中添加日志服务的配置
<property>
<!-- 表示提交到hadoop中的任务采用yarn来运行,要是已经有该配置则无需重复配置 -->
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<!--日志监控服务的地址,一般填写为nodenode机器地址 -->
<name>mapreduce.jobhistroy.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistroy.webapp.address</name>
<value>master:19888</value>
</property>
三、将修改后的配置文件拷贝到集群中的其他机器(单机版hadoop可以跳过该步骤)(也可以先不分发到各个节点上)
快捷一点可以使用 scp 命令将配置文件拷贝覆盖到其他机器
scp yarn-site.xml root@slave1:/usr/local/src/hadoop-2.6.5/etc/hadoop/
scp mapred-site.xml root@slave1:/usr/local/src/hadoop-2.6.5/etc/hadoop/
…其他datanode机器同理
四、配置spark
配置spark-defaults.conf文件
/usr/local/src/spark-2.4.4-bin-hadoop2.6/conf
vim spark-defaults.conf
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.
# Example:
# spark.master spark://master:7077
# spark.eventLog.enabled true
# spark.eventLog.dir hdfs:/tmp/
# spark.serializer org.apache.spark.serializer.KryoSerializer
# spark.driver.memory 5g
# spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
# 保存在本地
# spark.eventLog.dir=file://usr/local/hadoop-2.7.3/logs/
# spark.history.fs.logDirectory=file://usr/local/hadoop-2.7.3/logs/
#
spark.eventLog.enabled=true
spark.eventLog.compress=true
# 保存在hdfs上
spark.eventLog.dir=hdfs://master:9000/tmp/spark-yarn-logs
spark.history.fs.logDirectory=hdfs://master:9000/tmp/spark-yarn-logs
spark.yarn.historyServer.address=spark-master:18080
五、 创建HDFS 日志目录:
该目录与上述yarn-site.xml 中的目录需要一致
hdfs dfs -mkdir -p /tmp/spark-yarn-logs
六、上述配置完成后
1.重新启动hadoop
[root@master hadoop-2.6.5]# ./sbin/start-all.sh
2.启动hadoop历史服务
mr-jobhistory-daemon.sh start historyserver
3.启动spark
在spark目录下
./sbin/start-all.sh
4.启动spark历史服务器
在spark目录下
sbin/start-history-server.sh
验证:
运行spark 在yarn模式PI
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster examples/jars/spark-examples_2.11-2.4.4.jar 10
5.关闭hadoop历史服务
mr-jobhistory-daemon.sh stop historyserver
6.关闭spark历史服务
./sbin/stop-history-server.sh
标签:lang,web,eventLog,hadoop,yarn,sh,master,日志,spark 来源: https://blog.csdn.net/qq_43665254/article/details/112253340