其他分享
首页 > 其他分享> > 基于flume和lftp的非结构化文件同步

基于flume和lftp的非结构化文件同步

作者:互联网

目录

同步非结构化文件到本地系统

lftptest.sh

#!/bin/bash
 
lftp sftp://192.168.1.102 << EOF
set net:timeout 5; 
set net:max-retries 5;
set net:reconnect-interval-base 5;
set net:reconnect-interval-multiplier 1;
mirror --delete --only-newer --verbose /tmp/lftptest/in /tmp/lftptest/out
exit
EOF

vi /etc/crontab

* * * * * root sh /tmp/lftptest/lftptest.sh >> /tmp/lftptest/lftptest.log

同步非结构化文件到HDFS

test.conf

a1.sources = r1
a1.channels = c1
a1.sinks = k1

a1.sources.r1.channels = c1
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /tmp/flumetest/in
a1.sources.r1.deserializer = org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder
a1.sources.r1.deserializer.maxBlobLength = 100000000
a1.sources.r1.basenameHeader = true
a1.sources.r1.basenameHeaderKey = fileName
a1.sources.r1.pollDelay = 1000

a1.channels = c1
a1.channels.c1.type = memory

a1.sinks.k1.channel = c1
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://linux01:9000/test/flumetest/out/
a1.sinks.k1.hdfs.filePrefix = %{fileName}
a1.sinks.k1.hdfs.fileType = DataStream

/opt/app/apache-flume-1.9.0-bin/bin/flume-ng agent -n a1 -c conf -f /tmp/flumetest/conf/test.conf -Dflume.root.logger=DEBUG,console

源目录

目标目录

标签:flume,lftp,r1,结构化,a1,sources,k1,lftptest,sinks
来源: https://www.cnblogs.com/lijiong/p/15477070.html