1.Flume的下载与安装

本次学习在Ubuntu的Linux操作系统下进行，首先进入Ubuntu，输入指令

1	wget https://archive.apache.org/dist/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz

下载完成后，在目录下输入指令进行解压

1	tar -zxvf 压缩包的名称

2.netcat日志采集

2.1 配置文件

进入目录下的conf文件中，创建example.conf文件，输入以下内容：

# 设置Agent上的各个组件名称
a1.sources = r1  #可以定义多个,r1 r2 r3 …
a1.sinks = k1
a1.channels = c1
 
# 配置Source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
 
# 配置Sink
a1.sinks.k1.type = logger 

# 配置Channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
 
# 把Source和Sink绑定到Channel上
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

2.2启动测试

输入完成后，回到Flume的根目录下，输入

1	./bin/flume-ng agent --conf ./conf --conf-file ./conf/example.conf --name a1 -Dflume.root.logger=INFO,console

如果是Windows操作系统，则可能是

1	.\bin\flume-ng agent --conf .\conf --conf-file .\conf\example.conf --name a1 -property flume.root.logger=INFO,console

确保telnet在主机上启用后，输入

1	telnet localhost 端口号

敲下回车键，如果终端显示‘OK’,则说明telnet上了，这时候输入任意字符，终端上就会显示消息，则说明测试完成。

3.采集文件数据到指定位置

3.1 配置文件

回到conf目录下，新建example1.conf文件，配置以下信息，别忘了改路径

# 设置Agent上的各个组件名称
a1.sources = r1  #可以定义多个,r1 r2 r3 …
a1.sinks = k1
a1.channels = c1

# 配置Source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# 配置Sink
a1.sinks.k1.type = file_roll
a1.sinks.k1.sink.directory = /home/ldyer/
a1.sinks.k1.sink.rollInterval = 0
a1.sinks.k1.sink.file.name.timeFormat = yyyyMMddHH

# 配置Channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 把Source和Sink绑定到Channel上
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3.2启动测试

输入之前的指令（别忘了改conf的名字），启动Flume，然后telnet端口，随便输入一些消息，然后退出，就看看到之前输入的地址下面有一个文件，打开文件，内容就是刚才输入的信息，测试成功。

4.采集文件数据到指定位置(静态)

4.1配置文件

和上述一样，创建example2.conf，输入以下配置信息(注意spoolDir必须是目录，不是文件)

#配置Agent上的各个组件名称
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 配置source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir =/home/ldyer/log
# a1.sources.r1.fileHeader = true
# a1.sources.r1.interceptors = i1
# a1.sources.r1.interceptors.i1.type = timestamp

# 配置sink
a1.sinks.k1.type = logger

# 配置 channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 绑定source,channel,sink
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

4.2启动测试

启动Flume，在配置中的目录下进行操作，如创建文件、删除文件等，都可以在终端上看到消息，测试成功。

5.采集文件数据到指定位置(动态)

5.1 配置文件

配置example3.conf，输入以下内容：

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/ldyer/log/ldy.txt

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

5.2启动测试

测试步骤如上

6.基于Avro多Agent分布式日志采集

6.1 Agent1

创建 Agent1.conf文件，输入以下内容:

# 设置Agent1上采集telnet数据
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 配置Source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# 配置Sink
a1.sinks.k1.type =  avro
a1.sinks.k1.channel = c1
#实际应用为目标机制地址不能是localhost或127.0.0.1
a1.sinks.k1.hostname = localhost
#目标机器的端口号
a1.sinks.k1.port = 55555
a1.sinks.k1.batch-size = 1


# 配置Channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 把Source和Sink绑定到Channel上
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

输入指令启动

1	./bin/flume-ng agent --conf ./conf --conf-file ./conf/Agent1.conf --name a1

6.2 Agent2

输入以下内容:

# 设置Agent2上的各个组件名称
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 配置Source
#source中的avro组件是一个接收者服务
a1.sources.r1.type = avro
a1.sources.r1.channels = c1

#当前集群IP，必须是ip地址
a1.sources.r1.bind = localhost
a1.sources.r1.port = 55555


# 配置Sink
a1.sinks.k1.type = logger

# 配置Channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 把Source和Sink绑定到Channel上
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

然后输入指令启动（这和Agent1有所区别，因为Agent2要显示Agent1的内容）

1	./bin/flume-ng agent --conf ./conf --conf-file ./conf/Agent2.conf --name a1 -Dflume.root.logger=INFO,console