Technology Sharing

Processing and analysis of telecommunications customer service data based on the Hadoop platform ③ Project development: Building a Flume big data development environment --- Task 14: Flume installation and deployment

2024-07-08

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

mission details

The task is to install and configure Flume and test the collection of streaming data.

Task Guidance

Flume is often used as a tool for real-time data collection. It can store the collected data in HDFS or message queues such as Kafka.

The specific installation steps are as follows:

1. Unzip the Flume compressed package

2. Configure Flume environment variables

3. Modify the Flume configuration file. The Flume configuration file is stored in the conf file under the Flume installation directory.

4. Obtain data collected by Flume through remote login

5. Store the data collected by Flume in HDFS

Mission Implementation

1. Install Flume

You can find the installation package in the /opt/software/ directory, unzip the installation package and copy it to the /opt/app directory

Execute on master1:

[root@master1 ~]# cd /opt/software/
[root@master1 software]# tar -xzf apache-flume-1.9.0-bin.tar.gz -C /opt/app/

2. Set Flume environment variables

Edit the /etc/profile file, declare Flume's home path and add the bin path to the path:

export FLUME_HOME=/opt/app/apache-flume-1.9.0-bin
export PATH=$PATH:$FLUME_HOME/bin

Load the configuration file /etc/profile and confirm that it takes effect

[root@master1 ~]# source /etc/profile
[root@master1 ~]# echo $FLUME_HOME

3. Set the flume-env.sh configuration file

Copy and rename flume-env.sh.template to flume-env.sh in $FLUME_HOME/conf, and modify the conf/flume-env.sh configuration file

[root@master1 ~]# cd $FLUME_HOME/conf
[root@master1 conf]# cp flume-env.sh.template flume-env.sh
[root@master1 conf]# vi flume-env.sh

Append the following to the end of the configuration file:

JAVA_HOME=/opt/app/jdk1.8.0_181
JAVA_OPTS="-Xms100m -Xmx200m -Dcom.sun.management.jmxremote"

4. Verify installation (telnet)

Modify the flume-conf configuration file

Modify the flume-conf.properties.template file in the $FLUME_HOME/conf directory, copy it and rename it to flume-conf.properties

[root@master1 ~]# cd $FLUME_HOME/conf
[root@master1 conf]# cp flume-conf.properties.template flume-conf.properties
[root@master1 conf]# vi flume-conf.properties

Modify the flume-conf configuration file as follows:

# The configuration file needs to define the sources, the channels and the sinks.
# Sources, channels and sinks are defined per agent, in this case called 'a1'
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# For each one of the sources, the type is defined
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
#The channel can be defined as follows.
a1.sources.r1.channels = c1
# Each sink's type must be defined
a1.sinks.k1.type = logger
#Specify the channel the sink should use
a1.sinks.k1.channel = c1
# Each channel's type is defined.
a1.channels.c1.type = memory
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

Run in the Flume installation directory

[root@master1 conf]# cd $FLUME_HOME
[root@master1 apache-flume-1.9.0-bin]# flume-ng agent -c ./conf/ -f ./conf/flume-conf.properties -n a1 -Dflume.root.logger=INFO,console

Open another terminal and enter the following command:

[root@master1 ~]# telnet localhost 44444
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Enter the following into the terminal:

Hello