因為Cloudera的使用場景較多是私有云,本文介紹的是如何在離線環境下進行部署的parcles方式。
硬件環境
機器類型 | 配置 | 規模 | |
---|---|---|---|
Master機型 | 24 cores, 192GB, SSD * 4(Non-Raid) | 5 | 192.168.1.[2-6] |
Slave機型 | 24 cores, 128GB, SAS * 6(Non-Raid) | 11 | 192.168.1.[7-17] |
前端機機型(VM) | 6 cores, 16GB, SSD * 1(Raid 5) | 1 | 192.168.1.1 |
本文不討論Hadoop角色分布,只描述如何部署Cloudera本身。
Cloudera角色分布:
角色 | 部署節點 | 機器類型 |
---|---|---|
Cloudera Server | 192.168.1.1 | VM |
Cloudera Agent | 192.168.1.1, 192.168.1.[3-17] | VM、Master、Slave |
Cloudera Management Services | 192.168.1.3 | Master |
MySQL | 192.168.1.2 | Master |
軟件版本
OS: CentOS 7.2
JDK: 1.8.0_73
Cloudera: 5.8.2
依賴資源
- CDH-5.8.2-1.cdh5.8.2.p0.3-el7.parcel
- CDH-5.8.2-1.cdh5.8.2.p0.3-el7.parcel.sha1
- cloudera-manager-centos7-cm5.8.2_x86_64.tar.gz
- cm5.8.2-centos7.tar.gz
- manifest.json
- mysql-connector-java-5.1.34-bin.jar
- jdk-8u73-linux-x64.rpm
假設所有依賴包都已經上傳至/home/admin/soft
下,記得下載完CDH-5.8.2-1.cdh5.8.2.p0.3-el7.parcel.sha1
后把文件名改為CDH-5.8.2-1.cdh5.8.2.p0.3-el7.parcel.sha
,因為后續進行 parcel 版本比對的時候,只會對 repo 目錄下的 *.sha 文件內容進行提取,比對 parcel 的 sha 值,以確認我們的 parcel 文件是正確可用沒有損毀的。
部署步驟
Step1. SSH免密打通
假設所有步驟都在admin賬戶下執行,所有服務器的admin賬戶密碼統一,需要打通Cloudera Server至所有agent的SSH免密登錄,假設list_agents
已經包含了所有agent機器的列表:
192.168.1.1
192.168.1.3
192.168.1.4
192.168.1.5
192.168.1.6
192.168.1.7
192.168.1.8
192.168.1.9
192.168.1.10
192.168.1.11
192.168.1.12
192.168.1.13
192.168.1.14
192.168.1.15
192.168.1.16
192.168.1.17
在192.168.1.1上進行執行:
ssh 192.168.1.1
ssh-keygen
for agent in `cat list_agents`;do ssh-copy-id -i /home/admin/.ssh/id_rsa.pub admin@${agent};done;
當然,我們也可以用expect來做自動化,這里過程省略。
Step2. JDK安裝
cd /home/admin/soft
pscp -h list_agents jdk-8u73-linux-x64.rpm ~/
pssh -h list_agents -P "rpm -ivh /home/admin/soft/jdk-8u73-linux-x64.rpm"
修改環境變量:
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
JAVA_HOME=/usr/java/jdk1.8.0_73
PATH=$JAVA_HOME/bin:$PATH
export PATH JAVA_HOME
分發環境變量配置:
pscp -h list_agents ~/.bash_profile /tmp
pssh -h list_agents "sudo cp /tmp/.bash_profile ~/"
Step3. MySQL安裝并初始化數據庫,默認用戶root無密碼
安裝MySQL,我們使用 mariadb:
ssh 192.168.1.2
sudo yum install mariadb-server
修改 /etc/my.cnf
以支持utf-8:
[mysql]
default-character-set=utf8
[mysqld]
character_set_server=utf8
init_connect='SET NAMES utf8'
啟動 mysql:
sudo systemctl start mariadb
sudo systemctl enable mariadb
初始化Cloudera Manager數據庫:
mysql -uroot
MariaDB > grant all privileges on *.* to 'root'@'%' identified by 'root';
MariaDB > grant all privileges on *.* to 'root'@'localhost' identified by 'root';
MariaDB > use mysql;
MariaDB > update user set password=password('root') where user='root';
MariaDB > create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
MariaDB > create database reports DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
MariaDB > create database navigator DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
MariaDB > create database navigatormeta DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
MariaDB > create database oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
MariaDB > create database hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
MariaDB > create database sentry DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Step4. 離線資源準備
在192.168.1.1上對離線包進行解壓,并且分發:
sudo tar zxvf /home/admin/soft/cloudera-manager-centos7-cm5.8.2_x86_64.tar.gz -C /opt/
sudo cp /home/admin/soft/mysql-connector-java-5.1.34.jar /opt/cm-5.8.2/share/cmf/lib/
創建Cloudera Manager的初始化數據,mysql密碼在之前初始化了,是root:
sudo /opt/cm-5.8.2/share/cmf/schema/scm_prepare_database.sh mysql cm -h127.0.0.1 -uroot -p --port 3306 --scm-host 127.0.0.1 scm scm scm
mysql -uroot -p
MariaDB > use cm;
MariaDB > grant all PRIVILEGES on cm to scm;
修改 192.168.1.1 上的 /opt/cm-5.8.2/etc/cloudera-scm-agent/config.ini
的server_host
修改為主節點hostname(v001001.idc.domain.com),將離線包中的CDH5相關的parcel包放到主節點192.168.1.1/opt/cloudera/parcel-repo/
目錄中:
sudo mkdir -p /opt/cloudera/parcel-repo/
sudo cp /home/admin/soft/cdh5.8.2/CDH-5.8.2-1.cdh5.8.2.p0.3-el7.parcel /opt/cloudera/parcel-repo/
sudo cp /home/admin/soft/cdh5.8.2/CDH-5.8.2-1.cdh5.8.2.p0.3-el7.parcel.sha /opt/cloudera/parcel-repo/
sudo cp /home/admin/soft/cdh5.8.2/manifest.json /opt/cloudera/parcel-repo/
在主節點192.168.1.1上創建storage目錄:
sudo mkdir -p /var/lib/cloudera-scm-server
Step5. 啟動Cloudera Server服務
在192.168.1.1上啟動Cloudera Server:
sudo /opt/cm-5.8.2/etc/init.d/cloudera-scm-server start
在agent上把mysql-connector-java-5.1.34.jar
拷貝至Cloudera Manager的lib目錄下:
pscp -h list_agents /home/admin/soft/mysql-connector-java-5.1.34.jar /tmp
pssh -h list_agents -P "sudo mkdir -p /usr/share/cmf/lib"
pssh -h list_agents -P "sudo mkdir -p /usr/share/java"
pssh -h list_agents -P "sudo cp /tmp/mysql-connector-java-5.1.34.jar /usr/share/cmf/lib"
pssh -h list_agents -P "sudo cp /tmp/mysql-connector-java-5.1.34.jar /usr/share/java/mysql-connector-java.jar"
Step6. 搭建臨時httpd服務器
搭建httpd服務器的目的是為了實現離線本地Repo安裝,需要把之前從cm5.8.2-centos7.tar.gz解壓縮至Root Dir下,假設Repo地址為http://192.168.1.1/cm/5.8.2/。
sudo yum install -y httpd
sudo tar zxvf cm5.8.2-centos7.tar.gz -C /var/www/html/
sudo rm -rf /var/run/yum.pid
Step7. 進入圖形化界面進行部署
默認的Cloudera控制臺界面為 http://192.168.1.1:7180
初次登錄會要求輸入默認的管理員賬戶名密碼,請切記。進入圖形化界面后按照以下步驟進行agent發現和集群初始化:
- 選擇
Yes, I accept the End User License Terms and Conditions.
,然后一路Continue
; - 在
Specify hosts for your CDH cluster installation.
界面輸入以下機器,并點擊Search
:
192.168.1.1
192.168.1.3
192.168.1.4
192.168.1.5
192.168.1.6
192.168.1.7
192.168.1.8
192.168.1.9
192.168.1.10
192.168.1.11
192.168.1.12
192.168.1.13
192.168.1.14
192.168.1.15
192.168.1.16
192.168.1.17
- 全選機器后,點擊
Continue
; - 選擇
Use Parcels (Recommended)
; - 在
More Options
中去掉所有遠程Repo配置; -
Select the version of CDH
選擇CDH-5.8.2-1.cdh5.8.2.p0.3
; -
Select the specific release of the Cloudera Manager Agent you want to install on your hosts.
選擇Custom Repository
,并且使用局域網httpd,地址為:http://192.168.1.1/cm/5.8.2/ -
Install Oracle Java SE Development Kit (JDK)
不勾選; -
Single User Mode
不勾選; -
Login To All Hosts As:
選擇admin,并錄入CentOS的admin密碼;
正常情況下會自動進行Cloudera安裝包分發,以及agent進程啟動。
Q: 安裝過程中如果遇到:
/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/meta/parcel.json 文件無法找到
A: 請把進程停止后清除 uuid 之后再走一遍流程:
sudo /bin/systemctl stop cloudera-scm-agent
sudo rm /var/lib/cloudera-scm-agent/*
sudo /bin/systemctl start cloudera-scm-agent
如果無效,請在安裝向導的 More Options
中重新指定 parcel repo 目錄,如果 repo 目錄空間不足也會導致 parcel.json 文件安裝失敗。重新指定的方法 Hosts
-> Parcels
-> Configuration
-> Local Parcel Repository Path
修改為其他地址。并且查看 Remote Parcel Repository URLs
配置是否指向本地 httpd 服務器對應目錄。
Q: 遇到 ProtocolError: <ProtocolError for 127.0.0.1/RPC2: 401 Unauthorized>
A: 請把 supervisor 殺掉后重試:
pid=`ps aux | grep "/usr/lib64/cmf/agent/build/env/bin/supervisord" | grep -v grep | awk '{print$2}'`
sudo kill -9 ${pid}
Step8.進行Hadoop服務部署
這塊不在這里進行描述,按照您的需要進行部署。但是有一塊要注意,記得在部署服務前創建相應 lib 目錄并且把 mysql-connector 拷貝至對應服務的 lib 目錄:
pscp -h list_agents /home/admin/soft/mysql-connector-java-5.1.34.jar /tmp
pssh -h list_agents "sudo mkdir -p /opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/hive/lib/ /opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/oozie/lib/ /var/lib/oozie"
pssh -h list_agents "sudo cp /tmp/mysql-connector-java-5.1.34.jar /opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/hive/lib/"
pssh -h list_agents "sudo cp /tmp/mysql-connector-java-5.1.34.jar /opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/oozie/lib/"
pssh -h list_agents "sudo mkdir -p /opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/sentry/lib"
pssh -h list_agents "sudo cp /tmp/mysql-connector-java-5.1.34.jar /opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/sentry/lib/"
pssh -h list_agents "sudo cp /tmp/mysql-connector-java-5.1.34.jar /var/lib/oozie"
pssh -h list_agents "sudo rm -f /tmp/mysql-connector-java-5.1.34.jar"
Q: 部署 Oozie 的時候報錯:
mkdir: cannot create directory '/var/lib/oozie/tomcat-deployment': No such file or directory
A: 在 MySQL 中重建 oozie 這個 database,然后 執行:
sudo chown oozie:oozie /var/lib/oozie
Step9. 修改HUE和Oozie的時區
修改時區是針對國內用戶的,如果不進行HUE時區修改,會造成HUE中無法使用sqoop,報Sqoop error: Could not get connectors.
;如果不進行Oozie時區修改,那么所看到的Oozie日志時間戳會錯誤。
HUE修改很簡單,只需要在Cloudera控制臺HUE Service的configuration
中對如下屬性進行修改,并重啟服務即可:
time_zone = Asia/Shanghai
Oozie修改的話,也是在Cloudera控制臺HUE Service的configuration
找到oozie-env.sh
進行修改,添加如下屬性:
<property>
<name>oozie.processing.timezone</name>
<value>GMT+0800</value>
</property>
然后在每臺運行Oozie的服務器上執行,完成后重啟Oozie服務:
cd /opt/cloudera/parcels/CDH/lib/oozie/libext
unzip ext-2.2.zip
chown oozie:oozie -R ext-2.2
最后在Oozie Web Console中的Settings
下修改Timezone
為CST(Asia/Shanghai)
。
附 完全卸載腳本
#!/bin/bash
RELEASE_VERSION=5.8.2
RPM_CMA_VERSION=5.8.2-1.cm582.p0.17.el7.x86_64
RPM_CMD_VERSION=5.8.2-1.cm582.p0.17.el7.x86_64
pssh -h list -P "sudo /bin/systemctl stop cloudera-scm-agent"
sudo /opt/cm-${RELEASE_VERSION}/etc/init.d/cloudera-scm-server stop
pssh -h list -P "sudo umount /run/cloudera-scm-agent/process"
pssh -h list -P "sudo rm -rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/x86_64/6/cloudera* /var/log/cloudera* /var/run/cloudera* /etc/cloudera* /usr/lib64/cmf /etc/init.d/cloudera-scm-agent /etc/rc3.d/S90cloudera-scm-agent /etc/cloudera-scm-agent"
pssh -h list -P "sudo rpm -e --noscripts --nodeps cloudera-manager-agent-${RPM_CMA_VERSION}"
pssh -h list -P "sudo rpm -e --noscripts --nodeps cloudera-manager-daemons-${RPM_CMD_VERSION}"
pssh -h list -P "sudo rm -rf /var/lib/hadoop-* /var/lib/impala /var/lib/solr /var/lib/zookeeper /var/lib/hue /var/lib/oozie /var/lib/pgsql /var/lib/sqoop2 /data/dfs/ /data/impala/ /data/yarn/ /dfs/ /impala/ /yarn/ /var/run/hadoop-*/ /var/run/hdfs-*/ /usr/bin/hadoop* /usr/bin/zookeeper* /usr/bin/hbase* /usr/bin/hive* /usr/bin/hdfs /usr/bin/mapred /usr/bin/yarn /usr/bin/sqoop* /usr/bin/oozie /etc/hadoop* /etc/zookeeper* /etc/hive* /etc/hue /etc/impala /etc/sqoop* /etc/oozie /etc/hbase* /etc/hcatalog /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper /var/lib/hbase /var/lib/hive /var/lib/impala /var/lib/spark"
pssh -h list -P "sudo rm -rf /opt/cloudera"
pssh -h list -P "sudo userdel -r cloudera-scm"
pssh -h list -P "sudo rm -rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/x86_64/6/cloudera* /var/log/cloudera* /var/run/cloudera* /etc/cloudera* /usr/lib64/cmf"
pssh -h list -P "sudo rm -rf /opt/cloudera /opt/cm-${RELEASE_VERSION}/"
附 Agent 卸載腳本
/sbin/service cloudera-scm-agent stop
/bin/sed -e s/\(server_host=\).*/\1localhost/ -i /etc/cloudera-scm-agent/config.ini
/bin/yum -y erase cloudera-manager-agent
/bin/rm -rf /var/log/cloudera-scm-agent/
/bin/rm -rf /etc/cloudera-scm-agent/