博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
hadoop 2.7.1部署
阅读量:6906 次
发布时间:2019-06-27

本文共 13374 字,大约阅读时间需要 44 分钟。

hot3.png

 

 

环境准备:

装备使用一台主 namenode,一台备namenode,2台datanode。 系统: Red Hat Enterprise Linux Server release 6.6 (Santiago)

配置/etc/hosts

192.168.83.11 hd1 192.168.83.22 hd2 192.168.83.33 hd3 192.168.83.44 hd4

修改主机名:

vi /etc/sysconfig/network

NETWORKING=yes HOSTNAME=hd1

创建hadoop运行用户

groupadd -g 10010 hadoop useradd -u 1001 -g 10010 -d /home/hadoop hadoop

ssh对等性验证包括:

主要包括hd1 到各个节点的对等性验证设置。 在各个节点生成authorizedkeys ,然后全部拷贝到hd1合并,然后把hd1合并后的authorizedkeys 分发到各个子节点。

hd1: ssh-keygen -t rsa cat ~/.ssh/.pub >~/.ssh/authorizedkeys hd2: ssh-keygen -t rsa cat ~/.ssh/.pub >~/.ssh/authorizedkeys scp ~/.ssh/authorizedkeys hd1:~/.ssh/authorizedkeys2 hd3: ssh-keygen -t rsa cat ~/.ssh/.pub >~/.ssh/authorizedkeys scp ~/.ssh/authorizedkeys hd1:~/.ssh/authorizedkeys3 hd4: ssh-keygen -t rsa cat ~/.ssh/.pub >~/.ssh/authorizedkeys scp ~/.ssh/authorizedkeys hd1:~/.ssh/authorizedkeys4

配置hadoop用户环境变量:

vi ~/.bash_profile

export JAVA_HOME=/usr/java/jdk1.8.0_11export JRE_HOME=/usr/java/jdk1.8.0_11/jreexport CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/libexport HADOOP_INSTALL=/usr/hadoop/hadoop-2.7.1export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin

安装jdk

[hadoop@hd1 ~]$ java -version java version "1.8.0_11"Java(TM) SE Runtime Environment (build 1.8.0_11-b12)Java HotSpot(TM) 64-Bit Server VM (build 25.11-b03, mixed mode)

配置hadoop的基础环境变量,如JDK位置,hadoop一些配置、文件及日志的路径,这些配置都在/home/hadoop-2.7.1/etc/hadoop/hadoop-env.sh文件中,修改以下内容: export JAVAHOME=/usr/java/jdk1.8.011

安装hadoop

从官网http://hadoop.apache.org下载hadoop-2.7.1.tar.gz,只在Master上解压,我的解压路径是 /usr/hadoop/hadoop-2.7.1/

配置hadoop:

由于hadoop运行有3种模式,独立模式,伪分布模式,集群模式,接下就这三种模式分别配置。 我们可以在启动的时候通过--config 参数来指定启动路径,从而达到多种模式共存的目的。我们一般会在环境变量配置hadoop_install参数指定hadoop根目录路径。 独立模式:单节点运行,hadoop默认配置都是独立模式,可以直接启动。 伪分布模式(pseudo):在一台主机模拟集群模式。 hadoop分为core,hdfs和map/reduce三部分。配置文件也被分成了三个core- site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml。

下面来配置伪分布模式:

修改Hadoop核心配置文件core-site.xml,这里配置的是HDFS master(即namenode)的地址和端口号。

fs.defaultFS
hdfs://localhost/

配置hdfs-site.xml文件

修改Hadoop中HDFS的配置,配置的备份方式默认为3。

dfs.replication
1

配置mapred-site.xml文件

修改Hadoop中MapReduce的配置文件,配置的是JobTracker的地址和端口。

mapreduce.framework.name
yarn

配置yarn-site.xml文件

yarn.resourcemanager.hostname
localhost
yarn.resourcemanager.aux-services
mapreduce_shuffle

hdfs文件格式化:

hadoop namenode -format

STARTUP_MSG:   java = 1.8.0_11************************************************************/18/07/23 17:04:32 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]18/07/23 17:04:32 INFO namenode.NameNode: createNameNode [-format]18/07/23 17:04:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFormatting using clusterid: CID-7ebdb3d2-19c1-4c1a-a64f-c3c149d1c07f18/07/23 17:04:34 INFO namenode.FSNamesystem: No KeyProvider found.18/07/23 17:04:34 INFO namenode.FSNamesystem: fsLock is fair:true18/07/23 17:04:34 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=100018/07/23 17:04:34 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true18/07/23 17:04:34 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.00018/07/23 17:04:34 INFO blockmanagement.BlockManager: The block deletion will start around 2018 Jul 23 17:04:3418/07/23 17:04:34 INFO util.GSet: Computing capacity for map BlocksMap18/07/23 17:04:34 INFO util.GSet: VM type       = 64-bit18/07/23 17:04:34 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB18/07/23 17:04:34 INFO util.GSet: capacity      = 2^21 = 2097152 entries18/07/23 17:04:34 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false18/07/23 17:04:34 INFO blockmanagement.BlockManager: defaultReplication         = 318/07/23 17:04:34 INFO blockmanagement.BlockManager: maxReplication             = 51218/07/23 17:04:34 INFO blockmanagement.BlockManager: minReplication             = 118/07/23 17:04:34 INFO blockmanagement.BlockManager: maxReplicationStreams      = 218/07/23 17:04:34 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false18/07/23 17:04:34 INFO blockmanagement.BlockManager: replicationRecheckInterval = 300018/07/23 17:04:34 INFO blockmanagement.BlockManager: encryptDataTransfer        = false18/07/23 17:04:34 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 100018/07/23 17:04:34 INFO namenode.FSNamesystem: fsOwner             = hadoop (auth:SIMPLE)18/07/23 17:04:34 INFO namenode.FSNamesystem: supergroup          = supergroup18/07/23 17:04:34 INFO namenode.FSNamesystem: isPermissionEnabled = true18/07/23 17:04:34 INFO namenode.FSNamesystem: HA Enabled: false18/07/23 17:04:34 INFO namenode.FSNamesystem: Append Enabled: true18/07/23 17:04:35 INFO util.GSet: Computing capacity for map INodeMap18/07/23 17:04:35 INFO util.GSet: VM type       = 64-bit18/07/23 17:04:35 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB18/07/23 17:04:35 INFO util.GSet: capacity      = 2^20 = 1048576 entries18/07/23 17:04:35 INFO namenode.FSDirectory: ACLs enabled? false18/07/23 17:04:35 INFO namenode.FSDirectory: XAttrs enabled? true18/07/23 17:04:35 INFO namenode.FSDirectory: Maximum size of an xattr: 1638418/07/23 17:04:35 INFO namenode.NameNode: Caching file names occuring more than 10 times18/07/23 17:04:35 INFO util.GSet: Computing capacity for map cachedBlocks18/07/23 17:04:35 INFO util.GSet: VM type       = 64-bit18/07/23 17:04:35 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB18/07/23 17:04:35 INFO util.GSet: capacity      = 2^18 = 262144 entries18/07/23 17:04:35 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.999000012874603318/07/23 17:04:35 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 018/07/23 17:04:35 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 3000018/07/23 17:04:35 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 1018/07/23 17:04:35 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 1018/07/23 17:04:35 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,2518/07/23 17:04:35 INFO namenode.FSNamesystem: Retry cache on namenode is enabled18/07/23 17:04:35 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis18/07/23 17:04:35 INFO util.GSet: Computing capacity for map NameNodeRetryCache18/07/23 17:04:35 INFO util.GSet: VM type       = 64-bit18/07/23 17:04:35 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB18/07/23 17:04:35 INFO util.GSet: capacity      = 2^15 = 32768 entriesRe-format filesystem in Storage Directory /tmp/hadoop-hadoop/dfs/name ? (Y or N) Y18/07/23 17:07:57 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1239596151-192.168.83.11-153233687718118/07/23 17:07:57 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.18/07/23 17:07:57 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 018/07/23 17:07:57 INFO util.ExitUtil: Exiting with status 018/07/23 17:07:57 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at hd1/192.168.83.11************************************************************/

启动dfs:

[hadoop@hd1 hadoop_pseudo]$ start-dfs.sh --config /usr/hadoop/hadoop-2.7.1/etc/hadoop_pseudo/Starting namenodes on [localhost]localhost: starting namenode, logging to /usr/hadoop/hadoop-2.7.1/logs/hadoop-hadoop-namenode-hd1.outlocalhost: starting datanode, logging to /usr/hadoop/hadoop-2.7.1/logs/hadoop-hadoop-datanode-hd1.outStarting secondary namenodes [0.0.0.0]0.0.0.0: starting secondarynamenode, logging to /usr/hadoop/hadoop-2.7.1/logs/hadoop-hadoop-secondarynamenode-hd1.out

启动yarn:

[hadoop@hd1 hadoop_pseudo]$ start-yarn.sh --config /usr/hadoop/hadoop-2.7.1/etc/hadoop_pseudo/starting yarn daemonsstarting resourcemanager, logging to /usr/hadoop/hadoop-2.7.1/logs/yarn-hadoop-resourcemanager-hd1.outlocalhost: starting nodemanager, logging to /usr/hadoop/hadoop-2.7.1/logs/yarn-hadoop-nodemanager-hd1.out

启动Mapreduce:mr-jobhistory-daemon.sh start historyserver

或者 如下命令启动hadoop伪分布所有组件:

[hadoop@hd1 ~]$ start-all.sh --config /usr/hadoop/hadoop-2.7.1/etc/hadoop_pseudo

当然也可以使用HADOOPCONFDIR来制定hadoop配置文件路径。如下:

export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop_pseudo [hadoop@hd1 ~]$ hadoop fs -ls /18/07/24 07:29:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFound 24 items-rw-r--r--   1 root root          0 2018-07-23 16:37 /.autofsckdr-xr-xr-x   - root root       4096 2018-07-19 20:35 /bindr-xr-xr-x   - root root       1024 2018-07-19 19:06 /bootdrwxr-xr-x   - root root       4096 2014-08-07 13:29 /cgroupdrwxr-xr-x   - root root       3800 2018-07-23 16:37 /devdrwxr-xr-x   - root root      12288 2018-07-23 16:37 /etcdrwxr-xr-x   - root root       4096 2018-07-20 16:37 /homedr-xr-xr-x   - root root       4096 2018-07-19 19:05 /libdr-xr-xr-x   - root root      12288 2018-07-19 20:35 /lib64drwx------   - root root      16384 2018-07-19 19:00 /lost+founddrwxr-xr-x   - root root       4096 2011-06-28 22:13 /mediadrwxr-xr-x   - root root          0 2018-07-23 16:37 /miscdrwxr-xr-x   - root root       4096 2011-06-28 22:13 /mntdrwxr-xr-x   - root root          0 2018-07-23 16:37 /netdrwxr-xr-x   - root root       4096 2018-07-19 19:05 /optdr-xr-xr-x   - root root          0 2018-07-23 16:37 /procdr-xr-x---   - root root       4096 2018-07-19 22:53 /rootdr-xr-xr-x   - root root      12288 2018-07-19 20:35 /sbindrwxr-xr-x   - root root          0 2018-07-23 16:37 /selinuxdrwxr-xr-x   - root root       4096 2011-06-28 22:13 /srvdrwxr-xr-x   - root root          0 2018-07-23 16:37 /sysdrwxrwxrwt   - root root       4096 2018-07-24 07:27 /tmpdrwxr-xr-x   - root root       4096 2018-07-20 18:29 /usrdrwxr-xr-x   - root root       4096 2018-07-19 19:05 /var

后记: 如果是在win平台安装hadoop 2.7.1,需要另外几个文件(链接:https://pan.baidu.com/s/1w1-cmTDTLWC_sFNWpxrOQA 密码:ozzw),不然启动报错。 把如下几个文件拷贝到hadoop安装目录下bin目录即可。

集群模式部署:

cp -p -r $HADOOP_INSTALL/etc/hadoop $HADOOP_INSTALL/etc/hadoop_cluster 

ln -s /usr/hadoop/hadoop-2.7.1/etc/hadoop_cluster  /usr/hadoop/hadoop-2.7.1/etc/hadoop 

需要修改 slaves、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml 等5个文件。

vi core-site.xml

fs.defaultFS
hdfs://hd1:9000
hadoop.tmp.dir
file:/usr/hadoop/hadoop-2.7.1/tmp
Abase for other temporary directories.

hdfs-site.xml

dfs.namenode.secondary.http-address
hd1:50090
dfs.replication
2
dfs.namenode.name.dir
file:/usr/hadoop/hadoop-2.7.1/tmp/dfs/name
dfs.datanode.data.dir
file:/usr/hadoop/hadoop-2.7.1/tmp/dfs/data

mapred-site.xml

mapreduce.framework.name
yarn
mapreduce.jobhistory.address
hd1:10020
mapreduce.jobhistory.webapp.address
hd1:19888

yarn-site.xml

yarn.resourcemanager.hostname
hd1
yarn.nodemanager.aux-services
mapreduce_shuffle

salves

vi salveshd3hd4

 

把hadoop_cluster文件拷贝到hd2,hd3,hd4

[hadoop etc]$ scp -p -r hadoop_cluster/ hadoop@hd2:/usr/hadoop/hadoop-2.7.1/etc/

[hadoop etc]$ scp -p -r hadoop_cluster/ hadoop@hd3:/usr/hadoop/hadoop-2.7.1/etc/

[hadoop etc]$ scp -p -r hadoop_cluster/ hadoop@hd4:/usr/hadoop/hadoop-2.7.1/etc/

格式化:

hadoop namenode -format --config /usr/hadoop/hadoop-2.7.1/etc/hadoop_cluster/

启动:

[hadoop@hd1 ~]$ start-dfs.sh  

[hadoop@hd1 ~]$ start-yarn.sh 

[hadoop@hd1 ~]$ mr-jobhistory-daemon.sh start historyserver

hd1部署NameNode实例,hd2不是辅助NameNode实例和DataNode实例

hd3,hd4部署DataNode实例

  |NameNode SecondaryNameNode DataNode
hd1 Y    
hd2   Y Y
hd3     Y
hd4     Y

hd1:

[hadoop@hd1 sbin]$ jps7764 Jps7017 ResourceManager6734 NameNode

 

hd2:

[root@hd2 ~]# jps3222 NodeManager3142 SecondaryNameNode3962 Jps3035 DataNode

 hd3:

[root@hd3 ~]# jps3600 DataNode3714 NodeManager4086 Jps

hd4:

[root@hd4 ~]# jps3024 NodeManager3373 Jps2909 DataNode

yarn 资源管理框架,有NodeManager和ResourceManager进程,NM在DataNode节点上,而RM在NameNode节点上。

 

 

 

 

 

 

转载于:https://my.oschina.net/u/3862440/blog/1862524

你可能感兴趣的文章
MYSQL-授权
查看>>
你的编程语言能做到这个吗?
查看>>
iTerm和Alfred 2的安装和使用
查看>>
用归档解档实现简单登陆
查看>>
自动点胶机点胶不良率高是什么原因?
查看>>
Win2003中配置FTP服务,开启防火墙导致客户端无法连接【精华详解】
查看>>
【云计算】Linux从入门到精通
查看>>
oracle 解锁用户
查看>>
Hibernate与Mybatis/iBatis的区别
查看>>
Java源码学习之:Semaphore
查看>>
林仕鼎谈架构设计与架构师
查看>>
我的《实战java高并发程序设计》纸质书上市了
查看>>
MySQL的InnoDB的幻读问题
查看>>
Pyinstaller Python ImportError: No module named publisher
查看>>
Rebuild Instance 操作详解 - 每天5分钟玩转 OpenStack(37)
查看>>
Cobbler + WindowsDHCP
查看>>
读《思维的乐趣matrix67数学笔记》
查看>>
linux25-代理服务器
查看>>
字母数字混合随机验证码
查看>>
前端开源项目周报0314
查看>>