您当前的位置: 首页 > 学无止境 > 心得笔记 网站首页心得笔记
hadoop-安装与基础使用
发布时间:2019-07-30 16:41:02编辑:雪饮阅读()
环境:
redhat6.4-i386
hadoop安装
[root@localhost src]# tar -zxvf hadoop-0.20.2-cdh3u5.tar.gz
[root@localhost src]# ln -sv /usr/local/src/hadoop-0.20.2-cdh3u5 /usr/local/hadoop
[root@localhost src]# rpm -ivh jdk-7u5-linux-i586.rpm
环境变量
[root@localhost src]# cat /etc/profile.d/java.sh
JAVA_HOME=/usr/java/latest
PATH=$JAVA_HOME/bin:$PATH
export JAVA_HOME PATH
[root@localhost src]# cat /etc/profile.d/hadoop.sh
HADOOP_HOME=/usr/local/hadoop
PATH=$HADOOP_HOME/bin:$PATH
export HADOOP_HOME PATH
[root@localhost src]# source /etc/profile.d/java.sh
[root@localhost src]# source /etc/profile.d/hadoop.sh
配置
[root@localhost src]# useradd hduser
[root@localhost src]# chown -R hduser.hduser /usr/local/hadoop/
[root@localhost src]# passwd hduser
[root@localhost src]# mkdir -p /hadoop/temp
[root@localhost src]# chown -R hduser.hduser /hadoop/temp/
[root@localhost src]# su - hduser
[hduser@localhost ~]$ cd /usr/local/hadoop/
[hduser@localhost hadoop]$ cat conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/temp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
</configuration>
[hduser@localhost hadoop]$ cat conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration>
[hduser@localhost hadoop]$ cat conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
密钥
这里要新开一个会话来操作
[root@localhost src]# su - hduser
[hduser@localhost hadoop]$ ssh-keygen -t rsa -P ''
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
Created directory '/home/hduser/.ssh'.
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
01:dc:6f:2d:a4:1a:ca:2a:a5:db:c2:5a:96:a6:34:2d hduser@localhost.localdomain
The key's randomart image is:
+--[ RSA 2048]----+
| ... |
| ... . |
| .+ . |
| . ..+ . |
| . . oS. . |
| o.o . |
|.E=o |
|=B+ |
|=+. |
+-----------------+
[hduser@localhost ~]$ ssh-copy-id -i .ssh/id_rsa.pub hduser@localhost
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is 9c:f1:5f:4f:87:f5:58:e4:6e:b8:88:8d:e2:4c:de:10.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
hduser@localhost's password:
Now try logging into the machine, with "ssh 'hduser@localhost'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
启动
[hduser@localhost ~]$ hadoop namenode -format
如下命令仅限单机情况下,若是集群则需要单独执行指定的几个脚本
[hduser@localhost ~]$ start-all.sh
要保证,除了jps自己以外还有5个进程出现
[hduser@localhost ~]$ jps
使用
hdfs下目录创建、文件上传、查看文件
[hduser@localhost ~]$ hadoop fs -mkdir test
注意:该步骤可能会报错如:
[hduser@localhost hadoop]$ hadoop fs -mkdir test
19/07/30 01:43:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s).
19/07/30 01:43:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 1 time(s).
19/07/30 01:43:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 2 time(s).
19/07/30 01:43:07 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 3 time(s).
19/07/30 01:43:08 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 4 time(s).
19/07/30 01:43:09 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 5 time(s).
19/07/30 01:43:10 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 6 time(s).
19/07/30 01:43:11 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 7 time(s).
19/07/30 01:43:12 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 8 time(s).
19/07/30 01:43:13 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 9 time(s).
Bad connection to FS. command aborted. exception: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused
这个时候重新格式化下"hadoop namenode -format"然后重新启动hadoop的5个服务即可
创建一个测试用文件
[hduser@localhost ~]$ cat test.txt
how are you?
how old are you?
what are you doing?
上传
[hduser@localhost ~]$ hadoop fs -put test.txt test/
查看文件
[hduser@localhost hadoop]$ hadoop fs -ls test/
Found 1 items
-rw-r--r-- 1 hduser supergroup 51 2019-07-30 01:47 /user/hduser/test/test.txt
hadoop数据分析示例:
第一个参数test是输入目录,第二个参数wordcount-out是输出目录,且该目录是不存在的目录
[hduser@localhost ~]$ hadoop jar /usr/local/hadoop/hadoop-examples-0.20.2-cdh3u5.jar wordcount test wordcount-out
查看所有作业
[hduser@localhost hadoop]$ hadoop job -list all
1 jobs submitted
States are:
Running : 1 Succeded : 2 Failed : 3 Prep : 4
JobId State StartTime UserName Priority SchedulingInfo
job_201907300144_0001 2 1564422551234 hduser NORMAL NA
查看分析结果
[hduser@localhost hadoop]$ hadoop fs -ls wordcount-out
Found 3 items
-rw-r--r-- 1 hduser supergroup 0 2019-07-30 01:49 /user/hduser/wordcount-out/_SUCCESS
drwxr-xr-x - hduser supergroup 0 2019-07-30 01:49 /user/hduser/wordcount-out/_logs
-rw-r--r-- 1 hduser supergroup 47 2019-07-30 01:49 /user/hduser/wordcount-out/part-r-00000
查看某个作业的分析结果
[hduser@localhost ~]$ hadoop fs -cat wordcount-out/part-r-00000
jobtracker控制台
secondaryNameNode控制台
获取hdfs状态信息
[root@localhost ~]# hadoop dfsadmin -report
Configured Capacity: 37560725504 (34.98 GB)
Present Capacity: 33305432064 (31.02 GB)
DFS Remaining: 33305067520 (31.02 GB)
DFS Used: 364544 (356 KB)
DFS Used%: 0%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
report: org.apache.hadoop.security.AccessControlException: Access denied for user root. Superuser privilege is required
查看文件系统详情
-openforwrite 打印正在打开写操作的文件
-blocks 打印block报告 (需要和-files参数一起使用)
-locations 打印每个block的位置信息(需要和-files参数一起使用)
-racks 打印位置信息的网络拓扑图 (需要和-files参数一起使用)
[hduser@localhost ~]$ hadoop fsck -openrorwrite -files -blocks -locations -racks
关键字词:hadoop
上一篇:CPU负载观察及调优方法.
下一篇:hadoop-分布式
相关文章
-
无相关信息