Step 1: Create user table.
hive> CREATE TABLE user (
> id INT,
> name STRING
> );
OK
Time taken: 0.252 seconds
hive> ;
hive> ;
hive> DESC user;
OK
id int
name string
Time taken: 0.12 seconds, Fetched: 2 row(s)
Step 2: Execute INSERT query to insert new records to user table.
INSERT INTO user VALUES(1, 'Krishna');
hive> INSERT INTO user VALUES(1, 'Krishna');
Query ID = cloudera_20220414020505_7dcebdb2-e470-423a-a09e-583c4733033b
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1649172504056_0014, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1649172504056_0014/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1649172504056_0014
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2022-04-14 02:05:20,856 Stage-1 map = 0%, reduce = 0%
2022-04-14 02:05:28,724 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.59 sec
MapReduce Total cumulative CPU time: 1 seconds 590 msec
Ended Job = job_1649172504056_0014
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://quickstart.cloudera:8020/user/hive/warehouse/user/.hive-staging_hive_2022-04-14_02-05-12_954_681757281437170910-1/-ext-10000
Loading data to table default.user
Table default.user stats: [numFiles=1, numRows=1, totalSize=10, rawDataSize=9]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 1.59 sec HDFS Read: 3793 HDFS Write: 78 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 590 msec
OK
Time taken: 17.151 seconds
Each insert query internally triggers a map-reduce job.
Let’s query all the records from user table.
hive> SELECT * FROM user; OK 1 Krishna Time taken: 0.059 seconds, Fetched: 1 row(s)
Let’s query the HDFS folder /user/hive/warehouse/user (this is the location where Hive store the records of user table, you can get this by executing the command ‘DESCRIBE EXTENDED user;’ from terminal.
[cloudera@quickstart conf]$ hadoop fs -ls /user/hive/warehouse/user
Found 1 items
-rwxrwxrwx 1 cloudera supergroup 10 2022-04-14 02:05 /user/hive/warehouse/user/000000_0
Every insert create a new file in the directory /user/hive/warehouse/user.
Let’s print the content of file ‘/user/hive/warehouse/user/000000_0’.
[cloudera@quickstart conf]$ hadoop fs -cat /user/hive/warehouse/user/000000_0
1Krishna
Let’s add one more record to user table and confirm that one more files gets created in the folder /user/hive/warehouse/user.
hive> INSERT INTO user VALUES(2, 'Ram');
Query ID = cloudera_20220414021010_f28696f4-52d8-41b4-8a2d-305e545f9a0a
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1649172504056_0015, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1649172504056_0015/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1649172504056_0015
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2022-04-14 02:10:26,718 Stage-1 map = 0%, reduce = 0%
2022-04-14 02:10:33,080 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.2 sec
MapReduce Total cumulative CPU time: 1 seconds 200 msec
Ended Job = job_1649172504056_0015
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://quickstart.cloudera:8020/user/hive/warehouse/user/.hive-staging_hive_2022-04-14_02-10-20_492_2704729077173014892-1/-ext-10000
Loading data to table default.user
Table default.user stats: [numFiles=2, numRows=2, totalSize=16, rawDataSize=14]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 1.2 sec HDFS Read: 3887 HDFS Write: 74 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 200 msec
OK
Time taken: 13.826 seconds
Let’s query the folder ‘/user/hive/warehouse/user’.
[cloudera@quickstart conf]$ hadoop fs -ls /user/hive/warehouse/user
Found 2 items
-rwxrwxrwx 1 cloudera supergroup 10 2022-04-14 02:05 /user/hive/warehouse/user/000000_0
-rwxrwxrwx 1 cloudera supergroup 6 2022-04-14 02:10 /user/hive/warehouse/user/000000_0_copy_1
Let’s query the content of files.
[cloudera@quickstart conf]$ hadoop fs -cat /user/hive/warehouse/user/000000_0
1Krishna
[cloudera@quickstart conf]$
[cloudera@quickstart conf]$ hadoop fs -cat /user/hive/warehouse/user/000000_0_copy_1
2Ram
No comments:
Post a Comment