Sunday, 24 July 2022

Hive: Copy data from one table to other table

In this post, I am going to explain how to copy data from one table ‘t1’ to other table ‘t2’

 

Syntax

INSERT INTO TABLE t2 
SELECT * FROM t1;

 

Above statement copy the records from table t1 to the table t2.

 

Step 1: Create tables t1 and t2 by executing below statements.

 

CREATE TABLE t1 (id INT, name STRING);
create table t1 like t1;

hive> CREATE TABLE t1 (id INT, name STRING);
OK
Time taken: 0.683 seconds
hive> ;
hive> ;
hive> create table t2 like t1;
OK
Time taken: 0.106 seconds
hive> ;
hive> ;
hive> DESCRIBE t1;
OK
id                  	int                 	                    
name                	string              	                    
Time taken: 0.163 seconds, Fetched: 2 row(s)
hive> ;
hive> ;
hive> DESCRIBE t2;
OK
id                  	int                 	                    
name                	string              	                    
Time taken: 0.07 seconds, Fetched: 2 row(s)

Step 2: Insert some records into t1 table by executing below command.

INSERT INTO t1 VALUES
(1, 'Ravi'), 
(2, 'Ram'), 
(3, 'Siva');

hive> INSERT INTO t1 VALUES
    > (1, 'Ravi'), 
    > (2, 'Ram'), 
    > (3, 'Siva');
Query ID = cloudera_20220414213030_7582dd8f-703f-4e97-91f4-f07de73e2aba
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1649172504056_0019, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1649172504056_0019/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1649172504056_0019
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2022-04-14 21:31:04,220 Stage-1 map = 0%,  reduce = 0%
2022-04-14 21:31:11,885 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.57 sec
MapReduce Total cumulative CPU time: 1 seconds 570 msec
Ended Job = job_1649172504056_0019
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://quickstart.cloudera:8020/user/hive/warehouse/t1/.hive-staging_hive_2022-04-14_21-30-53_833_1051604796610857441-1/-ext-10000
Loading data to table default.t1
Table default.t1 stats: [numFiles=1, numRows=3, totalSize=20, rawDataSize=17]
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 1.57 sec   HDFS Read: 3792 HDFS Write: 86 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 570 msec
OK
Time taken: 19.451 seconds

Print the content of t1 table.

hive> SELECT * FROM t1;
OK
1	Ravi
2	Ram
3	Siva
Time taken: 0.074 seconds, Fetched: 3 row(s)

Step 3: Copy the data from table t2 to t1 by executing below command.

INSERT INTO TABLE t2 
SELECT * FROM t1;

hive> INSERT INTO TABLE t2 
    > SELECT * FROM t1;
Query ID = cloudera_20220414213333_37d27021-6730-4302-bdff-13dc967677b9
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1649172504056_0020, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1649172504056_0020/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1649172504056_0020
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2022-04-14 21:33:33,623 Stage-1 map = 0%,  reduce = 0%
2022-04-14 21:33:42,213 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.97 sec
MapReduce Total cumulative CPU time: 970 msec
Ended Job = job_1649172504056_0020
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://quickstart.cloudera:8020/user/hive/warehouse/t2/.hive-staging_hive_2022-04-14_21-33-26_362_4959830092155427021-1/-ext-10000
Loading data to table default.t2
Table default.t2 stats: [numFiles=1, numRows=3, totalSize=20, rawDataSize=17]
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 0.97 sec   HDFS Read: 3490 HDFS Write: 86 SUCCESS
Total MapReduce CPU Time Spent: 970 msec
OK
Time taken: 17.09 seconds
hive>

Let’s print the content of tables t1 and t2.

hive> SELECT * FROM t1;
OK
1	Ravi
2	Ram
3	Siva
Time taken: 0.044 seconds, Fetched: 3 row(s)
hive> ;
hive> ;
hive> SELECT * FROM t2;
OK
1	Ravi
2	Ram
3	Siva
Time taken: 0.043 seconds, Fetched: 3 row(s)


  

Previous                                                    Next                                                    Home

No comments:

Post a Comment