Thursday, 28 July 2022

Hive: Insert array elements using INSERT statement

In this, post, I am going to explain how to insert data of type array into a table using INSERT statement.

 

We need a dummy table to insert data into a table with complex data types.

 

Example

insert into employee select "1","Krishna",array('playing cricket', 'cooking');

Step 1: Create employee table.

 
CREATE TABLE employee (
    id INT,
    name STRING,
    hobbies ARRAY<STRING>
 )
 ROW FORMAT DELIMITED
 COLLECTION ITEMS TERMINATED BY ','
 STORED AS TEXTFILE;

hive> CREATE TABLE employee (
    >     id INT,
    >     name STRING,
    >     hobbies ARRAY<STRING>
    >  )
    >  ROW FORMAT DELIMITED
    >  COLLECTION ITEMS TERMINATED BY ','
    >  STORED AS TEXTFILE;
OK
Time taken: 0.109 seconds
hive> ;
hive> ;
hive> DESCRIBE employee;
OK
id                      int                                         
name                    string                                      
hobbies                 array<string>                               
Time taken: 0.038 seconds, Fetched: 3 row(s)

Step 2: Insert a record into employee table.

insert into employee select "1","Krishna",array('playing cricket', 'cooking');

hive> insert into employee select "1","Krishna",array('playing cricket', 'cooking');
Query ID = cloudera_20220418080808_136c810b-a426-4008-903b-aea5f5b46450
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1649172504056_0031, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1649172504056_0031/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1649172504056_0031
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2022-04-18 08:08:11,564 Stage-1 map = 0%,  reduce = 0%
2022-04-18 08:08:19,268 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.76 sec
MapReduce Total cumulative CPU time: 1 seconds 760 msec
Ended Job = job_1649172504056_0031
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://quickstart.cloudera:8020/user/hive/warehouse/employee/.hive-staging_hive_2022-04-18_08-08-02_830_1093859265942969682-1/-ext-10000
Loading data to table default.employee
Table default.employee stats: [numFiles=1, numRows=1, totalSize=34, rawDataSize=33]
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 1.76 sec   HDFS Read: 4186 HDFS Write: 106 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 760 msec
OK
Time taken: 17.742 seconds

Query employee table and confirm whether the data is inserted or not.

hive> SELECT * FROM employee;
OK
1   Krishna ["playing cricket","cooking"]
Time taken: 0.051 seconds, Fetched: 1 row(s)


Previous                                                    Next                                                    Home

No comments:

Post a Comment