Saturday 24 December 2022

Hive: Working with lateral view

In this post, I am going to explain the use of lateral view.

 

Lateral view is used to expand the collection of elements into rows.

 

Suppose, you have data like below.

 

Name

hobbies

Hari

Football,Cricket

Chamu

Trekking,Watching movies

Sailu

Chess,Listening to music

Gopi

Cricket

 

Using lateral view + explode function, we can transform above data like below.

 

Name

Hobby

Hari

Football

Hari

Cricket

Chamu

Trekking

Chamu

Watching movies

Sailu

Chess

Sailu

Listening to music

Gopi

Cricket

 

Find the below working application.

 

Step 1: Create an employee table.

CREATE TABLE employee (
    id INT,
    name STRING,
    hobbies ARRAY<STRING>,
    technology_experience MAP<STRING,STRING>,
    gender_age STRUCT<gender:STRING,age:INT>
 )
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '|'
 COLLECTION ITEMS TERMINATED BY ','
 MAP KEYS TERMINATED BY ':'
 STORED AS TEXTFILE;

hive> CREATE TABLE employee (
    >     id INT,
    >     name STRING,
    >     hobbies ARRAY<STRING>,
    >     technology_experience MAP<STRING,STRING>,
    >     gender_age STRUCT<gender:STRING,age:INT>
    >  )
    >  ROW FORMAT DELIMITED
    >  FIELDS TERMINATED BY '|'
    >  COLLECTION ITEMS TERMINATED BY ','
    >  MAP KEYS TERMINATED BY ':'
    >  STORED AS TEXTFILE;
OK
Time taken: 0.155 seconds
hive> ;
hive> ;
hive> DESC employee;
OK
id                  	int                 	                    
name                	string              	                    
hobbies             	array<string>       	                    
technology_experience	map<string,string>  	                    
gender_age          	struct<gender:string,age:int>	                    
Time taken: 0.041 seconds, Fetched: 5 row(s)

 

Step 2: Create empInfo.txt file and move it to HDFS location /user/hive/warehouse/employee (this is location where Hive stores the data).

 

empInfo.txt

1|Hari|Football,Cricket|Java:3.4Yrs,C:4.5Yrs|Male,30
2|Chamu|Trekking,Watching movies|Selenium:5.6Yrs|Feale,38
3|Sailu|Chess,Listening to music|EmbeddedC:9Yrs|Femle,32
4|Gopi|Cricket|Datastage:11Yrs|Male,32

 

Step 3: Add empInfo.txt file to hdfs location /user/hive/warehouse/employee. Any content you place in the directory /user/hive/warehouse/employee will be the content of the table employee.

[cloudera@quickstart hive]$ vi empInfo.txt
[cloudera@quickstart hive]$ hdfs dfs -put empInfo.txt /user/hive/warehouse/employee
[cloudera@quickstart hive]$ 
[cloudera@quickstart hive]$ hadoop fs -ls /user/hive/warehouse/employee
Found 1 items
-rw-r--r--   1 cloudera supergroup        207 2022-04-18 07:46 /user/hive/warehouse/employee/empInfo.txt

 

Query employee table confirm the data.

hive> SELECT * FROM employee;
OK
1   Hari    ["Football","Cricket"]  {"Java":"3.4Yrs","C":"4.5Yrs"}  {"gender":"Male","age":30}
2   Chamu   ["Trekking","Watching movies"]  {"Selenium":"5.6Yrs"}   {"gender":"Feale","age":38}
3   Sailu   ["Chess","Listening to music"]  {"EmbeddedC":"9Yrs"}    {"gender":"Femle","age":32}
4   Gopi    ["Cricket"] {"Datastage":"11Yrs"}   {"gender":"Male","age":32}
Time taken: 0.409 seconds, Fetched: 4 row(s)

 

Step 4: Explode employee hobbies.

hive> SELECT explode(hobbies) FROM employee;
OK
Football
Cricket
Trekking
Watching movies
Chess
Listening to music
Cricket
Time taken: 0.079 seconds, Fetched: 7 row(s)

 

Explode employee hobby and name together by executing below statement in the terminal.

 

SELECT name, hobby FROM employee lateral view explode(hobbies) employee as hobby;

 

hive> SELECT name, hobby FROM employee lateral view explode(hobbies) employee as hobby;
OK
Hari    Football
Hari    Cricket
Chamu   Trekking
Chamu   Watching movies
Sailu   Chess
Sailu   Listening to music
Gopi    Cricket
Time taken: 0.098 seconds, Fetched: 7 row(s)

 

 

Previous                                                    Next                                                    Home

No comments:

Post a Comment