Tuesday 19 July 2022

HIVE: LOAD DATA INPATH: Load data from hdfs file to hive table

LOAD DATA INPATH’ statement moves (It is not copy, it moves the data) the data from a HDFS file to hive table.

 

Syntax

LOAD DATA INPATH {HIVE FILE PATH} INTO TABLE {TABLE_NAME}

 

Let’s see it with an example.

 

Step 1: Create empInfo.csv file in hdfs location /user/Krishna.

 

empInfo.txt

 

1|Hari|Football,Cricket|Java:3.4Yrs,C:4.5Yrs|Male,30
2|Chamu|Trekking,Watching movies|Selenium:5.6Yrs|Feale,38
3|Sailu|Chess,Listening to music|EmbeddedC:9Yrs|Femle,32
4|Gopi|Cricket|Datastage:11Yrs|Male,32

$hdfs dfs -mkdir -p /user/krishna
$
$hdfs dfs -put empInfo.txt /user/krishna
$
$hdfs dfs -ls /user/krishna
Found 1 items
-rw-r--r--   1 krishna supergroup        206 2021-01-16 11:28 /user/krishna/empInfo.txt
$
$hdfs dfs -cat /user/krishna/empInfo.txt
1|Hari|Football,Cricket|Java:3.4Yrs,C:4.5Yrs|Male,30
2|Chamu|Trekking,Watching movies|Selenium:5.6Yrs|Feale,38
3|Sailu|Chess,Listening to music|EmbeddedC:9Yrs|Femle,32
4|Gopi|Cricket|Datastage:11Yrs|Male,32

 

Step 2: Create table employee.

hive> CREATE TABLE employee (
    > id INT,
    > name STRING,
    > hobbies ARRAY<STRING>,
    > technology_experience MAP<STRING,STRING>,
    > gender_age STRUCT<gender:STRING,age:INT>
    > )
    > ROW FORMAT DELIMITED
    > FIELDS TERMINATED BY '|'
    > COLLECTION ITEMS TERMINATED BY ','
    > MAP KEYS TERMINATED BY ':'
    > STORED AS TEXTFILE;
OK
Time taken: 0.045 seconds

 

Query table employee

hive> select * from employee;
OK
Time taken: 0.09 seconds

 

Step 3: Load the data from /user/krishna/empInfo.txt to employee table.

hive> LOAD DATA INPATH '/user/krishna/empInfo.txt' INTO TABLE employee;
Loading data to table default.employee
OK
Time taken: 0.182 seconds

 

Step 4: Query the table employee.

hive> select * from employee;
OK
1   Hari    ["Football","Cricket"]  {"Java":"3.4Yrs","C":"4.5Yrs"}  {"gender":"Male","age":30}
2   Chamu   ["Trekking","Watching movies"]  {"Selenium":"5.6Yrs"}   {"gender":"Feale","age":38}
3   Sailu   ["Chess","Listening to music"]  {"EmbeddedC":"9Yrs"}    {"gender":"Femle","age":32}
4   Gopi    ["Cricket"] {"Datastage":"11Yrs"}   {"gender":"Male","age":32}
Time taken: 0.078 seconds, Fetched: 4 row(s)

Since loading the data from HDFS file system is a cut + paste operation, empInfo.txt file is deleted from the folder /user/krishna/. Let’s confirm the same by querying the file ‘/user/krishna/empInfo.txt’.  

$hdfs dfs -cat /user/krishna/empInfo.txt
cat: `/user/krishna/empInfo.txt': No such file or directory


 

Previous                                                    Next                                                    Home

No comments:

Post a Comment