‘LOAD DATA INPATH’ statement moves (It is not copy, it moves the data) the data from a HDFS file to hive table.
Syntax
LOAD DATA INPATH {HIVE FILE PATH} INTO TABLE {TABLE_NAME}
Let’s see it with an example.
Step 1: Create empInfo.csv file in hdfs location /user/Krishna.
empInfo.txt
1|Hari|Football,Cricket|Java:3.4Yrs,C:4.5Yrs|Male,30 2|Chamu|Trekking,Watching movies|Selenium:5.6Yrs|Feale,38 3|Sailu|Chess,Listening to music|EmbeddedC:9Yrs|Femle,32 4|Gopi|Cricket|Datastage:11Yrs|Male,32
$hdfs dfs -mkdir -p /user/krishna
$
$hdfs dfs -put empInfo.txt /user/krishna
$
$hdfs dfs -ls /user/krishna
Found 1 items
-rw-r--r-- 1 krishna supergroup 206 2021-01-16 11:28 /user/krishna/empInfo.txt
$
$hdfs dfs -cat /user/krishna/empInfo.txt
1|Hari|Football,Cricket|Java:3.4Yrs,C:4.5Yrs|Male,30
2|Chamu|Trekking,Watching movies|Selenium:5.6Yrs|Feale,38
3|Sailu|Chess,Listening to music|EmbeddedC:9Yrs|Femle,32
4|Gopi|Cricket|Datastage:11Yrs|Male,32
Step 2: Create table employee.
hive> CREATE TABLE employee ( > id INT, > name STRING, > hobbies ARRAY<STRING>, > technology_experience MAP<STRING,STRING>, > gender_age STRUCT<gender:STRING,age:INT> > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > COLLECTION ITEMS TERMINATED BY ',' > MAP KEYS TERMINATED BY ':' > STORED AS TEXTFILE; OK Time taken: 0.045 seconds
Query table employee
hive> select * from employee; OK Time taken: 0.09 seconds
Step 3: Load the data from /user/krishna/empInfo.txt to employee table.
hive> LOAD DATA INPATH '/user/krishna/empInfo.txt' INTO TABLE employee; Loading data to table default.employee OK Time taken: 0.182 seconds
Step 4: Query the table employee.
hive> select * from employee; OK 1 Hari ["Football","Cricket"] {"Java":"3.4Yrs","C":"4.5Yrs"} {"gender":"Male","age":30} 2 Chamu ["Trekking","Watching movies"] {"Selenium":"5.6Yrs"} {"gender":"Feale","age":38} 3 Sailu ["Chess","Listening to music"] {"EmbeddedC":"9Yrs"} {"gender":"Femle","age":32} 4 Gopi ["Cricket"] {"Datastage":"11Yrs"} {"gender":"Male","age":32} Time taken: 0.078 seconds, Fetched: 4 row(s)
Since loading the data from HDFS file system is a cut + paste operation, empInfo.txt file is deleted from the folder /user/krishna/. Let’s confirm the same by querying the file ‘/user/krishna/empInfo.txt’.
$hdfs dfs -cat /user/krishna/empInfo.txt
cat: `/user/krishna/empInfo.txt': No such file or directory
No comments:
Post a Comment