In previous post,
I explained, how to serialize and deserialize data using Apache avro. In
previous post to perform serialization and deserialization, I generated
Employee.java, by using “avro-tools” library. In this post, I am going to
explain, how to serialize and deserialize the data without code generation.
Prevoius
Next
Home
It is
three-step process.
1.
Create
Employee object from the schema file.
2.
Serialize
the employee object
3.
Deserialize
the employee object.
employee.avsc
{"namespace": "tutorial.model", "type": "record", "name": "Employee", "fields": [ {"name": "firstName", "type": "string"}, {"name": "lastName", "type": "string"}, {"name": "age", "type": "int"}, {"name": "id", "type": "string"}, {"name" : "company", "type" : "string"} ] }
Following
code is used to create an Employee object from schema file.
/* Read
schema definition and create schema object */
Schema
schema = new Schema.Parser().parse(new File(schemaFileName));
/* Use the
schema and create an employee object */
GenericRecord
employee = new GenericData.Record(schema);
/* Define
employee */
employee.put("firstName",
"Hari krishna");
employee.put("lastName",
"Gurram");
employee.put("age",
27);
employee.put("id",
"E432123");
employee.put("company",
"ABCD");
Follow the
steps to implement complete Application.
Step 1: Create Eclipse maven project “avro_tutorial”.
File ->
New -> Other
Select Maven
Project and press Next
Select the
check box “Create a simple project (Skip archetype selection) and press Next.
Give Group
Id and Artifact Id as “avro_tutorial” and press Finish.
Step 2: Open “pom.xml” and update dependencies for avro.
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>avro_tutorial</groupId> <artifactId>avro_tutorial</artifactId> <version>0.0.1-SNAPSHOT</version> <properties> <avro_version>1.7.7</avro_version> </properties> <dependencies> <dependency> <groupId>org.apache.avro</groupId> <artifactId>avro</artifactId> <version>${avro_version}</version> </dependency> </dependencies> </project>
Step 3: Create “EmployeeUtil.java” under the package
“tutorial.main”.
package tutorial.main; import java.io.File; import java.io.IOException; import org.apache.avro.Schema; import org.apache.avro.file.DataFileReader; import org.apache.avro.file.DataFileWriter; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericDatumWriter; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DatumWriter; public class EmployeeUtil { public static void serializeEmployee(GenericRecord emp, String serializedFileName, String schemaFileName) throws IOException { /* Read schema definition and create schema object */ Schema schema = new Schema.Parser().parse(new File(schemaFileName)); DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<GenericRecord>( schema); DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<GenericRecord>( datumWriter); dataFileWriter.create(schema, new File(serializedFileName)); dataFileWriter.append(emp); dataFileWriter.close(); } public static GenericRecord deSerializeEmployee(String serializedFileName, String schemaFileName) throws IOException { /* Read schema definition and create schema object */ Schema schema = new Schema.Parser().parse(new File(schemaFileName)); // De-serialize employee from disk File file = new File(serializedFileName); // Deserialize users from disk DatumReader<GenericRecord> datumReader = new GenericDatumReader<GenericRecord>( schema); DataFileReader<GenericRecord> dataFileReader = new DataFileReader<GenericRecord>( file, datumReader); GenericRecord employee = null; while (dataFileReader.hasNext()) { /* * Reuse employee object by passing it to next(). This saves us from * allocating and garbage collecting many objects for files with * many items. */ employee = dataFileReader.next(employee); } dataFileReader.close(); return employee; } public static GenericRecord createEmployee(String schemaFileName) throws IOException { /* Read schema definition and create schema object */ Schema schema = new Schema.Parser().parse(new File(schemaFileName)); /* Use the schema and create an employee object */ GenericRecord employee = new GenericData.Record(schema); /* Define employee */ employee.put("firstName", "Hari krishna"); employee.put("lastName", "Gurram"); employee.put("age", 27); employee.put("id", "E432123"); employee.put("company", "ABCD"); return employee; } }
Step 4: Create “Main.java”.
package tutorial.main; import java.io.IOException; import org.apache.avro.generic.GenericRecord; public class Main { public static void main(String args[]) throws IOException { GenericRecord employee = EmployeeUtil.createEmployee("employee.avsc"); EmployeeUtil.serializeEmployee(employee, "ser.out", "employee.avsc"); GenericRecord employee1 = EmployeeUtil.deSerializeEmployee("ser.out", "employee.avsc"); System.out.println(employee1); } }
No comments:
Post a Comment