Monday 4 July 2022

Atlas client: update Atlas type definition

This is continuation to my previous post. In my previous post, I defined a new type ‘tablet’.

 

tablet definition looks like below

{
  "category": "ENTITY",
  "guid": "e7201b2c-a87d-49ef-aac0-c228b92f9763",
  "createdBy": "admin",
  "updatedBy": "admin",
  "createTime": 1644561370044,
  "updateTime": 1644561370044,
  "version": 1,
  "name": "tablet",
  "description": "Represent a laptoo specification",
  "typeVersion": "1.0",
  "attributeDefs": [
    {
      "name": "screen_size",
      "typeName": "string",
      "isOptional": false,
      "cardinality": "SINGLE",
      "valuesMinCount": 1,
      "valuesMaxCount": 1,
      "isUnique": false,
      "isIndexable": true,
      "includeInNotification": false,
      "searchWeight": -1
    },
    {
      "name": "operating_system",
      "typeName": "string",
      "isOptional": false,
      "cardinality": "SINGLE",
      "valuesMinCount": 1,
      "valuesMaxCount": 1,
      "isUnique": false,
      "isIndexable": true,
      "includeInNotification": false,
      "searchWeight": -1
    }
  ],
  "superTypes": [
    "DataSet"
  ],
  "subTypes": [],
  "relationshipAttributeDefs": [
    {
      "name": "inputToProcesses",
      "typeName": "array<Process>",
      "isOptional": true,
      "cardinality": "SET",
      "valuesMinCount": -1,
      "valuesMaxCount": -1,
      "isUnique": false,
      "isIndexable": false,
      "includeInNotification": false,
      "searchWeight": -1,
      "relationshipTypeName": "dataset_process_inputs",
      "isLegacyAttribute": false
    },
    {
      "name": "pipeline",
      "typeName": "spark_ml_pipeline",
      "isOptional": true,
      "cardinality": "SINGLE",
      "valuesMinCount": -1,
      "valuesMaxCount": -1,
      "isUnique": false,
      "isIndexable": false,
      "includeInNotification": false,
      "searchWeight": -1,
      "relationshipTypeName": "spark_ml_pipeline_dataset",
      "isLegacyAttribute": false
    },
    {
      "name": "schema",
      "typeName": "array<avro_schema>",
      "isOptional": true,
      "cardinality": "SET",
      "valuesMinCount": -1,
      "valuesMaxCount": -1,
      "isUnique": false,
      "isIndexable": false,
      "includeInNotification": false,
      "searchWeight": -1,
      "relationshipTypeName": "avro_schema_associatedEntities",
      "isLegacyAttribute": false
    },
    {
      "name": "model",
      "typeName": "spark_ml_model",
      "isOptional": true,
      "cardinality": "SINGLE",
      "valuesMinCount": -1,
      "valuesMaxCount": -1,
      "isUnique": false,
      "isIndexable": false,
      "includeInNotification": false,
      "searchWeight": -1,
      "relationshipTypeName": "spark_ml_model_dataset",
      "isLegacyAttribute": false
    },
    {
      "name": "meanings",
      "typeName": "array<AtlasGlossaryTerm>",
      "isOptional": true,
      "cardinality": "SET",
      "valuesMinCount": -1,
      "valuesMaxCount": -1,
      "isUnique": false,
      "isIndexable": false,
      "includeInNotification": false,
      "searchWeight": -1,
      "relationshipTypeName": "AtlasGlossarySemanticAssignment",
      "isLegacyAttribute": false
    },
    {
      "name": "outputFromProcesses",
      "typeName": "array<Process>",
      "isOptional": true,
      "cardinality": "SET",
      "valuesMinCount": -1,
      "valuesMaxCount": -1,
      "isUnique": false,
      "isIndexable": false,
      "includeInNotification": false,
      "searchWeight": -1,
      "relationshipTypeName": "process_dataset_outputs",
      "isLegacyAttribute": false
    }
  ],
  "businessAttributeDefs": {}
}

 

As you see the definition, it has two attributes defined.

a.   screen_size

b.   operating_system

 

Now, I want to add one more property ‘configuration’ to the type ‘tablet’.

 

Step 1: Define new attribute definition.

AtlasAttributeDef attributeDef1 = new AtlasAttributeDef();
attributeDef1.setName("configuration");
attributeDef1.setTypeName("string");
attributeDef1.setCardinality(AtlasAttributeDef.Cardinality.SINGLE);
attributeDef1.setIsIndexable(true);
attributeDef1.setIsUnique(false);
attributeDef1.setIsOptional(true);

 

Step 2: Get the tablet entity and attach new attribute to it.

AtlasEntityDef atlasEntityDef = atlasClient.getEntityDefByName("tablet");
atlasEntityDef.getAttributeDefs().add(attributeDef1);

 

Step 3: Persist the changes to Atlas.

AtlasTypesDef atlasTypesDef = new AtlasTypesDef();
atlasTypesDef.getEntityDefs().add(atlasEntityDef);
AtlasTypesDef atlasTypesDefResponse = atlasClient.updateAtlasTypeDefs(atlasTypesDef);

Find the below working application.

 

Step 1: Create atlas-application.properties file under src/main/resources folder.

 

atlas-application.properties

atlas.client.readTimeoutMSecs=30000
atlas.client.connectTimeoutMSecs=30000

Step 2: Define JsonUtil and UpdateType classes.

 

JsonUtil.java

package com.sample.app.util;

import java.io.IOException;

import com.fasterxml.jackson.core.JsonParseException;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.JsonMappingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;

public class JsonUtil {
  public static String marshal(Object obj) throws JsonProcessingException {
    ObjectMapper mapper = new ObjectMapper();
    return mapper.writeValueAsString(obj);
  }

  public static <T> T unmarshal(Class<T> clazz, String json)
      throws JsonParseException, JsonMappingException, IOException {
    ObjectMapper mapper = new ObjectMapper();
    return (T) mapper.readValue(json, clazz);
  }

  public static String prettyPrintJson(Object obj) throws JsonProcessingException {
    ObjectMapper mapper = new ObjectMapper();
    mapper.enable(SerializationFeature.INDENT_OUTPUT);
    return mapper.writeValueAsString(obj);
  }
}

UpdateType.java

package com.sample.app.types;

import org.apache.atlas.AtlasClientV2;
import org.apache.atlas.AtlasServiceException;
import org.apache.atlas.model.typedef.AtlasEntityDef;
import org.apache.atlas.model.typedef.AtlasStructDef.AtlasAttributeDef;
import org.apache.atlas.model.typedef.AtlasTypesDef;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.sample.app.util.JsonUtil;

public class UpdateType {
  public static void main(String[] args) throws AtlasServiceException, JsonProcessingException {
    AtlasClientV2 atlasClient = new AtlasClientV2(new String[] { "http://localhost:21000" },
        new String[] { "admin", "admin" });

    AtlasEntityDef atlasEntityDef = atlasClient.getEntityDefByName("tablet");

    AtlasAttributeDef attributeDef1 = new AtlasAttributeDef();
    attributeDef1.setName("configuration");
    attributeDef1.setTypeName("string");
    attributeDef1.setCardinality(AtlasAttributeDef.Cardinality.SINGLE);
    attributeDef1.setIsIndexable(true);
    attributeDef1.setIsUnique(false);
    attributeDef1.setIsOptional(true);

    atlasEntityDef.getAttributeDefs().add(attributeDef1);

    AtlasTypesDef atlasTypesDef = new AtlasTypesDef();

    atlasTypesDef.getEntityDefs().add(atlasEntityDef);

    AtlasTypesDef atlasTypesDefResponse = atlasClient.updateAtlasTypeDefs(atlasTypesDef);

    String json = JsonUtil.prettyPrintJson(atlasTypesDefResponse);
    System.out.println(json);

  }
}

Output

{
  "enumDefs" : [ ],
  "structDefs" : [ ],
  "classificationDefs" : [ ],
  "entityDefs" : [ {
    "category" : "ENTITY",
    "guid" : "e7201b2c-a87d-49ef-aac0-c228b92f9763",
    "createdBy" : "admin",
    "updatedBy" : "admin",
    "createTime" : 1644561370044,
    "updateTime" : 1644563715634,
    "version" : 2,
    "name" : "tablet",
    "description" : "Represent a laptoo specification",
    "typeVersion" : "1.0",
    "attributeDefs" : [ {
      "name" : "screen_size",
      "typeName" : "string",
      "isOptional" : false,
      "cardinality" : "SINGLE",
      "valuesMinCount" : 1,
      "valuesMaxCount" : 1,
      "isUnique" : false,
      "isIndexable" : true,
      "includeInNotification" : false,
      "searchWeight" : -1
    }, {
      "name" : "operating_system",
      "typeName" : "string",
      "isOptional" : false,
      "cardinality" : "SINGLE",
      "valuesMinCount" : 1,
      "valuesMaxCount" : 1,
      "isUnique" : false,
      "isIndexable" : true,
      "includeInNotification" : false,
      "searchWeight" : -1
    }, {
      "name" : "configuration",
      "typeName" : "string",
      "isOptional" : true,
      "cardinality" : "SINGLE",
      "valuesMinCount" : 0,
      "valuesMaxCount" : 1,
      "isUnique" : false,
      "isIndexable" : true,
      "includeInNotification" : false,
      "searchWeight" : -1
    } ],
    "superTypes" : [ "DataSet" ],
    "subTypes" : [ ],
    "relationshipAttributeDefs" : [ {
      "name" : "inputToProcesses",
      "typeName" : "array<Process>",
      "isOptional" : true,
      "cardinality" : "SET",
      "valuesMinCount" : -1,
      "valuesMaxCount" : -1,
      "isUnique" : false,
      "isIndexable" : false,
      "includeInNotification" : false,
      "searchWeight" : -1,
      "relationshipTypeName" : "dataset_process_inputs",
      "isLegacyAttribute" : false
    }, {
      "name" : "pipeline",
      "typeName" : "spark_ml_pipeline",
      "isOptional" : true,
      "cardinality" : "SINGLE",
      "valuesMinCount" : -1,
      "valuesMaxCount" : -1,
      "isUnique" : false,
      "isIndexable" : false,
      "includeInNotification" : false,
      "searchWeight" : -1,
      "relationshipTypeName" : "spark_ml_pipeline_dataset",
      "isLegacyAttribute" : false
    }, {
      "name" : "schema",
      "typeName" : "array<avro_schema>",
      "isOptional" : true,
      "cardinality" : "SET",
      "valuesMinCount" : -1,
      "valuesMaxCount" : -1,
      "isUnique" : false,
      "isIndexable" : false,
      "includeInNotification" : false,
      "searchWeight" : -1,
      "relationshipTypeName" : "avro_schema_associatedEntities",
      "isLegacyAttribute" : false
    }, {
      "name" : "model",
      "typeName" : "spark_ml_model",
      "isOptional" : true,
      "cardinality" : "SINGLE",
      "valuesMinCount" : -1,
      "valuesMaxCount" : -1,
      "isUnique" : false,
      "isIndexable" : false,
      "includeInNotification" : false,
      "searchWeight" : -1,
      "relationshipTypeName" : "spark_ml_model_dataset",
      "isLegacyAttribute" : false
    }, {
      "name" : "meanings",
      "typeName" : "array<AtlasGlossaryTerm>",
      "isOptional" : true,
      "cardinality" : "SET",
      "valuesMinCount" : -1,
      "valuesMaxCount" : -1,
      "isUnique" : false,
      "isIndexable" : false,
      "includeInNotification" : false,
      "searchWeight" : -1,
      "relationshipTypeName" : "AtlasGlossarySemanticAssignment",
      "isLegacyAttribute" : false
    }, {
      "name" : "outputFromProcesses",
      "typeName" : "array<Process>",
      "isOptional" : true,
      "cardinality" : "SET",
      "valuesMinCount" : -1,
      "valuesMaxCount" : -1,
      "isUnique" : false,
      "isIndexable" : false,
      "includeInNotification" : false,
      "searchWeight" : -1,
      "relationshipTypeName" : "process_dataset_outputs",
      "isLegacyAttribute" : false
    } ],
    "businessAttributeDefs" : { }
  } ],
  "relationshipDefs" : [ ],
  "businessMetadataDefs" : [ ]
}

As you see the output, you can confirm that new attribute ‘configuration’ is added.

 

Note

a.   You can’t add a mandatory attribute to the existing type.

 


 

Previous                                                    Next                                                    Home

No comments:

Post a Comment