Wednesday, 1 June 2022

Setup Apache atlas in embedded mode

Step 1: Download latest release from below link.

https://github.com/apache/atlas/tags

 

Step 2: Open a terminal and navigate to the directory where atlas pom.xml file is located, execute below commands.

export MAVEN_OPTS="-Xmx1500m"

mvn clean -DskipTests package -Pdist,embedded-hbase-solr

Upon successful completion of mvn command, you can see below artifacts in distro/target folder.

distro/target/apache-atlas-{project.version}-bin.tar.gz   
distro/target/apache-atlas-{project.version}-impala-hook.tar.gz 
distro/target/apache-atlas-{project.version}-sqoop-hook.tar.gz
distro/target/apache-atlas-{project.version}-falcon-hook.tar.gz 
distro/target/apache-atlas-{project.version}-kafka-hook.tar.gz  
distro/target/apache-atlas-{project.version}-storm-hook.tar.gz
distro/target/apache-atlas-{project.version}-hbase-hook.tar.gz  
distro/target/apache-atlas-{project.version}-server.tar.gz
distro/target/apache-atlas-{project.version}-hive-hook.tar.gz 
distro/target/apache-atlas-{project.version}-sources.tar.gz

Step 3: Copy the file 'apache-atlas-{project.version}-server.tar.gz' to a folder where you would like to install  Apache Atlas, and run the following commands.

tar -xzvf apache-atlas-{project.version}-server.tar.gz
cd apache-atlas-{project.version}

Step 4: Start Apache atlas server.

To run Apache Atlas with local Apache HBase & Apache Solr instances that are started/stopped along with Atlas start/stop, run following commands:

export MANAGE_LOCAL_HBASE=true
export MANAGE_LOCAL_SOLR=true
bin/atlas_start.py

bash-3.2$ bin/atlas_start.py

Configured for local HBase.
Starting local HBase...
Local HBase started!

Configured for local Solr.
Starting local Solr...
solr.xml doesn't exist in /Users/krishna/Documents/softwares/atlas-exec/apache-atlas-2.2.0/data/solr, copying from /Users/krishna/Documents/softwares/atlas-exec/apache-atlas-2.2.0/solr/server/solr/solr.xml
Local Solr started!

Creating Solr collections for Atlas using config: /Users/krishna/Documents/softwares/atlas-exec/apache-atlas-2.2.0/conf/solr

Starting Atlas server on host: localhost
Starting Atlas server on port: 21000
...........................
Apache Atlas Server started!!!

 

Confirm whether apache atlas instance is running successfully or not

Execute below command.

curl -u admin:admin http://localhost:21000/api/atlas/admin/version

bash-3.2$ curl -u admin:admin http://localhost:21000/api/atlas/admin/version
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 503 Service Unavailable</title>
</head>
<body><h2>HTTP ERROR 503 Service Unavailable</h2>
<table>
<tr><th>URI:</th><td>/api/atlas/admin/version</td></tr>
<tr><th>STATUS:</th><td>503</td></tr>
<tr><th>MESSAGE:</th><td>Service Unavailable</td></tr>
<tr><th>SERVLET:</th><td>-</td></tr>
</table>

</body>
</html>


Oops, I got service unavailable error. When I check the log messages, I see below error.

 

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://192.168.1.100:9838/solr: Can not find the specified config set: vertex_index
	at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681)
	at 

 I stopped the server manually by executing the command ‘bin/atlas_stop.py’.

bash-3.2$ bin/atlas_stop.py 
stopping atlas.
Apache Atlas Server stopped!!!

Sending stop command to Solr running on port 9838 ... waiting up to 180 seconds to allow Jetty process 78818 to stop gracefully.
running master, logging to /Users/krishna/Documents/softwares/atlas-exec/apache-atlas-2.2.0/hbase/bin/../logs/hbase-krishna-master-m-c02g415mmd6n.out
stopping master.

 

Start the solr service manually

./solr/bin/solr start -c -z localhost:2181 -p 8984 -force

bash-3.2$ ./solr/bin/solr start -c -z localhost:2181 -p 8984 -force
*** [WARN] *** Your open file limit is currently 2560.  
 It should be set to 65000 to avoid operational disruption. 
 If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
*** [WARN] ***  Your Max Processes Limit is currently 2784. 
 It should be set to 65000 to avoid operational disruption. 
 If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
Waiting up to 180 seconds to see Solr running on port 8984 [/]  
Started Solr server on port 8984 (pid=93588). Happy searching!

Create an initialization index library for solr by executing below command.

./solr/bin/solr create -c vertex_index -shards 1 -replicationFactor 1 -force

bash-3.2$ ./solr/bin/solr create -c vertex_index -shards 1 -replicationFactor 1 -force
WARNING: Using _default configset with data driven schema functionality. NOT RECOMMENDED for production use.
         To turn off: bin/solr config -c vertex_index -p 8984 -action set-user-property -property update.autoCreateFields -value false
Created collection 'vertex_index' with 1 shard(s), 1 replica(s) with config-set 'vertex_index'

Stop solr server.

bash-3.2$ ./solr/bin/solr stop
Sending stop command to Solr running on port 8984 ... waiting up to 180 seconds to allow Jetty process 93588 to stop gracefully.
bash-3.2$

Start apache atlas server again

bash-3.2$ bin/atlas_start.py

Configured for local HBase.
Starting local HBase...
Local HBase started!

Configured for local Solr.
Starting local Solr...
Local Solr started!

Creating Solr collections for Atlas using config: /Users/krishna/Documents/softwares/atlas-exec/apache-atlas-2.2.0/conf/solr

Starting Atlas server on host: localhost
Starting Atlas server on port: 21000
........
Apache Atlas Server started!!!

Execute the command ‘curl -u admin:admin http://localhost:21000/api/atlas/admin/version’ to confirm whether apache atlas running successfully or not.

bash-3.2$ curl -u admin:admin http://localhost:21000/api/atlas/admin/version
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 503 Service Unavailable</title>
</head>
<body><h2>HTTP ERROR 503 Service Unavailable</h2>
<table>
<tr><th>URI:</th><td>/api/atlas/admin/version</td></tr>
<tr><th>STATUS:</th><td>503</td></tr>
<tr><th>MESSAGE:</th><td>Service Unavailable</td></tr>
<tr><th>SERVLET:</th><td>-</td></tr>
</table>

</body>
</html>

Come on…..not again..

 

When I check application.log file, I seen below error.

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://192.168.1.100:9838/solr: Can not find the specified config set: edge_index

Stop atlas server

bin/atlas_stop.py

 

Start solr server

./solr/bin/solr start -c -z localhost:2181 -p 8984 -force

 

Create edge_index collection

./solr/bin/solr create -c edge_index -shards 1 -replicationFactor 1 -force

 

Now stop solr server and start apache atlas again

./solr/bin/solr stop

bin/atlas_start.py

 

Reexecute the curl command to confirm

$ curl -u admin:admin http://localhost:21000/api/atlas/admin/version
{"Description":"Metadata Management and Data Governance Platform over Hadoop","Revision":"release","Version":"2.2.0","Name":"apache-atlas"}

Get all the types

curl -u admin:admin http://localhost:21000/api/atlas/v2/types/typedefs/headers

 

 

That’s it, you are done…..


Previous                                                    Next                                                    Home

No comments:

Post a Comment