Wednesday, 30 October 2019

Install and setup kafka in mac


Download kafka binaries
Go to the location ‘https://kafka.apache.org/downloads’ and download latest binaries of kafka.

Extract the downloaded zip file.

Extract kafka binary distribution contains below directory structure.
$ls kafka_2.12-2.3.0/
LICENSE  NOTICE  bin  config  libs  site-docs

Add kafka binary path to system path
Open .bash_profile  (bash_profile is located at your home directory ~/.bash_profile)and attach the kafka bin directory path to system path.

export PATH=$PATH:{Kafka bin directory path}

Reloade bash_profile by executing below command.
source ~/.bash_profile

Test kafka installation

Execute the command ‘kafka-topics.sh’. If you can see below kind of output, your installation is successful.
$kafka-topics.sh 
Create, delete, describe, or change a topic.
Option                                   Description                            
------                                   -----------                            
--alter                                  Alter the number of partitions,        
                                           replica assignment, and/or           
                                           configuration for the topic.         
--at-min-isr-partitions                  if set when describing topics, only    
                                           show partitions whose isr count is   
                                           equal to the configured minimum. Not 
                                           supported with the --zookeeper       
                                           option.                              
--bootstrap-server <String: server to    REQUIRED: The Kafka server to connect  
  connect to>                              to. In case of providing this, a     
                                           direct Zookeeper connection won't be 
                                           required.                            
--command-config <String: command        Property file containing configs to be 
  config property file>                    passed to Admin Client. This is used 
                                           only with --bootstrap-server option  
                                           for describing and altering broker   
                                           configs.                             
--config <String: name=value>            A topic configuration override for the 
                                           topic being created or altered.The   
                                           following is a list of valid         
                                           configurations:                      
                                          cleanup.policy                        
                                          compression.type                      
                                          delete.retention.ms                   
                                          file.delete.delay.ms                  
                                          flush.messages                        
                                          flush.ms                              
                                          follower.replication.throttled.       
                                           replicas                             
                                          index.interval.bytes                  
                                          leader.replication.throttled.replicas 
                                          max.compaction.lag.ms                 
                                          max.message.bytes                     
                                          message.downconversion.enable         
                                          message.format.version                
                                          message.timestamp.difference.max.ms   
                                          message.timestamp.type                
                                          min.cleanable.dirty.ratio             
                                          min.compaction.lag.ms                 
                                          min.insync.replicas                   
                                          preallocate                           
                                          retention.bytes                       
                                          retention.ms                          
                                          segment.bytes                         
                                          segment.index.bytes                   
                                          segment.jitter.ms                     
                                          segment.ms                            
                                          unclean.leader.election.enable        
                                         See the Kafka documentation for full   
                                           details on the topic configs.It is   
                                           supported only in combination with --
                                           create if --bootstrap-server option  
                                           is used.                             
--create                                 Create a new topic.                    
--delete                                 Delete a topic                         
--delete-config <String: name>           A topic configuration override to be   
                                           removed for an existing topic (see   
                                           the list of configurations under the 
                                           --config option). Not supported with 
                                           the --bootstrap-server option.       
--describe                               List details for the given topics.     
--disable-rack-aware                     Disable rack aware replica assignment  
--exclude-internal                       exclude internal topics when running   
                                           list or describe command. The        
                                           internal topics will be listed by    
                                           default                              
--force                                  Suppress console prompts               
--help                                   Print usage information.               
--if-exists                              if set when altering or deleting or    
                                           describing topics, the action will   
                                           only execute if the topic exists.    
                                           Not supported with the --bootstrap-  
                                           server option.                       
--if-not-exists                          if set when creating topics, the       
                                           action will only execute if the      
                                           topic does not already exist. Not    
                                           supported with the --bootstrap-      
                                           server option.                       
--list                                   List all available topics.             
--partitions <Integer: # of partitions>  The number of partitions for the topic 
                                           being created or altered (WARNING:   
                                           If partitions are increased for a    
                                           topic that has a key, the partition  
                                           logic or ordering of the messages    
                                           will be affected                     
--replica-assignment <String:            A list of manual partition-to-broker   
  broker_id_for_part1_replica1 :           assignments for the topic being      
  broker_id_for_part1_replica2 ,           created or altered.                  
  broker_id_for_part2_replica1 :                                                
  broker_id_for_part2_replica2 , ...>                                           
--replication-factor <Integer:           The replication factor for each        
  replication factor>                      partition in the topic being created.
--topic <String: topic>                  The topic to create, alter, describe   
                                           or delete. It also accepts a regular 
                                           expression, except for --create      
                                           option. Put topic name in double     
                                           quotes and use the '\' prefix to     
                                           escape regular expression symbols; e.
                                           g. "test\.topic".                    
--topics-with-overrides                  if set when describing topics, only    
                                           show topics that have overridden     
                                           configs                              
--unavailable-partitions                 if set when describing topics, only    
                                           show partitions whose leader is not  
                                           available                            
--under-min-isr-partitions               if set when describing topics, only    
                                           show partitions whose isr count is   
                                           less than the configured minimum.    
                                           Not supported with the --zookeeper   
                                           option.                              
--under-replicated-partitions            if set when describing topics, only    
                                           show under replicated partitions     
--version                                Display Kafka version.                 
--zookeeper <String: hosts>              DEPRECATED, The connection string for  
                                           the zookeeper connection in the form 
                                           host:port. Multiple hosts can be     
                                           given to allow fail-over.            
$

Start Zookeeper
Before start kafka service, we need to start zookeeper. Zookeeper read the configurations like port where it run, directory location where to store data etc., from a property file.


‘config’ directory of kafka installation has a ‘zookeeper.properties’ file. We can use this to our zookeeper instance.
$ls
LICENSE  NOTICE  bin  config  libs  site-docs
$
$ls config
connect-console-sink.properties  connect-log4j.properties  server.properties
connect-console-source.properties connect-standalone.properties  tools-log4j.properties
connect-distributed.properties  consumer.properties   trogdor.conf
connect-file-sink.properties  log4j.properties   zookeeper.properties
connect-file-source.properties  producer.properties


Let’s open zookeeper.properties file and see the contents.
$cat config/zookeeper.properties 
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
# 
#    http://www.apache.org/licenses/LICENSE-2.0
# 
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# the directory where the snapshot is stored.
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0

As you see the property file, zookeeper data directory location is set to /tmp/zookeeper and port where zookeeper starts set to 2181.

Since the data in /tmp directory will not live longer, I changed dataDir path to some different directory other than /tmp.

Start zookeeper instance by executing below command.

zookeeper-server-start.sh config/zookeeper.properties


You can see below messages in console.
[2019-10-04 19:44:01,592] INFO Server environment:java.compiler=<NA> (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,593] INFO Server environment:os.name=Mac OS X (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,593] INFO Server environment:os.arch=x86_64 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,593] INFO Server environment:os.version=10.14.5 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,593] INFO Server environment:user.name=krishna (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,593] INFO Server environment:user.home=/Users/krishna (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,593] INFO Server environment:user.dir=/Users/krishna/Documents/TechnicalDocuments/kafka/software/kafka_2.12-2.3.0 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,598] INFO tickTime set to 3000 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,598] INFO minSessionTimeout set to -1 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,598] INFO maxSessionTimeout set to -1 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-04 19:44:01,611] INFO Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory (org.apache.zookeeper.server.ServerCnxnFactory)
[2019-10-04 19:44:01,625] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)

As you see the console messages, zookeeper instance started at port 2181.

Start Kafka
We can pass the configuration properties to the kafka instance while starting.


Default configuration file ‘server.properties’ is located at config folder of kafka installation directory.
$ls
LICENSE  NOTICE  bin  config  libs  logs  site-docs
$
$ls config/
connect-console-sink.properties  connect-log4j.properties  server.properties
connect-console-source.properties connect-standalone.properties  tools-log4j.properties
connect-distributed.properties  consumer.properties   trogdor.conf
connect-file-sink.properties  log4j.properties   zookeeper.properties
connect-file-source.properties  producer.properties


Open server.properties and search for the property log.dirs.
# A comma separated list of directories under which to store log files
log.dirs=/tmp/kafka-logs

By default log.dirs point to /tmp/kafka-logs directory. Since data in /tmp directory will not live longer, I am going to change the value of log.dirs property.

For example, I created a directory ‘kafka_data’ and set log.dirs property to this directory path.

Execute below command to start kafka instance.

kafka-server-start.sh config/server.properties


You can see below messages in console.
[2019-10-04 19:52:16,236] INFO [/config/changes-event-process-thread]: Starting (kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread)
[2019-10-04 19:52:16,242] INFO [SocketServer brokerId=0] Started data-plane processors for 1 acceptors (kafka.network.SocketServer)
[2019-10-04 19:52:16,247] INFO Kafka version: 2.3.0 (org.apache.kafka.common.utils.AppInfoParser)
[2019-10-04 19:52:16,247] INFO Kafka commitId: fc1aaa116b661c8a (org.apache.kafka.common.utils.AppInfoParser)
[2019-10-04 19:52:16,247] INFO Kafka startTimeMs: 1570198936243 (org.apache.kafka.common.utils.AppInfoParser)
[2019-10-04 19:52:16,249] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)


Now you can see that some files are created in the directory kafka_data (log.dirs is set to this).
$ls kafka_data/
cleaner-offset-checkpoint  meta.properties    replication-offset-checkpoint
log-start-offset-checkpoint  recovery-point-offset-checkpoint


Previous                                                    Next                                                    Home

No comments:

Post a Comment