Download
kafka binaries
Go to the
location ‘https://kafka.apache.org/downloads’ and download latest binaries of
kafka.
Extract
the downloaded zip file.
Extract
kafka binary distribution contains below directory structure.
$ls kafka_2.12-2.3.0/
LICENSE NOTICE bin config libs site-docs
Add
kafka binary path to system path
Open
.bash_profile (bash_profile is located
at your home directory ~/.bash_profile)and attach the kafka bin directory path
to system path.
export
PATH=$PATH:{Kafka bin directory path}
Reloade
bash_profile by executing below command.
source
~/.bash_profile
Test
kafka installation
Execute
the command ‘kafka-topics.sh’. If you can see below kind of output, your installation
is successful.
$kafka-topics.sh
Create, delete, describe, or change a topic.
Option Description
------ -----------
--alter Alter the number of partitions,
replica assignment, and/or
configuration for the topic.
--at-min-isr-partitions if set when describing topics, only
show partitions whose isr count is
equal to the configured minimum. Not
supported with the --zookeeper
option.
--bootstrap-server <String: server to REQUIRED: The Kafka server to connect
connect to> to. In case of providing this, a
direct Zookeeper connection won't be
required.
--command-config <String: command Property file containing configs to be
config property file> passed to Admin Client. This is used
only with --bootstrap-server option
for describing and altering broker
configs.
--config <String: name=value> A topic configuration override for the
topic being created or altered.The
following is a list of valid
configurations:
cleanup.policy
compression.type
delete.retention.ms
file.delete.delay.ms
flush.messages
flush.ms
follower.replication.throttled.
replicas
index.interval.bytes
leader.replication.throttled.replicas
max.compaction.lag.ms
max.message.bytes
message.downconversion.enable
message.format.version
message.timestamp.difference.max.ms
message.timestamp.type
min.cleanable.dirty.ratio
min.compaction.lag.ms
min.insync.replicas
preallocate
retention.bytes
retention.ms
segment.bytes
segment.index.bytes
segment.jitter.ms
segment.ms
unclean.leader.election.enable
See the Kafka documentation for full
details on the topic configs.It is
supported only in combination with --
create if --bootstrap-server option
is used.
--create Create a new topic.
--delete Delete a topic
--delete-config <String: name> A topic configuration override to be
removed for an existing topic (see
the list of configurations under the
--config option). Not supported with
the --bootstrap-server option.
--describe List details for the given topics.
--disable-rack-aware Disable rack aware replica assignment
--exclude-internal exclude internal topics when running
list or describe command. The
internal topics will be listed by
default
--force Suppress console prompts
--help Print usage information.
--if-exists if set when altering or deleting or
describing topics, the action will
only execute if the topic exists.
Not supported with the --bootstrap-
server option.
--if-not-exists if set when creating topics, the
action will only execute if the
topic does not already exist. Not
supported with the --bootstrap-
server option.
--list List all available topics.
--partitions <Integer: # of partitions> The number of partitions for the topic
being created or altered (WARNING:
If partitions are increased for a
topic that has a key, the partition
logic or ordering of the messages
will be affected
--replica-assignment <String: A list of manual partition-to-broker
broker_id_for_part1_replica1 : assignments for the topic being
broker_id_for_part1_replica2 , created or altered.
broker_id_for_part2_replica1 :
broker_id_for_part2_replica2 , ...>
--replication-factor <Integer: The replication factor for each
replication factor> partition in the topic being created.
--topic <String: topic> The topic to create, alter, describe
or delete. It also accepts a regular
expression, except for --create
option. Put topic name in double
quotes and use the '\' prefix to
escape regular expression symbols; e.
g. "test\.topic".
--topics-with-overrides if set when describing topics, only
show topics that have overridden
configs
--unavailable-partitions if set when describing topics, only
show partitions whose leader is not
available
--under-min-isr-partitions if set when describing topics, only
show partitions whose isr count is
less than the configured minimum.
Not supported with the --zookeeper
option.
--under-replicated-partitions if set when describing topics, only
show under replicated partitions
--version Display Kafka version.
--zookeeper <String: hosts> DEPRECATED, The connection string for
the zookeeper connection in the form
host:port. Multiple hosts can be
given to allow fail-over.
$
Start
Zookeeper
Before
start kafka service, we need to start zookeeper. Zookeeper read the
configurations like port where it run, directory location where to store data
etc., from a property file.
‘config’
directory of kafka installation has a ‘zookeeper.properties’ file. We can use
this to our zookeeper instance.
$ls
LICENSE NOTICE bin config libs site-docs
$
$ls config
connect-console-sink.properties connect-log4j.properties server.properties
connect-console-source.properties connect-standalone.properties tools-log4j.properties
connect-distributed.properties consumer.properties trogdor.conf
connect-file-sink.properties log4j.properties zookeeper.properties
connect-file-source.properties producer.properties
Let’s open
zookeeper.properties file and see the contents.
$cat config/zookeeper.properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# the directory where the snapshot is stored.
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
As you see
the property file, zookeeper data directory location is set to /tmp/zookeeper
and port where zookeeper starts set to 2181.
Since the data
in /tmp directory will not live longer, I changed dataDir path to some
different directory other than /tmp.
Start
zookeeper instance by executing below command.
zookeeper-server-start.sh
config/zookeeper.properties
You can
see below messages in console.
[2019-10-04 19:44:01,592] INFO Server environment:java.compiler=<NA> (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,593] INFO Server environment:os.name=Mac OS X (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,593] INFO Server environment:os.arch=x86_64 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,593] INFO Server environment:os.version=10.14.5 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,593] INFO Server environment:user.name=krishna (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,593] INFO Server environment:user.home=/Users/krishna (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,593] INFO Server environment:user.dir=/Users/krishna/Documents/TechnicalDocuments/kafka/software/kafka_2.12-2.3.0 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,598] INFO tickTime set to 3000 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,598] INFO minSessionTimeout set to -1 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,598] INFO maxSessionTimeout set to -1 (org.apache.zookeeper.server.ZooKeeperServer) [2019-10-04 19:44:01,611] INFO Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory (org.apache.zookeeper.server.ServerCnxnFactory) [2019-10-04 19:44:01,625] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
As you see
the console messages, zookeeper instance started at port 2181.
Start Kafka
We can
pass the configuration properties to the kafka instance while starting.
Default
configuration file ‘server.properties’ is located at config folder of kafka
installation directory.
$ls
LICENSE NOTICE bin config libs logs site-docs
$
$ls config/
connect-console-sink.properties connect-log4j.properties server.properties
connect-console-source.properties connect-standalone.properties tools-log4j.properties
connect-distributed.properties consumer.properties trogdor.conf
connect-file-sink.properties log4j.properties zookeeper.properties
connect-file-source.properties producer.properties
Open
server.properties and search for the property log.dirs.
# A comma separated list of directories under which to store log files log.dirs=/tmp/kafka-logs
By default
log.dirs point to /tmp/kafka-logs directory. Since data in /tmp directory will
not live longer, I am going to change the value of log.dirs property.
For
example, I created a directory ‘kafka_data’ and set log.dirs property to this
directory path.
Execute
below command to start kafka instance.
kafka-server-start.sh
config/server.properties
You can
see below messages in console.
[2019-10-04 19:52:16,236] INFO [/config/changes-event-process-thread]: Starting (kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread)
[2019-10-04 19:52:16,242] INFO [SocketServer brokerId=0] Started data-plane processors for 1 acceptors (kafka.network.SocketServer)
[2019-10-04 19:52:16,247] INFO Kafka version: 2.3.0 (org.apache.kafka.common.utils.AppInfoParser)
[2019-10-04 19:52:16,247] INFO Kafka commitId: fc1aaa116b661c8a (org.apache.kafka.common.utils.AppInfoParser)
[2019-10-04 19:52:16,247] INFO Kafka startTimeMs: 1570198936243 (org.apache.kafka.common.utils.AppInfoParser)
[2019-10-04 19:52:16,249] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)
Now you
can see that some files are created in the directory kafka_data (log.dirs is
set to this).
$ls kafka_data/
cleaner-offset-checkpoint meta.properties replication-offset-checkpoint
log-start-offset-checkpoint recovery-point-offset-checkpoint
No comments:
Post a Comment