Programming for beginners: How to Ingest Data from an HTTPS Endpoint into Apache Druid?

Are you working with real-time or batch data that's accessible over HTTPS and wondering how to bring it into Apache Druid for fast analytics? You're in the right place!

Apache Druid is a high-performance analytics database designed for rapid, ad-hoc queries on large datasets. While it's commonly used with Kafka, local files, or cloud storage systems like S3 or GCS, you can also pull data from an HTTPS endpoint such as a public API or your internal service and ingest it into Druid.

In this blog post, I’ll Walk through how to ingest data from an HTTPS source using Druid’s HTTPs input type.

Open the url ‘http://localhost:8888’ in browser, you will be taken Druid console.

Click on Load data -> Batch – SQL (multi-stage-query) button.

On the “Load Data / Select Input Type” page in Druid, choose HTTP(s) as your input type. This option allows you to provide one or more HTTPS URLs, separated by commas. For example, I used the following URL to ingest data into the Druid ecosystem:

https://raw.githubusercontent.com/MainakRepositor/Datasets/refs/heads/master/Weather%20Data/pressure.csv

Click on Connect data button.

You will be taken to ‘Load data / Parse’ page.

Click on Next button.

You will be taken to ‘Load data / Configure schema’ page. You can observe __time column is generated from the datetime values of the dataset.

Click on ‘Start loading data’ button.

Upon ingesting successful, navigate to Query tab and execute following sql statement.

SELECT * FROM "pressure"

That’s it, you are done….

Previous Next Home

Programming for beginners

Friday, 12 September 2025

How to Ingest Data from an HTTPS Endpoint into Apache Druid?

No comments:

Post a Comment