Skip to content

carlos-medina/go-data-platform

Repository files navigation

[WIP] go-data-platform

This data platform has two applications:

Ingestor

A worker that consumes inputs from a Kafka topic, decodes them, and reads previous entries from the database. If there is not previous data, it persists the record in the database; if there is previous data but its version is greater than the input one, it discards the input; if there is previous data but its version is less than the input one, it updates the record in the database. It's currently working, but some refactoring is necessary. The work that must be done is to:

  • Read all config from environment variables;
  • The database ip address in our docker network is being set dynamically. It should be static, so we don't have to change both ingestor's and retriever's configs everytime the database ip changes
  • Implement endpoint;
  • Implement logging;
  • Create tests for the adapter;

Retriever

A HTTP API that receives a GET request, applies the filter in the query parameter, access the database and returns the data in JSON. It's currently working, but some refactoring is necessary. The work that must be done is to:

  • Remove the panics from main and return in the response an appropriate message with the current error;
  • Read all config from environment variables;
  • The database ip address in our docker network is being set dynamically. It should be static, so we don't have to change both ingestor's and retriever's configs everytime the database ip changes;
  • Implement endpoint;
  • Implement service;
  • Implement logging;
  • Create tests for the adapter;

How to run the system

Running a Kafka broker

Run a Kafka broker on localhost:9092 using docker. The one present in this repo was taken from Confluent Platform's Kafka.

docker compose up -d kafka-broker

Producing input data

Enter Kafka broker's container:

docker exec -it kafka-broker bash

Create a new topic called input data:

kafka-topics --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic input-data

You can check if the topic was created using the command:

kafka-topics --list --bootstrap-server localhost:9092

In resources/input-data.txt in this repo, each line contains a different input. To produce an input event, create a console producer, copy and paste one line in the console:

kafka-console-producer --bootstrap-server localhost:9092 --topic input-data

You can check if the event was produced creating a console consumer:

kafka-console-consumer --bootstrap-server localhost:9092 --topic input-data --from-beginning

Set up the database

Running the database:

docker compose up -d db

Running a client for us to connect to the database:

docker run -it --network go-data-platform-network --rm --name mysql-client mysql:8.0 mysql -hmysql -uroot -p

In the container's terminal, we type the password atributed to the value MYSQL_ROOT_PASSWORD in our .env file, which is admin.

In another terminal, we will copy the file create-table.sql to our database container:

docker cp ./resources/create-table.sql mysql-client:/create-table.sql

In mysql-client's container terminal, we execute the following commands to 1. create the database 2. use it 3. execute the SQL commands from the copyied file:

create database go_data_platform;
use go_data_platform;
source /create-table.sql;

Running Ingestor

Before runnning ingestor, we must change the Addr value in cmd/ingestor/resources.go > MustNewMySQLAdapter() > cfg. In order for us to find its correct value, we can inspect it using the command docker inspect:

docker inspect mysql | grep IPAddress

If our container's IP Address is, for instance, 172.28.0.2, we change the value on Addr to:

Addr: "172.28.0.2:3306",

To run the docker image, the key "bootstrap.servers" on kafka.ConfigMap must have the value "broker:29092" if you are running Confluent Platform's Kafka; more on KAFKA_LISTENERS AND KAFKA__ADVERTISED_LISTENERS here and here:

Running the ingestor:

docker compose up ingestor

Running Retriever

Identical to Ingestor, we must change the Addr value in cmd/retriever/resources.go > MustNewMySQLAdapter() > cfg. In order for us to find its correct value, we can inspect it using the command docker inspect:

docker inspect mysql | grep IPAddress

If our container's IP Address is, for instance, 172.28.0.2, we change the value on Addr to:

Addr: "172.28.0.2:3306",

Running the retriever:

docker compose up retriever

Sending a request to the retriever:

curl http://localhost:8080/records?data_id=1

About

[WIP] A data platform written in Golang

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published