Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZDM-295 Enhance docker-compose components #5

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

anthony-grasso-datastax
Copy link

@anthony-grasso-datastax anthony-grasso-datastax commented Sep 29, 2022

  • Created Dockerfiles for jumphost, proxy, and client
  • Migrated installation of packages and user setup from docker-entrypoint into Dockerfile for jumphost, proxy, and client
  • Increased retry sleep time to 20s for jumphost's ssh connection test, and key scan
  • Increased retry sleep time to 20s for proxy's ssh key check
  • Added functionality to jumphost to detect container names and ip address of proxy, origin (Cassandra node), and target (Cassandra node) containers and record information to a shared hosts file
  • Added functionality to client to read ip address information of proxies and Cassandra nodes from shared hosts file
  • Updated test carried out by client to initially insert data into the Cassandra origin node, then every minute insert data via the ZDM proxy, read data back via ZDM proxy, and then read data back directly from each Cassandra node
  • Fixed issue where ansible will fail to provision proxy hosts if less than three existed by making the jumphost dynamically generate the zdm_ansible_inventory file

* Created Dockerfiles for jumphost, proxy, and client
* Migrated installation of packages and user setup from docker-entrypoint into
  Dockerfile for jumphost, proxy, and client
* Increased retry sleep time to 20s for jumphost's ssh connection test,
  and key scan
* Increased retry sleep time to 20s for proxy's ssh key check
* Added functionality to jumphost to detect container names and ip address of
  proxy, origin (Cassandra node), and target (Cassandra node) containers and
  record information to a shared hosts file
* Added functionality to client to read ip address information of proxies and
  Cassandra nodes from shared hosts file
* Updated test carried out by client to initially insert data into the
  Cassandra origin node, then every minute insert data via the ZDM proxy, read
  data back via ZDM proxy, and then read data back directly from each
  Cassandra node
* Fixed issue where ansible will fail to provision proxy hosts if less than
  three existed by making the jumphost dynamically generate the
  zdm_ansible_inventory file
Copy link
Collaborator

@grighetto grighetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together @anthony-grasso-datastax, great stuff.
I left a few comments for your consideration and I'm still looking into why things are not quite working for me locally on OSX, the client_1 container keeps spinning with:

client_1    | cqlsh not ready on zdm-proxy-automation_proxy_1
client_1    | /usr/local/bin/cqlsh:460: DeprecationWarning: Legacy execution parameters will be removed in 4.0. Consider using execution profiles.
client_1    | Connection error: ('Unable to connect to any servers', {'192.168.100.6:9042': ConnectionRefusedError(111, "Tried connecting to [('192.168.100.6', 9042)]. Last error: Connection refused")})
client_1    | cqlsh not ready on zdm-proxy-automation_proxy_1
client_1    | /usr/local/bin/cqlsh:460: DeprecationWarning: Legacy execution parameters will be removed in 4.0. Consider using execution profiles.
client_1    | Connection error: ('Unable to connect to any servers', {'192.168.100.6:9042': ConnectionRefusedError(111, "Tried connecting to [('192.168.100.6', 9042)]. Last error: Connection refused")})

@@ -1,35 +0,0 @@
#!/bin/bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the zdm-proxy repo we're also using the short name compose for the directory; for consistency we should probably stick with that name here too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Reverted the change so that the directory is called compose.

networks:
proxy:
deploy:
mode: replicated
replicas: 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main goal for the docker-compose infrastructure was testing the automation in an environment as similar as possible to the real production environment, meaning distribute machines/VMs running multiple proxy instances/replicas.
I understand that for demo purposes we may only need one instance, so I'd suggest moving the number of replicas into that new env file and making the default as 3, so we can easily tweak it when needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Reverted the change and moved the number of replicas to the .env file as suggested.

networks:
proxy:

proxy:
image: thesoul/ubuntu-dind:docker-20.10.12
build: ./docker-compose/services/proxy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! I liked this approach to build the image on the fly and then re-use it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 😄

@@ -0,0 +1,144 @@
#!/bin/bash

INSERT_DML="INSERT INTO test_keyspace.test_table (id, window_day, read_minute, value)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the zdm-repo we're using NoSQLBench to validate the data written through the proxy. You can see it here: https://github.com/datastax/zdm-proxy/blob/main/compose/nosqlbench-entrypoint.sh
The intent is to do same thing for the automation, since that's more comprehensive for testing purposes.

We can probably keep these CQL statements under some demo folder in the project root, but probably outside the integration test, so anyone interested in demo'ing can first start docker-compose and then separately run the demo script.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the zdm-repo we're using NoSQLBench to validate the data written through the proxy. ...
The intent is to do same thing for the automation, since that's more comprehensive for testing purposes.

Makes sense.

We can probably keep these CQL statements under some demo folder in the project root, but probably outside the integration test, so anyone interested in demo'ing can first start docker-compose and then separately run the demo script.

Did we want to use NoSQLBench for testing in similar to what we are doing in zdm-proxy?

Further to the above question, what if I updated the client to have three operation modes:

  • TEST_CONNECTION - same cqlsh connection test that the client currently does.
  • TEST_DATA - data verification test using NoSQLBench similar to what is done in the zdm-proxy repository.
  • DEMO - looped data verification test using NoSQLBench similar to what my prosed changes were doing with cqlsh.

The default mode could be set to TEST_CONNECTION for now. We could aways change it to something else later.

Let me know what you think.

Copy link
Collaborator

@grighetto grighetto Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anthony-grasso-datastax

Did we want to use NoSQLBench for testing in similar to what we are doing in zdm-proxy?

Yes, that should give us more confidence than just connecting through CQLSH.

Further to the above question, what if I updated the client to have three operation modes:

I like that idea. It would be good though if we could run both TEST_CONNECTION and TEST_DATA without having to re-run the Ansible automation again. In other words, we run the automation once, but the client can be run multiple times in different ways at the end for testing.

ENV PROXY_IP_1 ""
ENV PROXY_IP_2 ""
ENV PROXY_IP_3 ""
ENV JUMPHOST_IP ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These vars were only used to populate the inventory file. Since that file is constructed through bash now, we can probably get rid of them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I forgot to remove these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants