Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to Docker execution #439

Merged
merged 2 commits into from
Aug 11, 2022
Merged

Updates to Docker execution #439

merged 2 commits into from
Aug 11, 2022

Conversation

emmacasolin
Copy link
Contributor

@emmacasolin emmacasolin commented Aug 9, 2022

Description

#432 is making changes to the way we do process execution, which impacts our docker tests. This requires updating the affected tests, redeploying in the CI/CD, and sanity checks for the expected behaviour.

Related to https://gitlab.com/MatrixAI/Engineering/Polykey/Polykey-Infrastructure/-/merge_requests/1

Issues Fixed

Tasks

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@ghost
Copy link

ghost commented Aug 9, 2022

👇 Click on the image for a new way to code review
  • Make big changes easier — review code in small groups of related files

  • Know where to start — see the whole change at a glance

  • Take a code tour — explore the change with an interactive tour

  • Make comments and review — all fully sync’ed with github

    Try it now!

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map Legend

@emmacasolin
Copy link
Contributor Author

This is the PR in Polykey-Infrastructure for updating the task definition: https://gitlab.com/MatrixAI/Engineering/Polykey/Polykey-Infrastructure/-/merge_requests/1

@emmacasolin
Copy link
Contributor Author

One of the biggest changes coming in from #436 is that pkStdio is no longer compatible with docker integration tests. This is because the original pkStdio was not compatible with these tests since it mocked stdout and stderr. It was changed to essentially just call pkExec underneath when used with docker, however, this functionality is being removed. Any tests that need to run on docker need to switch from using pkStdio to using pkExec, a change which is being made to existing tests in #436 so this just needs to be kept in mind for any future tests we want to add to our suite of docker integration tests.

@CMCDragonkai
Copy link
Member

One of the biggest changes coming in from #436 is that pkStdio is no longer compatible with docker integration tests. This is because the original pkStdio was not compatible with these tests since it mocked stdout and stderr. It was changed to essentially just call pkExec underneath when used with docker, however, this functionality is being removed. Any tests that need to run on docker need to switch from using pkStdio to using pkExec, a change which is being made to existing tests in #436 so this just needs to be kept in mind for any future tests we want to add to our suite of docker integration tests.

This is correct. The pkStdio should not be used unless the test is expected to never be an integration test. So any test using pkStdio should be convert to pkExec in this scenario. However with the loss of the ability to mock STDIO we need pkExpect to be shell capable so we can do proper stdio expectations. Please link that issue and that will be a subsequent work to do, which may involve forking nexpect.

@CMCDragonkai
Copy link
Member

However we are not replacing the pkStdio tests for now. That's assigned to MatrixAI/Polykey-CLI#7.

This should only be affecting existing docker tests by removing the /bin/polykey.

@CMCDragonkai
Copy link
Member

@tegefaulkes can you point out where we need to remove /bin/polykey when docker integration tests are run?

@tegefaulkes
Copy link
Contributor

/bin/polykey needs to be removed from the PK_TEST_COMMAND in the gitlab ci file. at .gitlab-ci.yml:342 PK_TEST_COMMAND="docker run \$DOCKER_OPTIONS $image_and_tag /bin/polykey" npm run test -- tests/bin;

@emmacasolin
Copy link
Contributor Author

PK_TEST_COMMAND has been updated. This should now be ready for CI/CD redeployment.

@CMCDragonkai
Copy link
Member

Are the recent job failures the same timeouts as before? Not introduced by this PR?

@CMCDragonkai
Copy link
Member

Also have you see that the Nat tests are timing out. Can this be fixed up in your testnet tests PR?

@emmacasolin
Copy link
Contributor Author

The failures are all timeouts on tests that usually take very close to the set timeout time to complete, so if they take slightly too long they timeout. They can't be due to this PR since this PR only touches the integration:docker job and release.nix, neither of which would affect tests during the check stage.

There's only one NAT test timing out, and it usually takes close to 40s to run. It doesn't always timeout on the CI/CD but when it does it's most likely just that test taking slightly too long. The only fix would be increasing the timeout or waiting to see if #419 causes the test to run any faster.

@CMCDragonkai CMCDragonkai marked this pull request as ready for review August 11, 2022 03:05
@CMCDragonkai CMCDragonkai merged commit e80624e into staging Aug 11, 2022
@CMCDragonkai
Copy link
Member

Merged on Polykey-Infrastructure. This requires an image build and upload. Image is uploaded here, so until CI/CD is green we have to do this manually.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 11, 2022

Manual deployment done and instructions updated: 7b1edbd

taskDefinition: 'arn:aws:ecs:ap-southeast-2:015248367786:task-definition/polykey-testnet:76',

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 11, 2022

I've updated the task definition on AWS. And I noticed that the image itself doesn't have a HOME which impacts the NODE_PATH automatic derivation.

If a NODE_PATH and HOME doesn't exist, it ends up defaulting to /.local/share/polykey (because os.homeDir() becomes just /).

This is something to be aware of. I also fixed up the path construction to be using path.join, so the help page now shows:

  -np, --node-path <path>      Path to Node State (default: "/.local/share/polykey", env: PK_NODE_PATH)

This should also help MatrixAI/Polykey-CLI#14 and MatrixAI/Polykey-CLI#11

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 11, 2022

Service redeployed.

  {
      "serviceName": "polykey-testnet",
      "serviceArn": "arn:aws:ecs:ap-southeast-2:015248367786:service/polykey-testnet/polykey-testnet",
      "status": "ACTIVE",
      "deployments": [
          {
              "id": "ecs-svc/9149761156517349517",
              "status": "PRIMARY",
              "rolloutState": "IN_PROGRESS",
              "rolloutStateReason": "ECS deployment ecs-svc/9149761156517349517 in progress."
          },
          {
              "id": "ecs-svc/6135588724958369979",
              "status": "ACTIVE",
              "rolloutState": "IN_PROGRESS",
              "rolloutStateReason": "ECS deployment ecs-svc/6135588724958369979 in progress."
          },
          {
              "id": "ecs-svc/1428782724658529208",
              "status": "ACTIVE",
              "rolloutState": "COMPLETED",
              "rolloutStateReason": "ECS deployment ecs-svc/1428782724658529208 completed."
          }
      ]
  }

Now we wait if the new task definition continues to work. @emmacasolin please check in with AWS dashboard and logs tomorrow as you're writing testnet tests to see if it is deployed correctly. The command is now set to just set to:

            "command": [
                "agent",
                "start",
                "--verbose",
                "--format=json"
            ],

And it should be appending to the entrypoint being /bin/polykey.

Furthermore commands are changed to:

docker run -it polykey-1.0.0:rkl4ma7wbfvsz1wl95gn9rpz865gk2ai --help

@tegefaulkes remember to check in to see if docker integration tests are still working as well when you rebase. Specifically in reference to 6..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants