Skip to content

Conversation

@surajkota
Copy link

@surajkota surajkota commented Oct 23, 2024

Description of changes

SageMaker HyperPod recently launched EKS integration. This commit adds SageMaker instance types and toleration for running DeepHealthChecks so customers can install EFA helm chart without modifications unless required

Checklist

  • Added/modified documentation as required (such as the README.md for modified charts)
  • Incremented the chart version in Chart.yaml for the modified chart(s)
  • Manually tested. Describe what testing was done in the testing section below
  • Make sure the title of the PR is a good description that can go into the release notes

Testing

Installed EFA driver from my github branch and verified pods are scheduled on g5.8x instances for EC2 and Hyperpods without modifying the chart. Didnt run workloads as I didnt change the deamonset itself

helm install aws-efa-k8s-device-plugin .
NAME: aws-efa-k8s-device-plugin
LAST DEPLOYED: Wed Oct 23 09:23:44 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
EFA device plugin is installed, it can be requested as `vpc.amazonaws.com/efa` resource.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

SageMaker HyperPod recently launched EKS integration. This commit adds SageMaker instance types and toleration for running DeepHealthChecks.
@surajkota surajkota requested a review from dims as a code owner October 23, 2024 15:20
@bryantbiggs
Copy link
Contributor

partial dupe of #1129

@surajkota
Copy link
Author

Ack, I can work with Nathan to close the other PR

@surajkota
Copy link
Author

surajkota commented Oct 24, 2024

@bryantbiggs is there also a kustomize version of EFA device plugin maintained by EKS/AWS? or any other high traffic download sources where similar change should also go?

I checked the aws-samples repo and its depreciated in favor or this chart.

@bryantbiggs
Copy link
Contributor

no - this is the source of truth for deploying the EFA device plugin into an EKS cluster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants