-
Notifications
You must be signed in to change notification settings - Fork 15
AKS Improvements #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
AKS Improvements #36
Conversation
| # - Pin operator/system components to the smaller cpu8 pool. | ||
| # - Expose two CPU-focused instance types aligned with the cpu8 and cpu16 node pools. | ||
|
|
||
| enableAnyscaleRayHeadNodePDB: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this yaml file?
We should just refer to the Anyscale Operator helm chart link and ask users to customize the values file themselves. We do not want to fork / maintain another helm values file per example.
| #checkov:skip=CKV_AZURE_227: "Ensure that the AKS cluster encrypt temp disks, caches, and data flows between Compute and Storage resources" | ||
|
|
||
| name = var.aks_cluster_name | ||
| name = local.cluster_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you keep the original name unchanged?
| auto_scaling_enabled = true | ||
| min_count = 1 | ||
| max_count = 3 | ||
| min_count = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change the min_count to 3? We should keep it as minimal as possible.
| min_count = 1 | ||
| max_count = 3 | ||
| min_count = 3 | ||
| max_count = 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change it to 5?
| min_count = 3 | ||
| max_count = 5 | ||
|
|
||
| upgrade_settings { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this upgrade_settings change?
We prefer not to maintain any optional settings here. Users can customize on their own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this one, if we don't provide it, I think the AKS rejects it - as in it has to have upgrade_settings..
| vm_size = "Standard_NC16as_T4_v3" | ||
| product_name = "NVIDIA-T4" | ||
| gpu_count = "1" | ||
| user_node_pools = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are those GPU node pools? They are necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add them back on once I can test them, currently don't have GPU quota so couldn't test (WIP)
| @@ -1,13 +1,25 @@ | |||
| resource "random_string" "storage_suffix" { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you revert this change? We prefer not to have any randomness here. Otherwise debugging issues would be harder.
| default = "anyscale-operator" | ||
| } | ||
|
|
||
| variable "node_group_gpu_types" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are necessary and available across all examples.
| source = "hashicorp/azurerm" | ||
| version = "4.26.0" | ||
| } | ||
| random = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert this randomness change.
Pull request checklist
Please check if your PR fulfills the following requirements:
Pull Request Type
Does this introduce a breaking change?
Other information