-
Notifications
You must be signed in to change notification settings - Fork 38
feat: add calculate-memory command to actor #980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| writeFileSync(joinPath(LOCAL_CONFIG_PATH), JSON.stringify(actorJson, null, '\t'), { flag: 'w' }); | ||
| }; | ||
|
|
||
| describe('apify actor calculate-memory', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vladfrangu when I start it one by one it works perfectly, but when I start it as as a whole set it fails, can you help me to fix that, please? 🙃
It seems there's something wrong with path or existence of the file. I was a bit confused with that useTempPath mocks...
tobice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👏 Left some suggestions though.
| import { error, info, success } from '../../lib/outputs.js'; | ||
| import { getJsonFileContent } from '../../lib/utils.js'; | ||
|
|
||
| export class ActorCalculateMemoryCommand extends ApifyCommand<typeof ActorCalculateMemoryCommand> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this magic? Why is it basically extending itself? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used for types of args and flags, since they're different per command 😄
maybe @vladfrangu would add more info here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not extending itself but rather referencing itself in the extension class -> allows us to infer the types of the flags and args for this.args/this.flags
| private readonly DEFAULT_INPUT_PATH = 'storage/key_value_stores/default/INPUT.json'; | ||
|
|
||
| static override description = | ||
| `Calculates the actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `Calculates the actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`; | |
| `Calculates the Actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`; |
| description: 'Actor build version or build tag to evaluate the expression with.', | ||
| required: false, | ||
| }), | ||
| timeoutSecs: Flags.integer({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine this being slightly confusing (because it is confusing to me 😅 ).
Intuitively, why should restartOnError affect memory?
The only slightly related fields from run options are maxTotalChargeUsd or memoryMbytes but that latter one is not even here. And I forgot what it was supposed to do. Can you remind me? And is it intentional that it is not here?
The bottom line is that I find it really weird to have these top level params that have zero effect and are there simply because they are part of Actor run options.
I thought about dropping the options altogether from the memory calculation (that'd be a bigger change) but I guess maxTotalChargeUsd is useful. And we don't want to get into the business of deciding what is useful and what isn't.
I just wonder, is it possible to provide these also as a JSON file / object? Which would be probably inconsistent with other CLI commands but would make more intuitive sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I was also thinking about that as well. I'd not rather provide options as JSON, probably not many devs would need it.
The idea behind providing path to input.json is because new Actors already have default input and by providing defaultMemoryMbytes in actor.json devs can call apify actor calculate-memory without any flags.
Regarding memoryMbytes and diskMbytes I thought it would be very confusing and just removed them from flags, we can do the same for restartOnError.
IMHO it's cleaner for me to provide options as flags from DX point of view, but if you feel like it would be better to provide them in additional file I can do that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind providing path to input.json is because new Actors already have default input and by providing defaultMemoryMbytes in actor.json devs can call apify actor calculate-memory without any flags.
Totally agree here 👍
Regarding memoryMbytes
Remind me, what is memoryMbytes doing?
IMHO it's cleaner for me to provide options as flags from DX point of view, but if you feel like it would be better to provide them in additional file I can do that as well.
From DX, I'd argue that providing a param that doesn't do anything is at least a little weird 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remind me, what is memoryMbytes doing?
It's defaultRunOptions.memoryMbytes - fixed memory provided by the developer in settings.
From DX, I'd argue that providing a param that doesn't do anything is at least a little weird 😄
Makes sense, I removed restartOnError from flags and left a comment explaining this
| /** | ||
| * Helper to load the `defaultMemoryMbytes` expression from actor.json. | ||
| */ | ||
| private async getExpressionFromConfig(): Promise<string | undefined> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) There seems to be a lot of boilerplate and error handling for what seems to be a rather generic functionality: get Actor config 🤔 There is no existing utility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well we use one functionality useActorConfig, that already exists 😄 I'd wait for review from @vladfrangu here, I just couldn't find it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the existing utility, and it handles multiple cases (invalid config file, non-existing file or existing file).
I do think you could probably skip the exists == false check tho... It will just return undefined -> bubble up top
| } | ||
|
|
||
| if (localConfigResult.isOkAnd((cfg) => cfg.exists === false)) { | ||
| throw new Error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why throwing here but using process.exitCode above? 🤔
I assume it's a CLI thing but I don't follow the reasoning 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, again let's wait for best practices from tooling team. I was not sure about this one as well 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we set process.exitCode in certain cases for those who use shells and $? to differentiate exit reasons...
RE: why throw -> I honestly want to move away from throws and more towards handling errors in the cmd level (building one generic catch handler for all errors is rough, unless we build custom error classes [but at that point just implement the nicer error msgs where they're thrown])
| expect(lastErrorMessage()).toMatch(/Calculated memory: 512 MB/); | ||
| }); | ||
|
|
||
| it('should calculate memory using expression from .actor.json', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| it('should calculate memory using expression from .actor.json', async () => { | |
| it('should calculate memory using expression from actor.json', async () => { |
| it('should calculate memory using defaultMemoryMbytes flag', async () => { | ||
| await testRunCommand(ActorCalculateMemoryCommand, { | ||
| flags_input: 'INPUT.json', | ||
| flags_defaultMemoryMbytes: '512', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's confusing that when providing the expression via a flag, you use just a constant value, but when providing from actor.json, you use an actual expression. That creates the impression that it works differently but as far as I can tell, it doesn't?
I'd just use the same thing in both (actually all, dtto below) cases.
What you can do is something like:
const START_URLS_LENGTH_BASED_MEMORY_EXPRESSION = 'get(input, 'startUrls.length') * 1024';And then reuse it across all cases. That will make it super clear that the expression itself doesn't really matter and remains constant across cases.
|
@danpoletaev Can you also include another person from our team for the sake of our bus factor on the feature? 🙏 |
vladfrangu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can help out with the tests if they're still causing you lots of issues 🫡, just lmk!
| export class ActorCalculateMemoryCommand extends ApifyCommand<typeof ActorCalculateMemoryCommand> { | ||
| static override name = 'calculate-memory' as const; | ||
|
|
||
| private readonly DEFAULT_INPUT_PATH = 'storage/key_value_stores/default/INPUT.json'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use path.join + getLocalKeyValueStorePath
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this can be moved to a constant outside the class
| input: Flags.string({ | ||
| description: 'Path to the input JSON file used for the calculation.', | ||
| required: false, | ||
| default: 'storage/key_value_stores/default/INPUT.json', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either reference the path via this. (/ const var name if moved), or don't provide a default at all and set it in code when destructuring the flag (const { input = DEFAULT_INPUT_PATH } = this.flags)
| required: false, | ||
| }), | ||
| restartOnError: Flags.boolean({ | ||
| description: 'Whether to restart the actor on error.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| description: 'Whether to restart the actor on error.', | |
| description: 'Whether the Actor will be restarted on error.', |
| /** | ||
| * Helper to load the `defaultMemoryMbytes` expression from actor.json. | ||
| */ | ||
| private async getExpressionFromConfig(): Promise<string | undefined> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the existing utility, and it handles multiple cases (invalid config file, non-existing file or existing file).
I do think you could probably skip the exists == false check tho... It will just return undefined -> bubble up top
| } | ||
|
|
||
| if (localConfigResult.isOkAnd((cfg) => cfg.exists === false)) { | ||
| throw new Error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we set process.exitCode in certain cases for those who use shells and $? to differentiate exit reasons...
RE: why throw -> I honestly want to move away from throws and more towards handling errors in the cmd level (building one generic catch handler for all errors is rough, unless we build custom error classes [but at that point just implement the nicer error msgs where they're thrown])
| beforeEach(async () => { | ||
| await beforeAllCalls(); | ||
| toggleCwdBetweenFullAndParentPath(); | ||
| }); | ||
|
|
||
| afterEach(async () => { | ||
| await afterAllCalls(); | ||
| }); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you should call toggle every time, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point 👍
| input: inputJson, | ||
| runOptions, | ||
| }); | ||
| success({ message: `Calculated memory: ${result} MB` }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provide the stdout: true option (and adjust the test accordingly) 🙏
| }); | ||
|
|
||
| it('should fallback to default input path if input flag is not provided', async () => { | ||
| const defaultInputPath = joinPath('storage/key_value_stores/default'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use multiple args passed to this function, not a full path, for OS compat
| it('should fail when default memory is not provided and actor.json is missing', async () => { | ||
| await testRunCommand(ActorCalculateMemoryCommand, {}); | ||
|
|
||
| expect(lastErrorMessage()).toMatch(/actor.json not found at/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| expect(lastErrorMessage()).toMatch(/actor.json not found at/); | |
| expect(lastErrorMessage()).toMatch(/actor\.json not found at/); |
| }); | ||
|
|
||
| // INPUT.json does not exist, so input.startUrls.length will be undefined, defaulting to 1 | ||
| expect(lastErrorMessage()).toMatch(/Calculated memory: 1024 MB/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assert only the memory requested, not the whole msg
tobice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
| ```sh | ||
| DESCRIPTION | ||
| Creates an Actor project from a template in a new directory. | ||
| Creates an Actor project from a template in a new directory. The command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why these changes? They seem unrelated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was caused by automatic doc generation, so I think it is fine
This PR adds new command to cli to allow developers to test their
defaultMemoryMbytesexpressions. Those expressions are used to dynamically calculate memory of the run based on theinputandrunOptions.E.g. of usage
Full specification in Notion 👇
https://www.notion.so/apify/Dynamic-default-Actor-memory-WIP-293f39950a2280b9aa34fa9cd07f52a4?source=copy_link