Skip to content

Conversation

@danpoletaev
Copy link
Contributor

@danpoletaev danpoletaev commented Dec 4, 2025

This PR adds new command to cli to allow developers to test their defaultMemoryMbytes expressions. Those expressions are used to dynamically calculate memory of the run based on the input and runOptions.

E.g. of usage

apify actor calculate-memory --input ./input.json --maxTotalChargeUsd=25 --timeoutSecs=360

Full specification in Notion 👇
https://www.notion.so/apify/Dynamic-default-Actor-memory-WIP-293f39950a2280b9aa34fa9cd07f52a4?source=copy_link

@danpoletaev danpoletaev self-assigned this Dec 4, 2025
@github-actions github-actions bot added t-core-services Issues with this label are in the ownership of the core services team. tested Temporary label used only programatically for some analytics. labels Dec 4, 2025
writeFileSync(joinPath(LOCAL_CONFIG_PATH), JSON.stringify(actorJson, null, '\t'), { flag: 'w' });
};

describe('apify actor calculate-memory', () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladfrangu when I start it one by one it works perfectly, but when I start it as as a whole set it fails, can you help me to fix that, please? 🙃

It seems there's something wrong with path or existence of the file. I was a bit confused with that useTempPath mocks...

@danpoletaev danpoletaev requested a review from tobice December 4, 2025 15:26
Copy link
Contributor

@tobice tobice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👏 Left some suggestions though.

import { error, info, success } from '../../lib/outputs.js';
import { getJsonFileContent } from '../../lib/utils.js';

export class ActorCalculateMemoryCommand extends ApifyCommand<typeof ActorCalculateMemoryCommand> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this magic? Why is it basically extending itself? 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used for types of args and flags, since they're different per command 😄

maybe @vladfrangu would add more info here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not extending itself but rather referencing itself in the extension class -> allows us to infer the types of the flags and args for this.args/this.flags

private readonly DEFAULT_INPUT_PATH = 'storage/key_value_stores/default/INPUT.json';

static override description =
`Calculates the actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`Calculates the actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`;
`Calculates the Actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`;

description: 'Actor build version or build tag to evaluate the expression with.',
required: false,
}),
timeoutSecs: Flags.integer({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine this being slightly confusing (because it is confusing to me 😅 ).

Intuitively, why should restartOnError affect memory?

The only slightly related fields from run options are maxTotalChargeUsd or memoryMbytes but that latter one is not even here. And I forgot what it was supposed to do. Can you remind me? And is it intentional that it is not here?

The bottom line is that I find it really weird to have these top level params that have zero effect and are there simply because they are part of Actor run options.

I thought about dropping the options altogether from the memory calculation (that'd be a bigger change) but I guess maxTotalChargeUsd is useful. And we don't want to get into the business of deciding what is useful and what isn't.

I just wonder, is it possible to provide these also as a JSON file / object? Which would be probably inconsistent with other CLI commands but would make more intuitive sense.

Copy link
Contributor Author

@danpoletaev danpoletaev Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I was also thinking about that as well. I'd not rather provide options as JSON, probably not many devs would need it.

The idea behind providing path to input.json is because new Actors already have default input and by providing defaultMemoryMbytes in actor.json devs can call apify actor calculate-memory without any flags.

Regarding memoryMbytes and diskMbytes I thought it would be very confusing and just removed them from flags, we can do the same for restartOnError.

IMHO it's cleaner for me to provide options as flags from DX point of view, but if you feel like it would be better to provide them in additional file I can do that as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind providing path to input.json is because new Actors already have default input and by providing defaultMemoryMbytes in actor.json devs can call apify actor calculate-memory without any flags.

Totally agree here 👍

Regarding memoryMbytes

Remind me, what is memoryMbytes doing?

IMHO it's cleaner for me to provide options as flags from DX point of view, but if you feel like it would be better to provide them in additional file I can do that as well.

From DX, I'd argue that providing a param that doesn't do anything is at least a little weird 😄

Copy link
Contributor Author

@danpoletaev danpoletaev Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind me, what is memoryMbytes doing?

It's defaultRunOptions.memoryMbytes - fixed memory provided by the developer in settings.

From DX, I'd argue that providing a param that doesn't do anything is at least a little weird 😄

Makes sense, I removed restartOnError from flags and left a comment explaining this

/**
* Helper to load the `defaultMemoryMbytes` expression from actor.json.
*/
private async getExpressionFromConfig(): Promise<string | undefined> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) There seems to be a lot of boilerplate and error handling for what seems to be a rather generic functionality: get Actor config 🤔 There is no existing utility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we use one functionality useActorConfig, that already exists 😄 I'd wait for review from @vladfrangu here, I just couldn't find it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the existing utility, and it handles multiple cases (invalid config file, non-existing file or existing file).

I do think you could probably skip the exists == false check tho... It will just return undefined -> bubble up top

}

if (localConfigResult.isOkAnd((cfg) => cfg.exists === false)) {
throw new Error(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why throwing here but using process.exitCode above? 🤔

I assume it's a CLI thing but I don't follow the reasoning 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, again let's wait for best practices from tooling team. I was not sure about this one as well 🙃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we set process.exitCode in certain cases for those who use shells and $? to differentiate exit reasons...

RE: why throw -> I honestly want to move away from throws and more towards handling errors in the cmd level (building one generic catch handler for all errors is rough, unless we build custom error classes [but at that point just implement the nicer error msgs where they're thrown])

expect(lastErrorMessage()).toMatch(/Calculated memory: 512 MB/);
});

it('should calculate memory using expression from .actor.json', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
it('should calculate memory using expression from .actor.json', async () => {
it('should calculate memory using expression from actor.json', async () => {

it('should calculate memory using defaultMemoryMbytes flag', async () => {
await testRunCommand(ActorCalculateMemoryCommand, {
flags_input: 'INPUT.json',
flags_defaultMemoryMbytes: '512',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing that when providing the expression via a flag, you use just a constant value, but when providing from actor.json, you use an actual expression. That creates the impression that it works differently but as far as I can tell, it doesn't?

I'd just use the same thing in both (actually all, dtto below) cases.

What you can do is something like:

const START_URLS_LENGTH_BASED_MEMORY_EXPRESSION = 'get(input, 'startUrls.length') * 1024';

And then reuse it across all cases. That will make it super clear that the expression itself doesn't really matter and remains constant across cases.

@tobice
Copy link
Contributor

tobice commented Dec 5, 2025

@danpoletaev Can you also include another person from our team for the sake of our bus factor on the feature? 🙏

Copy link
Member

@vladfrangu vladfrangu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help out with the tests if they're still causing you lots of issues 🫡, just lmk!

export class ActorCalculateMemoryCommand extends ApifyCommand<typeof ActorCalculateMemoryCommand> {
static override name = 'calculate-memory' as const;

private readonly DEFAULT_INPUT_PATH = 'storage/key_value_stores/default/INPUT.json';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use path.join + getLocalKeyValueStorePath

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this can be moved to a constant outside the class

input: Flags.string({
description: 'Path to the input JSON file used for the calculation.',
required: false,
default: 'storage/key_value_stores/default/INPUT.json',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either reference the path via this. (/ const var name if moved), or don't provide a default at all and set it in code when destructuring the flag (const { input = DEFAULT_INPUT_PATH } = this.flags)

required: false,
}),
restartOnError: Flags.boolean({
description: 'Whether to restart the actor on error.',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: 'Whether to restart the actor on error.',
description: 'Whether the Actor will be restarted on error.',

/**
* Helper to load the `defaultMemoryMbytes` expression from actor.json.
*/
private async getExpressionFromConfig(): Promise<string | undefined> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the existing utility, and it handles multiple cases (invalid config file, non-existing file or existing file).

I do think you could probably skip the exists == false check tho... It will just return undefined -> bubble up top

}

if (localConfigResult.isOkAnd((cfg) => cfg.exists === false)) {
throw new Error(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we set process.exitCode in certain cases for those who use shells and $? to differentiate exit reasons...

RE: why throw -> I honestly want to move away from throws and more towards handling errors in the cmd level (building one generic catch handler for all errors is rough, unless we build custom error classes [but at that point just implement the nicer error msgs where they're thrown])

Comment on lines 30 to 38
beforeEach(async () => {
await beforeAllCalls();
toggleCwdBetweenFullAndParentPath();
});

afterEach(async () => {
await afterAllCalls();
});

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should call toggle every time, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point 👍

input: inputJson,
runOptions,
});
success({ message: `Calculated memory: ${result} MB` });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provide the stdout: true option (and adjust the test accordingly) 🙏

});

it('should fallback to default input path if input flag is not provided', async () => {
const defaultInputPath = joinPath('storage/key_value_stores/default');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use multiple args passed to this function, not a full path, for OS compat

it('should fail when default memory is not provided and actor.json is missing', async () => {
await testRunCommand(ActorCalculateMemoryCommand, {});

expect(lastErrorMessage()).toMatch(/actor.json not found at/);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expect(lastErrorMessage()).toMatch(/actor.json not found at/);
expect(lastErrorMessage()).toMatch(/actor\.json not found at/);

});

// INPUT.json does not exist, so input.startUrls.length will be undefined, defaulting to 1
expect(lastErrorMessage()).toMatch(/Calculated memory: 1024 MB/);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assert only the memory requested, not the whole msg

@danpoletaev danpoletaev requested a review from TC-MO as a code owner December 11, 2025 09:40
Copy link
Contributor

@tobice tobice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

```sh
DESCRIPTION
Creates an Actor project from a template in a new directory.
Creates an Actor project from a template in a new directory. The command
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these changes? They seem unrelated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was caused by automatic doc generation, so I think it is fine

@danpoletaev danpoletaev merged commit 78eeee7 into master Dec 16, 2025
21 of 22 checks passed
@danpoletaev danpoletaev deleted the feat/calculate-run-dynamic-memory branch December 16, 2025 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-core-services Issues with this label are in the ownership of the core services team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants