diff --git a/CHANGELOG.md b/CHANGELOG.md
index ad8173b99..de82b693a 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -3,6 +3,7 @@
## [v3.6.5.dev0]
### Added
+- [GCP Compute Engine] Added new GCP Compute Engine standalone backend
- [Core] Added support for variable-length parameters in functions passed to the executor.
### Changed
@@ -15,6 +16,7 @@
### Fixed
- [K8s] Fixed default runtime builds impacted by Debian Buster end-of-life.
- [GCP Cloud Run] Added Artifact Registry (`pkg.dev`) runtime deployment support
+- [K8s] Run default runtime image as non-root user (uid 1000) (#1469)
## [v3.6.4]
diff --git a/config/README.md b/config/README.md
index dbf4fd1d7..958a3faae 100644
--- a/config/README.md
+++ b/config/README.md
@@ -69,6 +69,7 @@ Storage Backends
- [IBM Virtual Private Cloud](../docs/source/compute_config/ibm_vpc.md)
- [AWS Elastic Compute Cloud (EC2)](../docs/source/compute_config/aws_ec2.md)
- [Azure Virtual Machines](../docs/source/compute_config/azure_vms.md)
+- [Google Compute Engine](../docs/source/compute_config/gcp_compute_engine.md)
diff --git a/config/config_template.yaml b/config/config_template.yaml
index e5b1de723..c960891af 100644
--- a/config/config_template.yaml
+++ b/config/config_template.yaml
@@ -137,6 +137,7 @@
# Google Cloud Platform – credentials shared by all gcp_* sections
# =============================================================================
#gcp:
+ #project_name:
#region: # e.g. us-east1
#credentials_path:
@@ -172,6 +173,26 @@
#region: # Falls back to gcp.region
#storage_bucket: # Auto-created if not provided
+# Google Compute Engine – standalone compute
+#gcp_compute_engine:
+ #project_name: # Falls back to gcp.project_name
+ #zone: # e.g. us-east1-b. Mandatory
+ #region: # Falls back to gcp.region; derived from zone if omitted
+ #service_account:
+ #network_name: # Lithops creates a VPC if not provided
+ #subnet_name:
+ #source_image: projects/ubuntu-os-cloud/global/images/family/ubuntu-2404-lts-amd64
+ #instance_name: # Mandatory in consume mode
+ #master_instance_type: e2-small
+ #worker_instance_type: e2-standard-2
+ #ssh_username: ubuntu
+ #ssh_password:
+ #ssh_key_filename: ~/.ssh/id_rsa
+ #request_spot_instances: False
+ #delete_on_dismantle: True
+ #max_workers: 100
+ #worker_processes: AUTO # Default: number of CPUs of the worker VM
+
# =============================================================================
# Microsoft Azure – credentials shared by all azure_* sections
@@ -460,7 +481,7 @@
# =============================================================================
-# Standalone shared settings (applies to ibm_vpc, aws_ec2, azure_vms, vm)
+# Standalone shared settings (applies to ibm_vpc, aws_ec2, azure_vms, gcp_compute_engine, vm)
# =============================================================================
#standalone:
#runtime: python3
diff --git a/docs/source/compute_backends.rst b/docs/source/compute_backends.rst
index 0148a6615..70d38caef 100644
--- a/docs/source/compute_backends.rst
+++ b/docs/source/compute_backends.rst
@@ -44,4 +44,4 @@ Compute Backends
compute_config/ibm_vpc.md
compute_config/aws_ec2.md
compute_config/azure_vms.md
- compute_config/gcp_compute_engie.md
+ compute_config/gcp_compute_engine.md
diff --git a/docs/source/compute_config/gcp_compute_engie.md b/docs/source/compute_config/gcp_compute_engine.md
similarity index 52%
rename from docs/source/compute_config/gcp_compute_engie.md
rename to docs/source/compute_config/gcp_compute_engine.md
index 676b8fb7d..04c7fc09e 100644
--- a/docs/source/compute_config/gcp_compute_engie.md
+++ b/docs/source/compute_config/gcp_compute_engine.md
@@ -2,19 +2,27 @@
The GCP Compute Engine backend of Lithops can provide a serverless user experience on top of GCE where Lithops creates new Virtual Machines (VMs) dynamically at runtime and scales Lithops jobs against them (create and reuse modes). Alternatively Lithops can start and stop an existing VM instance (consume mode).
-The backend key is `gcp_compute_engie` (matches the Lithops module name).
+The backend key is `gcp_compute_engine` (matches the Lithops module name).
## Choose an operating system image for the VM
-Any VM needs an operating system image. By default Lithops uses Ubuntu 24.04 (`ubuntu-2404-lts-amd64`). Lithops installs required dependencies on the VM on first use (this can take a few minutes).
+Any VM needs an operating system image. By default Lithops uses Ubuntu 24.04 (`projects/ubuntu-os-cloud/global/images/family/ubuntu-2404-lts-amd64`). Lithops installs required dependencies on the VM on first use (this can take a few minutes).
+
+For faster startups, build a pre-configured custom image (see [runtime/gcp_compute_engine](https://github.com/lithops-cloud/lithops/tree/master/runtime/gcp_compute_engine)):
+
+```bash
+lithops image build -b gcp_compute_engine
+```
+
+This creates `lithops-ubuntu-2404-lts-amd64-server` in your project; Lithops uses it automatically when present.
To list available images:
```bash
-lithops image list -b gcp_compute_engie
+lithops image list -b gcp_compute_engine
```
-Use the **Image ID** column as `source_image` in your config.
+Use the **Image ID** column as `source_image` in your config when using a custom image name.
## Installation
@@ -54,12 +62,12 @@ gcloud projects add-iam-policy-binding \
```yaml
lithops:
- backend: gcp_compute_engie
+ backend: gcp_compute_engine
gcp:
credentials_path:
-gcp_compute_engie:
+gcp_compute_engine:
project_name:
zone:
exec_mode: reuse
@@ -78,29 +86,29 @@ Lithops attaches the service account from `credentials_path` to master and worke
|Group|Key|Default|Mandatory|Additional info|
|---|---|---|---|---|
-|gcp_compute_engie | project_name | |yes | GCP project ID |
-|gcp_compute_engie | zone | |yes | Compute Engine zone, for example `us-east1-b` |
-|gcp_compute_engie | region | derived from zone |no | Region used for subnet and NAT |
-|gcp_compute_engie | service_account | |no | Service account email attached to VMs. Default: `client_email` from `credentials_path` |
-|gcp_compute_engie | network_name | |no | Existing VPC name. If not provided, Lithops creates a new network |
-|gcp_compute_engie | subnet_name | |no | Existing subnet name when using a custom VPC |
-|gcp_compute_engie | source_image | ubuntu-2404-lts-amd64 |no | Boot image reference |
-|gcp_compute_engie | master_instance_type | e2-small |no | Master VM machine type |
-|gcp_compute_engie | worker_instance_type | e2-standard-2 |no | Worker VM machine type |
-|gcp_compute_engie | ssh_username | ubuntu |no | Username to access the VM |
-|gcp_compute_engie | ssh_password | |no | Password for worker VMs. If not provided, it is created randomly |
-|gcp_compute_engie | ssh_key_filename | ~/.ssh/id_rsa |no | SSH private key for the master VM. If not provided, Lithops creates one |
-|gcp_compute_engie | request_spot_instances | False |no | Use Spot VMs for workers |
-|gcp_compute_engie | delete_on_dismantle | True |no | Delete worker VMs when stopped. Master VM is never deleted when stopped |
-|gcp_compute_engie | max_workers | 100 |no | Max number of workers per `FunctionExecutor()` |
-|gcp_compute_engie | worker_processes | AUTO |no | Parallel Lithops processes per worker. Default: CPUs of `worker_instance_type` |
-|gcp_compute_engie | runtime | python3 |no | Runtime name. Default: python3 on the VM |
-|gcp_compute_engie | auto_dismantle | True |no | If False, VMs are not stopped automatically |
-|gcp_compute_engie | soft_dismantle_timeout | 300 |no | Seconds to stop the VM after a job **completed** |
-|gcp_compute_engie | hard_dismantle_timeout | 3600 |no | Seconds to stop the VM after a job **started** |
-|gcp_compute_engie | exec_mode | reuse |no | One of: **consume**, **create** or **reuse** |
-|gcp_compute_engie | extra_apt_packages | [] |no | Extra apt packages on master/worker VMs during setup |
-|gcp_compute_engie | extra_python_packages | [] |no | Extra pip packages on master/worker VMs after Lithops |
+|gcp_compute_engine | project_name | |yes | GCP project ID |
+|gcp_compute_engine | zone | |yes | Compute Engine zone, for example `us-east1-b` |
+|gcp_compute_engine | region | derived from zone |no | Region used for subnet and NAT |
+|gcp_compute_engine | service_account | |no | Service account email attached to VMs. Default: `client_email` from `credentials_path` |
+|gcp_compute_engine | network_name | |no | Existing VPC name. If not provided, Lithops creates a new network |
+|gcp_compute_engine | subnet_name | |no | Existing subnet name when using a custom VPC |
+|gcp_compute_engine | source_image | ubuntu-2404-lts-amd64 |no | Boot image reference |
+|gcp_compute_engine | master_instance_type | e2-small |no | Master VM machine type |
+|gcp_compute_engine | worker_instance_type | e2-standard-2 |no | Worker VM machine type |
+|gcp_compute_engine | ssh_username | ubuntu |no | Username to access the VM |
+|gcp_compute_engine | ssh_password | |no | Password for worker VMs. If not provided, it is created randomly |
+|gcp_compute_engine | ssh_key_filename | ~/.ssh/id_rsa |no | SSH private key for the master VM. If not provided, Lithops creates one |
+|gcp_compute_engine | request_spot_instances | False |no | Use Spot VMs for workers |
+|gcp_compute_engine | delete_on_dismantle | True |no | Delete worker VMs when stopped. Master VM is never deleted when stopped |
+|gcp_compute_engine | max_workers | 100 |no | Max number of workers per `FunctionExecutor()` |
+|gcp_compute_engine | worker_processes | AUTO |no | Parallel Lithops processes per worker. Default: CPUs of `worker_instance_type` |
+|gcp_compute_engine | runtime | python3 |no | Runtime name. Default: python3 on the VM |
+|gcp_compute_engine | auto_dismantle | True |no | If False, VMs are not stopped automatically |
+|gcp_compute_engine | soft_dismantle_timeout | 300 |no | Seconds to stop the VM after a job **completed** |
+|gcp_compute_engine | hard_dismantle_timeout | 3600 |no | Seconds to stop the VM after a job **started** |
+|gcp_compute_engine | exec_mode | reuse |no | One of: **consume**, **create** or **reuse** |
+|gcp_compute_engine | extra_apt_packages | [] |no | Extra apt packages on master/worker VMs during setup |
+|gcp_compute_engine | extra_python_packages | [] |no | Extra pip packages on master/worker VMs after Lithops |
## Consume mode
@@ -110,12 +118,12 @@ In this mode, Lithops uses an existing VM. The VM must be reachable by SSH and h
```yaml
lithops:
- backend: gcp_compute_engie
+ backend: gcp_compute_engine
gcp:
credentials_path:
-gcp_compute_engie:
+gcp_compute_engine:
exec_mode: consume
project_name:
zone:
@@ -126,19 +134,19 @@ gcp_compute_engie:
|Group|Key|Default|Mandatory|Additional info|
|---|---|---|---|---|
-|gcp_compute_engie | instance_name | |yes | Existing VM instance name |
-|gcp_compute_engie | project_name | |yes | GCP project ID |
-|gcp_compute_engie | zone | |yes | Compute Engine zone |
-|gcp_compute_engie | ssh_username | ubuntu |no | Username to access the VM |
-|gcp_compute_engie | ssh_key_filename | ~/.ssh/id_rsa |no | Path to the SSH private key |
-|gcp_compute_engie | worker_processes | AUTO |no | Parallel Lithops processes per worker |
+|gcp_compute_engine | instance_name | |yes | Existing VM instance name |
+|gcp_compute_engine | project_name | |yes | GCP project ID |
+|gcp_compute_engine | zone | |yes | Compute Engine zone |
+|gcp_compute_engine | ssh_username | ubuntu |no | Username to access the VM |
+|gcp_compute_engine | ssh_key_filename | ~/.ssh/id_rsa |no | Path to the SSH private key |
+|gcp_compute_engine | worker_processes | AUTO |no | Parallel Lithops processes per worker |
## Test Lithops
Once you have your compute and storage backends configured, you can run a hello world function with:
```bash
-lithops hello -b gcp_compute_engie -s gcp_storage
+lithops hello -b gcp_compute_engine -s gcp_storage
```
## Viewing the execution logs
@@ -158,7 +166,7 @@ All VMs, including the master, are automatically stopped after a configurable ti
You can open an SSH session to the master VM with:
```bash
-lithops attach -b gcp_compute_engie
+lithops attach -b gcp_compute_engine
```
The master and worker VMs store Lithops service logs in `/tmp/lithops-root/*-service.log`.
@@ -166,25 +174,25 @@ The master and worker VMs store Lithops service logs in `/tmp/lithops-root/*-ser
To list available workers:
```bash
-lithops worker list -b gcp_compute_engie
+lithops worker list -b gcp_compute_engine
```
To list submitted jobs:
```bash
-lithops job list -b gcp_compute_engie
+lithops job list -b gcp_compute_engine
```
To delete workers only:
```bash
-lithops clean -b gcp_compute_engie -s gcp_storage
+lithops clean -b gcp_compute_engine -s gcp_storage
```
To delete workers, the master VM, and Lithops-created network resources:
```bash
-lithops clean -b gcp_compute_engie -s gcp_storage --all
+lithops clean -b gcp_compute_engine -s gcp_storage --all
```
## Architecture diagram
diff --git a/docs/source/configuration.rst b/docs/source/configuration.rst
index 4b6be8239..a0bbca4ac 100644
--- a/docs/source/configuration.rst
+++ b/docs/source/configuration.rst
@@ -37,7 +37,7 @@ Choose your compute and storage engines from the table below:
|| `IBM Virtual Private Cloud `_ || |
|| `AWS Elastic Compute Cloud (EC2) `_ || |
|| `Azure Virtual Machines `_ || |
-|| `Google Compute Engine `_ || |
+|| `Google Compute Engine `_ || |
+--------------------------------------------------------------------+--------------------------------------------------------------------+
Configuration File
diff --git a/docs/source/execution_modes.rst b/docs/source/execution_modes.rst
index 737300a41..8ea31a0a8 100644
--- a/docs/source/execution_modes.rst
+++ b/docs/source/execution_modes.rst
@@ -83,4 +83,4 @@ underlying infrastructure.
fexec = lithops.StandaloneExecutor()
-- Available backends: `IBM Virtual Private Cloud `_, `AWS Elastic Compute Cloud (EC2) `_, `Azure Virtual Machines `_, `Google Compute Engine `_, `Virtual Machine `_
+- Available backends: `IBM Virtual Private Cloud `_, `AWS Elastic Compute Cloud (EC2) `_, `Azure Virtual Machines `_, `Google Compute Engine `_, `Virtual Machine `_
diff --git a/lithops/constants.py b/lithops/constants.py
index e1b29cad4..515899975 100644
--- a/lithops/constants.py
+++ b/lithops/constants.py
@@ -120,6 +120,6 @@
'ibm_vpc',
'aws_ec2',
'azure_vms',
- 'gcp_compute_engie',
+ 'gcp_compute_engine',
'vm'
]
diff --git a/lithops/standalone/backends/gcp_compute_engie/__init__.py b/lithops/standalone/backends/gcp_compute_engie/__init__.py
deleted file mode 100644
index d7d7fb0f1..000000000
--- a/lithops/standalone/backends/gcp_compute_engie/__init__.py
+++ /dev/null
@@ -1,3 +0,0 @@
-from .gcp_compute_engie import GCPComputeEngieBackend as StandaloneBackend
-
-__all__ = ['StandaloneBackend']
diff --git a/lithops/standalone/backends/gcp_compute_engine/__init__.py b/lithops/standalone/backends/gcp_compute_engine/__init__.py
new file mode 100644
index 000000000..3a749babe
--- /dev/null
+++ b/lithops/standalone/backends/gcp_compute_engine/__init__.py
@@ -0,0 +1,3 @@
+from .gcp_compute_engine import GCPComputeEngineBackend as StandaloneBackend
+
+__all__ = ['StandaloneBackend']
diff --git a/lithops/standalone/backends/gcp_compute_engie/config.py b/lithops/standalone/backends/gcp_compute_engine/config.py
similarity index 66%
rename from lithops/standalone/backends/gcp_compute_engie/config.py
rename to lithops/standalone/backends/gcp_compute_engine/config.py
index 5e61af1aa..f5b593491 100644
--- a/lithops/standalone/backends/gcp_compute_engie/config.py
+++ b/lithops/standalone/backends/gcp_compute_engine/config.py
@@ -43,48 +43,48 @@
def load_config(config_data):
- if 'gcp_compute_engie' not in config_data or not config_data['gcp_compute_engie']:
- raise Exception("'gcp_compute_engie' section is mandatory in the configuration")
+ if not config_data['gcp_compute_engine']:
+ raise Exception("'gcp_compute_engine' section is mandatory in the configuration")
if 'gcp' not in config_data:
config_data['gcp'] = {}
- temp = copy.deepcopy(config_data['gcp_compute_engie'])
- config_data['gcp_compute_engie'].update(config_data['gcp'])
- config_data['gcp_compute_engie'].update(temp)
+ temp = copy.deepcopy(config_data['gcp_compute_engine'])
+ config_data['gcp_compute_engine'].update(config_data['gcp'])
+ config_data['gcp_compute_engine'].update(temp)
- if 'credentials_path' in config_data['gcp_compute_engie']:
- config_data['gcp_compute_engie']['credentials_path'] = os.path.expanduser(
- config_data['gcp_compute_engie']['credentials_path']
+ if 'credentials_path' in config_data['gcp_compute_engine']:
+ config_data['gcp_compute_engine']['credentials_path'] = os.path.expanduser(
+ config_data['gcp_compute_engine']['credentials_path']
)
for key in DEFAULT_CONFIG_KEYS:
- if key not in config_data['gcp_compute_engie']:
- config_data['gcp_compute_engie'][key] = DEFAULT_CONFIG_KEYS[key]
+ if key not in config_data['gcp_compute_engine']:
+ config_data['gcp_compute_engine'][key] = DEFAULT_CONFIG_KEYS[key]
if 'standalone' not in config_data or config_data['standalone'] is None:
config_data['standalone'] = {}
for key in SA_DEFAULT_CONFIG_KEYS:
- if key in config_data['gcp_compute_engie']:
- config_data['standalone'][key] = config_data['gcp_compute_engie'].pop(key)
+ if key in config_data['gcp_compute_engine']:
+ config_data['standalone'][key] = config_data['gcp_compute_engine'].pop(key)
elif key not in config_data['standalone']:
config_data['standalone'][key] = SA_DEFAULT_CONFIG_KEYS[key]
if config_data['standalone']['exec_mode'] == 'consume':
params_to_check = MANDATORY_PARAMETERS_1
- config_data['gcp_compute_engie']['max_workers'] = 1
+ config_data['gcp_compute_engine']['max_workers'] = 1
else:
params_to_check = MANDATORY_PARAMETERS_2
for param in params_to_check:
- if param not in config_data['gcp_compute_engie']:
- msg = f"'{param}' is mandatory in the 'gcp_compute_engie' section of the configuration"
+ if param not in config_data['gcp_compute_engine']:
+ msg = f"'{param}' is mandatory in the 'gcp_compute_engine' or 'gcp' section of the configuration"
raise Exception(msg)
- if 'region' not in config_data['gcp_compute_engie']:
- zone = config_data['gcp_compute_engie']['zone']
- config_data['gcp_compute_engie']['region'] = '-'.join(zone.split('-')[:-1])
+ if 'region' not in config_data['gcp_compute_engine']:
+ zone = config_data['gcp_compute_engine']['zone']
+ config_data['gcp_compute_engine']['region'] = '-'.join(zone.split('-')[:-1])
- if 'region' not in config_data['gcp'] and 'region' in config_data['gcp_compute_engie']:
- config_data['gcp']['region'] = config_data['gcp_compute_engie']['region']
+ if 'region' not in config_data['gcp'] and 'region' in config_data['gcp_compute_engine']:
+ config_data['gcp']['region'] = config_data['gcp_compute_engine']['region']
diff --git a/lithops/standalone/backends/gcp_compute_engie/gcp_compute_engie.py b/lithops/standalone/backends/gcp_compute_engine/gcp_compute_engine.py
similarity index 73%
rename from lithops/standalone/backends/gcp_compute_engie/gcp_compute_engie.py
rename to lithops/standalone/backends/gcp_compute_engine/gcp_compute_engine.py
index 94dfa6d75..7b8e2d015 100644
--- a/lithops/standalone/backends/gcp_compute_engie/gcp_compute_engie.py
+++ b/lithops/standalone/backends/gcp_compute_engine/gcp_compute_engine.py
@@ -37,6 +37,7 @@
StandaloneMode,
CLOUD_CONFIG_WORKER,
CLOUD_CONFIG_WORKER_PK,
+ get_host_setup_script,
)
from lithops.standalone import LithopsValidationError
@@ -49,16 +50,20 @@
'ubuntu-2404-lts-amd64',
'ubuntu-2204-lts',
)
+DEFAULT_UBUNTU_SOURCE_IMAGE = (
+ 'projects/ubuntu-os-cloud/global/images/family/ubuntu-2404-lts-amd64'
+)
+DEFAULT_LITHOPS_IMAGE_NAME = 'lithops-ubuntu-2404-lts-amd64-server'
# Scopes for the SA attached to master/worker VMs (GCS, etc. via metadata credentials).
GCE_INSTANCE_SCOPES = ['https://www.googleapis.com/auth/cloud-platform']
-class GCPComputeEngieBackend:
+class GCPComputeEngineBackend:
def __init__(self, config, mode):
logger.debug("Creating GCP Compute Engine client")
- self.name = 'gcp_compute_engie'
+ self.name = 'gcp_compute_engine'
self.config = config
self.mode = mode
self.project_name = self.config['project_name']
@@ -122,7 +127,7 @@ def _resolve_service_account_email(self):
logger.warning(
'No service account resolved for GCE VMs. Workers/master cannot access GCS '
- 'via metadata unless you set gcp_compute_engie.service_account or '
+ 'via metadata unless you set gcp_compute_engine.service_account or '
'gcp.credentials_path.'
)
@@ -393,21 +398,15 @@ def _instance_exists(self, instance_name):
)
def _create_master_instance(self):
+ """
+ Creates the master VM instance
+ """
name = self.config.get('instance_name') or f'lithops-master-{self.network_key}'
- self.master = GCPComputeEngieInstance(self.config, self.compute_client, public=True)
- self.master.name = name
+ self.master = GCPComputeEngineInstance(
+ name, self.config, self.compute_client, public=True
+ )
self.master.instance_type = self.config['master_instance_type']
self.master.delete_on_dismantle = False
-
- if self._instance_exists(name):
- logger.debug(f'Using existing master VM {name}')
- self.master.get_instance_data()
- if self.master.get_status() == 'TERMINATED':
- logger.debug(f'Master VM {name} is stopped, starting')
- self.master.start()
- elif self.mode != StandaloneMode.CONSUME.value:
- logger.debug(f'Creating new VM instance {name}')
- self.master.create(public=True)
self.master.get_instance_data()
def _create_compute_client(self):
@@ -434,44 +433,232 @@ def init(self):
self._load_gce_data()
if self.mode == StandaloneMode.CONSUME.value:
+ instance_name = self.config['instance_name']
+ if not self.gce_data or instance_name != self.gce_data.get('master_name'):
+ self.gce_data = {
+ 'mode': self.mode,
+ 'vpc_data_type': 'provided',
+ 'ssh_data_type': 'provided',
+ 'master_name': instance_name,
+ 'master_id': instance_name,
+ }
+
+ self.config['instance_name'] = self.gce_data['master_name']
self._create_master_instance()
+ self._dump_gce_data()
+ return
+
+ elif self.mode in [StandaloneMode.CREATE.value, StandaloneMode.REUSE.value]:
+ self._create_network()
+ self._create_ssh_key()
+ self._request_source_image()
+ if 'instance_name' not in self.config:
+ self.config['instance_name'] = f'lithops-master-{self.network_key}'
+ self._create_master_instance()
+ self._load_instance_types()
self.gce_data = {
'mode': self.mode,
+ 'vpc_data_type': self.vpc_data_type,
+ 'ssh_data_type': self.ssh_data_type,
'master_name': self.master.name,
- 'master_id': self.master.get_instance_id()
+ 'master_id': self.network_key,
+ 'network_name': self.config['network_name'],
+ 'network_key': self.network_key,
+ 'subnet_name': self.config['subnet_name'],
+ 'firewall_name': self.config['firewall_name'],
+ 'internal_firewall_name': self.config['internal_firewall_name'],
+ 'router_name': self.config.get('router_name'),
+ 'nat_name': self.config.get('nat_name'),
+ 'ssh_key_filename': self.config['ssh_key_filename'],
+ 'source_image': self.config['source_image'],
+ 'instance_types': self.instance_types,
}
self._dump_gce_data()
+
+ @staticmethod
+ def _is_default_ubuntu_source_image(source_image):
+ if not source_image:
+ return True
+ return (
+ source_image == DEFAULT_UBUNTU_SOURCE_IMAGE
+ or source_image.endswith('/family/ubuntu-2404-lts-amd64')
+ or source_image.endswith('/family/ubuntu-2204-lts')
+ )
+
+ def _project_image_ref(self, image_name):
+ return f'projects/{self.project_name}/global/images/{image_name}'
+
+ def _get_project_image(self, image_name):
+ try:
+ return self.compute_client.images().get(
+ project=self.project_name, image=image_name
+ ).execute()
+ except HttpError as err:
+ if getattr(err.resp, 'status', None) == 404:
+ return None
+ raise
+
+ def _request_source_image(self):
+ """
+ Requests the default image if not provided
+ """
+ if not self._is_default_ubuntu_source_image(self.config.get('source_image')):
return
- self._create_network()
- self._create_ssh_key()
- if 'instance_name' not in self.config:
- self.config['instance_name'] = f'lithops-master-{self.network_key}'
- self._create_master_instance()
- self._load_instance_types()
- self.gce_data = {
- 'mode': self.mode,
- 'vpc_data_type': self.vpc_data_type,
- 'ssh_data_type': self.ssh_data_type,
- 'master_name': self.master.name,
- 'master_id': self.master.get_instance_id(),
- 'network_name': self.config['network_name'],
- 'network_key': self.network_key,
- 'subnet_name': self.config['subnet_name'],
- 'firewall_name': self.config['firewall_name'],
- 'internal_firewall_name': self.config['internal_firewall_name'],
- 'router_name': self.config.get('router_name'),
- 'nat_name': self.config.get('nat_name'),
- 'ssh_key_filename': self.config['ssh_key_filename'],
- 'instance_types': self.instance_types,
- }
- self._dump_gce_data()
+ if 'source_image' in self.gce_data:
+ self.config['source_image'] = self.gce_data['source_image']
+ return
+
+ for image in self._iter_project_images(self.project_name):
+ if image.get('name') == DEFAULT_LITHOPS_IMAGE_NAME:
+ image_ref = self._project_image_ref(DEFAULT_LITHOPS_IMAGE_NAME)
+ logger.debug(f'Found default VM image: {DEFAULT_LITHOPS_IMAGE_NAME}')
+ self.config['source_image'] = image_ref
+ return
- def build_image(self, **kwargs):
- raise NotImplementedError(f'{self.name}.build_image() is not implemented yet')
+ if 'source_image' not in self.config:
+ self.config['source_image'] = DEFAULT_UBUNTU_SOURCE_IMAGE
+
+ def _get_boot_disk_source(self, instance_data):
+ for disk in instance_data.get('disks', []):
+ if disk.get('boot'):
+ disk_url = disk.get('source', '')
+ if '/disks/' in disk_url:
+ return disk_url.split('/disks/')[-1]
+ raise Exception(f'Boot disk not found for instance {instance_data.get("name")}')
+
+ def _wait_image_ready(self, image_name, timeout=600):
+ start = time.time()
+ while time.time() - start < timeout:
+ image = self._get_project_image(image_name)
+ if image:
+ status = image.get('status', 'UNKNOWN')
+ logger.debug(f'VM Image is being created. Current status: {status}')
+ if status == 'READY':
+ return image
+ if status == 'FAILED':
+ raise Exception(
+ f"VM image '{image_name}' creation failed: {image}"
+ )
+ time.sleep(20)
+ raise TimeoutError(
+ f"VM image '{image_name}' was not ready after {timeout}s"
+ )
+
+ def build_image(self, image_name, script_file, overwrite, include, extra_args=[]):
+ """
+ Builds a new VM Image
+ """
+ image_name = image_name or DEFAULT_LITHOPS_IMAGE_NAME
+
+ if self._get_project_image(image_name):
+ if overwrite:
+ self.delete_image(image_name)
+ else:
+ image_ref = self._project_image_ref(image_name)
+ raise Exception(
+ f"The image with name '{image_name}' already exists with ID: "
+ f"'{image_ref}'. Use '--overwrite' or '-o' if you want to overwrite it"
+ )
+
+ is_initialized = self.is_initialized()
+ self.init()
+
+ try:
+ del self.config['source_image']
+ except Exception:
+ pass
+ try:
+ del self.gce_data['source_image']
+ except Exception:
+ pass
+
+ self._request_source_image()
+
+ build_vm = GCPComputeEngineInstance(
+ 'building-image-' + image_name, self.config, self.compute_client, public=True
+ )
+ build_vm.instance_type = self.config['master_instance_type']
+ build_vm.delete_on_dismantle = False
+ build_vm.create(public=True)
+ build_vm.wait_ready()
+
+ logger.debug(f"Uploading installation script to {build_vm}")
+ remote_script = "/tmp/install_lithops.sh"
+ script = get_host_setup_script(lithops_pip_spec='lithops[gcp,redis]')
+ build_vm.get_ssh_client().upload_data_to_file(script, remote_script)
+ logger.debug("Executing Lithops installation script. Be patient, this process can take up to 3 minutes")
+ build_vm.get_ssh_client().run_remote_command(
+ f"chmod 777 {remote_script}; sudo {remote_script}; rm {remote_script};"
+ )
+ logger.debug("Lithops installation script finsihed")
+
+ for src_dst_file in include:
+ src_file, dst_file = src_dst_file.split(':')
+ if os.path.isfile(src_file):
+ logger.debug(f"Uploading local file '{src_file}' to VM image in '{dst_file}'")
+ build_vm.get_ssh_client().upload_local_file(src_file, dst_file)
+
+ if script_file:
+ script = os.path.expanduser(script_file)
+ logger.debug(f"Uploading user script '{script_file}' to {build_vm}")
+ remote_script = "/tmp/install_user_lithops.sh"
+ build_vm.get_ssh_client().upload_local_file(script, remote_script)
+ logger.debug(f"Executing user script '{script_file}'")
+ build_vm.get_ssh_client().run_remote_command(
+ f"chmod 777 {remote_script}; sudo {remote_script}; rm {remote_script};"
+ )
+ logger.debug(f"User script '{script_file}' finsihed")
+
+ logger.debug(f'Stopping {build_vm} before creating VM image')
+ build_vm.stop()
+ build_vm.wait_stopped()
+
+ instance_data = build_vm.get_instance_data()
+ disk_name = self._get_boot_disk_source(instance_data)
+ source_disk = f'zones/{self.zone}/disks/{disk_name}'
+
+ op = self.compute_client.images().insert(
+ project=self.project_name,
+ body={
+ 'name': image_name,
+ 'description': 'Lithops Image',
+ 'sourceDisk': source_disk,
+ 'labels': {'type': 'lithops-runtime'},
+ },
+ ).execute()
+ self._wait_operation(op['name'], scope='global')
- def delete_image(self, **kwargs):
- raise NotImplementedError(f'{self.name}.delete_image() is not implemented yet')
+ logger.debug("Starting VM image creation")
+ self._wait_image_ready(image_name)
+
+ if not is_initialized:
+ while not self.clean(all=True):
+ time.sleep(5)
+ else:
+ build_vm.delete()
+
+ image_ref = self._project_image_ref(image_name)
+ logger.info(f"VM Image created. Image ID: {image_ref}")
+
+ def delete_image(self, image_name):
+ """
+ Deletes a custom GCE image from the project.
+ """
+ image = self._get_project_image(image_name)
+ if not image:
+ logger.debug(f"VM Image '{image_name}' does not exist")
+ return
+
+ logger.debug(f"Deleting VM Image '{image_name}'")
+ op = self.compute_client.images().delete(
+ project=self.project_name, image=image_name
+ ).execute()
+ self._wait_operation(op['name'], scope='global')
+
+ while self._get_project_image(image_name):
+ time.sleep(2)
+ logger.debug(f"VM Image '{image_name}' successfully deleted")
@staticmethod
def _format_image_timestamp(timestamp):
@@ -524,42 +711,73 @@ def list_images(self, **kwargs):
return sorted(result, key=lambda x: x[2], reverse=True)
def clean(self, **kwargs):
+ """
+ Clean all the backend resources.
+ Returns True when cleanup completed, False if resources are still in use.
+ """
all_clean = kwargs.get('all', False)
+ logger.info('Cleaning GCP Compute Engine resources')
+
if not self.gce_data:
self._load_gce_data()
if self.mode == StandaloneMode.CONSUME.value:
self._delete_vpc_data()
- return
-
- self._delete_vm_instances(all=all_clean)
+ return True
- master_name = self.gce_data.get('master_name') or (self.master.name if self.master else None)
- if master_name:
- master_pk = os.path.join(self.cache_dir, f'{master_name}-id_rsa.pub')
- if all_clean and os.path.isfile(master_pk):
- os.remove(master_pk)
+ try:
+ self._delete_vm_instances(all=all_clean)
- if all_clean:
- self._delete_network_resources()
- self._delete_ssh_key()
- self._delete_vpc_data()
+ master_name = self.gce_data.get('master_name') or (
+ self.master.name if self.master else None
+ )
+ if master_name:
+ master_pk = os.path.join(self.cache_dir, f'{master_name}-id_rsa.pub')
+ if all_clean and os.path.isfile(master_pk):
+ os.remove(master_pk)
+
+ if all_clean:
+ self._delete_network_resources()
+ self._delete_ssh_key()
+ self._delete_vpc_data()
+ return True
+ except HttpError:
+ return False
def _delete_vm_instances(self, all=False):
- prefixes = ('lithops-worker-', 'lithops-master-') if all else ('lithops-worker-',)
- instances = self.compute_client.instances().list(
- project=self.project_name, zone=self.zone
- ).execute().get('items', [])
+ """
+ Deletes all worker VM instances
+ """
+ msg = (
+ f'Deleting all Lithops worker VMs from {self.network_name}'
+ if self.network_name else 'Deleting all Lithops worker VMs'
+ )
+ logger.info(msg)
- for ins in instances:
- name = ins.get('name', '')
- if not name.startswith(prefixes):
- continue
- logger.debug(f"Deleting VM instance {name}")
- op = self.compute_client.instances().delete(
- project=self.project_name, zone=self.zone, instance=name
- ).execute()
- self._wait_operation(op['name'], scope='zone')
+ prefixes = (
+ ('lithops-worker-', 'lithops-master-', 'building-image-')
+ if all else ('lithops-worker-',)
+ )
+
+ def get_instance_names():
+ instances = self.compute_client.instances().list(
+ project=self.project_name, zone=self.zone
+ ).execute().get('items', []) or []
+ return [
+ ins['name'] for ins in instances
+ if ins.get('name', '').startswith(prefixes)
+ ]
+
+ while True:
+ names = get_instance_names()
+ if not names:
+ break
+ for name in names:
+ logger.debug(f"Deleting VM instance {name}")
+ op = self.compute_client.instances().delete(
+ project=self.project_name, zone=self.zone, instance=name
+ ).execute()
+ self._wait_operation(op['name'], scope='zone')
def _delete_network_resources(self):
"""
@@ -660,8 +878,7 @@ def dismantle(self, include_master=True):
self.master.stop()
def get_instance(self, name, **kwargs):
- instance = GCPComputeEngieInstance(self.config, self.compute_client)
- instance.name = name
+ instance = GCPComputeEngineInstance(name, self.config, self.compute_client)
for key in kwargs:
if hasattr(instance, key) and kwargs[key] is not None:
setattr(instance, key, kwargs[key])
@@ -675,13 +892,14 @@ def get_worker_cpu_count(self):
def create_worker(self, name):
"""
- Creates a new worker VM instance (same flow as AWS EC2 / IBM VPC).
+ Creates a new worker VM instance
"""
if self.mode == StandaloneMode.CONSUME.value:
raise NotImplementedError(f'{self.name}.create_worker() not available in consume mode')
- worker = GCPComputeEngieInstance(self.config, self.compute_client, public=False)
- worker.name = name
+ worker = GCPComputeEngineInstance(
+ name, self.config, self.compute_client, public=False
+ )
worker.instance_type = self.config['worker_instance_type']
user = worker.ssh_credentials['username']
@@ -711,13 +929,17 @@ def get_runtime_key(self, runtime_name, version=__version__):
return os.path.join(self.name, version, master_id, runtime)
-class GCPComputeEngieInstance:
+class GCPComputeEngineInstance:
- def __init__(self, config, compute_client, public=False):
+ def __init__(self, name, config, compute_client, public=False):
+ """
+ Initialize a GCPComputeEngineInstance.
+ VMs with public=True get an external IP (e.g. master or image build VM).
+ """
+ self.name = name.lower()
self.config = config
self.compute_client = compute_client
self.public = public
- self.name = self.config['instance_name']
self.project_name = self.config['project_name']
self.zone = self.config['zone']
@@ -744,9 +966,13 @@ def __str__(self):
def get_ssh_client(self):
self.get_instance_data()
+ if not self.instance_data:
+ raise Exception(f'VM instance {self.name} does not exist')
+
if self.public:
if not self.public_ip:
- if self.get_status() == 'TERMINATED':
+ status = self.get_status()
+ if status == 'TERMINATED':
self.start()
else:
self._wait_public_ip(timeout=60)
@@ -800,11 +1026,18 @@ def wait_ready(self, timeout=INSTANCE_START_TIMEOUT):
raise TimeoutError(f'Readiness probe expired on {self}')
def get_instance_data(self):
- res = self.compute_client.instances().get(
- project=self.project_name,
- zone=self.zone,
- instance=self.name
- ).execute()
+ try:
+ res = self.compute_client.instances().get(
+ project=self.project_name,
+ zone=self.zone,
+ instance=self.name
+ ).execute()
+ except HttpError as err:
+ if getattr(err.resp, 'status', None) == 404:
+ self.instance_data = None
+ return None
+ raise
+
self.instance_data = res
self.instance_id = str(res.get('id'))
@@ -832,16 +1065,27 @@ def get_public_ip(self):
return self.public_ip
def get_status(self):
- try:
- self.get_instance_data()
- except HttpError:
- return None
+ self.get_instance_data()
return self.instance_data.get('status') if self.instance_data else None
+ def is_stopped(self):
+ return self.get_status() == 'TERMINATED'
+
+ def wait_stopped(self, timeout=INSTANCE_START_TIMEOUT):
+ logger.debug(f'Waiting {self} to become stopped')
+ start = time.time()
+ while time.time() - start < timeout:
+ if self.is_stopped():
+ return True
+ time.sleep(3)
+ raise TimeoutError(f'Stop probe expired on {self}')
+
def _wait_public_ip(self, timeout=INSTANCE_START_TIMEOUT):
start = time.time()
while time.time() - start < timeout:
self.get_instance_data()
+ if not self.instance_data:
+ raise Exception(f'VM instance {self.name} does not exist')
if self.public_ip:
return self.public_ip
time.sleep(2)
@@ -883,7 +1127,9 @@ def create(self, public=False, ssh_public_key=None, user_data=None,
}
# Master: external IP for SSH from the Lithops client.
# Workers: no external IP; outbound internet uses Cloud NAT on the subnet.
- if public or self.config.get('worker_public_ip', False):
+ # Use self.public (set in __init__) when create() is called without public=...
+ use_public_ip = public or self.public or self.config.get('worker_public_ip', False)
+ if use_public_ip:
network_iface['accessConfigs'] = [{'name': 'External NAT', 'type': 'ONE_TO_ONE_NAT'}]
body = {
@@ -927,7 +1173,7 @@ def create(self, public=False, ssh_public_key=None, user_data=None,
f'GCS access from the VM will fail'
)
- if self.config.get('request_spot_instances', False) and not public:
+ if self.config.get('request_spot_instances', False) and not use_public_ip:
body['scheduling'] = {
'provisioningModel': 'SPOT',
'instanceTerminationAction': 'STOP'
diff --git a/lithops/standalone/standalone.py b/lithops/standalone/standalone.py
index 44e3e14fb..b969ab067 100644
--- a/lithops/standalone/standalone.py
+++ b/lithops/standalone/standalone.py
@@ -343,7 +343,8 @@ def clean(self, **kwargs):
"""
Clan all the backend resources
"""
- if self.is_initialized():
+ all_clean = kwargs.get('all', False)
+ if self.is_initialized() and not all_clean:
try:
self.init()
self._make_request('POST', 'clean')
diff --git a/lithops/utils.py b/lithops/utils.py
index e8a018a6a..2fe26eefc 100644
--- a/lithops/utils.py
+++ b/lithops/utils.py
@@ -164,7 +164,7 @@ def get_mode(backend):
if backend is None:
return constants.MODE_DEFAULT
- elif backend == constants.LOCALHOST:
+ if backend == constants.LOCALHOST:
return constants.LOCALHOST
elif backend in constants.SERVERLESS_BACKENDS:
return constants.SERVERLESS
diff --git a/runtime/README.md b/runtime/README.md
index a7e77b6b5..c7a05a071 100644
--- a/runtime/README.md
+++ b/runtime/README.md
@@ -8,10 +8,11 @@ Choose your compute backend:
4. [AWS EC2](aws_ec2/)
5. [Google Cloud Functions](gcp_functions/)
6. [Google Cloud Run](gcp_cloudrun/)
-7. [Azure Functions](azure_functions/)
-8. [Azure Container Apps](azure_containers/)
-9. [Aliyun Functions Compute](aliyun_fc/)
-10. [Kubernetes](kubernetes/)
-11. [OpenWhisk](openwhisk/)
-12. [Knative](knative/)
-13. [Singularity](singularity/)
+7. [Google Compute Engine](gcp_compute_engine/)
+8. [Azure Functions](azure_functions/)
+9. [Azure Container Apps](azure_containers/)
+10. [Aliyun Functions Compute](aliyun_fc/)
+11. [Kubernetes](kubernetes/)
+12. [OpenWhisk](openwhisk/)
+13. [Knative](knative/)
+14. [Singularity](singularity/)
diff --git a/runtime/gcp_compute_engine/README.md b/runtime/gcp_compute_engine/README.md
new file mode 100644
index 000000000..f79ff4750
--- /dev/null
+++ b/runtime/gcp_compute_engine/README.md
@@ -0,0 +1,69 @@
+# Lithops runtime for Google Compute Engine
+
+In Google Compute Engine (GCE), Lithops runs functions as parallel processes inside VMs. On first use, Lithops can install all dependencies on each VM automatically, but that adds several minutes to every cold start. A pre-built custom image avoids that cost.
+
+The Lithops backend key is `gcp_compute_engine` (matches the module name).
+
+Custom images are based on the default Ubuntu 24.04 LTS image: `projects/ubuntu-os-cloud/global/images/family/ubuntu-2404-lts-amd64`.
+
+## Option 1: Build the default Lithops image
+
+Build the default VM image with all Lithops dependencies:
+
+```bash
+lithops image build -b gcp_compute_engine
+```
+
+This creates an image named `lithops-ubuntu-2404-lts-amd64-server` in your GCP project.
+
+To rebuild when the image already exists:
+
+```bash
+lithops image build -b gcp_compute_engine --overwrite
+```
+
+If you use this default image name, you do not need to set `source_image` in the config; Lithops discovers it automatically.
+
+List available Ubuntu and Lithops images:
+
+```bash
+lithops image list -b gcp_compute_engine
+```
+
+Use the **Image ID** column as `source_image` when you use a custom image name.
+
+### Custom image name and extra setup
+
+Provide an install script and optional image name:
+
+```bash
+lithops image build -b gcp_compute_engine -f myscript.sh custom-lithops-runtime
+```
+
+Upload local files into the image with `--include` / `-i` (`src:dst`):
+
+```bash
+lithops image build -b gcp_compute_engine -f myscript.sh \
+ -i /home/user/test.bin:/home/ubuntu/test.bin custom-lithops-runtime
+```
+
+When using a custom name, set `source_image` in your Lithops config to the value printed at the end of the build (for example `projects//global/images/custom-lithops-runtime`):
+
+```yaml
+gcp_compute_engine:
+ project_name:
+ zone: us-central1-a
+ source_image: projects//global/images/custom-lithops-runtime
+```
+
+Delete a custom image:
+
+```bash
+lithops image delete -b gcp_compute_engine
+```
+
+## Option 2: Manual image
+
+Create a VM from Ubuntu 24.04, install dependencies (apt, pip, Lithops, Redis, Docker as needed), stop the VM, then create a custom image in the GCP console or with `gcloud compute images create`. Set `source_image` in your config to the full image resource path.
+
+If you name the image `lithops-ubuntu-2404-lts-amd64-server`, Lithops picks it up without an explicit `source_image` entry.
|