diff --git a/docs/access/index.md b/docs/access/index.md index 7b3e23a16..339763b69 100644 --- a/docs/access/index.md +++ b/docs/access/index.md @@ -18,10 +18,16 @@ how to apply for a new project [VDI access to virtual machines](./virtualmachines-vdi.md): how to connect to the virtual desktop interface. +[VDI access to Confidential Data Workspace VMs](../services/confidentialdataworkspace/quickstart.md): how to connect to the Confidential Data Workspace service. + ## SSH Access to Virtual Machines Users with the appropriate permissions can also [use `ssh` to login to Virtual Desktop VMs](./ssh.md) +## SSH Access to Confidential Data Workspace Service + +Remote SSH access to Confidential Data Workspace VMs is **disabled** for security of the service, access is provided through the Guacamole web interface only. Access is described in the [Confidential Data Workspace Quickstart](../services/confidentialdataworkspace/quickstart.md) guide. + ## SSH Access to Computing Services Includes access to the following services: diff --git a/docs/access/virtualmachines-vdi.md b/docs/access/virtualmachines-vdi.md index 517ed9eb8..4d20cc13d 100644 --- a/docs/access/virtualmachines-vdi.md +++ b/docs/access/virtualmachines-vdi.md @@ -14,7 +14,7 @@ Authentication to the VDI is provided by SAFE, so if you do not have an active w you will be redirected to the [SAFE log on page](https://safe.epcc.ed.ac.uk). If you do not have a SAFE account follow the instructions in the [SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/) -how to register and receive your password. +on how to register and receive your password. ## Navigating the EIDF VDI diff --git a/docs/services/confidentialdataworkspace/docs.md b/docs/services/confidentialdataworkspace/docs.md new file mode 100644 index 000000000..1f8afc7c6 --- /dev/null +++ b/docs/services/confidentialdataworkspace/docs.md @@ -0,0 +1,216 @@ +# Service Documentation + +## Project Management Guide + +### User Roles and Their Permissions + +!!! info + All names are separate from those in SAFE and are for the purpose of the EIDF Confidential Data Workspace service only + +**Data Users** - _Typical users - have access to read only on some data directories_ + +**Data managers** - _Have read and write access to some data directories_ + +**VM Admin** - _Access to the data manager group (project wide) and sudo access (per VM)_ + +#### Detailed Roles, Groups and Their File Permissions + +A description of the roles defined within the EIDF Confidential Data Workspace service and some access they have is given in the below table. +For those familiar with Trusted Research Environments (TREs) the following table includes a mapping of roles in the EIDF Confidential Data Workspace service which are conceptually similar to the roles in a TRE. + +| Role | TRE Equivalent User | Router Access | Should be given sudo permissions on VMs & router | +| ------------ | -------------------- | ------------- | ------------------------------------------------ | +| Data User | Researcher | No | No | +| Data Manager | Research Coordinator | No | No | +| VM Admin | N/A | Yes | Yes | + +There are a few different roles associated with Confidential Data Workspace projects that determine what actions a user can perform in the EIDF Portal. + +| Group | Files | Permission | Note | +| ------------------------------- | ------------------------------------------- | ---------- | ---------------------------------------------------------------------- | +| `sudo` group on router machine | `/etc/squid/allowlist_buckets.txt` | Owner W+R | (implicit via machine access) | +| | `/etc/squid/allowlist_domains.txt` | Owner W+R | (implicit via machine access) | +| | `/etc/squid/squid.conf` | Owner W+R | (implicit via machine access; Squid proxy configuration) | +| `sudo` group on a CDW VM | Permissions of `-datamanager` | Owner W+R | | +| `-datamanager` | `/data/datamanager` | W+R | Project shared area for data managers to stage and manage data for import into and export from Confidential Data Workspace VMs via approved transfer methods. | + +### Updating the Allowed Access Configuration for Confidential Data Workspace VMs + +See [router-docs.md](router-docs.md) for documentation on updating the allowed access configuration for Confidential Data Workspace VMs. + +### Create a VM + +To create a new Confidential Data Workspace VM: + +1. Select the project from the list of your projects, e.g. `eidfxxx` +1. Click on the 'New Private Machine' button +1. Complete the 'Create Machine' form as follows: + + 1. Select the 'Confidential Data Workspace' router to use, typically this will be the default router for your project e.g. `eidfxxx-router` + 1. Provide an appropriate name, e.g. `dev-01`. The project code will be prepended automatically to your VM name, in this case your VM would be named `eidfxxx-dev-01`. + 1. Select a suitable operating system + 1. Select a machine specification that is suitable + 1. Choose the required disk size (in GB) or leave blank for the default + 1. Tick the checkbox "Configure RDP access" if you would like to install RDP + and configure VDI connections via RDP for your VM. + +1. Click on 'Create' +1. You should see the new VM listed under the 'Machines' table on the project page and the status as 'Creating' +1. Wait while the job to launch the VM completes. + This may take up to 10 minutes, depending on the configuration you requested. + You have to reload the page to see updates. +1. Once the job has completed successfully the status shows as 'Active' in the list of machines. + +You may wish to ensure that the machine size selected (number of CPUs and RAM) does not +exceed your remaining quota before you press Create, otherwise the request will fail. + +In the list of 'Machines' in the project page in the portal, +click on the name of new VM to see the configuration and properties, +including the machine specification, its `10.24.*.*` IP address and any configured VDI connections. + +### Quota and Usage + +Quotas and Usage are the same as those for the [standard EIDF VM service](../virtualmachines/docs.md#quota-and-usage). + +Each project has a quota for the number of instances, total number of vCPUs, total RAM and storage. +You will not be able to create a VM if it exceeds the quota. + +You can view and refresh the project usage compared to the quota in a table near the bottom of the project page. +This table will be updated automatically when VMs are created or removed, and +you can refresh it manually by pressing the "Refresh" button at the top of the table. + +Please contact the helpdesk if your quota requirements have changed. + +### Add a User Account + +User accounts allow project members to log in to the VMs in a project. +The Project PI and project managers manage user accounts for each member of the project. +Users usually use one account (username and password) to log in to all the VMs in the same project that they can access, +however a user may have multiple accounts in a project, for example for different roles. + +1. From the project page in the portal click on the 'Create account' button under the 'Project Accounts' table at the bottom +1. Complete the 'Create User Account' form as follows: + + 1. Choose 'Account user name': this could be something sensible like the first and last names + concatenated (or initials) together with the project name. + The username is unique across all EPCC systems so the user will not be able to reuse this name + in another project once it has been assigned. + 1. Select the project member from the 'Account owner' drop-down field + 1. Click 'Create' + +The user can now set the password for their new account on the account details page. + +### Setting Up the Correct Groups and Permissions for User Accounts + +Portal management of VMs and user accounts can only be done by project members with **Cloud Admin** permissions. This includes the principal investigator (PI) of the project and all project managers (PM). The PI and PMs can grant a project member the **Cloud Admin** role through the SAFE. There is more information in the virtual machine documentation under the section [Required Member Permissions](../virtualmachines/docs.md#required-member-permissions). + +User accounts should be placed in the correct groups on the VM to ensure they have the correct file permissions and data access. When a user account is created in the portal it will be added to the default group ``. If the user account requires Data Manager access they must be added to the `-datamanager` group. The VM Admin must be added to this group also. + +The VM Admin must be given sudo permissions on each VM to have the necessary permissions for managing restricted VMs and router configuration. Unlike addition to the `datamanager` group, sudo permissions are set on a per-VM basis via the portal. + +The following sections give instructions for setting up the required groups for the different roles in a Confidential Data Workspace project: + +- For creating and adding users to the `datamanager` group under the SAFE see [Creating a group](https://epcced.github.io/safe-docs/safe-for-managers/#how-can-i-set-up-project-groups-within-my-project) and then [adding users to the group](https://epcced.github.io/safe-docs/safe-for-managers/#how-can-i-add-users-to-an-existing-project-group) +- See the Virtual Desktop Interface documentation for [Sudo Permissions](../virtualmachines/docs.md#sudo-permissions) for guidance on giving sudo permissions to the VM Admin role for each VM. + +## Adding Access to the VM for a User + +User accounts can be granted or denied access to existing VMs. + +1. Click 'Manage' next to an existing user account in the 'Project Accounts' table on the project page, or click on the account name and then 'Manage' on the account details page +1. Select the checkboxes in the column "Access" for the VMs to which this account should have access or uncheck the ones without access +1. Click the 'Update' button +1. After a few minutes, the job to give them access to the selected VMs will complete and the account status will show as "Active". + +If a user is logged in already to the VDI at [https://eidf-vdi.epcc.ed.ac.uk/vdi](https://eidf-vdi.epcc.ed.ac.uk/vdi) +newly added connections may not appear in their connections list immediately. +They must log out and log in again to refresh the connection information, or wait until the login token expires and is refreshed automatically - this might take a while. + +If a user only has one connection available in the VDI they will be automatically directed to the VM with the default connection. + +### Sudo Permissions + +Sudo permissions should only be granted to users in the VM Admin role to restrict as much as possible the data management capabilities of users in the Confidential Data Workspace environment and remove the opportunity for data ingress/egress outwith the proper channels. + +## First Login + +A new user account must reset the password before they can log in for the first time. +To do this: + +1. The user can log into the [Portal](https://portal.eidf.ac.uk) and select their project from the 'Projects' drop-down. +1. From the project page, they can select their account from the 'Your Accounts' table +1. Finally, click the 'Set Password' button from the 'User Account Info' table. + +Users will then be able to log in using the VDI as described in the [VDI documentation](../../access/virtualmachines-vdi.md). + +!!! Warning + Access to the Confidential Data Workspace VMs is only possible through the VDI or the project router. + You cannot directly SSH onto the VMs, and you cannot access the VMs through the router until you have access to the router itself. Please see the documentation section on [SSH Access to the Confidential Data Workspace Router](./router-docs.md#ssh-access-to-the-confidential-data-workspace-router) for more information on how to access the router and then the VMs via SSH. + +## Updating an Existing Machine + +### Adding RDP Access + +The instructions in this section match the instructions given in the [EIDF Virtual Machine Service Documentation](../virtualmachines/docs.md#adding-rdp-access). + +If you did not select RDP access when you created the VM you can add it later: + +1. Open the VM details page by selecting the name on the project page +1. Click on 'Configure RDP' +1. The configuration job runs for a few minutes. + +Once the RDP job is completed, all users that are allowed to access the VM +will also be permitted to use the RDP connection. + +### Software Catalogue + +The instructions in this section match the instructions given in the [EIDF Virtual Machine Service Documentation](../virtualmachines/docs.md#software-catalogue). + +You can install packages from the software catalogue at a later time, +even if you didn't select a package when first creating the machine. + +1. Open the VM details page by selecting the name on the project page +1. Click on 'Software Catalogue' +1. Select the configuration you wish to install and press 'Submit' +1. The configuration job runs for a few minutes. + +### Patching and Updating + +It is the responsibility of project PIs to keep the VMs in their projects up to date as stated in the [policy](policies.md#patching-of-user-vms). + +!!! important "Snap packages" + + Snap packages are not supported by default on the Confidential Data Workspace VMs. Snap requires data egress to install software packages. Allowing this could potentially be used to bypass the security of the Confidential Data Workspace. If you require snap packages then the VM Admin can enable usage by following the instructions in the [FAQs](./faq.md#unable-to-download-packages-from-snap-on-the-confidential-data-workspace-vms). + +Since updates and patches require access to the internet, users should ensure that their Confidential Data Workspace VMs have access to the sources that the operating system gets updates from. This will usually be done out of the box, but repository sources can change and may need to be updated in the squid proxy configuration. Updating of allowed sources is detailed in the documentation section [Updating the allowed access for Confidential Data Workspace VMs](router-docs.md#updating-the-allowed-access-for-confidential-data-workspace-vms). + +#### Ubuntu + +To patch and update packages on Ubuntu run the following commands (requires sudo permissions): + +```bash +sudo apt update +sudo apt upgrade +``` + +Your system might require a restart after installing updates. + +#### Rocky + +To patch and update packages on Rocky run the following command (requires sudo permissions): + +```bash +sudo dnf update +``` + +Your system might require a restart after installing updates. + +### Reboot + +When logged in you can reboot a VM with this command (requires sudo permissions): + +```bash +sudo reboot now +``` + +or use the reboot button in the EIDF Portal (requires project manager permissions). diff --git a/docs/services/confidentialdataworkspace/faq.md b/docs/services/confidentialdataworkspace/faq.md new file mode 100644 index 000000000..e358e90d4 --- /dev/null +++ b/docs/services/confidentialdataworkspace/faq.md @@ -0,0 +1,132 @@ +# FAQs for the Confidential Data Workspace Service + +## Unable to download packages from snap on the Confidential Data Workspace VMs + +Snap packages are not by default supported on the Confidential Data Workspace VMs. Snap requires outbound network access to install software packages which, if allowed, could potentially be used to bypass the security of the Confidential Data Workspace. + +If you require snap packages, the VM Admin can enable them by modifying the router squid configuration to allow the outgoing network traffic required for snap. This is done by adding the following lines near the top of the router config file `/etc/squid/squid.conf`, before any existing `http_access deny` rules: + + ```txt + acl snap_refresh_method method POST + acl snap_refresh_domain url_regex api\.snapcraft\.io/v2/snaps/refresh + http_access allow snap_refresh_method snap_refresh_domain + ``` +After modifying the configuration file, restart Squid with `sudo squid -k reconfigure` for the changes to take effect. + +## Using the Proxy for Some Common Software on the Confidential Data Workspace VMs + +Below we outline a number of common software that you may require and how they can be used with the proxy server on the Confidential Data Workspace Router from the Confidential Data Workspace VMs. You may find a number of these work out of the box. These make use of the proxy address given in the EIDF portal for your project under the **"Confidential Data Workspace (CDW)"** Proxy address section and the proxy ports 3128 for HTTP and 3129 for HTTPS. + +Some more general information on using the proxy server on the Confidential Data Workspace Router can be found in the documentation section on [Details of the Confidential Data Workspace Router](./router-docs.md#details-of-the-confidential-data-workspace-router). + +### http_proxy and https_proxy Environment Variables + +http_proxy and https_proxy environment variables are used by many command line tools to route traffic via a proxy server. They should be set by default as follows to the Squid proxy server in the Confidential Data Workspace. Uppercase versions are also required by some tools. + + ```bash + export http_proxy=http://:3128 + export https_proxy=http://:3129 + export HTTP_PROXY=http://:3128 + export HTTPS_PROXY=http://:3129 + ``` + +Where the `` can be found in the EIDF portal under the **Confidential Data Workspace (CDW)** heading. + +These variables are also set by default in the /etc/environment files. + +It is always worth checking these are set if you are having trouble with internet access from within a Confidential Data Workspace VM. + +### AWS CLI + +If using S3, as well as usual credentials file you must also define the proxy certificate: +by adding to `~/.aws/credentials` the line +`ca_bundle=/usr/local/share/ca-certificates/extra/squid_proxyCA.crt` + + ```bash + $ aws s3 ls + SSL validation failed for https://s3.eidf.ac.uk/ [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1032) + ``` + +### Docker Service via HTTP Proxy + +!!! note + Docker is not installed by default on the Confidential Data Workspace VMs, so these instructions are only relevant if you have installed Docker yourself. + +Typically docker will use the system proxy settings, however to ensure that docker service uses the HTTP proxy you need to create a systemd drop-in file for the docker service. + +Add service at startup + + ```bash + sudo systemctl enable docker.service + sudo systemctl enable containerd.service + ``` + +This is taken directly from the [Docker documentation](https://docs.docker.com/engine/daemon/proxy/#systemd-unit-file) + + ```bash + sudo mkdir -p /etc/systemd/system/docker.service.d + ``` + +Add to the newly created directory a file called http-proxy.conf + + ```http-proxy.conf + [Service] + Environment="HTTP_PROXY=http://:3128" + Environment="HTTPS_PROXY=http://:3129" + ``` + +Reload Docker + + ```bash + sudo systemctl daemon-reload + sudo systemctl restart docker + ``` + +Check this has been added with + + ```bash + sudo systemctl show --property=Environment docker + ``` + +Which should show the HTTP_PROXY and HTTPS_PROXY variables with the correct proxy address. + +### Git + +Git can be configured to work through a proxy by use of the following configuration items in `/etc/gitconfig`: + + ```gitconfig + [http] + proxy = http://:3128 + + [https] + proxy = https://:3129 + sslCAPath = /etc/pki/ca-trust/source/anchors/squid_proxyCA.crt + ``` + +While we recommend setting up the HTTP/HTTPS proxy above, only Git remotes using HTTPS URLs (for example, `https://github.com/...`) are permitted through the proxy server. SSH-based Git remotes (for example, `git@github.com:...` or `ssh://...`) are blocked by the SSH firewall restrictions. + +!!! note + GitHub and GitLab are by default not in the proxy allow list for the Confidential Data Workspace service. If you want to use Git with these services, you should ask your VM Admin to add them to the allow list. ECDF and EIDF GitLab instances are by default in the allow list. + +The following instructions are intended for VM Admins (router administrators) who have access to configure the Confidential Data Workspace Router. + +To add full GitLab and GitHub access on the router, add the following lines to the allow list file `/etc/squid/allowlist_buckets.txt`: + + ```txt + .github.com + .gitlab.com + ``` + +More information on how to edit the allow list can be found in the documentation section on [Details of the Confidential Data Workspace Router](./router-docs.md#updating-the-allowed-access-for-confidential-data-workspace-vms). + +### Rocky Package Manager YUM + +To force proxy usage we need to put http proxy into the `/etc/yum.conf` e.g. + + ```conf + proxy=http://:3128 + ``` + +### Adding the Proxy Certificate to Firefox + +/etc/ssl/ca-certificates/squidProxyCA.crt diff --git a/docs/services/confidentialdataworkspace/index.md b/docs/services/confidentialdataworkspace/index.md new file mode 100644 index 000000000..5fc812d90 --- /dev/null +++ b/docs/services/confidentialdataworkspace/index.md @@ -0,0 +1,27 @@ +# Overview + +The Confidential Data Workspace service is a specialised offering within EIDF, designed to provide enhanced security and controlled access for sensitive data projects. The Confidential Data Workspace service is built on the same underlying EIDF Virtual Machine (VM) Service infrastructure with additional access controls, network isolation, and policy requirements to meet the needs of secure data handling. Development of this service makes use of work done on the Scottish Safe Haven Trusted Research Environment. This section describes the service features specific to the Confidential Data Workspace, and highlights differences from the general VM Service where relevant. + +The service currently has a mixture of hardware node types which host VMs of various flavours. [These match the virtual machine service flavours](../virtualmachines/flavours.md). + +## Data Processing Requirements for Confidential Data Workspace Service + +EIDF supports processing many kinds of data including sensitive and personal data. We have policies in place about the kinds of data that are permitted in the EIDF and that certain requirements are in place to ensure data complies with EIDF and regulatory processing requirements before it is processed. + +For processing some types of data, the Confidential Data Workspace service may be appropriate; for others, a Safe Haven Trusted Research Environment may be required. For a full list of data types and their restrictions or requirements please see the [EIDF Third Party Data Policy](https://edinburgh-international-data-facility.ed.ac.uk/about/policies/third-party-data). + +## Service Access + +Users should have an EIDF account - [EIDF Accounts](../../access/project.md). + +Project Leads will be able to have access to the DSC added to their project during the project application process or through a request to the EIDF helpdesk. + +## Additional Service Policy Information + +Additional information on service policies can be found [in the policies page](policies.md). + +## Using the Confidential Data Workspace + +An introduction to using the Confidential Data Workspace service can be found in the [Quickstart guide](quickstart.md). + +Management of Confidential Data Workspace VMs is described in the [Managing VMs documentation](docs.md). diff --git a/docs/services/confidentialdataworkspace/policies.md b/docs/services/confidentialdataworkspace/policies.md new file mode 100644 index 000000000..337351216 --- /dev/null +++ b/docs/services/confidentialdataworkspace/policies.md @@ -0,0 +1,55 @@ +# EIDF Confidential Data Workspace Policies + +## Networking Policies + +The EIDF Confidential Data Workspace service provides VMs that prevent all host initiated network connections (egress) to the internet, except for those destinations explicitly allowed in the web proxy allow lists by the project PI. The Confidential Data Workspace VMs also prevent any incoming network connections (ingress) from the internet, as well as any lateral movements from VMs in the same project or from the EIDF Infrastructure to these VMs. This includes incoming connections from other VMs in the same project and from machines outwith the project. As a result, the EIDF Confidential Data Workspace VMs cannot be accessed via SSH or RDP from the internet or the EIDF Gateway, and can only be accessed via the project router or VDI. + +Confidential Data Workspace VMs are placed within an isolated network. This design leverages understanding and pentesting from development of the infrastructure for the Scottish National Safe Haven, which is also run by the EIDF. The Confidential Data Workspace subnet is isolated from other EIDF services and the internet by a Squid web proxy server that filters network traffic from the VMs. This service differs from TREs used in the Safe Haven in that project users have administrative access to their VMs to install software and manage networking access for the VMs as required. + +## Audit Policies + +The EIDF Confidential Data Workspace service provides logging of user activity within the Confidential Data Workspace service VMs. This includes: + +- Network requests +- Login Activity + +### Network Access Logging + +The Squid web proxy logs all network traffic like web requests made by Confidential Data Workspace VMs. These logs are stored in the EIDF Squid router and are configured to be retained for a default period of 30 days before being automatically deleted. + +The access logs are available in /var/log/squid/access.log.x where x is the log rotation number, with access.log.0 being the most recent log file. + +The log retention period can be adjusted by changing the `logfile_rotate` parameter in the squid configuration file located at /etc/squid/squid.conf and the cronjob that runs the log rotation command. By default the log rotation happens every day at midnight. This is the [frequency recommended by Squid](https://wiki.squid-cache.org/SquidFaq/SquidLogs#which-log-files-can-i-delete-safely) to ensure log files do not grow too large. + +### Login Activity + +Login activity is automatically logged by the access tools (XRDP and SSH). The relevant log files follow these naming patterns: `xrdp-sesman.log*` for XRDP logins and `auth.log*` for SSH authentication. + +Login activity to the Confidential Data Workspace VMs is copied periodically from each VM to the project router VM `-router`, where it is stored in the `ubuntu` user's home directory `/home/ubuntu/log-replications/`. + +Logs are retained for a default period of 30 days before being automatically deleted. Each machine has its own log file and type of log file. + +## Machine Management Policies + +The Confidential Data Workspace VMs are managed by the VM Admin users within the project. The EIDF team will manage ONLY the underlying infrastructure, hypervisors and cloud management software as part of the EIDF Maintenance sessions. + +EIDF Confidential Data Workspace VMs are provided in a state that allows easy customisation by VM Admin users to meet the requirements of their project. [Documentation is provided](docs.md) to guide VM Admin users on how to customise and manage their VMs for their own needs. + +## End of Life Policy for User Accounts and Projects + +### What Happens When an Account or Project Is No Longer Required, or a User Leaves a Project + +These will match those of the main Virtual Machine service. Please see [EIDF VM Service Policies](../virtualmachines/policies.md#end-of-life-policy-for-user-accounts-and-projects) for details. + +## Backup Policies + +The current policy, matching that of the main Virtual Machine service (see [EIDF VM Service Policies](../virtualmachines/policies.md#backup-policies)), is: + +- The content of VM disk images is not backed up +- The VM disk images are not backed up + +We strongly advise that you keep copies of any critical data on one of our fully backed-up systems such as the S3 storage. + +## Patching of User VMs + +The EIDF team updates and patches the hypervisors and the cloud management software as part of the EIDF Maintenance sessions. It is the responsibility of project PIs to keep the VMs in their projects up to date. VMs running the Ubuntu and Rocky operating systems automatically install security patches and alert users at log-on (via SSH) to reboot as necessary for the changes to take effect. They also encourage users to update packages. diff --git a/docs/services/confidentialdataworkspace/quickstart.md b/docs/services/confidentialdataworkspace/quickstart.md new file mode 100644 index 000000000..58cb9093b --- /dev/null +++ b/docs/services/confidentialdataworkspace/quickstart.md @@ -0,0 +1,70 @@ +# Quickstart + +Projects using the Confidential Data Workspace cloud service are accessed via the +[EIDF Portal](https://portal.eidf.ac.uk/). This documentation closely follows that of the +[Virtual Desktop Service Quickstart](../virtualmachines/quickstart.md). + +Authentication is provided by SAFE, so if you do not have an active web browser session in SAFE, +you will be redirected to the [SAFE log on page](https://safe.epcc.ed.ac.uk). +If you do not have a SAFE account follow the instructions in the +[SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/) +on how to register and receive your password. + +## Accessing Your Projects + +1. Log into the portal at [https://portal.eidf.ac.uk/](https://portal.eidf.ac.uk/). + The login will redirect you to the [SAFE](https://safe.epcc.ed.ac.uk/). + +1. View the projects that you have access to + at [https://portal.eidf.ac.uk/project/](https://portal.eidf.ac.uk/project/) + +## Joining a Project + +1. Navigate to [https://portal.eidf.ac.uk/project/](https://portal.eidf.ac.uk/project/) + and click the link to "Request access", or choose "Request Access" in the "Project" menu. + +1. Select the project that you want to join in the "Project" dropdown list - + you can search for the project name or the project code, e.g. "eidf0123". + +Now you have to wait for your PI or project manager to accept your request to join. + +## Accessing a VM + +1. Select a project and view your user accounts on the project page. + +1. Click on an account name to view details of the VMs that you are allowed to access + with this account, and to change the password for this account. + +1. Before you log in for the first time with a new user account, you must change your password as described + in the section [Set or change the password for a user account](../../services/virtualmachines/quickstart.md#set-or-change-the-password-for-a-user-account). + +1. Follow the link to the Guacamole login or + log in directly at [https://eidf-vdi.epcc.ed.ac.uk/vdi/](https://eidf-vdi.epcc.ed.ac.uk/vdi/). + Please see the [VDI](../../access/virtualmachines-vdi.md#navigating-the-eidf-vdi) guide for more information. + +!!! warning + You must set a password for a new account before you log in for the first time. + +## Set or Change the Password for a User Account + +Follow these instructions to set a password for a new account before you log in for the first time. +If you have forgotten your password you may reset the password as described here. + +1. Select a project and click the account name in the project page to view the account details. + +1. In the user account detail page, press the button "Set Password" + and follow the instructions in the form. + +There may be a short delay while the change is implemented before the new password becomes usable. + +## Further Information + +[Managing VMs](./docs.md): Project management guide to creating, configuring and removing Confidential Data Workspaces and managing user accounts in the portal. + +[Frequently Asked Questions](./faq.md): Common questions and solutions for working with the Confidential Data Workspace service. + +[policies](./policies.md): Policies governing the use of the Confidential Data Workspace service. + +[Virtual Desktop Interface](../../access/virtualmachines-vdi.md): Working with the VDI interface. + +[Storage and data transfer options for the Confidential Data Workspace](./storage.md) diff --git a/docs/services/confidentialdataworkspace/router-docs.md b/docs/services/confidentialdataworkspace/router-docs.md new file mode 100644 index 000000000..ce75c4cd4 --- /dev/null +++ b/docs/services/confidentialdataworkspace/router-docs.md @@ -0,0 +1,112 @@ +# Confidential Data Workspace Router Documentation + +## Details of the Confidential Data Workspace Router + +The Confidential Data Workspace Router is a virtual machine that acts as a gateway for the Confidential Data Workspace VMs. It is used to manage network access to the Confidential Data Workspace VMs and to provide a secure connection point for users to access the VMs via SSH. + +Confidential Data Workspace VMs use the Confidential Data Workspace Router as a proxy for network traffic, as such Confidential Data Workspace VMs may need some extra options around network access for software that is not installed by default. Many of these are set up on deployment of the service for a project or documented in the [FAQs](./faq.md). For software that is not installed by default the following details may be useful for users to know when setting up software on the VMs. + +### Proxy Router Address + +When configuring software on the Confidential Data Workspace VMs that requires network access, the proxy address to use is given in the EIDF portal for your project under the **"Confidential Data Workspace (CDW)"** Proxy address section. + +### Proxy Ports + +HTTP and HTTPS traffic from the Confidential Data Workspace VMs is proxied through the Squid proxy server on the Confidential Data Workspace Router. The proxy server listens on the following ports: + +- HTTP: port 3128 +- HTTPS: port 3129 + +### Certificate When Using the Proxy for HTTPS Traffic + +Because the Squid proxy server on the Confidential Data Workspace Router is intercepting and filtering network traffic from the Confidential Data Workspace VMs, it uses a self-signed certificate to decrypt and inspect HTTPS traffic. This means that when users are accessing websites over HTTPS from the Confidential Data Workspace VMs, they may encounter security warnings in their web browsers due to the self-signed certificate used by the Squid proxy. This proxy certificate is stored at `/usr/local/share/ca-certificates/extra/` and must be imported into the browser. + +## SSH Access to the Confidential Data Workspace Router + +VM Admins can access the Confidential Data Workspace Router `-router` machine via SSH using the VM Admin user for the project. This is easiest done using a SSH config file with the appropriate ProxyJump configuration. + +This requires users have set up the EIDF Gateway for their user as this must be jumped through to access the router. An example SSH config file entry for the router is shown below: + +```bash +host eidf-gateway + Hostname eidf-gateway.epcc.ed.ac.uk + User bc-eidfstaff + IdentityFile + +Host eidfxxx-router + HostName + User ubuntu + ProxyCommand SSH eidf-gateway -W %h:%p + IdentityFile +``` + +Where the `` can be found in the EIDF Portal under the project details page. Note that the Confidential Data Workspace Proxy Address is not the IP address that you use to SSH to the router, the address under `machines`, eidfxxx-router is the correct address to use for ssh access to the router. + +SSH credentials for the router can be added in the EIDF Portal under the project details page, more information on how to do this can be seen in the documentation section on [SSH Credentials in the EIDF Portal](../../access/ssh.md#generate-a-new-ssh-key). + +You can then connect to the router using the command: + +```bash +ssh eidfxxx-router +``` + +Where `eidfxxx-router` is the name of the host entry in the SSH config file for the router. + +## SSH Access to Confidential Data Workspace VMs via the Router + +VM Admins can access the Confidential Data Workspace VMs via SSH by first connecting to the project router `-router` and then jumping from there to the target VM. This is the only way to access the VMs via SSH as direct SSH access to the VMs is disabled for security of the service. This is easiest done using a SSH config file with the appropriate ProxyJump configuration, adding to the existing `~/.ssh/config` entry described above for the router. + +```bash +Host eidfxxx-VM + HostName + User + ProxyCommand ssh eidfxxx-router -W %h:%p + IdentityFile +``` + +Where the VM IP Address can be found in the EIDF Portal under the project details page, under the `machines` section for the relevant VM. + +SSH credentials for the router can be added in the EIDF Portal under the project details page, more information on how to do this can be seen in the documentation section on [ssh Credentials in the EIDF Portal](../../access/ssh.md#generate-a-new-ssh-key). + +You can then connect to the VM using the command: + +```bash +ssh eidfxxx-VM +``` + +## Updating the Allowed Access for Confidential Data Workspace VMs + +By default the allowed list of domains allows Confidential Data Workspace VMs to: + +- Access common package repositories for operating system updates and software installation +- Access popular container registries for downloading (but not uploading) container images + - Docker Hub + - GitHub Container Registry + - EIDF Container Registry +- Access common software packages: + - CRAN + - PyPI + - Bioconductor +- Specific EIDF S3 buckets for data transfer + +Configuring the allowed list of domains for specific projects can be done only by the VM Admin through access to the Squid Router machine. + +The VM Admin with access to the Squid Router `-router` machine can edit the squid access control list. The access control list is available through the file + +```bash +-router$ /etc/squid/allowlist_domains.txt +``` + +This file contains a detailed list of allowed domains. The VM Admin can add or remove domain names following the syntax of [access control lists defined by Squid](https://wiki.squid-cache.org/SquidFaq/SquidAcl) making special note of the section '[Squid Does Not Match My Subdomains](https://wiki.squid-cache.org/SquidFaq/SquidAcl#squid-doesnt-match-my-subdomains)'. + +S3 bucket access is handled in a different location due to some technical details of allowing EIDF S3 bucket access. The list of allowed S3 buckets is available through the file: + +```bash +-router$ /etc/squid/allowlist_buckets.txt +``` + +After editing allowlists Squid must be reconfigured using the command: + +```bash +sudo squid -k reconfigure +``` diff --git a/docs/services/confidentialdataworkspace/storage.md b/docs/services/confidentialdataworkspace/storage.md new file mode 100644 index 000000000..15303ba2e --- /dev/null +++ b/docs/services/confidentialdataworkspace/storage.md @@ -0,0 +1,88 @@ +# Storage in the Confidential Data Workspace Service + +The EIDF Confidential Data Workspace Service provides options for privileged users to store and transfer data securely to and from the Confidential Data Workspace VMs. To move data from the EIDF Confidential Data Workspace VMs to other systems, users require an intermediate transfer storage location to reduce the access of the Confidential Data Workspace VMs to the internet and non-authorised users. Any data should be transferred off of the transfer storage location and onto the VM disk. + +## Roles and Their Access to Storage Locations + +We define three roles in the Confidential Data Workspace service as described in the [Service Documentation](./docs.md#user-roles-and-their-permissions). These roles are in place to limit who can initiate data transfers to and from the Confidential Data Workspace VMs. Typically we recommend that only VM Admin users and Data Manager users have access to transfer data to and from the Confidential Data Workspace VMs. As such the below options for storage and data transfer are recommended as only available to these roles. + +## Data Manager Group Directory + +A directory is created for users with the Data Manager role on Confidential Data Workspace VMs, this is located at `/data/datamanager`. This directory is owned by root and has group ownership set to a group named `-datamanager`, which all users with the Data Manager role for the project should be added to following [Setting up the correct groups and permissions for user accounts](./docs.md#setting-up-the-correct-groups-and-permissions-for-user-accounts). + +This directory can be used as a controlled staging area for data that is being transferred to or from Confidential Data Workspace VMs, before that data is made available to additional users or copied off the VM. + +## S3 Storage + +The Confidential Data Workspace service allows users to transfer data using EIDF S3 buckets, by default all EIDF S3 buckets except for one with the project name are disabled. VM Admins can add buckets to the allowlist to enable data ingress and egress from the machine using them. + +For each S3 bucket you want to use, you must decide whether it should be read-only or read-write from the VM. This is controlled by adding the bucket to the appropriate allowlist domains: + +- To only allow pulling data into the VM from a bucket, add the bucket to the `/etc/squid/allowlist_domains.txt` file on the Confidential Data Workspace project's router (`-router`), as with access for any read only site. Buckets on this allowlist can be accessed with read permissions only. +- To allow both pulling data into the VM and pushing data from the VM to a bucket, add the bucket to the `/etc/squid/allowlist_buckets.txt` file on the Confidential Data Workspace project's router (`-router`). Buckets on this allowlist can be accessed with both read and write permissions. +To use S3 storage with Confidential Data Workspace VMs, an S3 repository must exist or be created in the EIDF Portal. Project leads can request an object store allocation through a request to the EIDF helpdesk. + +Once the S3 repository is created, a bucket must be created within this repository that will be accessible from the Confidential Data Workspace VMs. +This bucket must then be added by the VM Admin to the allowed S3 buckets list on the Confidential Data Workspace project's router, `-router`. The file to edit is located at `/etc/squid/allowlist_buckets.txt`. It contains entries where the allowed buckets name should be added using the regex pattern as shown in the form below. + + ```txt + ^https:\/\/s3\.eidf\.ac\.uk\/ + ``` + +For the example of a bucket named `eidf-xxx-bucket` we would add the bucket id to the end of the url pattern like so: + + ```txt + ^https:\/\/s3\.eidf\.ac\.uk\/eidf-xxx-bucket + ``` + +After editing the allowlist Squid must be reconfigured using the command: + + ```bash + sudo squid -k reconfigure + ``` + +## Data Transfer Using SCP (With and Without SSH Config) + +Data transfer to and from the Confidential Data Workspace VMs can be performed using [`scp`](https://linux.die.net/man/1/scp). + +For a user to transfer data to and from the Confidential Data Workspace VMs using `scp` the following is required: + +- The user has the Data Manager role +- The user has been added to the EIDF gateway, Confidential Data Workspace Router `-router` and the target Confidential Data Workspace VM in the EIDF Portal +- An SSH client is installed on the user's local machine (see [SSH connection to EIDF](../../access/ssh.md) for more details) +- The user has SSH access to the Confidential Data Workspace Router `-router` + +Data transfer using `scp` is performed by jumping through the Confidential Data Workspace project's router, `-router`, which acts as an intermediary for data transfer and access. + +Because all traffic to EIDF services must first go through the gateway, the below command needs to do a double jump via both the EIDF gateway and the Confidential Data Workspace router. + +### Where Users Have Set Up an SSH Config File With Configuration for the Router and VM (Recommended) + +Users should first set up an SSH config file with configuration for the router and VM, as described in the documentation section [SSH Access to the Confidential Data Workspace Router](./router-docs.md#ssh-access-to-the-confidential-data-workspace-router) and [SSH Access to Confidential Data Workspace VMs via the Router](./router-docs.md#ssh-access-to-confidential-data-workspace-vms-via-the-router). After this is configured, they can use a simplified `scp` command. In this case, the `scp` command is: + + ```bash + scp eidfxxx-VM:/data/datamanager/ + ``` + +Where: + +- `eidfxxx-VM` is the Host defined in the user's SSH config file for the target Confidential Data Workspace VM +- `/data/datamanager/` is the destination path on the Confidential Data Workspace VM where the file will be copied to +- `` is the path to the file on the user's local machine + +### Where Users Do Not Have an SSH Config File Set Up With Configuration for the Router and VM + +To transfer data **to** the Confidential Data Workspace VMs from a local machine using `scp`, the following command format should be used: + + ```bash + scp -i -o ProxyJump=@eidf-gateway.epcc.ed.ac.uk,@ @:/data/datamanager/ + ``` + +Where: + +- `` is the path to the user's private SSH key that corresponds to the public SSH key added for the user in the EIDF Portal for access to the router and VM, more information on how to do this can be seen in the documentation section on [SSH Credentials in the EIDF Portal](../../access/ssh.md#generate-a-new-ssh-key). +- `` is the user's SSH username for EIDF gateway, router and the Confidential Data Workspace VM +- `` is the IP address of the Confidential Data Workspace Router for the project. This IP address can be found in the EIDF Portal on the project details page, in the `machines` section, for the router machine named `-router`. +- `` is the path to the file on the user's local machine +- `` is the IP address of the Confidential Data Workspace VM +- `/data/datamanager/` is the destination path on the Confidential Data Workspace VM where the file will be copied to diff --git a/docs/services/s3/tutorial.md b/docs/services/s3/tutorial.md index 20298e1bc..76cf3b1f0 100644 --- a/docs/services/s3/tutorial.md +++ b/docs/services/s3/tutorial.md @@ -40,6 +40,9 @@ export AWS_SECRET_ACCESS_KEY= export AWS_ENDPOINT_URL=https://s3.eidf.ac.uk ``` +!!! Important + Confidential Data Workspace users should see the [Confidential Data Workspace FAQs](../confidentialdataworkspace/faq.md#aws-cli) for additional steps on how to set up AWS CLI in the Confidential Data Workspace environment. + ### Commands List the buckets in your account: diff --git a/docs/services/virtualmachines/index.md b/docs/services/virtualmachines/index.md index d1730f16f..739083347 100644 --- a/docs/services/virtualmachines/index.md +++ b/docs/services/virtualmachines/index.md @@ -14,8 +14,6 @@ The shapes and sizes of the flavours are based on subdivisions of this hardware, Users should have an EIDF account - [EIDF Accounts](../../access/project.md). -Project Leads will be able to have access to the DSC added to their project during the project application process or through a request to the EIDF helpdesk. - ## Additional Service Policy Information Additional information on service policies can be found [here](policies.md). diff --git a/docs/services/virtualmachines/quickstart.md b/docs/services/virtualmachines/quickstart.md index 1d815f80d..9a8376a5d 100644 --- a/docs/services/virtualmachines/quickstart.md +++ b/docs/services/virtualmachines/quickstart.md @@ -7,7 +7,7 @@ Authentication is provided by SAFE, so if you do not have an active web browser you will be redirected to the [SAFE log on page](https://safe.epcc.ed.ac.uk). If you do not have a SAFE account follow the instructions in the [SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/) -how to register and receive your password. +on how to register and receive your password. ## Accessing your projects diff --git a/mkdocs.yml b/mkdocs.yml index a962215df..18b47c03b 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -60,6 +60,14 @@ nav: - "VM Flavours": services/virtualmachines/flavours.md - "Policies": services/virtualmachines/policies.md - "RDP tunnelling over SSH": services/virtualmachines/rdp-tunnelling.md + - "Confidential Data Workspace": + - "Overview": services/confidentialdataworkspace/index.md + - "QuickStart": services/confidentialdataworkspace/quickstart.md + - "Managing VMs": services/confidentialdataworkspace/docs.md + - "Router Configuration": services/confidentialdataworkspace/router-docs.md + - "Policies": services/confidentialdataworkspace/policies.md + - "Storage": services/confidentialdataworkspace/storage.md + - "FAQs": services/confidentialdataworkspace/faq.md - "Ultra2": - "Overview": services/ultra2/access.md - "Connect": services/ultra2/connect.md