diff --git a/doc/deployment/deploy/NPort-install.md b/doc/deployment/deploy/NPort-install.md deleted file mode 100644 index 4b7c9eb28..000000000 --- a/doc/deployment/deploy/NPort-install.md +++ /dev/null @@ -1,23 +0,0 @@ -# Moxa NPort - -To install/upgrade NPort driver: - -1. Use remote desktop because you need admin privileges -1. Open NPort admin and note the IP address, Com ports and moxa ports for the current config (NB there maybe be multiple moxas) -1. Uninstall NPort using windows add/remove programs -1. Install the latest NPort software from `...\installs\Installs\Applications\LabVIEW\Other bits\LabVIEW related\MOXA\MOXA NPort` -1. Open NPort Admin -1. Click Add -1. Search for the NPort (click stop when it times out) -1. Clear All, then select the moxa you are using -1. Click OK, do not activate ports -1. Select all ports -1. Click Settings -1. Under basic settings Click "Auto Enumerating COM Number for Selected Ports" -1. Set Com to correct port (probably port 5) -1. Click OK. -1. Click Apply. -1. Ignore the message about setting moxa to real port. -1. Close NPort administrator - -If the upon clicking apply you are told the port is in use, you may need to move to using NPort Driver Manager instead, it can be found here `\\isis\inst$\Kits$\CompGroup\Utilities\MOXA Nport Software` diff --git a/doc/systems/External.md b/doc/systems/External.md deleted file mode 100644 index 126c228af..000000000 --- a/doc/systems/External.md +++ /dev/null @@ -1,10 +0,0 @@ -# External systems/services - -This information documents the computers on which parts of IBEX may depend, but are not directly within the team's control. - -```{toctree} -:glob: -:titlesonly: - -external/* -``` \ No newline at end of file diff --git a/doc/systems/Nagios.md b/doc/systems/Nagios.md deleted file mode 100644 index f87f6b327..000000000 --- a/doc/systems/Nagios.md +++ /dev/null @@ -1,11 +0,0 @@ -# Nagios - -For information about the setup of the nagios server see [here](https://stfc365.sharepoint.com/sites/ISISExperimentControls/_layouts/15/Doc.aspx?sourcedoc=%7BA6F9D67A-4FC8-493C-9834-995CF4044F27%7D&file=nagios_server.docx&action=default&mobileredirect=true&DefaultItemOpen=1). - - -```{toctree} -:glob: -:titlesonly: - -nagios/* -``` diff --git a/doc/systems/Shadow.md b/doc/systems/Shadow.md deleted file mode 100644 index 8e706b106..000000000 --- a/doc/systems/Shadow.md +++ /dev/null @@ -1,22 +0,0 @@ -# Shadow - -Shadow is a Linux machine which hosts various services. Credentials are in Keeper. - -## Apache - -Apache is running as the main http server, it proxies requests running to other services ie. Jenkins on shadow. - -Configuration is located at `/etc/httpd/conf.d/`. - -## Jenkins - -Shadow hosts the [Jenkins instance.](https://epics-jenkins.isis.rl.ac.uk/) and is the built-in node. - -This runs as a `systemd` service called `jenkins`, its configuration is stored in `/isis2/jenkins/instances/epics/`. - -## Site-mirrored Wikis - -These are currently being built by Github actions, Shadow executes periodic cron jobs (run `sudo su - isisupdate` then `crontab -e` to see these) which pull the `gh-pages` branches of the Github wikis. - -The cron jobs run `/isis/scripts/update_x` where `x` is the name of the wiki. - diff --git a/doc/systems/Sparrowhawk.md b/doc/systems/Sparrowhawk.md deleted file mode 100644 index f108d4b7e..000000000 --- a/doc/systems/Sparrowhawk.md +++ /dev/null @@ -1,17 +0,0 @@ -# Sparrowhawk - -`sparrowhawk` is a server that hosts instrument journals, and a bug submission page. The credentials to access the -system are in Keeper. - -For operating system help - for example the operating system not booting - contact "Technology IT Support" in outlook. - -## Instrument Journals - -See [Journal Viewer Data](/system_components/journal_viewer/Journal-Viewer-Data) for details. - -## Instrument bug submission page - -### Updating scientist email addresses - -The scientist emails are defined in `/isis/www/html/sutekh/footprints/cgi/inst_globals.py`. The emails on the webpage -are updated from this file by a cron job (execute `crontab -e` as root to see exactly what it runs) diff --git a/doc/systems/Test-Machines.md b/doc/systems/Test-Machines.md deleted file mode 100644 index 2df21184d..000000000 --- a/doc/systems/Test-Machines.md +++ /dev/null @@ -1,24 +0,0 @@ -# Test Machines - -Test machines are currently virtual machines running on ... to access them: - -1. Run Hyper-V Manager using your admin account (you may have to install it) -1. Connect to the server ndhspare70 (use right click, I think I used my admin credentials and Chris had to enable access for me) - -## To reset them - -1. Click on the server (left) -1. Click the Virtual machine you want to reset (middle) -1. Right click on the Clean for Clone check point and click `apply ...` -1. Click Apply yes I do -1. Right click and delete checkpoint sub tree (above clean for clone) - this removes history -1. Bring up the machine console (double click on the machine name) -1. Click power on (green button left on toolbar) -1. Wait for boot -1. Task Sequence: Build Basic W7 -1. Computer Details: Set machine name and next -1. Move Data and Settings: Do not move -1. User Data: Do not restore -1. Administration Password: Leave as is -1. Ready: Begin -1. Leave to boot diff --git a/doc/systems/Webserver.md b/doc/systems/Webserver.md deleted file mode 100644 index b85de00c7..000000000 --- a/doc/systems/Webserver.md +++ /dev/null @@ -1,25 +0,0 @@ -# Webserver - -## NDAEXTWEB3 - -NDAEXTWEB3 is a central Windows 2019 server, which is hosted on the FIT Hyper-V cluster, that runs a number of the web services associated with IBEX. The login credentials for this are in sharepoint. The server holds: - -* The [old IBEX web dashboard (JSON_Bourne)](/webdashboard/Web-Dashboard) -* The [Automation application](/processes/git_and_github/Automation-Application) -* [MCR news](https://www.isis.stfc.ac.uk/Pages/MCR-News.aspx) -* The central proxy created [here](https://github.com/ISISComputingGroup/IBEX/issues/5112) - -Most of these services can be started and stopped by the [IIS Manager](https://www.iis.net/). To access the IIS Manager, select IIS in the Server Manager, then click on Manage in the top right hand corner of that screen. - -There should be 2 sites, Dataweb and WAP, which provide the above. - -If a new server is setup, then ciphers and old TLS versions may need to be disabled e.g. tls 1.0 and 1.1 as per https://docs.microsoft.com/en-us/windows-server/identity/ad-fs/operations/manage-ssl-protocols-in-ad-fs - - -## NDAEXTWEB4 - -NDAEXTWEB4 is similar in that it also runs on the Hyper-V cluster. This currently hosts the [PVWS](/webdashboard/PVWS) which is a PVWS instance for the IBEX web dashboard. - -## Shadow - -Some web services run on Shadow - see [here](/systems/Shadow) for more information. diff --git a/doc/systems/control-svcs.md b/doc/systems/control-svcs.md deleted file mode 100644 index d81582761..000000000 --- a/doc/systems/control-svcs.md +++ /dev/null @@ -1,62 +0,0 @@ -# `control-svcs` - -control-svcs.isis.cclrc.ac.uk - -To log on use an SSH client such as Putty and see usual username/password page for credentials. - -The machine runs various services, including: -* EPICS gateways for running data between the halls -* ArchiveEngine for storing central data for beam currents etc. -* Git repositories for storing configs etc. (see https://control-svcs.isis.cclrc.ac.uk/git/) -* [Experiment database populator](https://github.com/ISISComputingGroup/ExperimentDatabasePopulator) -* The [Alert Relay](control_svcs/Alert-Relay) -* [Isis info slack channel bots](/webdashboard/ISIS-Info-Slack) - -None of these services are crucial for running instruments to continue taking data. The services log to `\home\var\` and can be restarted by killing the process as they run in procserv. - -These run under the "epics" user id so -``` -sudo su - epics -``` -and give your "isissupport" password when prompted - -Depending on what you need to do, you may need to modify system init scripts to start services at boot time. - -## Troubleshooting - -### Whole VM is down - -The best bet at diagnosing/fixing this is to [download VM Manager console](https://stfc365.sharepoint.com/sites/ISISExperimentControls/_layouts/15/Doc.aspx?sourcedoc=%7B289EA683-832D-4CC5-936B-A409AB379AF7%7D&file=vmm%20console.docx&action=default&mobileredirect=true&DefaultItemOpen=1) - ask Facilities IT if you have problems. The machine has appeared to be down in the past (unable to ping etc.) but has actually been running OK (access via VM console) - this seems to happen if the VM is "live migrated" to a new node on the cluster. Facilities IT are now aware of this and should inform us first, after the migration we may need to reboot to restore network connectivity. - -### Webserver is down - -Log onto control services and check if the webserver service is running using -``` -sudo systemctl status httpd -``` - -If it is down, it can be restarted with the following standard commands (these commands are also used to start/auto-restart any other linux service, substituting `httpd` with the name of the appropriate service): -``` -sudo systemctl start httpd -sudo systemctl enable httpd -``` - -If you need to stop the service first use -``` -sudo systemctl stop httpd -``` - -The logs are located in `/var/log/httpd` which may be useful for troubleshooting if a simple restart does not solve the issue. - -### Gateways unavailable / Beam Logger details not updating - -If PVs are visible in R3 then probably the gateway on the control service machine need to be updated. - -## Further Information - -```{toctree} -:glob: -:titlesonly: - -control_svcs/* -``` diff --git a/doc/systems/control_svcs/Alert-Relay.md b/doc/systems/control_svcs/Alert-Relay.md deleted file mode 100644 index 2a93cfca5..000000000 --- a/doc/systems/control_svcs/Alert-Relay.md +++ /dev/null @@ -1,27 +0,0 @@ -# Alert Relay - -This is a python file `/isis/cgi/sendmess.py` living on control-svcs and invoked as `http://control-svcs.isis.cclrc.ac.uk/isiscgi/sendmess.py` - -It is only possible to invoke this file when on RAL site, and you also need to pass a correct passcode (see IBEX access sharepoint) as one of the parameters for it to send a message. The parameters passed are: -``` -mobiles: a semicolon separated mobiles list for sms text -emails: a semicolon separated email list -pw: correct passcode -inst: instrument name for slack/teams alerts -message: the message -``` - -An IOC usually posts to this service from the :AC: (alert control) system loaded by the run control IOC - -To do the alerting it will use `sendAlert.db` from `support/webget`. This combines several PVs with mobiles, emails etc. and encodes into the format required for a http POST using the aSub function `webFormURLEncode`. It then sends this to control-svcs using the aSub function `webPOSTRequest` - -To see the debug output of what `sendAlerts.db` is sending set `epicsEnvSet("WEBGET_POST_DEBUG","1")` in the `st.cmd` of the runcontrol IOC. This is suppressed by default as it may contain sensitive user data. - -The PVs in IBEX which hold the URL & password for sending alerts are in the form: - -``` -%MYPVPREFIX%CS:AC:ALERTS:URL:SP -%MYPVPREFIX%CS:AC:ALERTS:PW:SP -``` - -There is further documentation in the [`RUNCTRL` README file](https://github.com/ISISComputingGroup/EPICS-ioc/blob/master/RUNCTRL/README.md). diff --git a/doc/systems/external/Beam-Status,-Shutter,-accelerator-and-moderator-information.md b/doc/systems/external/Beam-Status,-Shutter,-accelerator-and-moderator-information.md deleted file mode 100644 index 2308d14d1..000000000 --- a/doc/systems/external/Beam-Status,-Shutter,-accelerator-and-moderator-information.md +++ /dev/null @@ -1,124 +0,0 @@ -# Merckx (Accelerator Information) - -Information about the beam current and instrument shutter status is stored in the main accelerator control computer system, though a shutter is local to an instrument it is part of a safety system and we do not have direct access to it ourselves. Also the accelerator computer system can only read main shutter status - opening/closing a main shutter can only be performed using a physical button in the cabin. - -This information is fed from an IOC running on a machine on the accelerator network (merckx.isis.rl.ac.uk). This is a [Open VMS](https://en.wikipedia.org/wiki/OpenVMS) machine with the EPICS distribution from [here](https://github.com/ISISComputingGroup/EPICS-VMS/). The IOC is set to run on boot time and is auto-restarted if it is not present, it will also auto-restart if it receives too many errors, but some failures can cause it to hang. - -**If the system restarts itself, then there will be a brief loss of PVs for beam current/shutters/moderator temp to instruments, some will go into a WAITING state when this happens if run control is enabled on the block** - -if you need to contact the accelerator controls team, look for "ISIS Controls (Support)" in outlook - -You can see the current error counts in nagios for the merckx system or via -``` -camonitor ICS:IB:ERRCNT ICS:IB:CHANERRCNT -``` -You can log onto this machine using details on usual access page (you will need to use ssh via something like PuTTY, or git bash with `-oHostKeyAlgorithms=+ssh-dss` as an argument) - - -The most likely cause of a problem is that the local database has stopped updating, thus giving an unchanging value. You can can check the server log file with: -``` -cd beamlogdir -type isisbeam.log -``` -And see if there are errors about parameters not updating. Probably easiest thing to do is to kill the service and let it restart, then see if errors continue in the log. First type: -``` -pipe sh sys | sea sys$input isisbeam -``` -you will see a line like -``` -26E12DAA ISISBEAM LEF 6 132040 0 00:00:01.40 284 241 -26F991FC ISISBEAM_1 HIB 4 2416385 0 00:08:24.20 9589 3173 MS -``` -[_ISISBEAM_1_ is a sub-process of ISISBEAM and will die when you kill _ISISBEAM_] -The first number is the process id, in this case type -``` -stop /id=26E12DAA -``` -to kill it, then wait for it to restart (may take up to 30 seconds). Use the above `pipe` command to see then it has restarted, and then check isisbeam.log again. Look for messages after the `Starting iocInit` line in the file - -## more complicated details - -The asyn parameters that are served by the IOC are mapped to VISTA parameters (this is the accelerator control system). You can see the mapping with: -``` -type params.txt -``` -a line like -``` -beam_ions float t 0 IDTOR::IRT1:CURRENT -``` -means asyn parameter `beam_ions` (in the IOC Db files) is mapped to VISTA parameter `IDTOR::IRT1:CURRENT`. The other columns are related to data type and how the programs tries to check for stale (non updating) values. If the `isisbeam.log` indicated a huge number of errors for a particular parameter, then this could affect reading other parameters - after a certain number of errors the program restarts, but if it starts restarting too frequently this can cause PVs never to reconnect properly. In that case you _may need to temporarily remove a line_, but seek advice first. - -You can read the VISTA parameter directly on MERECKX if you think the issue is with the IOC e.g. -``` -db_access IDTOR::IRT1:CURRENT -db_access t1shut::n1_overview:sta -``` -you can search for errors about a particular parameter by e.g. -``` -sea isisbeam.log IDTOR::IRT1:CURRENT -``` -If you killed the `ISISBEAM` process above it has restarted, then the `isisbeam.log` file will only contain values since that restart. you can look at the previous log file by adding `;-1` to the file name -``` -sea isisbeam.log;-1 IDTOR::IRT1:CURRENT -``` -To see all log file versions type `dir isisbeam.log` - -If something does appear to have gone wrong with this service (e.g. values are not updating) you should get in touch with the _accelerator controls group_. The easiest way to do this is to call the MCR and find out who is on call. - -## Beam current block shows 0 - -A "Beam current" block may not be showing the accelerator beam current, it may be showing the effective beam current from the DAE. Blocks read from the accelerator will all be referring to global PV names starting AC: and TG: so if the block refers to an IN: it will be something on the local instrument. The DAE:BEAMCURRENT value is the effective DAE beam current, but if the dae is not counting (SETUP, WAITING, PAUSED) then this value will be zero. If the DAE is vetoing this value will vary between 0 and something else depending on what % of frames are being vetoed. If the chopper is being run at a lower frequency, the value will be lower too as the DAE is seeing less pulses and hence a lower effective beam current. - -## Value shows zero in IBEX but non-zero with `db_access` on MERCKX - -If the third column in `params.txt` contains a `z` (e.g. `tz`) then this means that the parameter will be monitored for a stale (non updating) state and if this is detected it will send 0 as the value to IBEX. At time of writing this had only been requested for the decoupled methane, sending 0 when the value is uncertain means they will go into a WAITING state as they run control on methane temperature and it is important that they are not collecting data when a methane charge-change happens. In future the value could be EPICS alarmed, but for SECI instruments we needed to send 0. - -You may be able to confirm a value is not updating by running `db_access` on it a few times with a reasonable time delay in-between, but some values are quite stable or fluctuate only a bit so this may be difficult to determine. You can view the typical value and variation in an accelerator parameter by following the links on values at [http://beamlog.nd.rl.ac.uk/status.xml](http://beamlog.nd.rl.ac.uk/status.xml) - -**If this is after a shutdown**, check the `st.cmd` and see if `epicsEnvSet("SIM_ISISBEAM", "1")` has been uncommented to stop out of cycle errors, if so comment it out and then kill the ISISBEAM process so it can restart - -## Checking channel access on MERCKX -if you type -``` -dir [--.db] -``` -you will see all the db files used by the program, you can display one to screen using e.g. -``` -type/page [--.db]isisbeam.db -``` -On MERCKX `$(P)` is "" so there is no prefix to PVs you see listed. To use `caget` you first need to set an epics address list as broadcasts do not work (they require privileges). So type -``` -def EPICS_CA_ADDR_LIST merckx -``` -and then you can use `caget` or `camonitor` on values e.g. -``` -camonitor TG:TS2:DMOD:METH:TEMP -``` -You can [browse the Db file source on the web](https://github.com/FreddieAkeroyd/EPICS-VMS/tree/master/ioc/ISISBEAM/isisbeamApp/Db) - -## intermittent dropouts - -The program will restart after too many errors are detected. If you run: -``` -camonitor ICS:IB:ERRCNT ICS:IB:CHANERRCNT -``` -`ERRCNT` is the total number of channel reads errors since last successful read, `CHANERRCNT` is the number of channels currently in error. When `ERRCNT` passes a threshold, the program restarts and you will see these PVs as well as other beam PVs briefly become disconnected. Note that `CHANERRCNT` may be non-zero while `ERRCNT` remains 0, this means that there isn't an actual read error on the channel, but it is considered in error for another reason. You would typically need to look at `isisbeam.log` to tell. This usually means the program thinks the channel is `stale`. Each channel, even if it does not change value, should have its timestamp updated by the accelerator control system. Also things like beam current are flagged as suspicious if their value (when non-zero) is exactly the same value for a long period of time. - -## nothing working - -check `isisbeam.log` but could be a scaled up version of intermittent dropouts leading to extremely frequent restarts and so no time for PVS to get connected. In bad cases you may need to remove lines from the `params.txt` file described above to stop the erroring reads being attempted. - -## things not updating - -`db_access LOCAL::BEAM:TARGET2` - -on merckx will show the local database TS2 beam, if this seems unusually stable then the accelerator controls vista system may have frozen. - -## server or services unavailable (nagios) - -Check for `merckx` in nagios, put it in the quick search box on the nagios top page. The `TS2 Beam Current Updating` check should reflect similar to the `db_access LOCAL::BEAM:TARGET2` - -If nagios shows `merckx` is down: -* during office hours email "ISIS Controls (support)" in the outlook address book and tell them that "The VMS MERCKX server is unreachable and instruments cannot access beam and moderator information that is important for running" -* out of hours contact the MCR and ask them to contact the on call ISIS accelerator controls computing person - diff --git a/doc/systems/external/ICAT-Troubleshooting.md b/doc/systems/external/ICAT-Troubleshooting.md deleted file mode 100644 index 5899dcd6d..000000000 --- a/doc/systems/external/ICAT-Troubleshooting.md +++ /dev/null @@ -1,11 +0,0 @@ -# ICAT / TopCAT - -**ICAT** is a metadata catalogue of ISIS (and other facilities' data). **TopCAT** is the web interface to this catalogue. - -For more information, see the [ICAT Project Website](https://icatproject.org/). - -Occasionally, the process which ingests ISIS data into the ICAT catalogue stops and the main symptom of this is that new data isn't available via TopCAT. This is what users notice and may call us about. - -Unfortunately, the server that runs the ingestion process is managed by SCD and we have no control over it (although it is on the ISIS network). There are various NAGIOS checks provided by SCD that we run on our server which notify the relevant people of any problems. - -There is an ICAT/TopCAT entry in the category dropdown box on the [error reporting page](http://sparrowhawk.nd.rl.ac.uk/footprints/) that can be used to submit a problem (will be forwarded to Computing Group), or the SCD team can be contacted directly via their [email address](mailto:isisdata@stfc.ac.uk). \ No newline at end of file diff --git a/doc/systems/external/IDAaaS.md b/doc/systems/external/IDAaaS.md deleted file mode 100644 index 0d82f8d38..000000000 --- a/doc/systems/external/IDAaaS.md +++ /dev/null @@ -1,7 +0,0 @@ -# IDAaaS - -To access IDAaaS see details at https://www.isis.stfc.ac.uk/Pages/Connecting-to-isiscomputendrlacuk-using-NoMachine.aspx - -You need to login with user office credentials, which can be you fed id or user@stfc.ac.uk email address. However you need to make sure you have a user office account id and that this is linked to your fed id. - -If you haven't signed up for a User Office account before, you should be able to request one using your stfc email address at https://users.facilities.rl.ac.uk/ Once you have done that, if you login, select "My Details" at the top of the page, you should see a button on the left hand side saying "Request Fed ID". This will give you an option to link it to your Federal ID. \ No newline at end of file diff --git a/doc/systems/inst_control/Archive-Watcher.md b/doc/systems/inst_control/Archive-Watcher.md deleted file mode 100644 index feafc3b49..000000000 --- a/doc/systems/inst_control/Archive-Watcher.md +++ /dev/null @@ -1,23 +0,0 @@ -# Archive Watcher - -This program monitors the `data$` share on the NDX instrument specified and copies across new files to the local computer it is running on, it is installed as a windows service. - -Run setup.exe from `\\isis\shares\ISIS_Experiment_Controls_Public\archive_watcher` it will install to `c:\Program Files (x86)\STFC ISIS Facility\ISIS Archive Watcher` and register as a service. If you get issues with setup.exe, just run msi on its own. After install create an admin window and run the `postinstall` bat script in the install directory. You then need to follow `README_POSTINSTALL.rtf` in this directory, you will create an `archive_watcher.properties` with relevant content. For musr this was -``` -watcher.instrument = musr -watcher.fileprefix = musr -watcher.localdir = c:/data -watcher.copylogs = true -watcher.digits = 8 -watcher.muonautosave = true -## if you want files pulled to this machine to be pushed elsewhere too, define watcher.pushto -watcher.pushto = file:////ndavms/musrdata -## if you want to override the default \\ndxinstrument\data$ file source define watcher.sharename -# watcher.sharename = -``` -`watcher.pushto` is not needed in most cases, `watcher.muonautosave` set to false unless on a muon instrument. watcher.digits is 5 or 8, it is the number of digits in your raw file run number. `copylogs` is whether to copy the *.log/*.txt files as well as the raw data file. - - - - - diff --git a/doc/systems/inst_control/Automated-log-rotation.md b/doc/systems/inst_control/Automated-log-rotation.md deleted file mode 100644 index 05cc34be5..000000000 --- a/doc/systems/inst_control/Automated-log-rotation.md +++ /dev/null @@ -1,78 +0,0 @@ -# Automated log rotation - -IBEX logs are rotated by a script called `logrotate.py` in `C:\instrument\apps\epics\utils\logrotate.py`. This script is called nightly by a windows scheduled task. - -Currently, for all files not modified within the last 10 days, it: -- Deletes them if they are console logs -- Moves them to `stage-deleted` otherwise - - -The scheduled task is added/recreated as an install step in `ibex_utils` under `server_tasks`. - -A nagios check will use the same script, but in dry-run mode and with a cutoff of 14 days, to notify us if the log rotation stops working. - - -## Automated periodic database message log truncation -Console logs are written to a `msg_log` database. After a while, the number of messages can become huge, especially if the console logging frequency is high. To mitigate this, an automated message log truncation procedure has been created. A scheduled SQL event calls a procedure to truncate the database `message` table to retain just those records spanning a predefined retention period. -The table's primary key is just the ID number, so searching on the `createTime` column for the row at the retention period threshold would be inherently slow. Instead a binary search algorithm has been developed which, knowing the earliest and last record timestamps, will check whether the target time lies above or below the current binary division and iterate until the `createTime` field of the target record is within three rows of the low and high row search limits. The `binary_search_time()` procedure is in the `binary-search.sql` file. -The `binary_search_time()` procedure takes one input parameter and returns two output parameters: - -**Binary Search**: -``` - IN p_retention_period time -- The retention period as a time type (e.g. '336:00:00' == 2 wks) Note that this is limited in SQL to '838:59:59', about 35 days - OUT p_first_row_id -- The id value of the earliest row in the table (zeroth row). - OUT p_row_number INT -- The returned record number in the table which is closest to the target time. - This will be a reference to a variable provided by the calling scope. -``` -The procedure `truncate_message_table()` calls the binary search procedure and uses the returned first row ID and target row ID to determine the rows to be deleted. Potentially the number of rows to be deleted could be very large, in some cases millions, so it is important that the database server is not locked for significant periods. Instead, rows are deleted in blocks of, say 1000 rows at a time, with a sleep period in between. This makes the deletion process cooperative and prevents data loss and timeouts from other database clients. -The `truncate_message_table()` procedure takes one input parameter and returns two output parameters, in the same way as the binary search procedure: - -**Truncation procedure**: -``` - IN p_retention_period time -- The retention period as a time type (e.g. '336:00:00' == 2 wks) Note that this is limited in SQL to '838:59:59', about 35 days - OUT p_first_row_id -- The id value of the earliest row in the table (zeroth row). - OUT p_row_number INT -- The returned record number in the table which is closest to the target time. - This will be a reference to a variable provided by the calling scope. -``` - -The `log_truncation_event()` is triggered periodically, typically once per day at 01:00 and calls the `truncate_message_table()`, supplying the retention period in days. Within the `CREATE EVENT` block, there are two STARTS lines, one of which is commented out and can be instated for testing with a short interval. - -Files specific to automatic truncation are: - -| File | Description | -| -----| ----------- | -| binary_search.sql | Defines the binary_search_time() procedure | -| truncate_message_table.sql | Defines the truncate_message_table() procedure | -| truncate_event.sql | Defines the log_truncation_event() event | -| create_event_logger.sql | Defines the EventsLog table for auto truncation process logging | -| debug_log.sql | Defines the debug_log procedure appending info to the process log | -| test\test_truncate_proc.sql | A SQL script to quickly test the truncate_message_table() procedure | -| test\test_fill_message.sql | A SQL script to populate the message table with records assigned a `createDate` field value at one hour intervals over the given period (period_days) | - -## Testing -There is a fairly comprehensive README.md file in C:\Instrument\Apps\EPICS\ISIS\IocLogServer\master\tests - -For new systems with no pre-exiting database, the described test procedure can be invoked once the msg_log database has been built by running: `C:\Instrument\Apps\EPICS\SystemSetup\config_mysql.bat`. - -For systems with an existing msg_log database, there is a SQL script which simply creates the SQL event and procedures necessary for the automatic truncation, without affecting any existing database content: -cd C:\Instrument\Apps\EPICS\ISIS\IocLogServer\master -C:\Instrument\Apps\MySQL\bin\mysql.exe -u root --password= < log-truncation-schema.sql - -**Dummy data**: - -There is a script : SQL file tests\test_fill_message.sql which will insert dummy log records into the message table. Message records will be assigned a `createDate` field value at one hour intervals over the given period (period_days). These parameters are configurable within the script. - -`C:\Instrument\Apps\MySQL\bin\mysql.exe -u root --password= < tests\test_fill_message.sql` - -**Run the test**: - -After filling the message table, it is then possible to test the truncation procedure via: -`C:\Instrument\Apps\MySQL\bin\mysql.exe -u root --password= < tests\test_truncate_proc.sql` -Metrics are output to the screen. It can also be helpful to manually examine the message table to view the remaining data and check the time span. - -Note that the checks that the script is running correctly is by examining a new database table debug_messages, which is emptied before each truncation process. Details of the process are recorded in the table as simple text, which includes information on the progress of the binary search procedure. - -If you want to update a procedure, simply edit the SQL file in the `C:\Instrument\Apps\EPICS\ISIS\IocLogServer\master` directory or `tests\` subdirectory, then use the mysql executable to do the change to the database accordingly, e.g.: - -`C:\Instrument\Apps\MySQL\bin\mysql.exe -u root --password= < truncate_event.sql` - diff --git a/doc/systems/inst_control/Change-Windows-Theme.md b/doc/systems/inst_control/Change-Windows-Theme.md deleted file mode 100644 index 1f9b63e7c..000000000 --- a/doc/systems/inst_control/Change-Windows-Theme.md +++ /dev/null @@ -1,16 +0,0 @@ -# Change Windows Theme - -To change from Windows Classic mode to a more modern look on a Windows 7 PC - -1. From the Windows Start menu, select Control Panel -1. If viewing the Control Panel by category - 1. Choose Appearance & Personalisation - 1. Select Change the Theme from the Personalisation section - 1. Select the Windows 7 theme -1. If viewing the Control Panel by icons (large or small) - 1. Choose Personalisation - 1. Select the Windows 7 theme. -1. Now set the background to the standard neutral Instrument background (Grey) - 1. Select Desktop Background (Harmony) - 1. From the Dropdown, select `Solid Colors` (instead of Windows desktop backgrounds). - 1. Select the third cell in from the top (a light grey) and "Save Changes" diff --git a/doc/systems/inst_control/Computer-Troubleshooting.md b/doc/systems/inst_control/Computer-Troubleshooting.md deleted file mode 100644 index 314b9ce6e..000000000 --- a/doc/systems/inst_control/Computer-Troubleshooting.md +++ /dev/null @@ -1,140 +0,0 @@ -# Computer Troubleshooting - -Issue related to the computer that IBEX is running on - - -## Screen Resolution needs to be Set on a small screen to view a larger screen remotely - -The resolution is settable on a remote desktop, even to a resolution bigger than the current screen. To do this there is a menu item on the remote desktop window for “smart sizing” which does just this. - -![smart screen](rdp_smart_screen.png) - -It doesn't seem to persist on server 2012 unless you edit it into the `.rdp` file (and it also requires that you don’t select full screen or it takes a lower resolution). I’ve edited the `.rdp` file appropriately on the NDHSMUONFE desktop so that this works at 1920x1200. The key bits are at the top here – but you we can fiddle the resolution down a bit if a different aspect ratio works better. - -``` -screen mode id:i:1 -use multimon:i:0 -desktopwidth:i:1920 -desktopheight:i:1200 -smartsizing:i:1 -... -``` - -## Cannot access the network shares - -This may be solved by adding windows credentials on the machine. There is a document describing how to do this on ICP Discussions under "Security". - -## Cannot access just the ISIS archive (but can access the instrument shares as above) - -This will be identified by a failure to access archive shares even though access to the normal ISIS document shares **is** possible -This may be solved by adding a global DNS suffix to the network connection. Open an admin privileged powershell console. Then use the get command to check what is in the existing suffix list - -``` -Get-DnsClientGlobalSetting - -UseSuffixSearchList : True -SuffixSearchList : {domain1.ac.uk, domain2.ac.uk} -UseDevolution : True -DevolutionLevel : 0 -``` - -Now run the following command to prepend the fully qualified (in our case the ISIS domain) to the suffix search list returned in the last command (similar syntax to the example below). - -`PS C:\> Set-DnsClientGlobalSetting -SuffixSearchList newdomain3.ac.uk,domain1.ac.uk,domain2.ac.uk` - -## Cannot VNC into the machine - -Check the network is up (ping `NDX`). - -If it is, check the VNC process is running on the NDX machine (there should be a VNC icon in the task bar). -If not start it by double-clicking the entry in the windows autostart menu (`c:\ProgramData\Microsoft\Windows\Start Menu\Programs\StartUp`) and start it manually. VNC does not run as a windows service on NDX computers - it is running in "user mode" in the background, started on login - -## Data fills up volume too rapidly on an Instrument - (generating Nagios errors or disk full errors) - -This is usually an issue when an instrument changes their mode of data taking and it is particularly common for instruments which have changed recently to using event mode data collection or altered the scheme they use (where the amount of data which can be produced may be much larger). -Good questions to ask are: - - 1) Are any monitors being used in event mode (usually not a good idea, better to histogram) - 2) Have the jaws been opened up or is white beam falling on any detectors (check setup with scientist) - 3) Any unusually rapid data taking? (e.g. 15s runs with large-ish files) - -## Data Disk available space - -Varies widely per instrument and the space is tailored over time to match the needs of the instrument (with spare space as a buffer against exceptional usage). - -Space for data to reside on the instrument so it can be analysed locally is provided by a cache which is purged on most instruments using a scheduled task with `robocopy` (`robocopy /? `for details). Cache sizes vary widely per instrument. Some instruments with low data rates have caches with more gentle purging strategies. Caching on most high volume instruments will use a `robocopy` task with the MINAGE parameter set to 1 or 2 remove files that are 1 or 2 days old. Fewer instruments purge on a monthly basis (e.g. MINAGE:30), muons and reflectometers generally have smaller data files. - -Availability in the cache for 1 day minimum is required for local copying programs on all instruments to have data _available for copying_ from the instrument `data` share. The External Export cache may be cleared of recent data files if space is limited, but NOT the instrument Data area (these will be removed only when archived). - -The Clean and purging tasks run as privileged tasks in the scheduled tasks library on the guest VMs. Where specially large and controlled caching is needed (on WISH currently) a more generic powershell script `purge.ps1` is run as a task on the host - the difference being that the cache trims to a fill level of over 90% on age and currently will not empty over time. This allows maximum local data (about 2 cycles normally) to be available for local analysis. In both cases, files are first moved to an area for deletion and then deleted by a separate task which runs later. - -## System Disk Getting Full; Finding Space - -Often the system disk gets full because of logging, or windows updates etc. You can free up space by doing the following: - -- VNC to machine, check no-one is using it -- Run `Tree Size` `(\\isis\installs\ISISAPPS\TreesizePro)` and analyse the C drive: - - Flag any large files that you are worried about deleting to Chris - - Check size of `instrument/var/logs` move any large logs to back `\Backups$\stage-deleted\`. Do this by creating a directory on c, moving files in then copying to this because it is write once. -- Uninstall apps which shouldn't be there (if you have admin access then removing mysql installer - community save 600Mb) -- [Truncate the database if it is too large](#database_troubleshooting_reduce_space) -- If you have remote desktop and a little more time then: - - Run `Disk Clean-Up` on C in user mode and remove all default files - - Run it in admin mode by clicking the button - - Remove all the default files it lets you - -## Getting 1920 x 1200 (or other) resolution on Daxten (analogue) connection to a monitor which supports this - -Dual monitors with one replicated by a Daxten or a single remote Daxten monitor may need this as 1920 x 1200 monitors are less common in default resolution lists. Digital connections are normally fine but in this case the connection has to be analogue. - -When a monitor is being driven though a remote analogue VGA connection, there is no feedback to the computer to select the correct monitor resolution. In this case the resolution must be forced on the graphics card and in Windows to be correct. The essentials are - 1) Install the analogue _monitor_ driver in the advanced display settings for the screen - once this is associated with the display - 2) Use the Graphics card Manufacturer utility to set up a custom 1920 x 1200 resolution (if necessary) and refresh rate (go low on refresh rate e.g. 51Hz if the graphics card considers the refresh rate too high). - 3) Ensure this is applied to the correct display. - -for more details see https://github.com/ISISComputingGroup/ControlsWork/issues/720#issuecomment-1413986492 - -## IRIS: Drives are not mapping correctly - -On IRIS when the machine is rebooted the mapped drives don't automatically reconnect for some reason. This means there data is not copied to the analysis machine. The drives are reconnected by opening them in file manager in windows then it is done. - - -## Remote desktop client keep freezing/hanging - -Try the following as administrator on the machine that you are running remote desktop client on: -``` -* Run gpedit.msc -* Navigate to Computer Configuration > Administration Templates > Windows Components > Remote Desktop Services > Remote Desktop Connection Client. -* Set the "Turn Off UDP On Client" setting to Enabled. -``` - -## User says they can not see their nexus data files on external machine - -See [DAE-Trouble-Shooting](#dae_troubleshooting_cannot_see_nexus_files) - -## Unable to communicate with MOXA ports - -We encountered this issue in August 2017 on HRPD. Neither SECI nor IBEX could communicate with the MOXA. We solved the problem on the fly by remapping the ports from COM blocks 5-20 to 21-36. NPort Administrator had several ports in the former block marked as in use in spite of no devices being active. The ultimate fix was to clear out the offending registry value: - -1. Click start → Run → type regedit and click OK button -1. Navigate to `HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\COM Name Arbiter` -1. Now on the right panel, you can see the key `ComDB`. Right-click it and click modify -1. In value Data section select all and delete reset to zero -1. Restart the machine if needed (to do this remotely use the command `shutdown -r -t 0` - -A similar (but I think unrelated) problem was found in June 2018 on ZOOM. Some ports in a MOXA were found to not communicate with a Julabo. According to both the lights on the MOXA and the MOXA's web interface there was data being transmitted both to and from the device. However, when transmitting to the device (either via hyperterminal or IOC) no actual data was received. Restarting the MOXA had no affect. The problem was ultimately not resolved, the Julabos were moved to different ports. - -## Can not rename/delete/move a file/folder - -This happens when windows locks a file for you. The lock can either be because of a local process or because of a file is shared from another computer. -### Local Process -Close items that might be using the file, especially command line consoles in that directory. If you still can't find it load "Process Explorer" (sysinternals some of the machines have this in the start menu). Then click `Menu` -> `Find` -> `Find File or Handle ...` type the path and this will give you the process id that is holding the lock. - -### Share -If it is through a share the file lock will not appear in here. In this case look at the share information then kill the share. It may reconnect so just do the operation quickly. - - -## Checking windows event log on NDX computer for process crash/resource issues - -The windows application and system event logs contain useful details to help diagnose system, process and resource issues check as crashes and memory issues. Accessing this via the computer management (`compmgmt`) application is in a [separate sharepoint document](https://stfc365.sharepoint.com/:w:/r/sites/ISISExperimentControls/Instrument%20Documents/Checking%20event%20log%20on%20a%20computer.docx?d=w40b9227a3095448eb655c156c2a567db&csf=1&web=1&e=VzNF0U) diff --git a/doc/systems/inst_control/Increase-VM-memory.md b/doc/systems/inst_control/Increase-VM-memory.md deleted file mode 100644 index 73c714b9e..000000000 --- a/doc/systems/inst_control/Increase-VM-memory.md +++ /dev/null @@ -1,47 +0,0 @@ -# Increase VM memory - -### Increasing guest VM memory to 14GB (using `NDHDETECT` as an example). -The basic steps are as below: -1) log onto the server (via the local Administrator account). -2) Check that there will still be a minimum of at least 20GB free space on the server C: drive after the increase. -3) Shut down the VM (`NDXDETECT1` in this case). -4) Change the startup/maximum memory setting for the VM in Hyper-V manager to 14GB. -5) Restart the VM and check that the memory has indeed increased (and that all still works). - -### Log onto the server -All administration of the Hyper-V system and system settings should be done from the local "Administrator" account for the server (NDH machine), in this case this would be `NDHDETECT\Administrator`. The password is in the usual place and the remote desktop log on should be done explicitly via the below command to ensure that the log on is not to another account by explicitly asking for a prompt. i.e. - -`mstsc /v:ndhdetect /admin /prompt` - -the `/admin` should ideally be used as it is possible to have more than one concurrent log on on a server and it is always good to be able to get back to the same session (there is only one /admin session). - -### Check there will be sufficient free disk space on the C: Drive of the instrument server -The space taken by the running VM on the NDH machine's C: drive will be increased by the amount of extra memory allocated to the machine. So, for example if the machine was 8GB going to 14GB it will take up another 6GB on the system drive. Ensure that the expansion will not take the free space on the C: drive below the minimum of 20GB. - -The running VM memory size can be seen from Hyper-V manager. Start this by clicking on -![](vm-memory-hyper-v-icon.png) on the toolbar. - -### Shut down the VM to be modified (NDXDETECT1) -Normally there will only be one NDX VM running but in this example there are actually two, we will only increase the memory size for the first one. On the Hyper-V manager window, right click on the VM and select `Connect...` as shown (or double click). This will open a console window on NDXDETECT1. - -![](vm-memory-connect-to-vm.png) - -If necessary log on to the instrument as `.\spudulike`, password as normal. Open a console (`Command Prompt`) window and type - -`shutdown /f /s /t 0` - -Wait for the machine to shut down (which it definitely will with no intervention needed), the `/f` forces shutdown regardless of any questions from applications. - -If you normally use `Start->Run...` to run commands, please type "cmd" and create a fresh command window, don't run the forced shutdown command directly, it will be remembered as the last `Run...` command and makes it too easy for a user to accidentally shut down their machine permanently! - -### Change the startup/maximum memory setting for the VM in Hyper-V manager to 14GB -When the machine has shut down, right click again in Hyper-V manager as before, but this time select `Settings...` The dialogue shown below will appear. - -![](vm-memory-change-memory.png) - - Navigate to expand the "Processor" settings and select "NUMA". Click once on the "Use Hardware Topology" -![](vm-memory-use-hardware-topology.png) - button. The numbers above may change a bit (but likely will not), in either case, select the "Maximum amount of Memory" and copy the figure here. On the "Hardware" pane on left hand side, select "Memory." Paste the figure you just copied into the box for "Startup RAM:" and click `Apply`. As long as this completes without error the job is done. Close the window, right click as before on the VM but this time choose `Start`. Check that the system boots OK, the console window will show the boot process. Ensure that the system is running as it should. You can check in task manager (on the performance tab) to see that the physical memory has now expanded. - -### Endnote -As the detector testing system is an unusual setup these two virtual machines must share their resources and are both equally important so despite what was shown here, these are now both set to the small IBEX machine minimum of 8GB each with 3.5GB allowed for the server system, i.e. 8GB+8GB+3.5GB which adds up to 19.5GB. If one of the four 8GB DIMMS in the host were to fail, there would be (32-8)GB left i.e. 24GB which would still permit the two machines to continue to function optimally (despite some risk of a crash and reboot if the failing DIMM corrupted memory which was in use). diff --git a/doc/systems/inst_control/MDT-(Microsoft-deployment-toolkit).md b/doc/systems/inst_control/MDT-(Microsoft-deployment-toolkit).md deleted file mode 100644 index 9b8dadc20..000000000 --- a/doc/systems/inst_control/MDT-(Microsoft-deployment-toolkit).md +++ /dev/null @@ -1,48 +0,0 @@ -# MDT (Microsoft deployment toolkit) - -This page assumes you already have access to a machine with MDT installed. If you don't have access to this type of machine, you can build a new one by following the instructions listed [here](mdt/Building-a-windows-10-MDT-build-server) - -MDT is a tool used by some other organisations - as such, there is some documentation about how to use the tool online. Rather than trying to describe the full capabilities of the tool, this page aims to document the ISIS-specific ways in which it is used. - -### Overview of concepts - -- **Deployment share** - this is a static location which must be read-accessible from the new machine to be built, and write-accessible from the MDT build server. -- **MDT Build server** - this is a machine with MDT installed on it, that has write access to the deployment share. This can be a virtual machine, even one built using MDT itself... -- **Deployment ISO** - this is a virtual disk (ISO image) which gets built by the MDT process. It contains a minimal (bootstrap) operating system, and points back to the deployment share to read instructions from it. It is the bootable media which starts off the creation of a new virtual machine. - * Note that it is not a standard windows 10 iso (like you might get directly from Microsoft), as it has to contain instructions to point to the ISIS MDT deployment share location. - -The basic process of an MDT update and deployment is: -- Make changes to the MDT configuration on the MDT build server. -- Update the deployment share so that the share contains the latest instructions for building a system -- Copy the deployment ISO to the machine that you want to deploy to -- Boot the brand-new machine using this deployment ISO, this will boot into a minimal MDT deployment environment (note that at this stage windows is not deployed yet!) -- The installation process will start by asking a few basic questions (e.g. hostname of new computer, admin password) -- Once the basic questions are complete, the MDT process will proceed to install the operating system and any specified applications -- Once the above process is complete (which may take a while as it does a lot of installations), you should have a working installation of windows and all relevant applications. The MDT deployment process is now complete and you can remove the MDT deployment ISO and proceed to use/configure the machine as normal. - -Once you are into the main MDT interface, a few of the options along the left side of the screen are typically interesting to us: - -### Applications - -The applications folder in MDT houses the definitions of various "applications" that can be installed. Fundamentally this works by calling normal windows installers under the hood, so any setup which can be performed unattended can be done by MDT. - -By clicking on a folder name and then right-clicking on an application name and selecting "properties", you can see and modify the details of the application installation process. You can also add installers in these menus, which will be copied onto the MDT deployment share location. - -This set of applications is not automatically installed however - this is just a list of applications that MDT knows how to install. - -### Task sequences - -The most interesting items here are under "ISIS Instrument Reference" -> "Re-clone system" -> "Build thick W10 image" as this is the full automated install. By right clicking this and selecting "properties" and then "task sequence", it will bring up a menu containing an ordered list of the installation tasks. These can include installing the applications defined above, and various other items such as applying OS updates, arbitrary sleeps etc. - -### Update deployment share - -When you are done making some changes, you should update the ISO image on the deployment share. This is done by right-clicking on the deployment share location in the left-hand pane of MDT and selecting "update deployment share". This will build and deploy a new ISO image containing the new deployment instructions. - -## Building clones via MDT - -```{toctree} -:glob: -:titlesonly: - -mdt/* -``` diff --git a/doc/systems/inst_control/PS-Remote.md b/doc/systems/inst_control/PS-Remote.md deleted file mode 100644 index 549212f65..000000000 --- a/doc/systems/inst_control/PS-Remote.md +++ /dev/null @@ -1,33 +0,0 @@ -# Powershell Remote - -Remote Powershell can be used on the NDX machines to run arbitrary command line scripts remotely. This could be useful for example if you want to run something as the database admin without having to interrupt instruments running. - -```{warning} -This is obviously very dangerous - you will have the power to run any commands and could affect the running instrument quite severely.** -``` - -### How to set your machine up - -1. Start a PowerShell terminal as administrator (found under `Windows PowerShell` in the Start Menu) -2. Run `Enable-PSRemoting` you should see an output like: -``` -WinRM has been updated to receive requests. -WinRM service type changed successfully. -WinRM service started. - -WinRM has been updated for remote management. -WinRM firewall exception enabled. -``` -3. Run `Set-Item -Path WSMan:\localhost\Client\TrustedHosts -Value * -Force`. This will allow your machine to access others. - -### Using Powershell Remote - -Once the above setup has been done you can use PowerShell Remote as follows: -1. Start a PowerShell terminal (does not need to be admin). -2. Run `$cred=get-credential` and you will see a pop up box asking for you to input some credentials. -3. Input the credentials for an instrument admin account (remember to put in an instrument machine as the domain but this can be any instrument machine, the credentials will still work for remote access as long as the machine has the same administrator account). -4. Run `Invoke-Command -Credential $cred instrument_machine { command }`, where _`command`_ is the PowerShell command or script you'd like to run on the remote _`instrument-machine`_ (standard DOS commands can be executed with `{ cmd /c` _`command`_ }). - -### Accessing lots of instruments - -There are some useful power shell scripts in the isis private share `...private_git_repositories\private_scripts.git` including one that helps remote to instrument machines and run the same command. diff --git a/doc/systems/inst_control/Rebuild-Performance-Counters.md b/doc/systems/inst_control/Rebuild-Performance-Counters.md deleted file mode 100644 index 24d7b0bf7..000000000 --- a/doc/systems/inst_control/Rebuild-Performance-Counters.md +++ /dev/null @@ -1,38 +0,0 @@ -# Rebuild Performance Counters - -If nagios was working OK and then suddenly you get lots of errors similar to : -``` -UNKNOWN - The WMI query had problems. You might have your username/password wrong or -the user's access level is too low. Wmic error text on the next line. -[wmi/wmic.c:196:main()] ERROR: Login to remote object. -NTSTATUS: NT_STATUS_ACCESS_DENIED - Access denied -``` -You may need to rebuild the performance counters on the machine that cannot be queried. - -This is detailed on https://support.microsoft.com/en-us/help/2554336/how-to-manually-rebuild-performance-counters-for-windows-server-2008-6 but to summarise: - -1. Open a administrator CMD window -1. Rebuild the counters: - ``` - cd c:\windows\system32 - lodctr /R - cd c:\windows\sysWOW64 - lodctr /R - ``` -1. Resync the counters with Windows Management Instrumentation (WMI): - ``` - WINMGMT.EXE /RESYNCPERF - ``` -1. Stop and restart the Performance Logs and Alerts service (only if it is running) -1. Stop and restart the Windows Management Instrumentation service (this should always be running) - - -If all this fails, you may have to reboot - -You can check WMI is working remotely by using the WMIC from your own computer e.g. to check `ndxvesuvio` -``` -wmic /node:"ndxvesuvio" /user:"ndxvesuvio\the_admin_account" os get name -``` -Replacing the_admin_account with our usual admin account - - diff --git a/doc/systems/inst_control/Script-to-connect-to-all-instruments.md b/doc/systems/inst_control/Script-to-connect-to-all-instruments.md deleted file mode 100644 index 7db16f87c..000000000 --- a/doc/systems/inst_control/Script-to-connect-to-all-instruments.md +++ /dev/null @@ -1,11 +0,0 @@ -# Script to connect to all instruments - -This is in the private repository on the shares. To use it clone as normal but use the path to the repo instead. - -This is a script for general use change it as needed. Currently the following functions are implemented by putting the right functions in the main of the script. - -Implement functions: - -- can be used to get versions of all instruments -- can be used to list usage of genie functions -- can be used to update files on drive diff --git a/doc/systems/inst_control/TreeSize-Pro.md b/doc/systems/inst_control/TreeSize-Pro.md deleted file mode 100644 index 3031f338a..000000000 --- a/doc/systems/inst_control/TreeSize-Pro.md +++ /dev/null @@ -1,26 +0,0 @@ -# TreeSize - -## Introduction -Tree Size pro is a tool to allow flexible and informative disk space management on Windows machines. Particularly useful for the instrument machines (ideally used over the network and not on the machines directly). It's also invaluable for managing and reporting on space and statistics for the ISIS archive servers. - -## Distribution -As this is now made available via `SCCM`/`Software Center` it should not normally be necessary for us to provide a distribution service. It is however incumbent on us to provide an up to date version of the software to DI regularly so that they can keep an updated version online. - -Good to do this regularly via a TopDesk ticket (particularly if there are any security issues raised) - but this hasn't been running for long. We can obviously download it from the TreeSize site for ourselves. - -## Licence Key -There is a licence key which is needed for installation for any machine it is used on. This is available on the TreeSize site (details in keeper) and in `\\isis\Installs\ISISAPPS\TreesizePro`, the licence is custom to us, so should not be passed on to anyone who is not aware of the licence requirements! Unless a query is specifically directed to us for this "Enterprise" version on this site, _general/home other users will be better being pointed to the "TreeSize free" version_ which can be installed for free (and is extremely helpful for home use) - - -## Licence -Due to the relatively low cost and benefit of the tool, this software is licenced and maintained as a "Site" licence for the UKRI STFC RAL site at Harwell (but by ISIS). The terms of this level of licence do not allow use at other UKRI sites or by non-UKRI organizations. The software is now distributed and licenced via the DI `Software Center` which installs and licenses it for end users. - - -## Licence Renewal -A renewal notice should be sent to the group inbox. It is usually due for renewal by the end of January. Further details can be found in [this ticket](https://github.com/ISISComputingGroup/ControlsWork/issues/356#issuecomment-2595707626). - -1. Logon to JAM Software site and renew maintenance. Use the address and log in details in keeper. After login to the JAM Software site, go to "Customer Area" and then to "Licences". Our licences are shown and maintenance can be renewed here. -1. Upload new Invoice & licence to the ticket linked above. -1. Download a recent version of TreeSize Pro to `\\isis\Installs\ISISAPPS\TreesizePro` and update licence instructions if necessary. -1. Submit a ticket to FIT to update the `Software Center` installation version from this location. -1. update the STFC software licencing list (see ticket for link). It is listed on the STFC software licence server [here](https://stfc365.sharepoint.com/sites/StfcITLicensing/Lists/STFCSoftwareLicensingList/Full%20details.aspx). \ No newline at end of file diff --git a/doc/systems/inst_control/VM-VHD-replication.md b/doc/systems/inst_control/VM-VHD-replication.md deleted file mode 100644 index 17bfe1403..000000000 --- a/doc/systems/inst_control/VM-VHD-replication.md +++ /dev/null @@ -1,3 +0,0 @@ -# VM VHD Replication -## Replication errors -Instructions on how to address replication errors (e.g. nagios reporting `ReplicationState>x` or `ReplicationHealth>y`) can be found [here](https://github.com/ISISComputingGroup/ControlsWork/issues/804) \ No newline at end of file diff --git a/doc/systems/inst_control/cron-jobs-and-windows-scheduled-tasks.md b/doc/systems/inst_control/cron-jobs-and-windows-scheduled-tasks.md deleted file mode 100644 index 395e3a6ef..000000000 --- a/doc/systems/inst_control/cron-jobs-and-windows-scheduled-tasks.md +++ /dev/null @@ -1,29 +0,0 @@ -# Cron Jobs & Scheduled Tasks - -In Windows, it is possible to set up a task that is scheduled to run at certain intervals or given certain conditions. - -## Task Scheduler - -To do this in Windows 10 we use the Task Scheduler app provided by the OS. You can find this by searching in the Windows taskbar for Task Scheduler or going to control panel -> System and Security -> Administrative Tools -> Schedule tasks. - -Previously scheduled tasks can be found by clicking on the Task Scheduler Library in the left panel and then viewing them in the middle top panel. - -To create a task click the Create Task button in the Actions panel on the right. Here give it a relevant name and description for the job. - -In the triggers tab, we can set when the task is run by clicking new and customising the trigger in the popped up window. - -To define the action we can create a batch script carrying out the task we wish to do (see batch scripts below). This batch script is then linked to the task when in the create task phase in the actions tab by clicking New and linking the program/script. - -You can customize other parts of the task in the conditions and settings tabs. - -## Batch scripts - -Used to script processes and tasks that can be run automatically. - -In a batch script, we can execute commands one after the other as we would in the cmd with the syntax "start ". - -Batch scripts also come with commands they can run such as "timeout 10" which makes the script wait for 10 seconds. - -## DETMON run control - -A cron job is used on [DETMON](https://github.com/ISISComputingGroup/IBEX/wiki/DETMON-Instrument-Details) to start and stop runs at midnight in order to create a new nexus log file for each day. This was done as part of ticket [4182](https://github.com/ISISComputingGroup/IBEX/issues/4182) \ No newline at end of file diff --git a/doc/systems/inst_control/mdt/Building-a-windows-10-MDT-build-server.md b/doc/systems/inst_control/mdt/Building-a-windows-10-MDT-build-server.md deleted file mode 100644 index c55d217a6..000000000 --- a/doc/systems/inst_control/mdt/Building-a-windows-10-MDT-build-server.md +++ /dev/null @@ -1,72 +0,0 @@ -# Building an MDT build server - -This wiki page documents the process for setting up a new MDT (Microsoft deployment toolkit) build server to create new windows 10 clones. - -The central source of truth for MDT configuration files is the MDT deployment share location, which can be found on the usual passwords page. - -## Mental model - -- `NDXINST` - this is the windows 10 virtual machine to be built. This is a usual NDX in the sense that it runs IBEX. -- `NDHINST` - this is the physical host on which the NDX virtual machine executes -- `NDXMDTSERVPROD` - This is an MDT server which contains instructions which the NDX can execute to install standard operating systems and/or software. This server can be either real or virtual as convenient. It never hosts a VM itself - it only contains the configuration files and setup for MDT. A new machine will need to be called something different (e.g. `NDXMDTSERVDEV`) - -This wiki page describes the process for setting up a new `NDXMDTSERVPROD` machine (NOT an `NDHINST` or `NDXINST` machine). - -## How to build a new MDT server - -- If you are creating `NDXMDTSERVPROD` as a virtual machine, you need to find a physical host for the MDT server. - * Ideally use same specifications as for an instrument machine (14GB memory, 256GB free disk space) - * If memory or disk space are tight, an MDT server can probably get by with ~6GB of memory and ~100GB of free disk space. - * __NB: if you are considering using your local machine as a host while working from home the VPN can cause issues where NDHBUILD cannot be found. We haven't found an easy fix for this so it's probably best to use an on-site host machine.__ -- If you are creating `NDXMDTSERVPROD` as a virtual machine, go into hyper-v manager on the MDT server host and select new machine. Default settings are mostly ok other than: - * Set the name to the intended hostname of the `NDXMDTSERVPROD` machine - * You'll need to create it on a disk which has enough space (will need ~256GB free) - * Set startup memory to 14GB (or less - see above) - * Set it to connect to ISIS network if you get the option, otherwise it will be ok on the default. - * Set virtual hard disk size to 128GB (or a bit less - see above) - * Install OS later -- Copy the windows 10 ISO file from `\\isis\inst$\mdt$\dev1\MDTDeploymentShare\Boot\LiteTouchPE_x64_Hyper-V.iso` and copy in onto the host server for `NDXMDTSERVPROD`. - * This ISO is not really a windows PE iso, it is instead an ISO which has been built in the past by a different MDT server machine, and this will have configured the menus which are available when booting this ISO. This is **not** substitutable for e.g. a version downloaded from `microsoft.com` -- Tell Hyper-V to boot from this ISO by adding it as a disk in the virtual disk drive (right click on the machine in hyper-v and select "settings") -- You might choose to increase number of processors available to the VM -- Boot the machine -- This will boot into a "Microsoft Deployment Wizard", which will then launch a set of menus embedded within the ISO. -- Select "Build thick updated windows 10 image" - * Thin image == Just windows 10 - * Thick image == windows 10 + software such as LabVIEW, NPort, notepad++, 7-zip, IBEX (if you have access to the existing MDT build server you may wish to disable the IBEX installation as it won't be required for this machine) -- Computer name - set it to the hostname (same as name in Hyper-V) -- Join the default ISIS workgroup (the name of this workgroup can be found on the passwords page) -- Don't restore settings or data -- When asked for an administrator password generate a secure random password following STFC password guidance, and then add this to the usual passwords page alongside hostname. -- Don't capture any image -- Set it off, it will now take ~1 hour and will install everything unattended -- After it has finished installing, it is wise to take a hyper-v snapshot so that you can roll back to this point if needed -- Create a new account to use and remove unneeded accounts (e.g. the default ones created for instrument machines). You can use `lusrmgr.msc` to access these settings quickly, or click through from the control panel. - * Add the account as `mdtbuilder`, set a password conforming to STFC password policy and add it to the usual passwords page - * Add the ability to remote desktop as this account by adding it to group `Remote Desktop Users` - * Add `mdtbuilder` to `Administrators` group (this is important for later) -- Now log out of the admin account and log back in as `mdtbuilder` -- Copy the following files from `\\isis.cclrc.ac.uk\inst$\kits$\CompGroup\ICP\MDT` into `NDXMDTSERVPROD`(if it asks for credentials use your fed id, **do not save these to the machine**): - * `adksetup.exe` - a utility for measuring performance of machines ("assessment and deployment toolkit") - * `MicrosoftDeploymentToolkit_x64.exe` - this is MDT itself - * `adkwinpsetup.exe` - this may not be necessary? -- Run `adksetup.exe` -- When asked which features to install remove "windows performance toolkit", "user experience virtualisation", "Microsoft application virtualisation", "Media experience analyzer" -- Run `adkwinpsetup.exe`, accept defaults -- Run `MicrosoftDeploymentToolkit_x64.exe` -- Open an **administrator** command prompt and type `net use /USER: MDT -> Deployment workbench and **run it as admin** -- Right click "deployment shares" -> "open" -> MDT deployment share location (found on passwords page) -> next -> finish - * If MDT complains that the directory does not exist, check you did the `net use` above correctly. -- Make changes to MDT process as required -- Right click "MDT Deployment Share" -> Properties -- Set "Network (UNC) path" to the MDT deployment share location (found on passwords page). Note that this **cannot** be a DFS filesystem, it must point to a real server. DFS shares are not supported by MDT. -- If this is a new share, under "Rules" tab you will need to set the following (these may already be set if using an existing share): - * You will need to set paths: `SLShare` to ``, `SLShareDynamicLogging` to `\dynlogs` and `BackupShare` to ``. These are directories where logs will be written during the MDT build process, so that you can debug any failures. `` can be found on the passwords page. - * Ensure user details in this file match the MDT account detailed on the passwords page -- Click `Edit bootstrap.ini` - * Set `DeployRoot` to the MDT deployment share location (found on passwords page) - * Ensure user details in this file match the MDT account detailed on the passwords page -- Right click "MDT Deployment Share" -> update deployment share - -Congratulations! You should now have a working MDT build server. See [here](../MDT-(Microsoft-deployment-toolkit)) for details about how to *use* MDT. \ No newline at end of file diff --git a/doc/systems/inst_control/mdt/Building-a-windows-10-instrument-machine-from-MDT.md b/doc/systems/inst_control/mdt/Building-a-windows-10-instrument-machine-from-MDT.md deleted file mode 100644 index 5f0af0259..000000000 --- a/doc/systems/inst_control/mdt/Building-a-windows-10-instrument-machine-from-MDT.md +++ /dev/null @@ -1,75 +0,0 @@ -# Building a new instrument virtual machine from MDT - -Note: this page documents the process of booting and building a windows 10 **system** in an empty virtual machine. This page does not focus on deploying IBEX itself. - - if you do not see hyper-v on your windows desktop, you just need to enable it via `turn windows feature on or off` , Select `Hyper-V` and sub items. - -**Note**: boot and build can take a while, 3 hours on an NDH with SSDs, longer of you have spinning disks. - -### Find a suitable physical host - -- Find a suitable physical host server. The server will need a minimum of 14GB of memory and 256GB of hard disk space free. You can use hyper-v on your own Windows 10 desktop if your machine is powerful enough. - * If you are building a real instrument machine, the hyper-v host will usually be `NDHINST`, and the virtual machine that you're building will usually be `NDXINST` - -### Copy needed files onto physical host - -Choose a virtual machine name to use later - this name will need to be unique on the network (for an instrument this is the NDX name). For developing, choose a name with an `NDXTEST` prefix followed by a number. Choose a free number and record your choice in the spreadsheet called `w10_test_machines.xslx` in the General channel of Teams. - -- Make a copy of a boot ISO from `\CompGroup\ICP\W10Clone\Boot` on your local computer. You may see several ISOs in here, see the `README.txt` and choose the appropriate one. This iso does an initial boot and the loads the rest off a network share name embedded within it, thus the iso itself doesn't need to change often, it is just pointing to the appropriate location to install from. - * *Note: This ISO is not really a windows PE iso, it is instead an ISO which has been built by MDT. You cannot just use a version downloaded from `microsoft.com`* -- Next make a local copy of the VHD disks from `\CompGroup\ICP\W10Clone\VHDS` - choose either an appropriate IBEX release or latest Jenkins build. You will need to copy the `apps`, `var` and `settings` VHDS. If you plan to have several local VMs, you may wish to rename these to `-settings.vhdx` etc. Make a copy of the `var` disk name rename it to `scratch.vhdx` or `-scratch.vhdx` as appropriate. Make sure you are copying them to a disk with enough free space. - * Note: you can check the VHDs are not corrupt by mounting them on your local machine (right click on file) if they fail to mount, they may be corrupt and you should select a different set of VHDs. After mounting each VHD should contain some files, e.g. the Apps VHD should contain an EPICS installation and a client. - -### Configure the VM - -- Don't use hyper-v quick create, start hyper-v manager as your admin account. -- in hyper-v settings (right click on your computer name in hyper-v) - adjust hyper-v paths for where to create machines and disks if you want a non-default location. Make sure disk path is somewhere with lots of space. -- in virtual switch manager: there will probably be a default switch there connected to "internal network". You need to create a new virtual switch of type "external" and attach it to the correct ISIS network adapter. If you have several network interfaces, you may need to look at you system network setting to get the right adapter name. - -- Now select new -> virtual machine. Create a generation 2 VM, 12GB memory = 12288mb is ample, less may be fine. Select use dynamic memory. -chose "external network" switch for the VM, create a new virtual hard disk (the default, 128GB), choose install os from iso for booting, select above iso file saved above - - (note: in hyper-v set to NUMA architecture limit which is usually just below 14gb). - -right click on VM -- in firmware/bios settings of VM, put hard disk/network/dvd as boot order, this helps on booting, initially giving you some time (while it is failing to network boot) to get ready to press key to boot from dvd. After initial dvd boot you don't want to boot from dvd iso again. -- increase number of processors say to 4 -- add scsi hard drives and attach vhds, add scratch first and then add rest in any order. You do not need to specify a mount point, just make the disks available. - - These should be `apps.vhdx`, `var.vhdx`, `settings.vhdx` and `scratch.vhdx`. You **MUST** mount `scratch.vhdx` as the first SCSI drive in the VM as it gets formatted and partitioned during the MDT task sequence (into `Data` and `Scratch`). `scratch.vhdx` is actually a blank VHDX so an empty one called something different will work as well (or a copy of one of the others) - * Note: if you are replacing existing disks, you **still need to eject and re-add them in Hyper-V for them to be recognized!** - -go to checkpoints of VM and disable automatic checkpoints, but create a checkpoint before you start - this lets you revert back (apply) and re-run a boot without having to recreate the VM - -start vm and connect to screen of vm - - If it blue screens try the boot again. I've seen this occasionally on windows 10 hyper v, not on the windows server hyper-v. - -After iso boot it will go into MDT install -- choose reclone full system (thick w10 image) as install type -- change computer name to your unique from above, other defaults should be fine (join ISISWG, Don't restore settings or data) -- When asked for admin password, refer to passwords page and add the new password there if necessary for NDX. If this is your own desktop, change it to whatever you like - this is just the password for the `Administrator` account after boot. -- now leave it installing, it will reboot several times and may look like it is doing nothing at times. Wait until you see the final dialogue box displayed on the screen that says the `installation has completed with 0 warnings and 0 errors` - -### Setting up IBEX before first use - -- Check settings folder name - it should have been renamed during install to correct -- (NDX instrument) Inside the settings folder, do a git checkout to the correct config branch and pull - -### Starting IBEX - -- At this stage you should be able to start IBEX. Make sure you start it as our **standard user**, not `Administrator` that you are probably still logged in as, otherwise all of the log files and directories will be created with the wrong permissions. It is probably easiest if you now remote desktop into your new VM rather than use the hyper-v console - * It seems that the Var and Settings VHDs in particular are very sensitive to getting into a state where the files are "owned" by admin but admin can't delete them, and a reboot does not fix this. To fix this, install fresh settings/var vhds by following the "upgrade/change vhd" instructions below. -- Start ibex client, initially you will have no configuration loaded so not everything will start. Go to `configuration -> edit current configuration -> save as` and save it as something like `test` and switch to this configuration. This should now start DAE processes and you should end up in `SETUP` rather than `UNKNOWN` runstate after everything restarts. This seems to take a while for some reason, be patient. -- to be able to start a run with `Begin` you need to set some DAE parameters: - * in `experiment setup -> time channels` set first row of time regime 1 to be 10, 10000, 100, dT=C - * in `data acquisition` select the dropdown next to wiring, detector and spectra tables - choose the only option offered that is an `ibextest` table - * now apply changes - -## Upgrading/changing IBEX VHDs - -If you need to upgrade/change IBEX VHDS, the process is as follows: -- Shutdown the NDX machine (gracefully) -- Go into hyper-V and remove the three IBEX VHDS from the VM (Apps, Settings, Var) -- Replace the VHDS on the filesystem on the NDH with the new versions you wish to install -- Add these back in to the VM via Hyper-V manager -- Boot the VM -- Ensure that the filesystem looks sensible e.g. that `Apps/` contains EPICS and a client, `Settings` contains a settings directory, and `Var/` contains the expected file structure. - -Note: you can not simply replace the VHDs on the NDH by name. This is because Hyper-V sets some attributes on the VHDs when they are explicitly added; if these attributes are not set, you will get an error on attempting to boot the VM. \ No newline at end of file diff --git a/doc/systems/inst_control/rdp_smart_screen.png b/doc/systems/inst_control/rdp_smart_screen.png deleted file mode 100644 index 4862956c8..000000000 Binary files a/doc/systems/inst_control/rdp_smart_screen.png and /dev/null differ diff --git a/doc/systems/inst_control/vm-memory-change-memory.png b/doc/systems/inst_control/vm-memory-change-memory.png deleted file mode 100644 index c4393dd48..000000000 Binary files a/doc/systems/inst_control/vm-memory-change-memory.png and /dev/null differ diff --git a/doc/systems/inst_control/vm-memory-connect-to-vm.png b/doc/systems/inst_control/vm-memory-connect-to-vm.png deleted file mode 100644 index d5499b7de..000000000 Binary files a/doc/systems/inst_control/vm-memory-connect-to-vm.png and /dev/null differ diff --git a/doc/systems/inst_control/vm-memory-hyper-v-icon.png b/doc/systems/inst_control/vm-memory-hyper-v-icon.png deleted file mode 100644 index b987e2986..000000000 Binary files a/doc/systems/inst_control/vm-memory-hyper-v-icon.png and /dev/null differ diff --git a/doc/systems/inst_control/vm-memory-use-hardware-topology.png b/doc/systems/inst_control/vm-memory-use-hardware-topology.png deleted file mode 100644 index d9cb40127..000000000 Binary files a/doc/systems/inst_control/vm-memory-use-hardware-topology.png and /dev/null differ diff --git a/doc/systems/nagios/Adding-new-cycle-for-nagios-notifications.md b/doc/systems/nagios/Adding-new-cycle-for-nagios-notifications.md deleted file mode 100644 index 55db4bbed..000000000 --- a/doc/systems/nagios/Adding-new-cycle-for-nagios-notifications.md +++ /dev/null @@ -1,8 +0,0 @@ -# Adding new cycle for Nagios notifications - -Nagios has a file that lists time periods that notifications will be emailed in, many are "all the time" but some are set to be within cycle only, particularly ones that notify scientists e.g. muon pulse width or kicker check. The service will always go red in nagios web display and say that notifications are enabled, but not send an email alert unless the current time is within the linked timeperiod (it actually queues up the email and sends it once a timeperiod starts, hence it claims they are enabled). So this file needs to be updated with new isis cycle dates as appropriate. To do this: -* log onto control-mon.isis.cclrc.ac.uk linux machine -* cd /usr/local/nagios/etc/objects -* edit `timeperiods.cfg` with your favourite linux editor. At the bottom of the file there is a timeperiod called `isis_cycle` and add a new line to this in the same format as the rest of the define. We use Friday before start of user run for our start time - the web page https://www.isis.stfc.ac.uk/Pages/Beam-Status.aspx lists start of user run which is the Tuesday, so count back 4 days from that -* run `sudo service nagios reload` to load changes - you will be prompted for the password of the account you are logged in as - diff --git a/doc/systems/nagios/ContactsAdministration.md b/doc/systems/nagios/ContactsAdministration.md deleted file mode 100644 index fdc706064..000000000 --- a/doc/systems/nagios/ContactsAdministration.md +++ /dev/null @@ -1,17 +0,0 @@ -# Nagios Contacts -## Viewing Nagios Contacts and contact groups -To view currently configured Nagios contacts: - -1. log into the Nagios web interface as `nagiosadmin`. -2. On the left hand menu, select `Configuration` under `System`. -3. Select either the `Contacts` Object Type to see configured contacts or `Contact Groups` to see the configured groups of contacts. - -## Editing Nagios Contacts -To edit contacts or contact groups: - -1. SSH into the Nagios host using the login details available in Keeper -2. Nagios is in `/usr/local/nagios` with configuration of objects within `/usr/local/nagios/etc/objects/` -3. Within that directory, edit `contacts.cfg` -4. restart the Nagios service: - `sudo service nagios reload` -6. Check changes in the web interface as above diff --git a/doc/tools/Network-traffic.md b/doc/tools/Network-traffic.md deleted file mode 100644 index 0dab80fd4..000000000 --- a/doc/tools/Network-traffic.md +++ /dev/null @@ -1,22 +0,0 @@ -# Networking tools - -## View Network Traffic using Wireshark - -To see packets to and from the machine simply install Wireshark and use. To look at packets on localhost. Install npcap (https://nmap.org/npcap/) with compatible for winpcap. Then afterwards install Wireshark it should recognise ncap. -To capture network traffic: - -1. Start Wireshark -1. Select capture interface (for localhost use Npcap Loopback Adapter) -1. Click the fin to start, stop button to stop and fin with reload to restart. -1. It is often useful to filter your traffic. Filters I have used: - - - `(data.data contains "TE:NDW1407:CS:SB" || data.data contains 00:06:00:08)` look at all packets containing block on my machine or the EPICS search for channel message - - `udp.dstport == 55691 || udp.srcport == 55691` get all UDP data from and to port 55691 - - `tcp.srcport == 51679` get all tcp data from port 51679 - -## Look at Open Ports - -To see open ports as an admin type: -``` - netstat -abon -``` \ No newline at end of file diff --git a/doc/tools/SSH-keys.md b/doc/tools/SSH-keys.md deleted file mode 100644 index 2987fc09c..000000000 --- a/doc/tools/SSH-keys.md +++ /dev/null @@ -1,144 +0,0 @@ -# SSH key-based auth - -Instruments run an SSH server, which can be used to execute commands remotely. - -It is possible to access this SSH server using key-based authentication. Keys are associated with -an individual, but are used to grant access to the instrument accounts. This means that keys -for individuals no longer on the team can be easily revoked. - -## Key-pair generation - -:::{note} -If you already have a suitable SSH key, which is encrypted using a passphrase, you may -skip this step. -::: - -Generate a key-pair using a strong algorithm, for example `ed25519`: -``` -ssh-keygen -t ed25519 -``` -**You must encrypt this key with a strong password when prompted.** -Don't use an empty passphrase for these keys. This is not a shared -password, it is a password for your personal key-pair; store it in your password -manager. This will generate two files: `~\.ssh\id_ed25519` and `~\.ssh\id_ed25519.pub`. The file -ending in `.pub` is a public key, the one without the `.pub` extension is a private key. It -would be sensible to store copies of these two files in your password manager too. - -:::{warning} -For the avoidance of doubt, the **public** key (`*.pub`) can be freely shared with everyone (for -example, by being copied onto instruments). Do not share your **private** key. The private key -is additionally encrypted using your selected password. -::: - -{#keeper_ssh} -## Keeper - -To avoid having to copy and paste your passphrase every time, you can use [Keeper](https://ukri.sharepoint.com/sites/thesource/SitePages/Keeper-Password-Manager.aspx) to store your passwords and SSH keys. - -If you want to use Keeper (you'll need the desktop client for this, _not_ the browser plugin) for storing your SSH keys, and not have local plain text copies on your machine, you can do so. - -This is done by following [this guide](https://docs.keeper.io/en/keeperpam/privileged-access-manager/ssh-agent#activating-the-ssh-agent) with your public key, private key and passphrase filled in. - -You may need to [turn the `OpenSSH` agent off](https://docs.keeper.io/en/keeperpam/privileged-access-manager/ssh-agent#windows-note-on-ssh-agent-conflicts) if it's on your machine - see if `ssh-agent` is running in your services in task manager. - -It would also be a good idea to change the vault timeout to something relatively short to minimise scope of access for when the SSH keys are available. - -### SSH works and prompts to use passphrase, but git doesn't show the prompt - If `ssh git@github.com` works fine, your SSH key has been added to Github and `ssh` is using it. - -You may need to set the `GIT_SSH` environment variable to wherever your ssh executable is, as git might try and use its own ssh executable which doesn't seem to work with Keeper. `where ssh` will tell you where this is. - -{#manual_ssh_agent} -## Manually Setting up SSH agent - -```{note} -Ignore this section if you followed {ref}`the section on setting up keeper as your ssh agent`. -``` - -In a powershell window, run the following commands: -```powershell -Get-Service ssh-agent | Set-Service -StartupType Automatic -Start-Service ssh-agent -``` - -## Deploying the public key - -- Add your public key to the [keys repository](https://github.com/ISISComputingGroup/keys). -- Ask a developer whose key is *already* deployed to run the [`deploy_keys.py` script](https://github.com/ISISComputingGroup/keys/blob/main/deploy_keys.py), which will -update the `authorized_keys` files on each instrument. - -If the permissions on `administrators_authorized_keys` are wrong, that file won't work. The -permissions can be fixed by running: - -``` -icacls.exe "c:\ProgramData\ssh\administrators_authorized_keys" /inheritance:r /grant "Administrators:F" /grant "SYSTEM:F" -``` - -## One-off usage - -To connect via SSH to an instrument, use: - -``` -ssh spudulike@NDXINST -``` - -(If you aren't [using Keeper](#keeper_ssh)) This will prompt you on each connection for the passphrase to unlock your SSH key, this is the -password you set earlier for your personal SSH key. You will not be prompted for an -account password; your key is sufficient to grant you access. - -## Bulk usage - -:::{caution} -If you intend to run a command across many instruments, it is worth getting that command -reviewed by another developer and running it together. This is **especially** true if you intend to -run a command as a privileged user. -::: - -Typing the password to unlock your SSH key for each instrument would be tedious. -To avoid this, we can either [use Keeper](#keeper_ssh), or **temporarily** add the key to the SSH agent: - -``` -ssh-add -``` -This will prompt for the passphrase to unlock your SSH key. You can check that your key is now in -the SSH agent by running: - -``` -ssh-add -l -``` - -Once the key has been added to the agent, you can SSH to an instrument without any further prompts: - -``` -ssh spudulike@NDXINST -``` - -Commands can be executed like: - -``` -ssh spudulike@NDXINST "dir /?" -``` - -Since we no longer have any authentication prompts (having added our key to the SSH-agent), -this command is suitable for automating in a loop over instruments - for example from python -or a `.bat` script. - -Once you have finished with the administration task which needed SSH across multiple instruments, you -should remove your key from the agent (and then verify that it has been removed): - -``` -ssh-add -D -ssh-add -l -``` - -:::{important} -Do not leave these keys permanently added to the SSH agent - having *immediate* SSH access to *every* -instrument is an unnecessary risk most of the time (for example if your developer machine was compromised). -Add the keys to the SSH agent only when needed, and remove them from the agent again when your administration -task is complete. The usual sudo lecture applies: -> We trust you have received the usual lecture from the local System -> Administrator. It usually boils down to these three things: -> 1) Respect the privacy of others. -> 2) Think before you type. -> 3) With great power comes great responsibility. -::: diff --git a/doc/webdashboard/ISIS-Info-Slack.md b/doc/webdashboard/ISIS-Info-Slack.md deleted file mode 100644 index ad7f2c629..000000000 --- a/doc/webdashboard/ISIS-Info-Slack.md +++ /dev/null @@ -1,3 +0,0 @@ -# ISIS Info (Slack) - -The ISIS info Slack system is detailed in the "ISIS Info Slack" document on Teams General \ No newline at end of file diff --git a/doc/webdashboard/MCR-News-in-Teams.md b/doc/webdashboard/MCR-News-in-Teams.md deleted file mode 100644 index bb216a96b..000000000 --- a/doc/webdashboard/MCR-News-in-Teams.md +++ /dev/null @@ -1,6 +0,0 @@ -# MCR News (Microsoft Teams) - -MCR news and beam information is available in Microsoft teams for isis staff - -* ISIS Staff Details [word](https://stfc365.sharepoint.com/:w:/r/sites/ISISExperimentControls/ICP%20Discussions/MCR%20News%20and%20Beam%20information%20in%20Microsoft%20Teams.docx?d=w79783f105d7945aabf33501408768a27&csf=1&web=1&e=cjl8XE) [pdf](https://control-svcs.isis.cclrc.ac.uk/msteams.pdf) -* [Developer support details](https://stfc365.sharepoint.com/:w:/r/sites/ISISExperimentControls/ICP%20Discussions/MCR%20News%20Teams%20support.docx?d=w33ffb5d1fccd42fa8a0f9c531af306f2&csf=1&web=1&e=eZTeLE) diff --git a/doc/webdashboard/Web-Dashboard.md b/doc/webdashboard/Web-Dashboard.md deleted file mode 100644 index ff65cecfa..000000000 --- a/doc/webdashboard/Web-Dashboard.md +++ /dev/null @@ -1,123 +0,0 @@ -# Web Dashboard - -```{important} -This page documents the old web dashboard, also known as `JSON_Bourne`. The [new one](https://github.com/ISISComputingGroup/webdashboard) should have its architecture and how it's run explained in the README of its repository. -``` - -The dataweb service allows some information about each instrument to be viewed in a webpage from both inside and outside the ISIS network. - -## The Overall Architecture - -![Architecture](dataweb_architecture.png) - -The dataweb system consists of a number of parts running on each instrument: - ---- - -### The Archive Engine - -The archive engine shown in the [high level design](/overview/High-Level-Architectural-Design) produces internal webpages that provides live data on various PVs in HTML format: - -* INST (located at http://localhost:4812/group?name=INST&format=json) gives data on the PVs associated with the DAE etc. -* BLOCKS (located at http://localhost:4813/group?name=BLOCKS&format=json) gives data on the current status of all block PVs -* DATAWEB (located at http://localhost:4813/group?name=DATAWEB&format=json) gives data on hidden blocks - -### The WebServer - -The webserver is run as part of the BlockServer and provides all of the data on the current configuration in JSON format. This is the exact same data that is served on the GET_CURR_CONFIG_DETAILS PV. The webserver is currently serving the data on localhost:8008. Note that the fortinet VPN uses 8008 for internal configuration and so you cannot access this address through the fortinet VPN. - ---- - -### On the Dataweb Server - -There are also parts of the system running on a central [webserver](/systems/Webserver), which provides external access. - -### JSON Bourne - -The program collates all the data from the other sources, on all the EPICS instruments, such as putting the blocks and their values into the relevant groups as given by the configuration. This information is served as JSON to localhost:60000. This runs as a service on the central server and lives in C:\JSON_Bourne. - -### The Website - -Currently a simple JS script takes the JSON created by JSON Bourne and provides a simple webpage for an external client to view. This can be accessed from http://dataweb.isis.rl.ac.uk/. The code for the website, both the html and javascript are located in the central server at `C:\inetpub\wwwroot\DataWeb\IbexDataweb`. - -### Grafana and Journals Setup - -Docs can be found on the shares at `shares\isis_experiment_controls\web_dashboard_history.docx` - -## Deployment - -To update the production version of the dashboard: -* Remote desktop into external webserver (for username and password see password page) -* Open a git bash terminal in C:\JSON_Bourne and switch to the [release branch](/deployment/Creating-a-release) -* Run the deploy batch script as admin -* Restart JSON_bourne (see [here](#webdashboard_restart_dataweb)) -* Go to (for example) http://dataweb.isis.rl.ac.uk/IbexDataweb/default.html?Instrument=zoom and confirm the webpage is live - -## Development/Testing - -Clone the repository at https://github.com/ISISComputingGroup/JSON_bourne - -To test the blockserver webserver: -* Start your instrument -* Navigate to `localhost:8008` in a browser - -To test JSON_Bourne: -* Run webserver.py -* Navigate in a browser to http://localhost:60000/?callback=parseObject&Instrument=[Instrument-Name]&. - Where [Instrument-Name] is replaced by the desired instrument (i.e., ZOOM&) in all capitals. - -To test the front end on a developer machine: -* Open default.html with the variable of ?Instrument=instrument-name e.g. go to `file:///C:/Instrument/Dev/JSON_bourne/front_end/default.html?Instrument=larmor` in a browser to view Larmor's dashboard. Note that the path is dependant on where you have created the local JSON_bourne repository. This will use the JSON bourne instance running on NDAEXTWEB! - -To test the front end and JSON bourne on a developer machine: -* Run `webserver.py` -* Edit display\_blocks.js to look at http://localhost rather than http://dataweb.isis.rl.ac.uk -* Open default.html as above - -To be able to see your instrument as well: -* Add your instrument to the `local_inst_list` dictionary in the `webserver.py` e.g. `local_inst_list = {"LOCALHOST": ("localhost")}` -* Run your instrument -* Run JSON Bourne up as above - -If you need to update the archive engine then you will need to: - -1. Run create_icp_binaries.bat -1. `make clean uninstall install` in `..\EPICS\CSS\master` - -{#webdashboard_troubleshooting} -## Troubleshooting - -### General Investigation - -First look at the log to ensure that there are no issues. The log is held in `C:\JSON_Bourne\log`. Issues may be in the front end, in which case error logs are in the web browser, visit the webpage in a browser and open up the web console. -If there are a number of `HTTP Error 503. The service is unavailable` errors, restarting the server completely may be required, but simply restarting the Dataweb should be the first thing to try. - -{#webdashboard_restart_dataweb} -### Restart the Dataweb - -As admin open the "Task Scheduler" (there is a shortcut for this on the desktop) and end and run the "JSON Bourne" task (in task scheduler library). Make sure that ending the task has killed the Python webserver process. - -### New Instrument with No Details - -If the instrument archive has never been restarted then the dataweb will fail to show any information and claim that the server hasn't been started. To fix this simple restart the instrument archive. - -{#webdashboard_troubleshooting_instrument_page_not_working} -### Instrument Page not Working on Web Dashboard - -Several causes - -1. Check that the instrument is in the list of Instruments in https://github.com/ISISComputingGroup/JSON_bourne/blob/master/webserver.py and that the version on web server is up-to-date. - -1. Issues with MySQL in the moment the IBEX server was started (this seems to affect the archiver start up). Check logs of the MySQL service in the `var` area, fix any issues so that MySQL is running correctly again, then restart the IBEX server. - -1. If it works in your browser but not he users they may have a old cached copy (this shouldn't happen but we have seen it in Safari). Clear their browser cache and reload. - -1. Try restarting `ARINST` on the instrument. It can happen that the archiver does not pick up all PVs to be archived on server startup. A symptom of this is that the configuration file under `EPICS\CSS\master\ArchiveEngine\inst_config.xml` is very short compared to other machines. - -## Future Development Ideas - -* We need to improve the unit test coverage of this project. It would be worth looking into the [requests-mock](https://pypi.python.org/pypi/requests-mock) library as this would make it very easy to test server code which makes HTTP requests. - -## Overview page - -http://dataweb.isis.rl.ac.uk/IbexDataweb/Overview/ibexOverview.html \ No newline at end of file diff --git a/doc/webdashboard/dataweb_ architecture.xml b/doc/webdashboard/dataweb_ architecture.xml deleted file mode 100644 index 52732661b..000000000 --- a/doc/webdashboard/dataweb_ architecture.xml +++ /dev/null @@ -1 +0,0 @@ -7Vpdc5s4FP01nmkf7AGJD/Fox+623SbpjDOT9FEG2WaCkSvk2O6vX8kIDAI7eAPJpt08xOiiL849OrpX0INXq91fDK+X1zQgUQ8Ywa4Hxz0ATAsa4kda9qnFtezUsGBhoCodDdPwF1FG1W6xCQOSlCpySiMerstGn8Yx8XnJhhmj23K1OY3Ko67xglQMUx9HVet9GPBlakXAOdo/k3CxzEY2HS+9M8P+44LRTazG6wE4P/ylt1c460s9aLLEAd0WTHDSg1eMUp5erXZXJJLYZrCl7T6duJvPm5GYN2lgpQ2ecLRRj/4lTjjbrGT7dIZ8n6GyXYacTNfYl+Wt8HwPjpZ8FYmSKS5xFC5icR2RuRh79EQYDwWkQ2XmVNaf05grf5tQlNX4oi7ZnXwGM0dGMI7QFeFsL6qoBpbCcl8ubo+OMx1lWxacBrOKWJFlkfd8BExcKMzq8bPP4DcYDN4phNBrimELEDq/JYQWQK8HIapAOGT+MnwiwjiJF2FMLkVRB6gZqmEUXdGIssMQcI584vvCLrxJH0nhzgzZlm20g7tdxt2tYa5VAzswWsDdq1L3ZnrXFta+QICwdlByyiiZhl2FCdTAlO3bL0Ep29QLMI2+3V79PX0PQDk1fOoMKLMC1Hh4N7yfjN4BUsB8TUqBKqUi6j8mhIlneguxI2ZgE7dO7DzHhdjpROwAaqh2XgtiZ1ZjxXsym7aKeIf8hKbzivysxoVjzPGWzIQxg8z4cDMeTh7uxPo2P74KZ1tA1Xo+2s4Fv4QqaIOC1VixghuJg6HMACWbIpwkoX8eOLIL+YMoGANTlX4cSrYqfScsFBOVrBwfQxYSVBJIDUExKbphflbLrUe1AJtdg1pmYyTCXMZzpUS3Bkk1wncayvwtc1rukb1WzrpIp6paFdNFrSMLonJHhlfuiGO2ILzS0cGx+WM383U1uFpyLlg8lL2CTzfjh/Hk+laULSQcKW0aEwTJedn3jCThLzw7VJC+XMt5HmZuj3r2uE6I9IW0CoNAth9FeEaiUZ7uF+PeNOG/IMtQxxhqZr38dKBEKufsujQGlpXFD8o5fdAKefqO1q2pdUHn84S82N3u80s7WeK1vPQ3LNqPmMCeSAc/o5QH95CgmW4yygVWVFr6JmopRQGGVUKwLqiEsLr6zTZyFLOaHL5YM5tq4GlELE1HarLlc3p4gfQZWkAPtO29ufR55ztqT/pANQ84LX3gD5C+s1yCyBwYBjKg6yDXc6GruQkOgOuJKBx6EFoeQv9N/QOw/WWahTYCH7GbFoMbhE6FNwJ0tpeN+oZENbOkzdy0G2loJyxSupgunWLo/FaRkhbd2vrpXFO18LQoGWn9tCgW1bTshFgg6c8/XSyAd0YsoAVKYuFdSp++Fmj3TdiFVlSzy99SK84EVKC8vGqOM9sKH06MdHn44GodOZ0pQodZ8juhByxjbcHO6HFipJcm1pZldUaPukzLieQWEIRPJZo4PzfyRfRI7hB9JflDqcOp6uf3xdVC/R76SdY4zmxfp7c3oslIACFfQ6UVxByLdQrmwxQyq0balx2JaQe2ASZoXvt2yvERmc27OS6rebGKapgInRYyv5rXU106+p7MDp9VAOPD57vrbz25OTt4JZ2U/heWr/gJJz4L1/zjm1AhD1fehAy2+3ZkyALZTjaF6sFpJv35PnC811Tey5lCplrFVCEl+P+pQlPlh9VzhQ458C/9fMKnOw3lM2/A2tnaT450qX9d/fDdbcvBonj8UiytfvwcD07+AQ== \ No newline at end of file diff --git a/doc/webdashboard/dataweb_architecture.png b/doc/webdashboard/dataweb_architecture.png deleted file mode 100644 index 189098d56..000000000 Binary files a/doc/webdashboard/dataweb_architecture.png and /dev/null differ