## June 14, 2016

### Electronics Collection, Reuse & Recycle Event

Facility Services will be collecting old electronic equipment ("anything with a power cord") on Thursday, June 16th from 10 - 2.  Items can be dropped off on the lawn across from University Hall or you can arrange for free pick up (call FS at ext. 24740).  There will be a chance to scavenge on June 22nd.

If you have any computers to dispose of, please email us first so that we can remove your hard drive for safe destruction.

## June 13, 2016

### Power outage this morning

There was a wide-spread power outage for a couple of hours early this morning.  Any systems not on battery power went down, of course, but so did systems on UPS units, too, as the outage outlasted battery capacities.

We had to bring up / wait for multiple layers of equipment - power; network; storage; physical hosts; virtual hosts - in the ABB and HH server rooms before all service (mail; web; file hosting; network configuration; etc.) were working again.   By 9:45 am, most systems and services were back on line and responding normally.

We are still tending to the wounded, however: we have a power problem in the ABB server room and at least one group server in HH isn't coming back on.

If your workstation or server is not working or behaving properly, please let us know.

## May 25, 2016

### New ssh key on ms & dealing with the alarming warning

When we replaced (the problem-having Mageia version of) ms last week (with a fresh, shiny, CentOS version), it got a new ssh identity key.   If your account + computer has saved the old key from prior ssh/sftp sessions, you will see a warning something like the one shown below.

To get rid of that warning, purge your ssh known-hosts file of the old key like so (from linux, OS X, or MobaXterm):

ssh-keygen -R ms
ssh-keygen -R ms.mcmaster.ca

Update 2016/06/02: to be more thorough, run this command, too:

ssh-keygen -R 130.113.105.93

Or there is the nuclear option, which will clear all host records on the current computer + account:

\rm ~/.ss/known_hosts

You may need to do this on more than one account + host.
Here is an example of the warning:

## May 20, 2016

A few other things to note about the emergency replacement of ms:

Most workstations and compute server had to be rebooted, though - as always - we avoided doing so if we could.  That means that any running jobs were terminated.

When you connect to ms via ssh or sftp, you will see an alarming looking warning something like this:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
That is because the new ms has a new ssh identity key (a change in key encryption meant that we could not reuse the old one).  You will need to run these commands (under OS X, linux, or MobaXterm):

ssh-keygen -R ms; ssh-keygen -R ms.mcmaster.ca

### ms is dead; long live ms

Due to some unexplained problems with the previous ms.mcmaster.ca operating system that began on Wednesday and which we were unable to resolve by other means by mid-day Thursday, we did an emergency installation of a new ms.math.mcmaster.ca running a newer, more stable and maintainable linux distribution to which RHPCS has been migrating all RHPCS-managed servers over the past two years.

We have been intending to do this for quite some time but wanted to do it in a scheduled manner to minimize disruption to users.  Unfortunately we were forced into doing this last night as our best way forward from the instability we were experiencing with the old ms installation.

We apologize for the disruption this has caused.

We realize that there may well be some unintentional side effects of the ms server operating system changes, both on ms and other computer systems that are dependent on ms, and we will deal with these as quickly as possible once we discover them or they are brought to our attention.  Please don't hesitate to report such problems to RHPCS.

As of Thursday at 7:15 pm, logins were enabled, server/workstation authentication was working, and most ms.mcmaster.ca web sites were back on line.

As of Friday at 10:30 am, most other services were available, including access to the shared printers.

As of Friday at 1:00 pm all but a few web sites are working.

## May 19, 2016

### Problems with ms fixed

The  main departmental server - ms - started having problems yesterday; most notably, web pages were not being served and Windows file shares weren't responding..   The problem was temporarily solved by reboots yesterday, but this morning we took ms off-line completely in order to fix the underlying problem (which had to do with the main file system).

The problem is corrected and ms and its services - web, printing, home directories, file sharing - are all working.  Workstation and server sessions which stopped responding when ms was rebooting or off-line will have re-estabilshed themselves automatically in most cases.