April 2011 Archives
April 26, 2011
April 18, 2011
Gosset did not come up properly after Friday's power problem; we are investigating. It may be several days before we get it back up.http://www.math.mcmaster.ca/blogs/archives/computing_news/2011/04/serverpower-pro.html
April 15, 2011
We are still trying to pinpoint the source of the partial power failure in the server room earlier this afternoon. We know that something went wrong with a UPS unit which has served us faithfully for six years now, but we don't know precisely what.
Depending on what we find, we may need to shut down the storage array and compute servers with very little notice. And a similar power failure might be possible, too.
There may be loss of access to home directories and interruptions to mail and web service with little or (should the power fail) no notice. So: save early and save often. I'll post an update once we know that things are stable again.
We lost power to part of the Hamilton Hall server room on Friday afternoon just before 3:00 pm. The main server wasn't affected, but the main storage array was, which means that mail, web and workstations were unavailable until the problem was corrected. Web sites were back up by half past three, but other services were spotty until about four o'clock.
There was no damage to the files on the storage array, though some mail may have been returned to senders as undeliverable.
Most workstations will need to be rebooted (Alt-Ctrl-F1 then Alt-Ctrl-Del); some will need to be restarted (hold power button for ten seconds to turn off then turn back on).
Any jobs running on bayes, gosset or freesurface will have been lost as those servers were connected to the part of the power system which failed.
April 12, 2011
I've installed Firefox 4.0 on the ms workstations but I have not made it the default since this is a .0 release and so a little bit suspect; the Firefox icon will still bring up version 3.5.
But you can try Firefox 4 with the command firefox4 (at a command prompt or via Alt-F2).
April 8, 2011
Facility Services has announced that air conditioning to Hamilton Hall will be turned off from 6:00 am to 4:30 pm on Saturday, April 30th. In order to prevent damage from overheating, we will be shutting down most systems in the server room on Friday afternoon: this means bayes, gosset, freesurface, etc.
I will leave the main file/web/mail server up, but if the room starts getting too hot I will shutdown everything but web services (no email, no workstations, no changes to the web server).
Announcement from FS follows...
But a spate of such spam has hit @math.mcmaster.ca accounts right on the heels of some server upgrades and mail problems, so I'm going to emphasize that RHPCS will never
- ask you for your password
- send a message without signing off with the name of a specific RHPCS staff member
- commit more than two outrageous solecisms per message
April 7, 2011
Email messages addressed directly to @math.mcmaster.ca addresses sent from off-campus sources (e.g. gmail.com, another university or from home without a VPN connection) were not deliverable between 5:00 pm Tuesday and 4:30 pm Wednesday.
Some of the undeliverable messages will have been queued off campus and delivered once @math.mcmaster.ca was accessible to external mail servers again. Other messages will have bounced, in which case the sender will most likely have received a delivery-failure warning.
Messages sent from the following sources were not effected ...
- mathmail.mcmaster.ca web mail
- univmail.mcmaster.ca or muss.mcmaster.ca web mail
- mail clients (Outlook, Thunderbird, Mail) used on campus
- mail clients used from off-campus with a VPN connection
April 6, 2011
Wikis hosted at wiki.math.mcmaster.ca are down today and will be until Thursday morning.
Most mail from off campus addressed to @math.mcmaster.ca addresses is not reaching the server; the problem started when we moved the mail server from the ABB server room to the HH server room yesterday evening.
I'm working with UTS to resolve the problem and expect to have it sorted out this afternoon.
Note that ...
- mail forwarded by @mcmaster.ca to @math.mcaster.ca is arriving;
- mail forwarded by unvimail is arriving;
- all mail originating from univmail, muss, other campus mails servers, or VPN-connected clients is arriving.
Our main server (ms.mcmaster.ca) is now back in HH after a few months in the ABB server room and is using a new, larger disk array, also in HH (we were borrowing space in ABB while ms was there). Thanks for your patience as we completed another part of the migration to new server infrastructure.
A few notes regarding the downtime and recovery ...
- contrary to my plan, the xguest login on the ms workstations did not work
- the downtime extended to 8:45 pm instead of 7:00 pm
- the web server was down from 4:55 pm to 5:20 pm
- the main page (and other database-driven pages) were down for another hour
- other sites (e.g. course and instructor pages) were OK
- most workstations are working fine as of 9 o'clock Wednesday morning, but a few will need to be rebooted; if your workstation is frozen
- hold down the power button for ten seconds to turn it off
- wait five seconds
- turn it back on
- note that the boot may take five minutes or so while the disk is checked for errors
April 5, 2011
Because ms.mcmaster.ca has moved between buildings (from ABB to HH), it has been given a different IP number (i.e. network address). You should remove the old entries for the server from your ssh host-key file in order to avoid dire warnings of "Offending keys".
ssh-keygen -R ms
ssh-keygen -R 22.214.171.124
Some of you will be having trouble getting to your mail via imap clients or webmail until later on this evening: I neglected to redirect the mathmail.mcmaster.ca to the new network location of the server. My apologies.
Note that anyone using the addresses mail.math.mcmaster.ca or ms.mcmaster.ca won't see these problems - though mathmail.mcmaster.ca is the preferred address.
The scheduled downtime from 4:45 pm to 7:00 pm this evening will proceed as planned but with these differences ...
- the www.math.mcmaster.ca web site will not be down for more than a few seconds (though you won't be able to make changes during this period)
- limited-use guest accounts will be available on the most workstations
April 4, 2011
We will be moving our main server and storage array from their temporary berth in ABB back to our HH server room on Tuesday. Email, workstation and home-directory access will be down from 4:45 pm to 7:00 pm on Tuesday; web sites will be down from ca. 6:50 pm to 7:00 pm.
If all goes well, I should have the linux workstations set up so that you can login and run a browser without logging into the server; you won't be able to read your mail @math.mcmaster.ca or get to your files, though.