Today I upgraded our 8 core authentication ("Single Sign On") servers from commercial ssh to OpenSSH (it was a 2-year battle to standardize on OpenSSH, but in the end the right product won).

The tricky things are: a) These machines are critical enough that they're not directly accessible from campus, so everything must be done through an intermediary machine. This complicates everything. b) Upgrading ssh is problematic because it's the remote control tool, so working on sshd implicitly interferes with your own control. Fortunately we have good terminal servers, which make this much less problematical; when sshd is down, you can get in the back door to bring it up.

One of the neat things about UNIX is that you can delete a file, but if it's open the file is not actually deleted until that filehandle is no longer in use (when the last filehandle is closed, the disk space is reclaimed), so for non-terminal-server systems, I've actually done the whole upgrade through an sshd binary which is deleted at the beginning of the upgrade, replaced during the upgrade, and still in use until the very end.

I (we) have been doing this long enough, and refined the procedure sufficiently, that the whole upgrade took under 90 minutes total for 8 machines, although there was a lot of prep and follow-up, cleaning up accounts, /etc/sudoers, installing public keys, etc.

Overwriting /etc/passwd, /etc/shadow, and /etc/groups with copies from the intermediate machine was particularly stressful; I was highly relieved that nothing broke.