Archive for February, 2008

Mac OS X Leopard: Changes and confusion regarding network mounting

Apple put a lot of effort into making network sharing (Mac and Windows networking using the AFP & SMB/CIFS protocols) easier in Leopard. One of the things they did was introduce credential caching at the system level, so once you mount another Mac via AppleShare (for instance), you could then connect to it with Screen Sharing too, without authenticating. This is neat, but a bit problematic. I have had cases where:

  1. I had to kill NetAuthAgent (the background process that appears to hold username/password pairs on your behalf) to make mounting work
  2. I had to rearrange windows around onscreen, because a (stalled) progress window was hiding a username/password window, and never going to get anywhere without some help; other times I have dismissed the progress dialog without realizing it was waiting for a concealed window.
  3. I have had to Force Quit and relaunch the Finder before it could (re-)mount some or all network volumes.
  4. I have had to reboot the Leopard server before I could (re-)mount its volumes.
  5. I have had Leopard systems fail to share out volumes, and had to re-share them manually. Part of this appears to be a different issue, where Leopard systems don’t even mount additional drives until a user logs in (obviously unmounted volumes cannot be mounted over the network). That’s not right!

Tonight’s problem was a bit different — I was connecting to a Windows server running Samba, and not getting the right permissions. When I looked in the server’s /var/log/samba/smbd.log (because I cannot find any way to see the account used for a network mount in in the Finder), I discovered that the share was mounted as the wrong user. I had never gotten the username/password dialog for this mount, as I had (the wrong) user credentials cached in NetAuthAgent.

The Tiger behavior is to default to the client username (the account mounting the share from the server). Leopard instead uses whichever user it has a cached credential for. I have now changed my scripts to always specify the username when mounting shares, e.g., open smb://pepper@inspectore/inspector.

Comments

Scarce at Union Hall

Joyce, singing

Joyce (Raskin) White is a friend of ours from the neighborhood — Julia and her daughter Sydney are a couple days apart in age and were best friends when they lived in Brooklyn. A few years ago Joyce, Matt, and Sydney moved to Boston, and we were all sad. Before Brooklyn, Joyce was in a fairly successful rock band named Scarce, but they broke up after a brain injury took Chick Graining (lead singer) out of commission.

A couple years ago, Joyce started writing a book about her experiences growing up as a female rocker, called Aching to Be: A Girl’s True Rock and Roll Story. Amy edited the book, and we’ve been waiting to see Scarce perform ever since.

Tonight they played at Union Hall, just down the street, and we finally got to watch Joyce rock out. It was most excellent, and I got a mess of pictures.

Comments

Between Jobs

For dessert: 4 bags of chocolate chips

It feels very very strange to be unemployed — it’s been 7 years since the last time, and I was too freaked out at Shooting Gallery laying me off to feel this way. Now that I’m a grown-up (having kid(s) means you’re responsible, even when you’re irresponsible!) it’s a good thing that we’re covered by RU insurance past the start date for GS insurance, but the whole experience is still very odd. I wiped the third computer today at 5:30pm, and am copying data off computer #4 (old reppep.com) right now in preparation for retiring it (it’s falling apart, apparently — optical drive died an hour ago).

Now I just need Apple to update the MBP15s, so I can replace this PowerBook. It’s doing better than I thought, though — doesn’t seem any doubt that it will serve until the next update.

RU IT did right by me today — a grand spread, consisting of John’s pizza, baby back ribs, beef ribs (they looked like something from The Flintstones), and chicken wings. A nice (short) speech by Armand, and well wishes all around. Elaine hung a bunch of colorful signs, which delighted Julia.

I closed out my helpdesk tickets, turned in my keys (forgot to turn in my ID/swipe card, though), and updated the documentation on our load balancers again, as well as re-re-recapping for my co-workers. I had to say “Look, when you feel like you’re an idiot, don’t worry — I felt like that repeatedly for years while working with these. The Big-IPs are absurdly complicated. Two kernels, a super ’switch card’ that’s doing all kinds of crazy (non-switch) stuff, over 20 IP addresses, 8 networks, plenty of bugs, and delays in getting technical support. It’s not you!”

Maybe I’ll have some time to investigate Linux & Windows text editors.

Comments

reppep.com Migrated

On Feb 19, 2008, I shut down the old reppep.com server, which ran Mac OS X 10.4 “Tiger” Server, and replaced it with a new (cheaper and faster) PC running Linux. Unfortunately, the password formats are incompatible, so I apologize to app reppep users for the disruption.

Please call me if you have an account on reppep.com and haven’t received your password already, or find anything not working right.

I switched from Apple’s jabberd to Openfire, which doesn’t use the UNIX system accounts, so let me know if you want a chat account (compatible with iChat & GTalk).


[Done] I forgot SquirrelMail address books — should be able to bring those over too.


  • Firewall problem fixed. SMTP MX issue fixed.
  • Virus filtering problem fixed.
  • Webmail certificate fixed.
  • Quota problem fixed.
  • Virtual domains for email fixed.

As of 5pm, I don’t know anything that doesn’t work (aside from SquirrelMail address books) [fixed Thursday].

Thanks for your patience!


As of 10:30 on the 20th, things seem to be working. Something’s screwy with amavisd-new’s quarantine, but mail is going through. I reinstalled Openfire, and chat seems okay under the correct hostname/certificate name now (will try signing it as ca.reppep.com later).

Good timing — the optical drive on the old server died tonight.

I have distributed all the new temporary passwords, so any users having trouble logging in should let me know.

Markdown.cgi is still broken, but I’m the only person who uses it here, so I’ll get to it.


On Thursday the 21st, I found a problem with amavisd-new — it had quarantined 32,000 messages in a single directory, and was stuck (apparently ext3 doesn’t support more than 32,000 files in a directory). I cleared it out and finally managed to disable quarantine, which wasn’t as easy as it should have been, and the backlog of messages have been delivered as of 9:15pm.

At 11pm, I fixed an issue preventing SMTP AUTH from working properly, which was interfering with sending email to non-reppep addresses.

Comments

System Admin Interview Questions

I was quite impressed by Joel’s description of the hiring process, and we’ve been doing a lot of interviewing for System Admins lately. I put together a list of standard questions to ask during interviews, which has been quite helpful in judging a) how much technical knowledge people have, and b) (just as important) how good a match they are for the skills void we were trying to fill at the time. Here they are, for the next person who needs to perform a similar exercise.

  1. How many systems does your team manage (Linux, Solaris, Windows, etc.)?
  2. How large is your team?
  3. Which OS are you most comfortable/familiar with?
  4. Which Linux flavors are you most comfortable/familiar with?
  5. Which Red Hat versions are you familiar with?
  6. Are you familiar with kernel programming or configuration?
  7. Have you done any custom packaging or kickstarting?
  8. Have you used or managed Sun JumpStart?
  9. How much experience do you have with Sendmail?
  10. … NetWorker? Version? Managing backups, or just configuring clients?
  11. … LDAP? Brand & version? LDIF or just querying?
  12. … firewalls (iptables, ipf, etc.)?
  13. … network administration (Cisco, sniffing, etc.)?
  14. … Apache httpd?
  15. … Tomcat & Java?
  16. … EMC (Clariion, PowerPath)?
  17. … shell scripting, and with which shells?
  18. perl scripting?
  19. … Veritas VM/FS? Versions?
  20. … Veritas Cluster, or other HA? Versions?
  21. … snapshots? In which products?
  22. … load balancing
  23. … Oracle (as SA, not DBA)?
  24. … HPC?
  25. Please briefly explain the difference between RAID 1 and 5. What are layered RAID levels, and when are they appropriate?
  26. What sizable projects have you done recently?
  27. Why are you leaving your current employer / did you leave your last employer?
  28. Please give specific examples of some routine tasks you’ve performed recently.
  29. Have you done systems specification and design (servers, multi-server configurations)?
  30. Have you worked with customers directly, or primarily with/for other IT personnel?

It didn’t make sense to publish a list of questions when I was involved in the interviewing process, but now that I’m leaving Rockefeller and no longer interviewing UNIX Admins for them, I can post my sample questions.

Comments

Reading in bed, and iPhone trick

The other day I was lying on my side, trying to read a web page on the iPhone. I turned the iPhone 90° clockwise, but it obligingly re-rotated the text 90° counter-clockwise, leaving me again out of sync. I grumbled something about the irritation of being outmaneuvered by a handheld gadget. Amy’s brilliant suggestion: rotate it another 90° CCW. Since the iPhone doesn’t offer 180° rotation, this left the text rotated 90° in alighnment with my head.

Thanks, Amy!

Comments

Extra Pepperoni Is Now SSL Protected

I’ve been thinking about using SSL to protect logins to this blog for a while, but thought it would be too complicated. This weekend, I took the time, and thanks to Haris’ Admin-SSL plug-in, it was very easy. First I used cert.command to create a certificate for www.extrapepperoni.com, then I configured my DreamHost account to provide SSL (https://www.extrapepperoni.com/) in addition to the existing http scheme; this took a while to go through. Then I installed Admin-SSL, and after a few loading errors, all authentication and authenticated access is now SSL only, while reading anonymously is non-SSL.

Note that I’m using a certificate signed by my private certificate authority, ca.reppep.com, so you’ll get a warning from your browser that it’s not trusted; this is normal. You can continue past the warning and get full 128-bit SSL encryption; you just don’t have the assurance of a public CA that I am who I say I am.

Thanks to Rich & Sam for encouraging me to do this.

Comments (4)

HP c-Class c7000 Chassis & Onboard Administrator Notes

The Onboard Administrators (we got a pair for redundancy) each ship with a unique password. When you connect them, it appears the active OA resets the standby password to match the active. This was a bit confusing, as OA #2 came up active, and the passwords were not as expected; SSL certificates are created and reloaded in terms of “Active” & “Standby”, so I initially loaded new certs onto the wrong OAs.

ssh Implementation Flawed

The OAs support ssh access and ssh keys, but apparently only for the single Administrator account. This is documented incorrectly — the docs say the last word on the key line is the username the key is for, but actually they’re all linked to Administrator. HP Support doesn’t know much about it. It’s bad when security features don’t work as documented — in this case, it would be easy to follow instructions and upload a key for an unprivileged Operator or User account, unintentionally granting full Administrator access — we had this for a while, until I figured out what was really going on.

The web interface doesn’t allow copy & paste of keys — they must be downloaded by the OA from a web server. Afterwards, though, the public keys (which had to be accessible on through a web server, remember) are not visible to other authorized users of the OAs — only Administrator can see or modify keys. Feh.

Additionally, the web interface shows line breaks as ‘^’, so the keys look corrupt. Despite this they work, and display correctly in the command-line interface.

OA doesn’t automatically configure its accounts onto blade iLO. Instead, it creates an account for OA itself on each blade’s iLO. This is a bit odd, as it means authorized users cannot connect directly to iLO — instead they must connect through an OA, and have the OA login, before using iLO. We will presumably use the Compaq iLO configuration language to deploy our accounts to iLO, but this shouldn’t be necessary.

Good News

On the bright side, the chassis is easier to mount than our (smaller) IBM BladeCenter chassis; it’s also better labeled. The Onboard Administrator interface is better laid out, although it doesn’t work in Safari (seems fine in Firefox/Mac). The command line is a bit less bizarre than IBM’s.

HP makes it easy to dump the configuration to a text file, tweak it, and load it into another chassis, although we haven’t tested yet; they call this “Configuration Scripts”.

Comments

Goodbye RU, Hello GS

I have accepted a position at Goldman Sachs in Jersey City. Leaving Rockefeller after 7 years as a UNIX admin (and an earlier 3 doing Mac support) was a tough decision. I learned a lot, and worked with a bunch of great people, but it is definitely time for a change. I expect to start February 25th and immediately enter firehose mode, as Goldman is so different than the other places I have worked. I’ll still be a UNIX administrator, but the specifics of the role will of course be totally different. Among other things, I have to start thinking of “security” as something people exchange, rather than the never-ending attempt to fend off bad folks.

Comments

Wiring Art

The Pretties

Inspired by When data center cabling becomes art from Andrew T Laurence & Chuck Goolsbee’s pics of Digital Forest, I took some photos of Rockefeller’s new data center. We’ve been planning out various scenarios for 5 years at this point, but we finally moved most of our systems in this month. Note that the network guys (mostly Eric) took care to run cables connecting to ports on the left half of each device in from the left, and come in from the right for ports on the right. This makes more work for them in preparation, since one cannot simply plug a cable into a free port, but makes things look prettier, and also reduces cable snarling. 3 KVMs & baby + LCD

More Connectivity, Please

Since we first started discussing data center plans, I’ve been saying we need more connectivity. The new DC has 48 patches per 42U rack, and some of the new racks are indeed running out of ports before they run out of vertical space. In our racks 2U is used for patch panels and 2 cables control APC managed power strips, so we have 40U and 46 patch ports for servers. Our Linux servers have Ethernet, serial console, & KVM; Suns have Ethernet & console; Windows have Ethernet & KVM. In the worst case, 40 1U Linux servers need 120 connections, but we only have 46 available. If the rack is full of 2U Suns & Windows servers, we’re okay with 6 ‘extra’, available for dual-connected servers or whatever. As we get more dense, we begin to run out of ports. Cat6 flowing down

Blades

Blades are no better — their chassis tend to blow out the power budget because they’re even more dense than 1Us (although they do get more servers per rack), and with all the redundancy they still require a lot of cabling. For a reasonable IBM BladeCenter, we need 4 x 2 for GE switches (FC cables don’t go in these patch panels). Then 2 x 2 for (Ethernet & KVM) for management modules per chassis = 12 ports for 7U. For our new HP c7000 chassis with basic networking, we have 16 GE ports, 2 GE console ports, 2 OA Ethernet ports, and 2 2 OA serial ports (again, ignoring the fiber-optic GE ports): 22 ports in 10U. I’m sure somewhere HP has demo chassis, filled them with fully-connected GE switch modules: (9 x 8 + 4 = 74 patches) & (4 x 8 = 32 fiber-optic ports) = 106 cables total (not counting power connections — 6 in our case). In 10U — 1/4 of a rack — insane! c7000: 30 ports

Update 2008/2/5: Eric pointed out I was wrong about the ports — the Cisco switches have 8 uplink ports, 4 of which are either fiber-optic or copper (you can see they’re 17-20 in the photo); the other 4 copper ports seem intended for cross-linking to the other switch. So the max copper patch count remains, but the the fiber connections would be instead, rather than in addition, and we may fully connect our 2 switches with only 8 GE uplinks rather than 16 going out of the chassis.

Comments