Wednesday, November 11, 2009

A Waiting Game

Looking for a job is a game of 'wait':
  • You send in a resume; then you wait.
  • They write you back asking about availability for a 'phone screen'; then you wait.
  • They settle on a time early next week; then you wait.
  • The H/R 'Screener' calls, you talk for a bit, the Screener sets up a phone interview (two weeks out); then you wait.
  • The phone interview happens; you talk to a Senior Engineer for a bit; then you wait.
  • And Wait
I passed the Phone Screen three weeks ago, I have the Phone Interview this afternoon, just after lunch.

I treat an interview like I would any business meeting -- I prepare. I re-read all of the information I have gathered -- the original job description, my notes from the previous call, my notes on news items about the company, any new items that appeared in today's headlines. Then I wait.

Sigh, I don't wait well. I am on my third cup of coffee.

Music: Carl Orff -- Carmina Burana

Tuesday, November 3, 2009

An Exercise in Network Analysis or The Case of the Hanging Boot.

A couple of months back, one of my house-mates reported an oddity when she started up her machine. Sometimes the desk-top would just hang in the middle -- the background and folders would appear, but the task-bar didn't. The Task Manager window showed nothing running (System Idle Task at 99+%); and, if she waited 'long enough' the task bar eventually appeared. ('Long Enough' was about twenty minutes.) It didn't happen all the time, just 'two or three times a week'. When she encountered the hang and didn't want to wait, a restart (selected from the Task manager) would run Just Fine, Thank You, the task bar came up properly, and she was off into the Internet Hinterlands.

To add to the confusion, the phenomenon was day and time sensitive. If she booted the machine after 2000, it never hung; if she booted before 1700, ditto; and it never hung on weekends, regardless of what time we booted it.

She had been living with this for a couple of weeks before she mentioned it to me. We talked a bit and then verified that it only happened on her machine -- if I rebooted my machine at the same time, then when her's hang ups, mine doesn't. Both machines are IBM/Lenovo T-61 ThinkPads, both are around two years old; both running Windows XP SP3, etc..

The facts that a) waiting 'long enough' cleared up the hang, and ) it appeared to be time related made me suspect some kind of Network Dependency.

Now, when I was a Fully Employed Unix Admin, I'd just fire up tcpdump() or ethereal() on a machine on the same LAN segment and watch the traffic for a while. The Home LAN is mostly Windows machines, so this is going to be a Learning Experience.

The first thing I learned is that Ethereal is now called 'Wireshark' ("due to trademark issues"). I downloaded Wireshark from the official web-site (www.wireshark.org), installed it on my machine, and started the GUI, to get a feel for what the Usual State of the Network looked like.

The Home LAN consists of six laptops, one tower, one printer, two SAN devices, a gateway-router, and a couple of switches; oh, and up to three work-laptops that jack-into the Home LAN when one of our pagers goes off or someone is working from home. There is a fair amount of traffic going over the Wire behind my Firewall Router....

I spent a Sunday afternoon looking at the traffic and building a filter that isolated only the traffic that related to the IP address of Her machine. After a couple of hours, I came up with an incantation that I thought would work. At least it pulled all of the traffic to and from Her IP address and it excluded all of the traffic and protocols that I could already explain (SMB file and printer sharing, SNMP probes from the Linux machine with Nagios on it, ARP and other Router related protocols, etc.).

Then the following Monday I set up my traps and we booted her machine. I promptly learned that the IP address I had so laboriously isolated on Sunday was no longer in use. Nothing was coming out of my trap. Ah, DHCP! My gateway router is also my DHCP server and it assigned Her the next address in the range when asked....

I reviewed my DHCP config, found the static range I know I had left available (just in case), and assigned Her a permanent IP address. Now we try again tomorrow, since I have to re-boot for the new address to take effect and the second boot doesn't hang (of course).

Tuesday -- I've reset my traps to use the static IP address, and I actually did get Her machine configured to use it. I am primarily a Unix SA; I can manage Windows, with a certain amount of trepidation. I am always pleased and a little bit surprised when I get Windows GUI Admin interface to do what I want it to do on the first try.

The machine didn't hang. It didn't hang for three days in a row. On the following Friday, however....

Watching the traffic outbound from Her machine, I spotted a couple of IP addresses that didn't look familiar. They weren't 192.168.0.xxx, so they weren't part of the Home LAN; they weren't my ISPs DNS servers; and they weren't Google or Her two favorite Cooking sites. They were an oddness; and they kept getting a 'connection retry' every fifteen or twenty seconds. AHA!

I don't have digg() on a Windows box, but nslookup() works. The reverse look-up gives me the names of the two different servers -- one of then is in the Adobe name-space and the other is in HP's. Both of them are not connecting when asked politely. The HP server gives up after ten minutes; the Adobe after 14.75 minutes.

OK, I have my problem-children identified, now to figure out what's really going on and develop a cure.

The HP one is easy; at one time there was an HP printer directly connected to Her laptop. Now we have a brand new HP printer connected through a (built-in) Jet-Direct card to the LAN. But maybe there is still a start-up task on Her machine that wanders out to HP-land and looks for software upgrades.

I don't see the task in Her . Sigh. But, the msconfig() command does shows a 'hpupdate.exe', started automatically at Boot. I remove it. One down; can we make it two?

Moving to the top of the msconfig() start-up listing, I see an Adobe started task entry. I stopped it; and then we talk a bit. It seems that one of the Favorite Cooking Sites publishes in PDF. So, we change to Adobe Reader configuration to 'prompt before starting' and 'check every other week'. Then we reboot; I check the msconfig() start-up list again and verify that the two (potential) problem children are indeed gone.

Now we have to wait until the next Monday to test out the hypothesis. On Monday, Her computer came up on the first try. It also came up clean on Tuesday, and on Wednesday, and on Thursday, .... "Once is Accident. Twice is Happenstance. Three times and it's a Universal Law".

That was three weeks ago; Her machine hasn't hung once. I think we can conclude that we have sorted the problem. I know that this is a negative proof, but She is willing to live with it. If the Client is happy, I'm happy.

Now to explain why the restart made things OK. I have a hypothesis, but. The hypothesis is that the two programs each leave bread-crumbs that say 'I looked for Updates at xx:xx:xx O'Clock, don't bother for another 24 hours'. Without access to the code, though, I can't really test it, except to say the I have some twenty observations that this is the Way It Is. I suppose I could do some statistical test (Chi-Square, if I remember correctly) and see how probable it is that this result appears by chance; but right now that means finding a book in the boxes in the back of the garage, and that's a bit more work than I am up for right now.

Music: Jack the Lad -- The Old Straight Track

Friday, October 16, 2009

First Light, and Childe Rolande

Poking around this afternoon, I found out that one of my favorite bands, Childe Rolande (yes, there are 'e's on the end of both words) has not one but TWO CDs out that aren't in my collection. A quick dash off to PayPal(UK) remedied that. Now I go haunt the post-office next week....

I am starting this blog as an experiment of sorts. I have always kept a diary containing events at work. (I learned that from a Professional Engineer back in '70.) I started out on paper in 'composition books' and moved on to first floppy-disk, then crunchy-disk, flash drives, and now a 5Gig Seagate USB drive (one of the flying-saucer drives). The Seagate does double duty as my off-board music library and sneaker-net. I am going to see how a non-business, public log works.


Music: Childe Rolande -- Foreign Lands