Worst Computer Disaster You Have Caused or Experienced

All I can say is, if you want huge pressure, take a programming job involving a system that directly involves someone's life.


I used to be a programmer at the Kennedy Space Center.

You'd be surprised how many of us working there ended up having the same bad dream on occasion -- that the shuttle blew up, and it was somehow OUR FAULT. :jaw-dropp
 
Accidentally used clre on a Solaris box and wiped out all of the runtime libraries. We couldn't use the workstation at all until I figured out what I'd done.
 
Always love threads like this.

Okay, worst screwup that was directly my fault.
Second day on the job as a computer scientist major co-oping as a PC tech. My "boss" asks me to swap out a power supply on one of their homebrew PC's. No problem; I've done this kind of thing before.

Minor problem; I didn't unplug the power supply before I started disconnecting the wires from the power switch on the case. Hot wire arcs over and welds itself to the case and I take down a breaker circuit.

Bigger problem; the breaker circuit includes the President/Owner/Founder of the company's office and no one can find the breaker box for about fifteen minutes because they'd never had to access it before.



Worst screwup that wasn't my fault, but I had to clean up.
My previous employer implemented the Baan ERP system back in '99. We'd been online with it for less than a year and were setting up to copy the production environment over to the test environment (on Baan this is on the same box unless you want to pay mondo dollars for a second license).

I get the backups all done and my departmental director wants to do the restore so he is familiar with the process (it was all internal to Baan). Now, my director is a smart, relatively computer savvy guy, however he did make the honest mistake of not redirecting the restore to a different environment than it was backed up from.

Thirty minutes in to the process all hell breaks loose and I realize what has happened. Couple that with software problems that were uncovered in the backup/restore software for the server itself netted me a ~100 hour shift w/ about 4 hours of sleep over the weekend (yeah, it happened on Friday morning).

The pisser was, after I saved the day and had them up and running by noon Monday (was a M-F 8-5 operation) the President of the company couldn't even be bothered to say "Thanks!". Yeah, I'm not there anymore.
 
I've been fortunate to never really cause a major meltdown, but I was witness to a decent snafu back in my college days.

Our school had only two servers at the time, and among the various admins, only one guy was common to both machines. Email addresses were server specific, so there was no common, main address for this admin. His bright idea was to set each of his server accounts to forward mail to the other, so that no matter where he was working, he'd get notice if someone on the other server had a problem. At the time, our system popped up a notification line whenever you received email, and you had to clear the notification before proceeding with anything else.

So, Steve, our admin in question, sets up his forwarding, and sends a test message to his other account. Notification of an email to his present login popped up almost immediately. He clears it, and is frozen by another notice. Then another. After about five, he realizes exactly what he's done, then kills the computer, since he can't get enough time in between notices to log out. Before he can make it to either server room, both machines die, all space on their drives full from his email test message. It took him about a day to fix, if only because he couldn't use either of his logins because of the notices, and the fact he had to clean both machines, located across campus from each other, at exactly the same time.
 
We have a server with (theoretically) hot-swappable, parity-striped drives in a RAID. One morning, when one of the drives in a set had failed, the boss went to the server room to remove and replace it. Can we guess whether it all went perfectly? Or whether he pulled out the wrong drive, thus destroying the parity and necessitating a rebuild of the set, while users continued to attempt to work on the live data? That made for a fun weekend.

Cheers,
Rat.
 

Back
Top Bottom