As they tell you, in theory, there is a very simple procedure for tracking down and fixing bugs:
– Get a description of the conditions that cause the bug.
– Reproduce the problem on your own test systems.
– Use debugging tools to find the cause in the code.
– Fix the bug and prepare a new package with the fix.
– Give the fixed package to the customer. Everyone smiles!
They LIE.
In my recent experience, it’s more like:
– Get a vague report from the support team that suddenly the customer’s ‘having problems,’ accompanied by logs from an unrelated part of the system.
– Request more detailed information from the customer, except it has to go through the sales office since they’re the primary customer contact, so it takes even longer for the request to be turned around.
– Discover you can’t even recreate the problem yourself because it depends on having massive, expensive hardware that costs more than your company’s total net worth.
– Get the information and find that it doesn’t even make any sense. Functions that are some other company’s code entirely are returning impossible values.
– The customer reveals that this started happening after they fiddled around with the hardware, installing whole new versions and models you’d never even heard of before. But it’s still your fault because, well, the software came from you right?
– Attend conference calls where you, the sales branch, the vendor of the hardware, and the customer each try to accuse the other guys of screwing something up.
– Discover that despite documentation for the hardware that says “On platform X don’t run in mode Y” the customer has been running his platform X machines in mode Y for years now.
– Inadvertently open your big fat mouth and wonder aloud about a possible workaround and have all three other groups instantly jump on it and say “Oooo, get right on it!”
– Spend a day coding the workaround, blindly because you can’t test it on your own machines because you don’t have the hardware and can’t reproduce the problem.
– Send the workaround through the usual slow channels.
– Immediately after you sent it realize that you probably should have added another output message to clarify if the workaround worked or not, but then you think it’s no big deal, it shouldn’t be that hard to tell.
– A week later they finally get around to actually trying it, and then complain that it’s not clear if the workaround really worked or not.
– Verify that no, it did not work, so you’re back to square one.
– More conference calls, wherein the hardware vendor has discovered a perfect solution to the problem and the customer won’t implement it because it requires redoing their whole disk network and would screw up some other program they use.
– Write test programs to try and produce the same error outside of our own software package so we can prove it’s not our problem.
– Send a manager to the customer site to talk to them personally, just to try and calm things down.
How does it end? Sometimes it doesn’t…
Heh. I just read that aloud to Matt, Matt, and he was cracking up. I guess it’s all too familiar amongst programmers. :-)
I can relate on the part about slow channels. We screwed some lady out of a deal we offered her. She’d done everything we told her to do, our support team dropped the ball, then I’m told there’s “nothing we can do for her.” It took me constantly on someone about it for them to finally be like “Oh. Well. Okay. But just this once.” Poor Cam. It sounds like you need a vacation! Come to Chi-town! :-D
Get a can of RAID, open the whining customer’s case, and empty it completely everywhere inside — take care to spew a liberal amount in the power supply, and any open slots on the MLB.
Bugs dead now. :-)