I was at a customer recently, who'd not long been live on their brand-new hardware with their homegrown applications and some COTS stuff. Their expectations were high, they'd done a Proof of Concept that had indicated some significant performance improvements might be possible, but they weren't getting what they expected. So I'd gone out to help...
Firstly, it transpired that they hadn't tested the system at all before going live. They'd already been live on the previous hardware, which was similar but les powerful, so why bother?? I asked why they'd removed the bulk of the testing from their "project plan" (a collection of notional activities and optimistic dates) - apparently management demanded that they go live on the declared date, even though the hardware had arrived much later than expected. I related this later to a colleague, and his response was "What would the management have done if the hardware had arrived after the demanded live date? Would they still have gone live?!". A good question, and one I will remember for later use.
Not only had they not tested the production hardware and software, they had gone live with no means of backup in place! Astounding. This was allegedly a critical system, making money for them every minute of the day processing customer sales and service. It had taken them quite a long time to cut-over the data in the first place - how long would it take to recover if anything went wrong? And it could...
Of course, if you're running a new live system, it would be good to have some quality monitoring and management in place, particularly given the complex nature of the system, and their declared emphasis on performance and user response time. They had no proper means of monitoring the system, other than a couple of hand-cranked scripts which reported on just one aspect of the system. Of course, when i pointed this out, they immediately installed the standard management and monitoring software - why not before they went live? Well, we know don't we...
Finally, they expected me to test and make changes on the production system. Now, the kind of processing involved used massively parallel processing, which was designed to use all the available resources in the service of an individual query - great on a system with a few, well-managed users, but on a system which was basically online for a couple of thousand users, with ad hoc query submission and user-created SQL... I don't think so. Too easy to make a minor modification, and have it bring everything to a virtual stop! Of course, they had a test system... which they hadn't set up yet, and certainly hadn't calibrated against their production system, in order to understand the performance relationship between the full-size production and the smaller test systems. Bah!! Madness!!
Is this client typical? I hope not. But increasingly, the approach of IT management seems to be "I don't care about your problems, my problem is we said it would go live, so it will...". Grim. Mad. And infinitely dense!
Tuesday, 27 July 2010
Subscribe to:
Comments (Atom)
