Josh, when troubleshooting a desktop or server the process usually starts the same. For future reference: What were you doing when it broke/crashed/stalled? What was the last change made to the machine? What else is on the machine and was integration testing done (do all the apps play nice together)? Do you have a backup you can roll back to and test? Can you recreate the problem? Can WE recreate the problem? Is the software or hardware being used as intended? Is it supposed to do what it is doing, other than crashing of course? What are the min. hardware requirements of the software and are those realistic?
There are other things that get asked from there as you work the problem, but those are the basics. Logs help, a server should have a changelog to refer to, and those changes should go thru a change control meeting and approval process if your organization is more than a handful of IT people. Discuss how a change may affect users, plan for down time for big changes and migrations, etc. So if you have a notebook for each server then great!! If not, why not?
So the server or the app was maybe not tested under load. There are programs and scripts that will simulate loads so you can test this on a box in your off-line test lab. You have a lab don't you?
Brian Kelsay
"Charles, Joshua Micah " <> 02/17/05 09:13AM >>>
Thanks to everyone that replied, both on and off the list. I think we've decided that the server just had too many users.