Thursday, March 31, 2005

undead Apache processes

Beware of surviving Apache processes that you thought you had stopped.

I just had a miserable time making some changes to httpd.conf on an Apache (1.3)/Windows installation. After making each change, I'd stop/start Apache and test the page with a browser to see if I'd mucked httpd.conf up fatally. The trouble is that I got a lot of false readings with this test, which confused me horribly. (Make change X; page works. Make change Y; page doesn't work. Undo change Y; page doesn't work. Undo change X; page doesn't work. Scream.)

It turns out that, although I'd executed 'apache stop' properly (I think), there were quite a few apache.exe processes lurking alive in the Task Manager's Processes tab. Apparently they were the source of the trouble, because killing them manually during my stop/starts eliminated my false test results.

Wednesday, March 30, 2005

Oracle 10g upgrade woes

I really ought to initiate a blog with a non-grumpy post! Oh, well. You should have been here a month ago when everything was sweetness and light.

I'm the early-adopter type. I upgraded to 9i pretty early on, and it went well. A couple weeks ago, I upgraded to Oracle 10g. This time, it hasn't gone very well.

At first, performance was absolutely miserable - the CPU was 100% busy and stayed there. But that was my fault. Shame on me, I had accepted the init.ora parameters written by DBCA. Fixing some of them (cursor_space_for_time, sga_max_size, sga_target, fast_start_mttr_target, cursor_sharing, session_cached_cursors) corrected that. I know that's my responsibility, but I'm really surprised that neglecting it caused such an utter CPU jam.

Then, last night, the database started dumping lots of "ORA-04021: timeout occurred while waiting to lock object" errors into the alert log and refusing to do anything whatsoever. The associated trace files mean nothing: "No current SQL statement being executed." I shouldn't have to be looking at binary stack dumps, dangit. At the time, I was trying to do a full export with EXP; I don't know whether that caused the problem. Supposedly, you can still use EXP with 10g, but I wonder if Oracle has stopped bug-checking it vigorously. Tonight, I'll switch to EXPDP.

There were also many unhandled PL/SQL exceptions in packages like WKSYS.WK_JOB - packages that Oracle supplied and that I don't even know what they're doing. Oracle's ever-increasing sophistication means that it's always doing more and more stuff behind my back; with 10g, that stuff sometimes clogs the performance, and sometimes it contains errors. I am not impressed. Postgresql looks more appealing all the time.

The problem might be that I'm one patch behind - I'm at 10.1.0.2, not 10.1.0.3. The patch procedure is pretty involved, though - a lot more involved than it ought to be, for a top-dollar product, IMHO. It will mean a couple hours of downtime, which means it should be done in the middle of the night or on a weekend. I wouldn't mind that, but as a contractor, I'm not allowed to be in the building at those times without dragging my supervisor in to babysit me.