Catherine: pyOraGeek: 2006

Thursday, December 07, 2006

Swivel

Darn. I didn't think of Swivel. Then again, you didn't think of it, either.

Swivel is described as "YouTube for data". Not just for uploading and viewing data... but also for madly cross-tabulating. Wow!

As a database professional, I'm kind of embarrassed that two physicists came up with it. But I'm the spouse of a physicist... so I'm kind of proud, too.

Anyway, I was interested in seeing what sort of data was in there on gender issues... so I searched Swivel on "gender" and got... no hits. Wow. I decided to look for some data worth uploading - say, that frightening "Balancing the Equation" study showing a steady drop of women working in information technology. Well, that study appears to be in send-us-cash-and-we'll-mail-it-to-you form; in fact, I don't see a lot of raw, upload-worthy data out on the web at all. Hmm. Any ideas?

There's got to be some mailing list of Swivel contributors out there somewhere; I guess I should find it and ask them where they find their raw data.

I'm dying to know what RDBMS they run on...

Tuesday, December 05, 2006

PyCon 2007

PyCon is coming! February 23-25, 2007 in Dallas!

I had the unexpected honor this year of serving on the Program Committee - the people who read the submitted abstracts and argue about which talks to include. The decisions were not easy! We had more excellent proposals than we could fit into the schedule... I wonder if we'll discuss scheduling four days for PyCon 2008. I think we've outgrown three days! We do have plenty of time set aside for Open Space talks and Lightning Talks, however, and I hope the presenters who didn't get into the regular schedule will show their stuff there.

Just reading the proposals was a great educational experience - I picked up news and ideas that have helped me quite a bit already. The actual conference is going to be incredible - though it will probably include plenty of those "Noooo! I want to be in all three seminars at once!" moments.

The hotel's conference rate is actually a good deal for the quality of the hotel - I know that's not always the case at conferences - but the number of rooms available at that rate is pretty limited, so consider getting your reservation right away. With so much good stuff going on from morning till night, it's really nice to "commute" via the elevator.

Monday, December 04, 2006

I'm all grown up!

I'm finally a serious open-source participant!

... by which I mean...

I was using a software package (SQLAlchemy),
found a place where it didn't meet my needs,
submitted an enhancement, together with test case,
and had the enhancement taken up into the project trunk.

I've got a second (less trivial) enhancement submitted but not yet incorporated, too. If you're using SQLAlchemy and you need to use reflection on tables that you're reaching through a database link, you'll need to do it via synonyms and grab code from this ticket. I hope it'll get into the trunk eventually, but it's a low-priority specialty need. (Unless, of course, you need it, in which case there's nothing low-priority about it!)

I guess I've been through the process before with SQLPython, via private email with its creator... it just felt so formal this time, with a Bug Tracker and Ticket Numbers and everything.

Anyway, the whole process takes a little puzzling out... but wow, it's FUN!

Tuesday, November 14, 2006

An agile January to you

There's going to be a fantastic week of agile programming activities in this area this coming January!

Begin with The TurboGears Jam in Ann Arbor, MI, Jan. 14 - 16.

On the 17th, drive two hours southeast to Sandusky, OH.

Then, Jan. 18 - 19, attend CodeMash at Kalahari Resort.

I don't know if they planned these events to dovetail so well, but Bruce Eckel is deeply involved in both, so perhaps they did.

Girls in IT

This Friday, there's a conference at Sinclair Community College here in Dayton (and other sites in Ohio) called We Are IT Day, designed to encourage high-school girls' participation in information technology.

We Are IT Day website
They're still accepting "lunch buddies" - technology professional women who can chat with a small group of girls over lunch.

And, if you know a girl here in Dayton who should be there, try to make sure she can go!

(Sorry I didn't publicize this earlier - I hadn't found a website for it.)

Tuesday, November 07, 2006

IronPython and Oracle

In my post on "Oracle-free Oracle access", I speculated that using IronPython with Oracle's .NET tools might be an effective way to access the database, but I'd never actually tried it.

Since then, I've been contacted by two people who've not only done it successfully, they've written up nice descriptions with code.

I recommend both blogs - they're in my feed reader.

If you can't read Bernd's post, here's my translation of the text that precedes his script:

Here follows the code for a simple IronPython program which enables the interactive input of SQL statements and output from an Oracle database. I use the ODP, which apparently is included with Oracle 10g XE, because I have not installed it separately. In addition, I use the well-known HR demo schema.

After startup, an example session might look like this:

Monday, November 06, 2006

Ubuntu: Linux for Cats

Our kitty knows his Linux.

Sunday, November 05, 2006

TurboGears and Oracle

This weekend saw the fulfillment of a lifelong dream - I got TurboGears working against an Oracle database!

For general information, I recommend the ToDo list tutorial and Splee's post on SQLAlchemy/TG. But there are some particulars you'll need to know to work with Oracle... so here's a super-basic example to demonstrate.

After installing TurboGears, run at the command prompt:
```
tg-admin quickstart --sqlalchemy
```
In dev.cfg, replace
```
sqlalchemy.dburi="sqlite:///devdata.sqlite"
```
with
```
sqlalchemy.dburi="oracle://scott:tiger@orcl"
```
[EDIT April 25, 2007:] Unless you've specifically configured your Oracle database to support Unicode (and maybe even if you have - I'm still fuzzy on this part), you'll also need to set
```
sqlalchemy.convert_unicode=True
```
If you decide to leave it out, then start getting
```
SQLError: (NotSupportedError) Variable_TypeByValue(): unhandled data
type unicode
```
you'll know you needed this parameter set.

Add the following to model.py:

from turbogears.database import bind_meta_data
bind_meta_data()
from sqlalchemy.ext.assignmapper import assign_mapper
emp_table = Table("emp", metadata, autoload=True)
class Emp(object):
    pass
assign_mapper(session.context, Emp, emp_table)

To controllers.py, add


import model

then add to the Root class


    @expose(template="myProjectName.templates.emps")
    def emps(self):
        emps_list = model.Emp.select()
        return dict(emps=emps_list)

Copy templates/welcome.kid to templates/emps.kid, and replace the document body with


  <ul>
    <li py:for="emp in emps">
      ${emp.ename} : ${emp.job}
    </li>
  </ul>

From the command prompt, run
```
python start-myProjectName.py
```
Point a browser at http://localhost:8080/emps

There - you're listing data from SCOTT.EMP!

You can, of course, manually define the columns of your tables; using autoload is simply more convenient and error-proof. It'll only work against tables that have a primary key, though. If you don't use autoload, you don't need to call bind_meta_data in model.py.

TurboGears has recently added SQLAlchemy, as an alternative to SQLObject, for its database-access layer. I don't know much about their relative merits, but it seems like SQLAlchemy may be more friendly to a database-centered (as opposed to object-programming-centered) point of view. In any case, SQLAlchemy has Oracle support, whereas SQLObject's Oracle support still hasn't been integrated into the main codebase. Thus, I'm using the SQLAlchemy flavor of TurboGears.

Tuesday, October 31, 2006

Oracle Net unaccountability

I have a complaint. After seven years of Oracle experience, ORA-12154: TNS:could not resolve the connect identifier specified ought to be in my past.

I installed a standard Oracle 10.2 client on a fresh, new machine, only to find that the sqlplus.exe in {ORACLE_HOME}/bin was not looking in {ORACLE_HOME}/network/admin for its TNSNAMES.ORA. I don't know where it was looking, or why. I searched for any of those annoying stray TNSNAMES.ORA files, and there weren't any, but that doesn't mean that Oracle Net wasn't looking for TNSNAMES in all the wrong places. I eventually gave up and set the TNS_ADMIN environment variable, but that was an unsatisfying brute-force solution; I want to know why SQLPLUS wasn't looking in the standard place in its own home for TNSNAMES.ORA, but apparently I'll never know.

The problem is that Oracle Net gives you no feedback about what went wrong when something goes wrong. Did it find a TNSNAMES.ORA, but hit a syntax error in it? Did it find a TNSNAMES.ORA, but not the one you expected it to? Did it not find a TNSNAMES.ORA at all? Well, you'll just have to guess. Yes, you can trace Oracle Net; you need to insert directives like TRACE_LEVEL_CLIENT=user and TRACE_DIRECTORY_CLIENT into SQLNET.ORA. Ah, but which SQLNET.ORA? Well, that's the problem - if Oracle Net isn't looking where you expect it to for TNSNAMES.ORA, it won't see your SQLNET.ORA either.

It was a big improvement when TNSPING was upgraded to report on which SQLNET.ORA it was using. (And, indeed, in my case, TNSPING reported that it was using {ORACLE_HOME}/network/admin/sqlnet.ora, as you would expect, which is probably why TNSPING could resolve my service names just fine.) We really need a similar improvement in Oracle Net in general - some troubleshooting information that's transmitted in the error message every time TNS resolution fails.

Have I simply missed the memo on some good way to troubleshoot these problems? If you know of one, please let me know.

While researching it, I did find a pretty nice resource - ora-code.com. Their ORA-12154 page is a more concise and relevant checklist than anything I know of on MetaLink. Unfortunately, they lack a search box, so the best way to find their pages is simply to Google for "ora-code 12154".

There. Now if Oracle magazine ever asks me "what one improvement I'd like to see in Oracle", I have my answer ready. That and PL/Python, of course.

Sunday, October 22, 2006

resetPwd.py

I just posted a little script at http://sqlwrappy.sourceforge.net/resetPwd.py for password changing; a full description is at http://sqlwrappy.sourceforge.net/resetPwd.html. You may like to use it / borrow from it if

You need to provide for nontechnical users who must field "Can you reset my password?" requests (the designed purpose)
You want a relatively robust command-line way to collect Oracle account login information
You want to crib code for a command-line pick-from-this-list loop

I packaged it into the 0.1.2 release of sqlWrap.py, too.

Friday, October 20, 2006

The destructive power of stereotypes

I hope that this study receives all the attention it deserves.

"Women exposed to bogus scientific theories linking their gender to poor math skills did worse on arithmetic tests than others..."

This is why people who care get so upset about "harmless personal opinions" about women being inherently worse at acience/math. They're not harmless. They push women away from technical excellence. Yes, that bothers me, and no passive-aggressive whining about "political correctness" will make me accept it in demure silence.

It's especially remarkable that the study showed an easily measurable effect of just one claim of male superiority. Now imagine the cumulative effect of hearing such claims, again and again, over an entire lifetime... it's no wonder that the women who do end up in technology are the exceptionally flinty ones.

Sunday, October 08, 2006

meaning of LAMP

Among the great things I learned at Ohio LinuxFest came from Jeff Waugh's talk. I bet you've seen that LAMP acronym around, and wondered what it stood for.

Linux
Apache
Most scripting languages begin with "P"
PostgreSQL

Now, that makes a lot more sense.

Monday, October 02, 2006

Oracle BoF at LinuxFest

Thank God, the babysitter, and a wonderful spouse, I did get to go to Ohio LinuxFest after all. Hooray!

With one talk each on MySQL and PostgreSQL, it was a good day for database enthusiasts. Maybe that's why I saw quite a few Oracle users there.

The organizers also invited attendees to put together impromptu Birds-of-a-Feather (BoF) sessions. I was tempted to throw together one for Oracle, but I felt like my impromptu idea needed more preparation... I wanted to have a list of topics to seed discussion if it drags.

So, what kind of topics would you suggest for an Oracle-Linux BoF? Here are some that occurred to me.

Experiences with Oracle on various Linux distributions
Oracle's ancient Apache 1.3 HTTPServer; is it ever going to get to 2.0? Can you get mod_plsql working on Apache 2.0?
Open-source SQL*Plus alternatives (actually, someday I hope to talk at OOUG on this)
Experiences with RAC on Linux, Oracle Cluster File System, etc.?

Please, add topics to this list! And, if you're going to be at next year's Ohio LinuxFest, come be part of an OraBoF!

Oracle-free Oracle access

I got an interesting question from Guido d'Amico... he wants to use Python scripts to access Oracle databases from machines with no Oracle software installed. Between us, we came up with these options.

cx_Oracle and DCOracle2: These "classic" DB-API2 modules both rely on the OCI (Oracle Call Interface), a piece of software distributed by Oracle. (I believe all comparable means for accessing Oracle from other languages rely on the OCI, too.) There's just no way around that - you need some sort of Oracle client installed on the machine you're using them from.
You don't have to bulk up your machine with a full-blown standard Oracle client, though.
- Oracle Instant Client is lightweight (85 MB on my Windows box), free, and redistributable. For better and worse, it comes as a simple zipped set of files - if you want any environment variables set (ORACLE_HOME, PATH), you need to do that yourself.
- OracleXEClient is likewise lightweight (72 MB) and free, and very easy to install.
Neither of these options comes with a /network/admin/ folder, which might be a little confusing - unless you want to make your connections with Easy Connect, you'll need to set up $ORACLE_HOME/network/admin/tnsnames.ora by yourself.
You can use ODBC. mxODBC has been around for a while, but is not free for commercial use. pyODBC is free, and I hadn't actually heard of it until I researched this question - maybe I'll review it (or at least find a review) sometime soon.
You can use JDBC from Jython. Andy Todd and Przemek Piotrowski have blog posts detailing this.
You can go to IronPython and... um... OK, I've never yet done database access from IronPython, but I assume that using ODT.NET from IronPython is easy enough.

EDIT: Przemek Piotrowski has not just made it work, he's posted a tutorial on ODP under IronPython. Thanks, Przemek!

Tuesday, September 12, 2006

Ohio LinuxFest

Ohio LinuxFest is coming Sep. 30!

I absolutely loved it last year. It looks like I won't be able to make this year... *weeps bitter tears* - unless, that is, one of you would like to babysit a ten-year-old? No? Bleah, Saturday events are not parent-friendly. (Actually, I did meet one attendee last year who brought his ten-year old. I can't imagine pulling that off with ours, though.)

But anyway, you can give me some comfort by going yourself and having a fantastic time. They've promised extra Linux-newbie stuff this year, too. It will be better than Cats; you will want to see it again and again.

Qnxo price weirdness

Let me say, first off, that I deeply appreciate everything Steven Feuerstein does for the Oracle community.

But the marketing of Qnxo has just gotten weird. My boss asked me for my wishlist for the software budget, and I tried to put Qnxo on it. On Qnxo's "buy it" page, I got

Qnxo is, for the time being, available only on a trial basis. The trial version offers the full range of Qnxo functionality and will work for 30 days after install. If you would like to continue using Qnxo after that point, please visit the Support page and fill out the Contact form. We will then provide you with a key that will enable Qnxo for use until the end of 2006. If you have any questions, please also submit your question through the Support page.

I thought refusing to give customers a straight up-front answer on a price was a hallmark of $10K+ software. Last I heard, Qnxo was $175.

If it were open-source, I'd dive in and use it. If it were proprietary with a clear price, we could decide whether to buy it. But this? I'm supposed to make it a part of my development process based on the hope that, when we're ultimately given a price, it'll be acceptable? I don't see how anybody can do that. I know we can't.

[EDIT 09-09-2007: Steven himself (!) just notified me via his comment that Qnxo is now QCGU, and it's free of charge. Thanks, Steven!]

Monday, September 11, 2006

dual-boot Ubuntu/Windows

I have a new laptop from my boss! A Dell XPS M1710; it feels a bit like an SUV of laptops... it weighs more than I do, comes with an on-board fusion reactor, and emits a menacing red glow from around its edges. It's just chock-full of gigabytes and megahertz and stuff. I will never complain that my boss skimps on our hardware. I think I need a mule to carry it, though.

Anyway, since I'm torn between my love of Linux and my workplace's Windows mandate, I decided to make the machine mirror my split. I need a dual-boot Windows/Ubuntu machine. Since this is the first time I've done so, I'm really glad it was on a clean machine... it was not as smooth as vanilla Ubuntu.

Here's what I did. You should skip step 2.

Found good documentation on the Ubuntu wiki
Let standard Ubuntu installer attempt to shrink the main Windows (NTFS) partition, to give me room for Linux partitions. I ended up with partitons of type "unknown". Oops!
Re-installed Windows from its CD, this time setting its NTFS partition up with about 1/3 the total disk space and leaving the rest uncommitted.
Followed the instructions from the "Issues with Windows XP and NTFS" section of WindowsDualBoot: created a System Rescue CD, booted from it, and used run_qtparted to redo the partitioning.
Created a single physical partition - the last one of the four I'm allowed.
Within the final physical partition, created a linux-swap partition (4 GB for my 2 GB of memory). Divided the rest of the disk between an ext3 partition for Linux root and a FAT32 partition for data that Windows and Linux will be able to share.
Started the Ubuntu installer again; chose "manually edit the partition table", and assigned the swap and ext3 partitions to Ubuntu.

Accepting the defaults, my FAT32 partition got labelled "sda5" under Ubuntu and "e:" under Windows. Sure enough, both OS's can access it OK.

Success! One dual boot laptop, hooray! (Yes, I know, if I'd been really bold, I would have used Xen to run both OS's virtually. Maybe next time.)

Wednesday, August 30, 2006

Blogging tools

I've seen so many blogging tools over the past year or so... many of them seem like interesting projects that have generated lots of excitement.

Here's the part I don't get, though. Why? Had people honestly been going around saying, "Wow, entering my blog entries conventionally is such a time drain"?

I blog a few times a month, and the challenge is having something worth saying and choosing words to say it well - not slinging those words into Blogger's standard posting interface. It's hard to imagine how shaving seconds off that could honestly be worth choosing, installing, and understanding a blogging tool, much less writing one. Maybe if I blogged six times a day, but who would read that?

Maybe someone who "gets it" can explain it to me?

I suspect that this is one of these cases that's being driven by the coolness of the solutions, not the actual need for them. Mind you, I've got nothing against that. I have often spent three hours writing code to avoid a one-hour manual job. (Which is not as illogical as it sounds, because when you finish the manual job, somebody's bound to say, "Oh, I'm so sorry, but we need it done again..." Not that a logical consideration of that possibility is what drives me; I do it because it's fun.)

Friday, August 25, 2006

dumpfile diving

The oracle EXP utility generates a dumpfile which, although technically binary, actually contains lots of readable ASCII. Sometime I need to see this information, especially since dumpfiles don't provide a way (that I know of) of summarizing their contents and the conditions of their generation. In a perfect world, we'd never have to work from with dumpfiles burned onto scratched CDs by unknown parties and abandoned in the dusty corner behind the recycle bin, but this is not a perfect world.

Anyway, I just went through several techniques of examining a mysterious.dmp, and thought I'd share the experience. Much of it would apply to delving into any mixed binary/ascii file.

The worst way: more mysterious.dmp

gave me frightening glyphs and angry beeps (xset b off to stop those), like R2-D2 invoking dread Cthulhu. Worse, my session would henceforth speak to me only in proto-Sumerian. I could kill the terminal window and open a new one, of course, but I still spent several hours plastering smooth curves over all the office's sharp corners, just to be safe.
Not quite so bad: less mysterious.dmp

let me look at the file, making harmless marks of the binary characters, and didn't mangle my session's character set. Yay! I do need to make a habit of using less instead of more.
Still pretty painful: grep -a "what I'm looking for" mysterious.dmp

The -a flag makes grep look into a file even though it's binary. It has no proper idea of where lines end in a binary file, though, so your hits can be really long. I had better luck grepping the files that resulted from the operations below.
Good: imp me@mydb show=Y file=mysterious.dmp full=y log=mysterious_contents.txt

This gives you a clean-looking file (well, except for all the gratuitous quotation marks). It's also the only technique I know that you can use on Windows (without Cygwin). You don't get data contents, though, just DDL.
Best: strings mysterious.dmp > mysterious_contents.txt

This GNU strings utility is really great! You get the ASCII, the whole ASCII, and nothing but the ASCII, quick and clean.

What I still don't have, though, is a way to examine a dumpfile and find out exactly what command was used to generate it. That would give all kinds of fantastic information: What instance was it? Which user did the export? Was it full? ... and so forth. If you know of a way to query a dumpfile for this kind of information, please comment!

Wednesday, August 23, 2006

xubuntu

Xubuntu is a variant of Ubuntu that uses the lightweight Xfce desktop environment, making it a good choice for low-powered systems.

Or so they say. So I dusted off (literally) a Gateway Solo laptop (Pentium II, 300 MHz, 64 MB RAM) and replaced its Windows 98 with Xubuntu 6.06. And that's what I'm posting from now! Go, Xubuntu! The only problem is that a naughty kitten tore several keys out of the keyboard several years ago, making typing difficult. And he would have to get the 'e', the little stinker.

I'm using a 2-year-old wireless card with it, though. I failed with my first attempt to make its old ethernet card work, and decided to take Tim Almond's advice and just use a known compatible device rather than go into ethernet-card archaeology.

Hmm, now there's a new working laptop in the house. Oh, the possibilities...

Monday, August 07, 2006

History

Cleanup of obsolete material continues.

I'm holding a boxed set of Oracle 7.3.4 Server software. It's still shrink-wrapped.

Can I really throw this out? 7.3.4 is where I started, after all. Then again, if I keep it, am I like the people who imagine that their comic book "investments" will pay off one day?

I love working in IT... yet it can be horrifying, realizing just how ephemeral our constructions are. If you're a mason, your work may outlive you by millenia. If you're a geek and you want your work to outlive you, you'd better get very sick or take up some dangerous hobbies.

Tuesday, August 01, 2006

obsolete books

Following a recent run of absurdly good luck at user-group drawings, I must face the fact that my cubicle will not tolerate my current inventory of books.

"So throw them out". But... but... books represent knowledge, how can I just throw them out? Especially the ones that I never did get around to devouring... sure, those skills may have proven irrelevant to my work, but it's stuff I never learned! How can I give up on learning it, send Knowledge away unlearned?

Ahem. Anyway, psychological issues aside, does anybody know a good destination for mildly obsolete technical books? The recycle bin seems so brutal, yet how can I find somebody who'd want them? This is mostly Oracle and Java stuff averaging five years old...

Thursday, July 13, 2006

Digital cholesterol

That's my new term for the performance-clogging stuff that big-enterprise IT departments automatically install to user desktops via the enterprise network. Every week, a bit more gets piped in without my foreknowledge or consent, gradually crippling my machine.

I do my serious work on my Ubuntu laptop, which is barred from my workplace's network; I download software from home and transfer my finished products to work by USB drive. At first that seemed like an unfortunate price I had to pay; now it's looking more like a blessing. My Ubuntu laptop sizzles along as fast as the day I first booted it, while my plugged-in Windows machine creaks and groans and is slowly becoming unusable.

Thursday, July 06, 2006

sqlWrap and oraDifference, packaged right

In the past few months, I've written sqlWrap.py, a database connection convenience wrapper, and oraDifference.py (which depends on sqlWrap.py). I didn't have any particularly sane way to distribute them, though, and I apologize to anybody who made the attempt.

Well, it may amount to delusions of grandeur, but I registered them as a SourceForge project. Now they have

A single, sane place for downloads, properly versioned
A regular distutils python installer: unzip it, run
```
python setup.py install
```
, and everything goes where it belongs (sqlWrap.py in your Python library, oraDifference.py in your python Scripts directory)
a Windows executable installer (oooh, aaah)
Homepages with documentation: for sqlWrap.py and oraDifference.py

Because oraDifference.py depends on sqlWrap.py, I put them together in a single install for simplicity.

This has been my first time working with Python's distutils module (so much easier than I expected!) and Sourceforge (not so much). It's been fun!

Saturday, June 03, 2006

sqlpython enhancements

I told you that Luca Canali's sqlpython is wonderfully easy to customize.

I probably should have also told you that it's dangerously addictive to customize. I kind of went out of control, and produced sqlpyPlus.py, a module of enhancements to sqlpython.

- SQL*Plus-style bind variables
- Query result stored in special bind variable ":_" if one row, one item
- SQL buffer with list, run, ed, get, etc.; unlike SQL*Plus, buffer stores session's full history
- @script.sql loads and runs (like SQL*Plus)
- ! runs operating-system command
- SQL*Plus-style describe, spool
- write sends query result directly to file
- comments shows table and column comments
- compare ... to ... graphically compares results of two queries
- commands are case-insensitive
- show and set to control sqlpython parameters

sqlpyPlus.py is not as clean and elegant as sqlpython - that's one reason I put it in a separate module, so that you can keep it separate from the original sqlpython and your own homemade enhancements. But it should cover pretty much everything you usually use SQL*Plus for, plus some goodies I hope you'll like.

[EDIT: Since I wrote this, Luca has wrapped an enhanced and debugged version of sqlpyplus into his distribution of sqlpython itself. Now you should simply go and get or upgrade sqlpython, and you'll have these goodies automatically.]

Wednesday, May 17, 2006

The missing Mercurial manual

I think I've found my ideal solution for version control. I used bzr for a few weeks, and appreciated its distributed nature - no single repository has to be the ultimate authority, so it works well for machines that can't all be connected to the same network. I use a travelling USB drive to sync my machines, so it was perfect... except that bzr can take several minutes for just a few merges. It got annoying when the merge was keeping me from leaving work.

Mercurial is also distributed, but it's very, very fast. It's even (gasp) well-documented! There are, however, a few things I think a newbie should know up-front.

The tutorial assumes you'll start by copying an existing mercurial repository. I wanted my own, though, and it took me several tries to figure out that I needed to do this:
~/existing$ hg init ~/existing$ hg add foo.bar ~/existing$ hg commit ~/existing$ cd .. ~$ hg clone existing newdir
Later on, if I make new files in existing, they won't get to newdir until I
~/existing$ hg add newfoo.bar ~/existing$ hg commit ~/existing$ cd ../newdir ~/newdir$ hg pull ~/newdir$ hg update
pull brings fresh metadata in from the other repository, storing it in .hg, but doesn't actually go on to update the files themselves. That's what update does. It took a while to get it, but now that I do, it seems intuitive and helps me feel in control.
Your EDITOR or HGEDITOR environment variable must be set, or else hg commit will hurl you pitilessly into vi (*nix) or throw up its hands and scoff at you (Windows).
But, if you use gedit and already have another file open in it, you get
~/existing$ hg commit transaction abort! rollback completed
I suspect that may happen with other multi-file editors, too. I'll use EXPORT HGEDITOR=pico.

Tuesday, May 16, 2006

Python in OTN again

Here's a cheer to Przemek Piotrowski for his recent OTN article, Build a Rapid Web Development Environment for Python Server Pages and Oracle. Python Server Pages are just one of the 10²² ways to build a web application with Python, but his methodical instructions would be useful for doing anything remotely related (including installation of Oracle XE and mod_python).

Friday, May 05, 2006

SQLpython - a SQL client of your very own

Luca Canali has written SQLpython, a lovely new SQL command-line tool for Oracle.

Right now, the most popular SQL command-line tools are

SQLPlus, included with Oracle, is sometimes great, sometimes annoying, and impossible to modify (source code not available).
gqlplus is open-source. It's written in C, though, which means (to my mind) that you'll need all of your strength and all of your courage if you want to modify it.

So, download sqlpython.tar, untar it, put sqlpython.py and mysqlpy.py somewhere handy (like your Python library), and then:

$ python
>>> import mysqlpy
SQL.NoConnection> connect hr/hr@xe
SQL.xe> select * from employees;

Now comes the fun part! Open up mysqlpy.py and sqlpython.py and start modifying. They're very basic right now, but very clean, concise, easy to understand, and easy to modify. For instance, I wanted to be able to issue Python commands like this:

SQL.xe> py print 'This is a python command';
This is a python command.

So I added this method to mysqlpy:

    def do_py(self, arg):
        exec(arg)

That's all I did - not one keystroke more - and it works. Now that's extensibility!

If you're not an Oracle person and you're envious, as far as I can tell, it should be easy to modify SQLpython to use any DB-API2 adapter.

Friday, April 28, 2006

Stored procedures from cx_Oracle

A couple of people have asked me about calling Oracle stored procedures from cx_Oracle. It's taken me a while to answer, because... I didn't know! I'd only had experience doing them the 'dumb' way:


>>> ora = cx_Oracle.Connection('scott/tiger@orcl')
>>> curs = ora.cursor()
>>> curs.execute('execute immediate myStoredProc(:a)',{'a':'the letter a'})

... but, of course, that won't do if (for instance) you want OUT variables. So I did a little research. cx_Oracle provides callproc and callfunc, but using them can get squirrely. Say you have PROCEDURE times_two(n IN NUMBER, result OUT NUMBER).

>>> n = 1
>>> curs.callproc('times_two',[2, n])
[2, 4]
>>> n
1

In other words, if you just pass a regular Python variable to callproc, the value won't actually change, OUT mode notwithstanding. If you want the new value, you'll just have to assign it there from callproc's return value.

Alternately, you can prepare the way by setting up your in/out variable as an instance of a special cx_Oracle object type, as follows...

>>> n = curs.var(cx_Oracle.NUMBER)
> curs.callproc('times_two',[5,n])
[5, 10.0]
>>> n
<cx_Oracle.NUMBER object at 0xb7cf2480>
>>> n.getvalue()
10.0

Pre-setting a variable's type? Calling .getvalue() just to see the contents? What an un-Pythonic pain! As far as I know, for the time being, cx_Oracle and PL/SQL procedures with IN-OUT parameters are simply two great tastes that do not taste great together. You can do it, you just won't feel like you're having Pythonic fun.

On the plus side, if the stored function or procedure is within a PL/SQL package, callproc accepts that in the way you'd guess:
> curs.callproc('multiplication_package.times_two',[5,n])
[5, 10.0]

Oh, and it looks like sqlWrap.py wasn't handling .callproc. I've posted a correction.

Saturday, April 22, 2006

IOUG Collaborate! handouts

If you're on your way to Collaborate!, and getting annoyed at the way you need to search manually for each session just to download its session materials, this script may be handy. It lets you grab the session materials from your personal itinerary.

Be gentle, it was written in a huge hurry.


#!/usr/bin/python
"""Creates a version of your Collaborate! personal itinerary with links to
session materials.

To use:
0. Make sure your machine has Python.  www.python.org
1. Login to your personal itinerary at
http://iougew.prod.web.sba.com/displaymod/ITIntro.cfm?conference_id=44
2. Once your personal itinerary is showing, use Save As to save the webpage
to your hard drive.  Name it PersonalIT.cfm.html.  (This should be the 
default name.)
3. Put this script in the same directory with PersonalIT.cfm.html.
4. Run the script by issuing 'python makeLinks.py' at the command prompt.
5. Open the generated file PersonalIT.withLinks.cfm.html with a browser.
6. The (find materials) links for each title will search for session materials.

By Catherine Devlin (catherinedevlin.blogspot.com)"""
import re, urllib
titleRe = re.compile('(Title:</td>\s+<td.*?>(<a href.*?>(.*?)</a>))', re.DOTALL | re.MULTILINE)
f = open('PersonalIT.cfm.html')
contents = f.read()
newContents = contents
f.close()
sessions = titleRe.finditer(contents)
sessionLinks = [s.groups()[1:] for s in sessions]
for (wholeLink, title) in sessionLinks:
    withNewLink = '%s <a href="http://iougew.prod.web.sba.com/proceedingmod/SearchEvents.cfm?conference_id=44&searchType=2&title=%s">(find materials)</a>' % (wholeLink, urllib.quote(title))
    newContents = newContents.replace(wholeLink, withNewLink) 
newFile = open('PersonalIT.withLinks.cfm.html','w')
newFile.write(newContents)
newFile.close()

Monday, April 17, 2006

oraDifference.py

oraDifference.py - a tool for comparing items that differ between two Oracle schemas. The basic idea is to leverage the excellent graphical diff/merge tools available for file comparison and conveniently use them to inspect database object differences.

There are many programs that can compare two database schemas and tell you which objects are defined differently between them. That's really not good enough, though, because you then need to tediously dig into the definition of each (allegedly) differing object by hand, and perform any desired reconciliation by hand.

I wrote oraDifference.py to make comparing and reconciling schemas more convenient. For example, let's say you have the SCOTT schema in production and development instances. Stored function MYFUNC is defined in both, but the definition differs. View MYVIEW is defined only in development. Then running
python oraDifference.py scott@prod scott@dev
will generate the following batch files (Win) or shell scripts (*nix):

oraDifferenceResults/FUNCTION/MYFUNC.bat, which will invoke a graphical diff/merge tool showing you precisely where MYFUNC's definition differs between the two instances
oraDifferenceResults/FUNCTION/MYFUNC-copy-SCOTT-DEV.bat, which will write DEV's definition of FUNC into PROD
oraDifferenceResults/VIEW/missingFrom-SCOTT-PROD/MYVIEW.sql, the definition of MYVIEW
oraDifferenceResults/VIEW/missingFrom-SCOTT-PROD/MYVIEW-copy-SCOTT-DEV.bat, which writes MYVIEW into PROD

For now, you have to do the work of getting oraDifference.py (and sqlWrap.py) manually and putting them someplace appropriate. I do intend to wrap them up in a proper distutils distribution (maybe even with an .egg).

I'm posting this now because I find it really useful already. You may find some of my design decisions quirky - for instance, I mush all one-liner items (grants, synonyms) into big files by category, rather than making separate files for each grant or synonym. It's Python, though, so you should be able to tweak it to meet your tastes. Also, you can tweak the process oraDifference.py uses to decide whether two objects differ. I have always been annoyed that I can't stop TOAD's "Schema Compare" tool from turning up dozens of "differences" that I consider false hits. With oraDifference.py, you can just get in there and change it.

Eventually I hope to release something that will look polished and final, but for now, feel free to use it, and re-code any part that doesn't match your preferences - and let me know about any of your changes that you think should go into everybody's version.

For fairness, I'll mention some other options I found for schema comparison...

LivingLogic's oradiff.py (part of ll.orasql) is the closest to oraDifference.py. It also compares the text for each object, but it outputs in unified diff format (or unidiff). If you can read unidiff comfortably - welcome, advanced extraterrestrial visitor! The oradiff.py source code looks tidy and well-organized, but it's still not obvious to me how to tweak it.
schemaCompare, a Java program, was registered at SourceForge in June 2002, but has not yet released any files. I conceived oraDifference.py about two weeks ago. Not to suggest that this implies anything about the relative productivity of various languages. (jab, jab)

Friday, April 14, 2006

Python Core for Oracle

He put it in a comment, but it bears repeating:

Przemek Piotrowski has written up Python Core for Oracle, a set of instructions to put a top-to-bottom data-driven webserver stack on your machine in about half an hour. The installation is surprisingly straightforward. It's all fully functional, and it's all free.

This is the Golden Age!

Thursday, April 13, 2006

Cheetah templating

Yesterday, Python's str.Template failed me, so it was finally time to learn Cheetah.

I wanted to use templates like
'my list has $len($myList) objects; the first is named $myList[0].name.upper()'
... but, of course, that sort of stuff is impossible with str.Template. I created a bunch of code to populate a dictionary to pass to str.Template, but that was clunky, and defeated the purpose of having a template that clearly describes its own contents. In Cheetah, it's perfectly straightforward.

from Cheetah.Template import Template
tmplt = 'my list has $len($myList) objects; the first is named $myList[0].name.upper()'
print Template(tmplt, [locals(),globals()])

The second argument is the list of dictionaries Cheetah will search for matches to variables in the Template. Using [locals(), globals()] is my way to cheat and say, "Look wherever the interpreter would".

Wednesday, April 12, 2006

Oracle XE and Ubuntu

WOW. I just installed Oracle XE on my Ubuntu machine. I absolutely cannot believe how easy it was. This is - honest to goodness - all I did.

Download oracle-xe_10.2.0.1-1.0_i386.deb
su - root
dpkg -i oracle-xe_10.2.0.1-1.0_i386.deb
It ran for maybe thirty seconds - so short, I was certain there had been an error!
/etc/init.d/oracle-xe configure (it told me to do that)
pointed Firefox at http://127.0.0.1:8080/apex (it requested that, too)
Started using the database (plus its included Application Express).

The entire installation took less than five minutes. Unbelievable! "This is Oracle?"

The only glitches I've gotten so far were when using Python's cx_Oracle against XE, and I've puzzled them out. (I don't know whether other people will get these glitches; they could have resulted from some residue of the full-fledged 10.2 Oracle that was on the machine before.)

import cx_Oracle gave ImportError: libclntsh.so.10.1: cannot open shared object file: No such file or directory until I set LD_LIBRARY_PATH=/usr/lib/oracle/xe/app/oracle/product/10.2.0/server/lib/
conn = cx_Oracle.Connection('scott/tiger@xe') gave RuntimeError: Unable to acquire Oracle environment handle until I set ORACLE_HOME=/usr/lib/oracle/xe/app/oracle/product/10.2.0/server

[EDIT Oct. 9, 2007: Configuring cx_Oracle with Oracle XE has turned out to be harder than expected. See my new blog post.]

[EDIT Mar. 6, 2008: Great instructions for installing straight from Oracle's repository with apt-get here]

Tuesday, March 28, 2006

summary of Oracle/Python discussion

My OTN article on Oracle and Python was kept very brief, to be non-intimidating and to fit within OTN's preferred length. If you've come here, though, you're ready for the rest of the story! I'll use this post to summarize that discussion.

Python+Oracle on other Linux distributions - see Andy Todd's blog entry
alternatives to fetchone(): fetchmany(), fetchall(), and looping directly on the cursor - see my last entry and this comment
Passing an argument to split(), to avoid errors on more complex init.ora parameters - see this comment

Friday, March 24, 2006

OTN article addendum

If you've read my new article at the Oracle Technology Network, Wrapping Your Brain Around Oracle + Python, thank you! I'd like to add a few more details about fetching rows with cx_Oracle that can make your code even cleaner.

Several times, I demonstrate getting rows from a cursor by means of the cursor's .fetchone() method. .fetchone() is used in loops like this:

curs.execute(<some query>, <bind variables>)
aRowOfData = curs.fetchone()
while aRowOfData:
    <commands>
    aRowOfData = curs.fetchone()

Another, more concise, alternative was not mentioned in the article. The cursor object itself can be iterated over, like this:

curs.execute(<some query>, <bind variables>)
for aRowOfData in curs:
    <commands>

The effect is the same, but it works with two fewer lines of code.

Finally, the .fetchall() method, bringing the entire result set into a list at once, was only briefly mentioned in the article, but it would probably be preferable to .fetchone() for the small result sets we'll find in places like v$parameter. Only when a result set is very large (or your computer is very memory-limited) do you need to worry about .fetchall()'s impact on your system's available memory.

Sunday, March 12, 2006

Alice: corrupting the youth

I wish I could remember which PyCon delegate told me about Alice. It's a graphical environment for programming animations in a very kid-friendly fashion, yet full of solid, object-oriented programming goodness. This generation's LOGO, I suppose.

I was hoping it would fire our nine-year-old's interest in computers. I think it's working; Star Wars: Battlefront never produced such delighted shrieks and giggles. Maybe it works too well. He refused dessert to spend the extra couple minutes with Alice. Choosing code over food - I always thought of that as late-stage geekery.

Cheer: Alice works hard to be girl-friendly. Boo: Not available for Linux.

Tuesday, March 07, 2006

new and improved sqlWrap.py

After reading the Python Cookbook and attending PyCon, I have enormously improved sqlWrap.py, my Python module for handling DB-API 2.0 connections conveniently.

Also at PyCon, I learned that I could be accused of having duplicated projects like

Like them, I allow tuple-like, dict-like, and object-like access to fields. I like mine better, though; it requires less preparation, and has really nice reporting methods. For example,

conn = sqlWrap.SqliteConnection('myDb.sqlite')
print conn.select('myTable').xml()

... is all you need to get an XML report on myTable. Similar reporting methods exist for

tables in pp (prettyprint), xhtml, ReStructured Text
transposed tables in pp, xhtml, ReStructuredText
SQL INSERT statements

Hey, if everybody gets to write their own web app platform, why shouldn't I write my own DB-API wrapper?

Thursday, March 02, 2006

PyCon2006

No, Guido doesn't know I have this... I snagged it from the badge reuse box. You keep your smelly old rock star T-shirts, I'm keeping this.

So... PyCon. Wow. It was wonderful, of course. What else could it be? Put 400 people with that much intelligence, creativity, and energy in one place, and it can hardly help but be wonderful.

What surprised me is that I found so many good ideas no matter what I was doing at PyCon. Whether I was in a talk that I expected to benefit hugely from, or a "Probably useless but I guess I'll go anyway" talk, or just chatting between or after sessions, I seemed to learn good stuff constantly just the same. The serendipity was worth more than the "planned" learning, and maybe that's what makes conventions so much more useful than formal books and classes and so forth.

Friday, February 17, 2006

Python Cookbook

This may seem silly, but I really have to sing the praises of the Python Cookbook, 2nd Edition. Everybody else has known about it forever, but I just got my copy a month ago.

I am blown away. I can't believe how good this book is, far beyond any other programming book I've ever known. I'm now completely embarrassed about the quality of the code I wrote without it, and tempted to stay up all night refactoring everything. In fact, I've already largely rewritten sqlWrap.py based on what I've learned.

I'm afraid I dawdled about buying my copy because I never found the recipes at ActiveState all that compelling - occasionally nice, but usually nothing to jump and scream about. I figured the Cookbook would just be a bunch of them, bound together. Not so - the careful selection and excellent discussion make it amazing. It's like having a master programmer at your elbow to guide you.

Friday, February 10, 2006

The Geek Event Aggregator is ready!

The Geek Event Aggregator is more-or-less ready for prime time! It now collects many more events. Better yet, it is very easy to feed more event sources, so it is set for even more growth. In other words,

PLEASE FEED THE AGGREGATOR NEW SITE SUGGESTIONS!

Some design notes:

After fooling around with complex regular expressions, Beautiful Soup, etc., I found a quick-and-dirty way that works better. The Aggregator downloads a page's HTML, replaces all tags with carriage returns, breaks the remaining text into lines, and checks those lines for ones that appear to contain recognizable future dates. The Aggregator assumes that is an upcoming event date. (There are many reasons dates get put on websites, but future dates almost always refer to meetings or events.)

Wow, human beings have many, many, many ways to write dates. Fortunately, python's dateutil module can recognize most of them - I mainly have to modify that to avoid false hits (like interpreting '.' as 'today', or '2006' alone as 'Jan 1, 2006').

Some sites already aggregate events from several groups and places together; for those, the Aggregator uses a slightly different algorithm. It finds the dates as above, then finds the location and event name by their line-number position relative to the date. (A human being (me) needs to provide relative line-number positions for those values in advance; for instance, one site may always list location on the line immediately after the event date, and that fact is recorded in advance.)

For the multi-event sites, the Aggregator has a decent but kludgey algorithm to parse city and region, despite the great variety of ways to write a location. Part of that relies on a list of recognized city names. It can be used for the single-event sites, too; if a recognizable city name is in the site's Title, or in the text in the form "Blah Blah City Blah Blahware Group", the Aggregator can find it. But if your event is in Athens, GA, the Aggregator thinks it's in Greece.

HTML DB makes it very convenient to build a web interface to the data. Oracle is also very gracious to host a free sandbox for HTML DB projects (which is where the Aggregator lives right now.) Unfortunately, as far as I can see, HTML DB doesn't support RESTful interfaces, or serving up pages as XML. That's a pity, because this cries out to be a REST web service. Maybe eventually I'll buy/find a place to host the web app in TurboGears or something.

For now, I've given up on feeding upcoming.org. Upcoming mandates an actual street address, which is just too hard to find automatically. I'd still like to pull from it, although I'll have to check their legal requirements, and - dare I say it? - I don't know if it will really have many relevant events I don't already have.

I'm sorry the events are so U.S./Canada-centric. It only has a handful of events from elsewhere, and it doesn't break down regions within other countries. (What's wrong with going from St. Petersburg to Novosibirsk for a meeting, anyway? Isn't that what the Trans-Siberian Railway is for?) You can help fix this by suggesting new sites to scan, and volunteering to introduce regional granularity for other countries.

Actually, because I've been the only one to feed the Aggregator so far, the events are Ohio-centric. You folks in benighted backwaters like California and New York are just going to have to feed it your own favorite sites if you want to fix that.

Some of the many things that produce misses and false hits:

Dates without years. Somebody puts on the website, "Our next meeting is Nov. 19." The Aggregator can't tell that they haven't updated the site since 2004.
Years must be on the same line as the rest of the date. If you say,
2005 events:
Feb. 14
Aug. 12
... the Aggregator doesn't see the "2005", and believes there are events on Feb. 14 and Aug 12 of this year.
Frames. Well, you can't blame it; frames mess up everybody. But if you can dig into the HTML source and puzzle out the URL of the frame with the data, then that can be read. That's what I did for the OKCOUG webpage, for example.

Thursday, February 02, 2006

Geek Event Aggregator's future

The strangest thing happened to me this morning. The clouds split open, and a beam of light shone down onto me. (Since I was sitting in my cubicle, this in itself was odd enough.) A Voice spoke from Heaven, and said,

"Catherine, remember that somewhat cheesy Geek Event Aggregator you put on HTML DB a while ago?"

"Yes, Lord?" I said. (That seemed like the obvious response. That the Almighty would speak with hyperlinks didn't seem too surprising.)

"Well, I have heard the cries of my people. I want you to rewrite your aggregator to both consume and provide events in iCalendar format. Also, interface it with upcoming.org - both to download and to upload. Behold, Python libraries for upcoming.org have already been written for you. And I'm thinking that Beautiful Soup might help you pluck event descriptions from the God-awful HTML jungles you find them in."

"It sounds wonderful, Lord," I said. "But when am I supposed to do this? I mean, you did just drop custody of our Godson into our lap. Time is not exactly abundant right now."

There was a long pause, then the Voice said, "Let me get back to you on that."

Friday, January 27, 2006

cleanPath executable ready

An executable version of cleanPath is now available here!

You do need .NET Framework 2.0 to run this. It's pretty painless to download and install. You don't need IronPython.

Thanks to the newest IronPython release for making this possible! When I run

ironpythonconsole cleanPath.py

, the code runs, and it also generates cleanPath.exe without even being asked to. Until now, though, that .exe didn't actually work. In IronPython 1.0 Beta 2, however, released just Wednesday, it works great! I do need to put IronPython.dll in my directory before I compile... other than that, it's so simple, even a DBA can do it.

Wednesday, January 25, 2006

cleanPath.py: an IronPython utility

During my CodeCamp presentation, I introduced a very sketchy (and logically flawed) script to examine a PATH environment variable and clean out unwanted elements.

Since then, I created a much-improved version in IronPython which, I think, is actually worth using. Features:

Conveniently view and delete path items
Nonexistent directories and duplicate items are flagged, and can be purged with one keystroke
works on PATH, CLASSPATH, PYTHONPATH, PERL5LIB, etc.

It's Windows-only, but only Windows makes editing your PATH such a pain, anyway.

The code is here.

Right now, you need IronPython, and use "ironpythonconsole cleanPath.py" to run it. I need to figure out how to generate an .exe, since getting set up with IronPython is too much to ask just to clean up your PATH.

[EDIT: An executable is ready now. See my next blog entry.]

Monday, January 23, 2006

Smurfette syndrome

One thing about the CodeCamp, though: severe Smurfette syndrome. Maybe 10% of the attendees were women, but probably closer to 5%.

I don't think it's the CodeCamp's fault; I'm afraid they may have gotten an accurate sample of the programmers in the area, and that's the scary thought. Possibly things would have been a little better on a weekday; men might feel more comfortable in taking a day away from home duties for their professional development. Still.

I know, it's been talked to death, and there are a million theories and stuff, but... yikes. It's just scary that one year goes by after another, and the situation persists.

On the other hand, the GCOUG meeting was about half women, maybe even a majority. At a small meeting, granted, but it's still a bright spot.

Dayton-Cincinnati Code Camp review

If you missed the CodeCamp, don't miss the next one. It was great. I'm not even into .NET, but even the minority of the conference that was non-.NET was well worth my time. Plus, sampling a bit of the .NET take on a few topics was good for my general education.

I almost wish I were involved in .NET, purely for the sake of the Dayton .NET group, which has a vitality that you just don't see that much of in the Dayton area IT community (at least, in my tracks thereof.)

Anyway, a few things I learned...

There really are Python-experienced people in the area; I'd just never found them before.
Ruby on Rails supports Oracle. Uh-oh. It's not that I'm unwilling to learn Ruby; it sounds great, and (of course) simple to learn. It's just that I really want to use TurboGears, but it uses SQLObject, and Oracle support in SQLObject is the sword in the stone. Many have tugged, but none have completed it, and maybe I should stop waiting and just use RoR for now.
Speaking of Ruby, never miss seeing Jim Weirich speak. I already knew that, but maybe you didn't.
GCOUG really needs to work on getting more people involved. There were 15-20 people at Jimmy Brock's talk on Oracle Developer Tools for .NET - none of whom had more than the haziest awareness of GCOUG's existence.
Speaking of Jimmy's talk, Oracle Developer Tools has a great and handy way to bring .NET assemblies from Visual Studio all the way into Oracle, where they can be used like stored procedures.
So, one glorious day, this year I hope, Visual Studio will support IronPython, and I'll be able to (essentially) write my Oracle stored procedures in Python instead of PL/SQL. And choirs of angels will sing.

I'm sure I'll have more to say about the Camp as I think about it.