Catherine: pyOraGeek: June 2009

Wednesday, June 24, 2009

don't need no stinking rules engine

There's a whole class of programs called "rules engines". The idea is to remove the details of a process from the hard-code of the program, store them externally, and view/modify them easily. The engine then converts the rules, stored in some sort of custom format, back into an executable form at runtime.

In my experience, Python is an effective rule engine. Thanks to Python's readability, you can store business rules as snippets of Python code - in textfiles, a database table, or wherever you prefer - and business users should be able to read them comfortably. After that, a very lightweight Python program can load the rules and the relevant data and use exec() or eval() to apply the rules to it.

One of my main projects is an example of this. It's program that synchronizes data between two Oracle databases. That sounds easy, but business details complicate it enormously:

Table and column naming, structure, and normalization differ
Only some rows are transferred, according to a complex set of business rules
Only some columns are transferred. Column values are combined, split, truncated, have functions applied, etc. Again, governed by a jungle of business rules
The business rules change continuallyRules must be documented. Letting documentation get out of synch with implemented rules is very bad.
Users may demand explanations for each decision made by the program, down to the row and column level

My first take on the problem was a large hard-coded PL/SQL procedure. What a nightmare!

Later, I rewrote the rules as snippets of Python. Each rule is stored a database table along with the dates it takes effect and expires, the person authorizing the rule, and a justification. This readable, self-documenting set of rules can also answer questions like, "Why did things change since last month?".

Unfortunately, I didn't know much about object-relational mappers when I wrote it, so the program has clunky data-fetching code. I'm currently working on a third version of the program that uses SQLAlchemy; the resulting program is very short. Broadly, here's what it does:

- Queries the row-level and column-level rules from their respective tables

- Fetches a row from the local database (ours) and the corresponding row from the remote database (theirs)

- The heart of the engine:


data = {'ours': ours, 'theirs': theirs}

for row_rule in row_rules:

    if not eval(row_rule, data):

        return

for column_rule in column_rules:

    exec(column_rule, data)

Actually, the data dict also includes definitions of a few functions that some of the rules invoke. The function names are chosen to be self-explanatory to business users. For example:


def fiscalYear(inDate):

    if inDate.month > 9:

        result = inDate.year + 1

    else:

        result = inDate.year

    return result



data = {'ours': ours, 'theirs': theirs, 'fiscalYear': fiscalYear}

I suppose it wouldn't be too hard to put the function definitions themselves in among the rules, then include locals() in with data, as long as execution order is controlled (easily done by putting execution_order columns in the rules tables). It hasn't been necessary for my project.

- Now row_rules has eval()-able entries like::


  id  start   end     code                      authorized   reason

   1  1/1/06  7/1/09  ours.funded == "Y"        Bob          I said so

and column_rules has exec()-able entries like::


  id  start   end     code                           authorized   reason

   1  2/2/08          if (ours.value > 1000):        Steve        to annoy Bob

                          theirs.value = ours.value

- Add some logging and a "test-run" capacity which reports on the changes without actually performing them. (It uses sqlalchemy.orm.attributes.get_history() for this; make sure to set autoflush=False if you use this, or intermediate flushes might clear the history.)

I suppose I could try writing a sample rules-engine implementation in simple, general terms, that people could crib from for their own "rules engine" applications. I wonder if that would be helpful, or if just the general idea is enough guidance.

Friday, June 19, 2009

Python for Secretaries

As if I need a new ambition, I've got an itch to create and teach a course called "Computer Programming for Secretaries". Since I got into IT via the secretarial pool, I think I'm the perfect one to do it.

To outsiders, programming has this horribly intimidating aura. You've got enterprisey Software Architects trying to sound professional, academic Computer Scientists telling you that you're oversimplifying the problem, fearsome Hacker Gods strutting their skillz. Lots of people want to make sure you know how smart they are, and that nothing could happen without their planet-sized brains.

but programming is not rocket science

If you want to launch satellites into space, you need to invest your life in the field and be part of a large, well-funded institution. Yes, you can have lots of fun with model rocketry, but you're just playing. You're not actually going to get anything into orbit.

programming is more like cooking

My friend James is a professional chef. Everything he makes involves a bunch of French words, ingredients I've never heard of, and turns out eyes-roll-back delicious. I don't have the ambition to invest the time and effort to cook that well... but I can still roast a turkey. That's what programming is like, especially dynamic language programming. With a lot of skill, you can work miracles - but with a little skill, you can work little miracles. You don't need to go in for the whole hog.

That's what I'd like a class to address. There are programming books aimed at kids, but none that I know of aimed at adult business users. There are people who could write themselves small, useful programs, but who will flee in well-justified terror if you start talking about overriding import hooks. There are people spending hours cutting-and-pasting from one file to another because they don't know how to write a six-line script. There are people could replace some of their daily tedium with just a little dose of Python. There is Resolver One, which is a fantastic way to integrate tiny dashes of Python with everyday spreadsheet work, but it's being used by thousands instead of by millions.

So... yeah. What should such a class include? More importantly, once I'm ready to teach such a class, where do I teach it?

Monday, June 15, 2009

how to tell a geek

Given a choice between spending an hour doing a task manually, or spending three hours writing a program to do it automatically... a geek will write the program, every single time. And, if not given the choice, if explicitly ordered to do the job manually, we'll disobey and write the program anyway. I've heard it said that a good geek is lazy, but I think it's more precise to say that a geek dreads boredom above all else. We'll move mountains to accomplish a task, as long as it's interesting.

This is not nearly as crazy as it sounds, because after we've "finished" a task, without fail, the requestor will return and say, "I know I said that would just be a one-time change, but...", or, "Actually, it turns out we don't need A B C D, we need A B Q D C", or whatever. You will reuse that program, no matter what they say; never throw it away!

Tuesday, June 09, 2009

NCR

Pardon, oh Citizens of the World who read this, while I go regional for a moment and speak as a Dayton-area resident on news almost certainly irrelevant to you.

The big news here last week was NCR's decision to leave Dayton. Basically, three reasons have been given:

"the high availability of a skilled work force"
More direct international flights from Atlanta than from Dayton
$60 to $80 million in incentives from Georgia

The funny thing about the first reason is... NCR doesn't hire people. (Their manufacturing plants may hire, but I'm speaking of their HQ here in Dayton). Since I came to the Dayton area, I've had IT friends in NCR and have tried to keep up with them. The news from them has always been the same: "We just went through another round of downsizing. We keep wondering when our turn will come." For a company in continual contraction, the benefit of a larger pool of people to not hire seems... um, not clear.

That leaves shorter plane flights for those who fly to Europe - seems a strange reason to move 1,300 people - and a large amount of cash. Many people think Ohio should have tried to outbid Georgia, but that would have to come at the expense of companies that don't threaten relocation - and it begins to blur the line between "private company" and "state-funded entity", anyway.

Moving itself, of course, gets rid of those employees who choose not to relocate. I predict that most Ohioans who choose not to move with NCR will not be replaced; the company will use the natural contraction in place of one of its periodic downsizings.

Anyway, it's sad for Dayton, since the company had such a history here, but that's pretty much what NCR has been about here for years - history. For decades now, growth for Dayton - as for most cities - hasn't come from big, stable, traditional companies but from small companies, appearing and disappearing quickly as new opportunities appear and change and dry up. It's a less predictable business world, but that's the century we're in. No amount of sighing will prolong the 20th century

Friday, June 05, 2009

Python Magazine article

I've got an article in this month's Python Magazine: PyOhio: Planning and Running a Regional Python Miniconference. I try to cover some of the stuff we learned in the course of doing the first PyOhio, for the benefit of people considering staging similar conferences of their own. I feel a little silly impersonating an expert on the topic, since I'm near the beginning of a learning process that never ends - but in the open-source world, it's not being the ultimate guru that's important, it's taking the time to share whatever you can.

Python Magazine is a great publication, by the way - with all the good stuff about Python on the net, you might wonder what's the point of buying a magazine, but their articles are very well-chosen and there's a real advantage to being able to read it away from the computer.