Wednesday, June 24, 2009

don't need no stinking rules engine

There's a whole class of programs called "rules engines". The idea is to remove the details of a process from the hard-code of the program, store them externally, and view/modify them easily. The engine then converts the rules, stored in some sort of custom format, back into an executable form at runtime.

In my experience, Python is an effective rule engine. Thanks to Python's readability, you can store business rules as snippets of Python code - in textfiles, a database table, or wherever you prefer - and business users should be able to read them comfortably. After that, a very lightweight Python program can load the rules and the relevant data and use exec() or eval() to apply the rules to it.

One of my main projects is an example of this. It's program that synchronizes data between two Oracle databases. That sounds easy, but business details complicate it enormously:

  • Table and column naming, structure, and normalization differ
  • Only some rows are transferred, according to a complex set of business rules
  • Only some columns are transferred. Column values are combined, split, truncated, have functions applied, etc. Again, governed by a jungle of business rules
  • The business rules change continuallyRules must be documented. Letting documentation get out of synch with implemented rules is very bad.
  • Users may demand explanations for each decision made by the program, down to the row and column level

My first take on the problem was a large hard-coded PL/SQL procedure. What a nightmare!

Later, I rewrote the rules as snippets of Python. Each rule is stored a database table along with the dates it takes effect and expires, the person authorizing the rule, and a justification. This readable, self-documenting set of rules can also answer questions like, "Why did things change since last month?".

Unfortunately, I didn't know much about object-relational mappers when I wrote it, so the program has clunky data-fetching code. I'm currently working on a third version of the program that uses SQLAlchemy; the resulting program is very short. Broadly, here's what it does:

- Queries the row-level and column-level rules from their respective tables

- Fetches a row from the local database (ours) and the corresponding row from the remote database (theirs)

- The heart of the engine:

data = {'ours': ours, 'theirs': theirs}
for row_rule in row_rules:
if not eval(row_rule, data):
return
for column_rule in column_rules:
exec(column_rule, data)

Actually, the data dict also includes definitions of a few functions that some of the rules invoke. The function names are chosen to be self-explanatory to business users. For example:

def fiscalYear(inDate):
if inDate.month > 9:
result = inDate.year + 1
else:
result = inDate.year
return result

data = {'ours': ours, 'theirs': theirs, 'fiscalYear': fiscalYear}

I suppose it wouldn't be too hard to put the function definitions themselves in among the rules, then include locals() in with data, as long as execution order is controlled (easily done by putting execution_order columns in the rules tables). It hasn't been necessary for my project.

- Now row_rules has eval()-able entries like::

id start end code authorized reason
1 1/1/06 7/1/09 ours.funded == "Y" Bob I said so

and column_rules has exec()-able entries like::

id start end code authorized reason
1 2/2/08 if (ours.value > 1000): Steve to annoy Bob
theirs.value = ours.value


- Add some logging and a "test-run" capacity which reports on the changes without actually performing them. (It uses sqlalchemy.orm.attributes.get_history() for this; make sure to set autoflush=False if you use this, or intermediate flushes might clear the history.)

I suppose I could try writing a sample rules-engine implementation in simple, general terms, that people could crib from for their own "rules engine" applications. I wonder if that would be helpful, or if just the general idea is enough guidance.

Friday, June 19, 2009

Python for Secretaries

As if I need a new ambition, I've got an itch to create and teach a course called "Computer Programming for Secretaries". Since I got into IT via the secretarial pool, I think I'm the perfect one to do it.

To outsiders, programming has this horribly intimidating aura. You've got enterprisey Software Architects trying to sound professional, academic Computer Scientists telling you that you're oversimplifying the problem, fearsome Hacker Gods strutting their skillz. Lots of people want to make sure you know how smart they are, and that nothing could happen without their planet-sized brains.

but programming is not rocket science

If you want to launch satellites into space, you need to invest your life in the field and be part of a large, well-funded institution. Yes, you can have lots of fun with model rocketry, but you're just playing. You're not actually going to get anything into orbit.

programming is more like cooking

My friend James is a professional chef. Everything he makes involves a bunch of French words, ingredients I've never heard of, and turns out eyes-roll-back delicious. I don't have the ambition to invest the time and effort to cook that well... but I can still roast a turkey. That's what programming is like, especially dynamic language programming. With a lot of skill, you can work miracles - but with a little skill, you can work little miracles. You don't need to go in for the whole hog.

That's what I'd like a class to address. There are programming books aimed at kids, but none that I know of aimed at adult business users. There are people who could write themselves small, useful programs, but who will flee in well-justified terror if you start talking about overriding import hooks. There are people spending hours cutting-and-pasting from one file to another because they don't know how to write a six-line script. There are people could replace some of their daily tedium with just a little dose of Python. There is Resolver One, which is a fantastic way to integrate tiny dashes of Python with everyday spreadsheet work, but it's being used by thousands instead of by millions.

So... yeah. What should such a class include? More importantly, once I'm ready to teach such a class, where do I teach it?

Monday, June 15, 2009

how to tell a geek

Given a choice between spending an hour doing a task manually, or spending three hours writing a program to do it automatically... a geek will write the program, every single time. And, if not given the choice, if explicitly ordered to do the job manually, we'll disobey and write the program anyway. I've heard it said that a good geek is lazy, but I think it's more precise to say that a geek dreads boredom above all else. We'll move mountains to accomplish a task, as long as it's interesting.

This is not nearly as crazy as it sounds, because after we've "finished" a task, without fail, the requestor will return and say, "I know I said that would just be a one-time change, but...", or, "Actually, it turns out we don't need A B C D, we need A B Q D C", or whatever. You will reuse that program, no matter what they say; never throw it away!

Tuesday, June 09, 2009

NCR

Pardon, oh Citizens of the World who read this, while I go regional for a moment and speak as a Dayton-area resident on news almost certainly irrelevant to you.

The big news here last week was NCR's decision to leave Dayton. Basically, three reasons have been given:The funny thing about the first reason is... NCR doesn't hire people. (Their manufacturing plants may hire, but I'm speaking of their HQ here in Dayton). Since I came to the Dayton area, I've had IT friends in NCR and have tried to keep up with them. The news from them has always been the same: "We just went through another round of downsizing. We keep wondering when our turn will come." For a company in continual contraction, the benefit of a larger pool of people to not hire seems... um, not clear.

That leaves shorter plane flights for those who fly to Europe - seems a strange reason to move 1,300 people - and a large amount of cash. Many people think Ohio should have tried to outbid Georgia, but that would have to come at the expense of companies that don't threaten relocation - and it begins to blur the line between "private company" and "state-funded entity", anyway.

Moving itself, of course, gets rid of those employees who choose not to relocate. I predict that most Ohioans who choose not to move with NCR will not be replaced; the company will use the natural contraction in place of one of its periodic downsizings.

Anyway, it's sad for Dayton, since the company had such a history here, but that's pretty much what NCR has been about here for years - history. For decades now, growth for Dayton - as for most cities - hasn't come from big, stable, traditional companies but from small companies, appearing and disappearing quickly as new opportunities appear and change and dry up. It's a less predictable business world, but that's the century we're in. No amount of sighing will prolong the 20th century

Friday, June 05, 2009

Python Magazine article

I've got an article in this month's Python Magazine: PyOhio: Planning and Running a Regional Python Miniconference. I try to cover some of the stuff we learned in the course of doing the first PyOhio, for the benefit of people considering staging similar conferences of their own. I feel a little silly impersonating an expert on the topic, since I'm near the beginning of a learning process that never ends - but in the open-source world, it's not being the ultimate guru that's important, it's taking the time to share whatever you can.

Python Magazine is a great publication, by the way - with all the good stuff about Python on the net, you might wonder what's the point of buying a magazine, but their articles are very well-chosen and there's a real advantage to being able to read it away from the computer.

Friday, May 29, 2009

Where did you hear about... ?

I've served as PyCon's volunteer publicity chair the past two years. This year, at my request, the attendee survey had the question, "How did you hear about PyCon?"

Thanks to everybody who took the survey and answered that annoying question. Of course, most answers were along the lines of, "Duh, I've always known about PyCon!"

Basically, the answers helped to confirm that community buzz is what brings most people. (My favorite answer: "Birds.") Problem: not everybody is plugged into the buzz. I know lots of programmers who never read blogs or attend groups, and there are lots more that I don't know because they don't do blogs or groups, or otherwise plug themselves into the community.

What I most need is for everybody who didn't hear about PyCon to answer this question: "Why didn't you hear about PyCon? Where would you have seen a PyCon announcement?" The logistics of doing that survey are tricky, however.

How do you think word spreads among geeks, in this day and age?

Friday, May 22, 2009

Wanted: pictoral Field Guide to Nerds

My grandfather was an amazing man. He seemed to know every living soul in Duluth, Minnesota. He never forgot a face.

I didn't inherit that gene. Remembering faces and names is a huge challenge for me. "Very nice to meet you, Mrs. - wait, have we met before? Oh, Mom! I'm sorry." I can spend fifteen minutes in conversation with someone, and five minutes later be unable to bring their face into my mind. It's frustrating and humiliating. I need technological help.

I'd love a website full of labelled and indexed photos of the inhabitants of geekland, something I could brush up on before a conference, or study afterward to cement the new acquaintances into my memory.

Does something like this exist? Failing that, does anybody have some good ideas for how it could be made? At present, my best idea is for some sort of Flickr mashup.

Friday, May 15, 2009

documentation rant

Everybody knows that open-source software documentation sucks.

It doesn't, however, suck nearly as much as big-company proprietary software documentation, which Steve Holden characterizes as

Threep Nardling
To nardle threeps, select the Threep tab and check the Nardling checkbox.

... and so forth... a beautifully-typeset waste of electrons.

The questions we go to docs for are: Can I nardle a threep via SSH? I tried it, but my threep still isn't nardled. I got an "ERROR: Threep nardling failure" message. What now? If those questions aren't addressed, there's really no point.

These days, I do sometimes see open-source docs that address these questions. Most often, of course, you find answers to these questions in the user community.

Anyway, I suggest that, when users run into trouble, this is the preferable order of responses:

1. Change the program so that it acts as the users expected in the first place.
2. Use program interaction and informative error messages to guide users through problems without having to look outside the program.
3. Put the answer in the documentation - and make it easy to find or it doesn't count! Expecting users to comb painstakingly through 400 pages is not realistic.
4. Respond to individual questions via some sort of support process.

Small-to-medium FOSS projects are the best at responding in this order, probably because the same people - or at least people who know each other - are responsible for writing the program, documenting it, and answering user questions. At Big Software Corp., on the other hand, Support, Documentation, and Development are likely to be completely separate groups. The professionalization of documentation is deadly, because you get reams of nearly identical manuals that are pleasantly laid-out and nicely proofread, but completely unaware of realistic user problems. Support is painfully aware of user problems, but they have no process to tell Documentation and Development, "Hey, users keep getting confused here. Can we change the program and the docs to help them? Please? Because we'd really like to quit getting these questions?" (Perhaps some companies do have processes; I can only speculate, but I do know that I don't see evidence of it.)

Oracle, incidentally, only about halfway sucks in these matters, which is pretty good for a company so large. Their docs generally have some good, non-obvious substance, and sometimes - but, alas, only sometimes - troubleshooting information.

In short, I think you need users' pain to efficiently and accurately become developers' and documenters' pain, so that the causes will get fixed. Small FOSS projects have this kind of pain transfer built-in (when the authors publish their email addresses and invite questions). Everybody else should think about how to make it happen for their product.

Tuesday, May 12, 2009

managing installation requirements by Python version

I always feel guilty using this blog as my LazyWeb, but it works really well, so here we go again...

I'm trying out Oracle Enterprise Linux 5 - basically a clone of RHEL - and trying to get sqlpython installed on it. OEL5 comes with Python 2.4; sqlpython 1.6.5.1 needs Python 2.5. I could take out the 2.5 dependency in sqlpython, but it uses pyparsing. Pyparsing dropped 2.4 compatibility as of 1.5.2.

[EDIT: It turns out that pyparsing 1.5.2 still works on Python 2.4. It emits a scary error message, during easy_install under Python 2.4 -
except ParseException as err:
^
SyntaxError: invalid syntax

but it installs successfully nonetheless. Thanks to Paul McGuire to pointing that out. So my job in this case is simply to release a 2.4-compatible sqlpython 1.6.5.2.]

I could change install_requires=['pyparsing>=1.5.1'] to install_requires=['pyparsing==1.5.1'] in setup.py, but that locks everybody into an obsolete pyparsing whether they need it or not.

I'd like to install_requires=(['pyparsing>=1.5.1'] if python.version >= '2.5' else ['pyparsing==1.5.1']), but that's absolutely imaginary syntax.

Obviously, I could just download a newer Python and install it on my machine, but I'm trying to make sqlpython usable for DBAs who don't have that kind of authority, or who fear to muck around with their environment that way.

I wonder what the smart people do in situations like this?

Thursday, April 30, 2009

SpeakerRate

I've created a page at speakerrate.com for myself:

SpeakerRate profile

... and populated it with my two upcoming talks.

If you want to use SpeakerRate to give feedback to a speaker, you don't actually have to wait for them to create their own profile - you can enter it yourself.

I don't know if SpeakerRate will grow into the ultimate connecting-to-speakers webtool, but it might. I do know that they support microformats, which is a big plus in my book.

Hope to see some of you at Penguicon (this weekend) or IOUG Collaborate (next week)!

Wednesday, April 29, 2009

right to complaint

(prompted by discussion of porn use at GoGaRuCo - see here, here)

Quick thought: It's not that the community needs to ensure offensive content never happens, or that the community needs to find a single standard of what is appropriate.

The key is the right to complain safely. When complaints are predictably met with accusations of "overreacting", "political correctness", and "intolerance", the resulting message is: Be like us, be silent, or leave.

If you reject the criticism, then try something like, "I think you're wrong, but I accept your right to complain." Complaint is feedback, it's a legitimate part of a community's communication.

(Let me clarify that I've had mostly really good experiences in the software communities I participate in!)

Thursday, April 23, 2009

the expanding reStructuredTextiverse

It seems I'm always coming across new uses for reStructuredText, the plaintext format that goes everywhere. (Really, more of a "set of plaintext conventions" than a format as such.) I'm beginning to imagine a talk reviewing them all for next Fall's Ohio LinuxFest, or maybe a magazine article.

The places you can go with reStructuredText - am I missing any? (I haven't checked these all for viability)Not everything has been done yet, however. Here are a couple projects yet undone - so far as I know. Comment with your own ideas... or take this as a challenge and implement something!
  • rst2word - this would really be the holy grail, for communicating to the unwashed masses. (We in the know can use rst2odt and convert within OpenOffice, but rst2word would get my boss on board.)
  • Fuse more templating engines to rst (perhaps not a good idea, violate the readability principle?)
  • ReST lexer for Scintilla - this would allow ReST support in WingIDE, too.
[EDIT: Thanks Michael Foord for info about rest2web, rst2pdf.]

Tuesday, April 21, 2009

chocolate

We interrupt this blog with a special message from our sponsor.

My friend James has a new business, Two Bears Chocolates, handmaking organic chocolates. They are crazy-good... I've never been a food snob, but James's chocolates have spoiled me to the point where even "gourmet" mass-produced chocolates seem waxy and bland by comparison. They're a diet aid! I have to eat these chocolates so that I won't be tempted by lesser chocolates!

Anyway, they're expensive, but worth it, and a good gift (see: Mother's Day)... or a good way to maximize your yummy-per-calorie ratio.

He's particularly gung-ho to do custom orders. Go ahead, ask him to make a gooseberry-chocolate truffle in the shape of the state of Michigan. He'll love you for the challenge.

So, as James says, "If you love someone, send Two Bears chocolates. If not, bummer."

Monday, April 20, 2009

calm down

Yes, yes. Oracle is buying Sun, which owns MySQL and Java. No, this is not the end of MySQL. You're being silly.

Oracle is about as open-source friendly as a huge proprietary software company can be, and has been since before it was cool. Oracle adores Linux, and started pushing it vigorously since about, hm, 2002? Oracle has been Java-crazy since that time, too. Oracle's marketing strategy has long been against lock-in - it wants to plug easily into a thriving open-standards economy, not to enclose and lock a walled garden. It's also been very easygoing about licensing, eager to see casual, non-paying users gaining familiarity with its products, knowing that those are the seeds that later big-money sales will come from. It doesn't try to catch and squeeze little fish, it feeds them fish food and waits for them to grow into whales. In short, you over there, slapping MySQL on your Linux box for your brother's home business? Oracle doesn't want to shut you down. Oracle loves you, has always loved you, and wants your love and trust for when you get big.

In fact, if anything, I'm a little disappointed that Oracle's (superb) marketing power, name recognition, and corporate respect will all benefit MySQL and Java... which is all fine and good, except that I'd rather see that gust of wind behind PostgreSQL and Python. (OTN's PyCon sponsorship warmed my heart, to be sure, but I wish there was a way to make ORACLE + PYTHON stop-the-presses news all around techland.)

Monday, April 13, 2009

#amazonfail

If you haven't heard about #amazonfail, this article will catch you up quickly.

1. It's a good reminder that giving market dominance to one company in a crucial role in steering our culture is probably unwise. Let's not have a market where a single company can relegate books to obscurity, intentionally or not. Patronize a variety of booksellers.

2. A proper apology would look something like, "Mr. Doofus Middle Manager didn't think through what he was doing, and company management failed to supervise it properly. We feel humiliated and commit to new efforts to keep book culture diverse and uncensored." Amazon's lame, mealy-mouthed "glitch" non-apology suggests that it has fallen prey to Big Corporatosis. Unless, of course, Amazon really has discovered a homophobic computer glitch, in which case this is huge news for artificial intelligence researchers.

3. This is not the only case where the label "adult content" has had a strangling effect. Actions taken under the justification of "adult content" should never be blandly accepted, but should be carefully examined for accuracy, necessity, and bias.

[ EDIT: A final complaint: Labelling a bad policy, sloppily implemented and poorly supervised, as a "glitch", is blaming the company's technologists for a mistake of its management. It says that Amazon's management does not trust, understand, or respect its technologists. In a technology company, this is a strong signal of decline.)

Friday, April 10, 2009

Penguicon 7.0



If you're anywhere near Michigan, you need to consider Penguicon. It's an open-source software conference! It's a science-fiction con! It's two great tastes that really do taste great together. There's always a great deal of excellent technical content, and the SF people lend a really healthy sense of relaxation and creativity to the whole thing. Where else can you learn CSS and belly dancing in one weekend?

I'm giving a talk at Penguicon this year: "sqlpython: SQL is fun again". It's sort of a preview of my upcoming SQL*Plus Alternatives talk at IOUG Collaborate... but without the stuff about Oracle-only tools, and with more focus on sqlpython's rapidly developing cross-RDBMS powers, and a healthy plug to pull more people into the project.

In fact, I'm leaving directly from Penguicon to Collaborate. Yikes! It'll be a fun week.

Thursday, April 09, 2009

PyOhio: Call for Proposals

The PyOhio Call for Proposals has been issued!
PyOhio

PyOhio 2009 takes place July 25-26, 2009 at the Ohio State University in Columbus, Ohio. Much like a mini-PyCon, it includes scheduled talks, tutorials, Lightning Talks, Open Spaces, and room for your own unique ideas. If you can make it to Ohio this summer, please consider participating.



PyOhio 2009, the second annual Python programming mini-conference for Ohio and surrounding areas, will take place Saturday-Sunday, July 25-26, 2009 at the Ohio State University in Columbus, Ohio. A variety of activities are planned, including tutorials, scheduled talks, Lightning Talks, and Open Spaces.

PyOhio invites all interested people to submit proposals for scheduled talks and tutorials. PyOhio will accept abstracts on any topics of interest to Python programmers.

Standard presentations are expected to last 40 minutes with a 10 minute question-and-answer period. Other talk formats will also be considered, however; please indicate your preferred format in your proposal. Hands-on tutorial sessions are also welcomed. Tutorial instructors should indicate the expected length

PyOhio is especially interested in hosting a Beginners' Track for those new to Python or new to programming in general. If your proposal would be suitable for inclusion in the Beginners' Track, please indicate so. Organizers will work with speakers and instructors in the Beginners' Track to help them coordinate their talks/tutorials into a smooth, coherent learning curve for new Python users.

All proposals should include abstracts no longer than 500 words in length. Abstracts must include the title, summary of the presentation, the expertise level targeted, and a brief description of the area of Python programming it relates to.

All proposals should be emailed to cfp@pyohio.org for review. Please submit proposals by May 15, 2009. Accepted speakers will be notified by June 1.

You can read more about the conference at http://pyohio.org

If you have questions about proposals, please email cfp@pyohio.org. You can also contact the PyOhio organizers at pyohio-organizers@python.org.

Monday, April 06, 2009

geekspeakr.com

geekspeakr.com: connecting tech women speakers with event organizers
Many organisers of technical conferences, meetups, and dinners want to have more gender-balance in their lineups, but they don't know where to find technical women speakers.

Enter geekspeakr.com, a simple directory and connections system to help technical women speakers and event organisers to find each other.
I'm really glad to see this. I'm even more glad to see, browsing through speakers, that they really are technical women. I've... um, there's no good way to say this... seen other "tech women" groups that quickly became dominated by women networking for their multilevel marketing careers. It's pretty understandable, since they have a much more obvious need to network than us geeks - but, you know, it's really not the purpose.

Anyway. geekspeakr's are the real deal. w00t! Need more Python and Oracle speakers there, though.

Note to self: just as soon as the PyOhio CFP is out (very soon), do not neglect to spam Pythonistas on geekspeakr! At least, the ones vaguely near Ohio.

Monday, March 30, 2009

Five minutes at PyCon change everything

I gave a five-minute Lightning Talk on sqlpython on Saturday. I hoped it would pique the interest of some people who sometimes use Oracle, and give them a neat example of yet another cool thing being done with Python. It certainly did that, and I got lots of gratifying feedback.

I knew people would ask when it would be available for non-Oracle databases, so I said, tongue-in-cheek, that this was my distant-future ambition for "sqlpython 3000". What I didn't expect was that several of the people buttonholing me over the next two days would ask to collaborate to get multi-RDBMS support in place. Help? Uh, yeah... I guess help would help... I honestly hadn't even been thinking about that...

Brian Dorsey in particular wanted to see the code face-to-face with me, so I put a card on the Open Space board, just in case anybody else wanted to show up, and twittered about it one bare hour in advance.

Nine people came, all of them eager to get going on writing code, bringing great ideas to get started. All this for a project that was basically personal 36 hours before.

Noooo, now other people are going to be exposed to my squiggly code! Now I know what embarrassment-driven development really is.

If I'd had $1 million of startup funding to hire a staff to work on sqlpython, I couldn't have gotten a team that large or that talented. I figure that gives me better than a 1000-to-1 return on my PyCon investment. :)

So anyway, I'm setting up a mailing list for cooperation on sqlpython, and it looks like the far-future dream of multi-RDBMS sqlpython has suddenly become imminent. Stay tuned!

Saturday, March 28, 2009

sqlpython lightning talk follow-up

This morning's lightning talk on sqlpython was an example of what's so great about PyCon - it instantly really good suggestions about how to go onward with sqlpython development, and offers of collaboration. Awesome!

Except that I forgot to show the instant graphs using \b / \l terminators. Rats! That's the most eye-catching part!

I wasn't prepared for the common question, though: "Where's the repository?" Well, it's here:
https://www.assembla.com/wiki/show/sqlpython It is crazy-unstable, and if you're actually trying to use it, use the PyPI version instead.

I mean to get a link to that into the docs as soon as possible, but I don't have Sphinx configured right on this machine, so that may have to wait.

Quick review of the lightning talk:

* Unix-like powers: cat, ls, grep, >, |
* Python interactive session; access to resultsets (`r`) and bind variables (`binds`)
* Special output formats with alternate terminators (see `help terminators`)
* The magic that makes sqlpython work:
- cmd
- cmd2
- pyparsing
- code (for the embedded Python interpreter)
- cx_Oracle

For further reference, see the sqlpython docs, particularly the comparative review of sqlpython vs. SQL*Plus vs. gqlplus vs. Senora vs. YASQL.