Friday, March 29, 2013

%sql to Pandas

After getting %sql magic for IPython working, my next big goal was to figure out how to get those results into Pandas.

Er, OK, not such a big goal. Even with zero Pandas experience, it took about five minutes of skimming the first page of documentation to figure out:

In [1]: %load_ext sql

In [2]: data = %sql postgresql://will:longliveliz@localhost/shakes select * from work

In [3]: import pandas as pd

In [4]: s = pd.DataFrame.from_records(data, columns=data.keys)

This is not the only way to move data from an RDBMS to Pandas (there's pandas.io.sql, for example), and I don't know enough about Pandas to know if it's the best way. But I bet it's the easiest way.

released: %sql magic for IPython

Inspired and informed by discussions with the IPython developers at PyCon 2013, I've released ipython-sql, a %sql magic for IPython.

With this, I really think the IPython Notebook will become the most amazing database tool ever. In fact, virtually every computing problem will become a lot more workable when manipulated via the IPython Notebook - you can remember, inspect, and annotate all the steps to investigate whatever your issue is. All hail the Notebook! The FSF really chose well in choosing Fernando Perez for their 2012 award.

Friday, March 22, 2013

The Canadian menace

Incidentally, some people have been asking, "Wait a minute - PyCon-US in Canada? How does that work? Wouldn't it be more correct to call it PyCon-NA for North America?"

It might, if this were a case of nations cooperating to share PyCon. However, that is not the case. You will notice that the Canadians haven't cancelled their own PyCon. Rather, they have seized PyCon-US by ruthless volunteerism and are even now dragging it off to their stronghold on the St. Lawrence, to hoard it along with the PyCon they already have. That's right, they want ALL THE PYCONS.

Pythonistas of the world, be warned! When friendly faces from the North arrive to lend a hand, watch them carefully! Or you'll soon be flying to Vancouver for PyAr and Winnipeg for EuroPython.

Just to clarify: I am giddy over having PyCon in Montreal. I'm so excited that they'll probably need to name a Montreal Syndrome to go along with Jerusalem Syndrome and Paris Syndrome.

post-PyCon post

You might be sick of me saying after each PyCon, "That was the best PyCon ever!", but it's not my fault if it's true.

I hardly know where to start summing up the highlights...

  • PyPGDay was a great addition! I've had virtually no exposure to the PostgreSQL community before, so this was very valuable to me. Evan Klitzke from Uber gave a talk on migrating to PG from MySQL that is going to save me a ton of time, and Jeff Davis from Aster showed off the huge usefulness of range types.
  • Naomi had to talk me into the Education Summit, but I'm glad she did - I got a lot of great ideas and inspiration that will help in teaching future workshops.
  • One of these ideas was the use of Matt Davis' fantastic ipythonblocks, which will let us do graphical exercises right within the IPython Notebook - an amazing, intuitive, seamless learning experience.
  • Speaking of the IPython Notebook, this PyCon was really its tour de force. Seemed like everyone was using it to do and show just amazing things. If you haven't seen some talks on it yet, go watch a bunch of videos immediately and then watch a bunch more when the PyCon 2013 talks are online - the docs alone can't do justice to the possibilities the Notebook creates. We've only begun to take advantage of this fantastic environment.
  • I mentioned my ambition to create an IPython-based SQL client to Fernando Perez from the IPython team, and he jumped to show me what I need to know. The day after coming home, I checked in a %%sql magic. It's not ready for prime time yet (or even PyPi) and it may warrant merging into a similar project, but it was a delight to play off the capability of the Notebook.
  • Peter Wang and Travis Oliphant showed me - personally! - Wakari, an amazingly powerful online hosted Python environment. I can't wait to play.
  • I'm not going to list all the people I loved seeing and catching up with, because it wouldn't mean much to most blog readers. But the fact is it would be a lonely year if I couldn't see my PyCon friends (and make some new ones).
  • If GitHub had an AI, it would be looking at me funny and asking, "What's got into you lately?" PyCon, that's what. And I'm nowhere near done.

Thank you to the fantastic bunch of volunteers who make this such an amazing conference, and to all the participants who bring their ideas and their friendship.

Saturday, March 16, 2013

HTSQL lightning talk slides

I posted the slideshow from my PyPGDay HTSQL lightning talk here. Thanks to everybody involved with PyPGDay, I loved it!

Saturday, March 09, 2013

Forsooth, a dataset

Do you ever want a demo or sample dataset that doesn't bore you to death? How about one steaming with sex, murder, and mayhem?

I'll be giving a lightning talk on HTSQL at PyPGDay this Wednesday, and wanted to show it off with some data worthy of its awesomeness. How about Shakespeare? Yeah! Luckily, Open Source Shakespeare has published a database of all Will's works. Unluckily, they've only published it as flat text files and as a Microsoft Access database. ("Open Source" Shakespeare? In a closed-source database? And a horrible one at that? Yeah, I know.)

So I fixed that; opensourceshakespeare on GitHub is a port of the opensourceshakespeare.org data to the RDBMS the Bard himself would have used, PostgreSQL. Porting further to MySQL and SQLite is left as an exercise for the reader (for now; maybe I'll add those after PyCon.)

Enjoy! And if you're ever in Cincinnati, you have GOT to see the Cincinnati Shakespeare Company. They're like... they're like actually eating the food, when all you've done before is look at the recipes.