After getting %sql magic for IPython working, my next big goal was to figure out how to get those results into Pandas.
Er, OK, not such a big goal. Even with zero Pandas experience, it took about five minutes of skimming the first page of documentation to figure out:
In [1]: %load_ext sql
In [2]: data = %sql postgresql://will:longliveliz@localhost/shakes select * from work
In [3]: import pandas as pd
In [4]: s = pd.DataFrame.from_records(data, columns=data.keys)
This is not the only way to move data from an RDBMS to Pandas (there's pandas.io.sql, for example), and I don't know enough about Pandas to know if it's the best way. But I bet it's the easiest way.
6 comments:
https://bitbucket.org/zzzeek/calchipan/
if this magic uses sqlalchemy as a backend, then probably you should also check out this:
https://bitbucket.org/zzzeek/calchipan/
It does use SQLAlchemy (it's basically a pipeline to SQLAlchemy's executing sql.text), but Calchipan is significantly different; Calchipan works off SQLAlchemy expression constructs, not off actual expression strings. So which to use depends on whether you like raw SQL or SQLAlchemy expression constructs better. I plan to blog on the differences soon...
this is also the opposite of calchipan, this is relational data *into* pandas, calchipan and pandasql are about getting pandas data *out* into a relational API.
Thank you for excellent tool. But how to works with multiline queries and pandas:
result = %%sql WITH summary AS (
SELECT p.city_id,
p.hotel_id,
count(*) as cnt,
ROW_NUMBER() OVER(PARTITION BY p.city_id
ORDER BY count(*) DESC) AS rk
FROM booking p
where
group by
p.city_id, p.hotel_id)
SELECT s.*
FROM summary s
WHERE s.rk < 10;
doesn't work. The same thing with %sql
Hi, Roman. As far as I can tell, it's a fundamental limitation of multiline %%-style magics that you can't use them in assignment statements like that. The best you can do is run your %%sql cell by itself, and then in the next cell, use the `_` IPython history variable (which contains the last result) to assign.
So, in one cell,
%%sql WITH summary AS (
SELECT p.city_id,
<>
and in the next cell,
result = _
See http://ipython.org/ipython-doc/stable/interactive/tutorial.html#history
Post a Comment