Jupyter notebook indicators platform: Baby step #1 – Finding an ORM

Warning: WP_Syntax::substituteToken(): Argument #1 ($match) must be passed by reference, value given in /home/djones/public_html/blog/wp-content/plugins/wp-syntax/wp-syntax.php on line 383

Warning: WP_Syntax::substituteToken(): Argument #1 ($match) must be passed by reference, value given in /home/djones/public_html/blog/wp-content/plugins/wp-syntax/wp-syntax.php on line 383

Warning: WP_Syntax::substituteToken(): Argument #1 ($match) must be passed by reference, value given in /home/djones/public_html/blog/wp-content/plugins/wp-syntax/wp-syntax.php on line 383

The last post documented early explorations of Jupyter notebooks ending with a simple query of a Moodle database. This post takes the first baby steps toward some sort of indicators platform using Jupyter notebooks, Python and github. The focus here is to find some form of ORM or other form of database independent layer.

Problem: the code from the last post was specific to Postgresql. If you’re Moodle database is based on another database that code won’t work. The aim here is to enable some level of sharing of code/analysis/indicators. This means needed a way to keep the code independent of database specifics. This is where object relational mappers (ORMS) enter the picture. See this for an argument why this is a good idea.

Pandas and SQLAlchemy

This book offers some examples using sqlalchemy with pandas. A likely combination. This sqlalchemy cheatsheet offers some useful examples.

Need to install sqlalchemy, it appears. Actually just updated

conda install -c anaconda sqlalchemy

Oh dear, time wasted. Needed to restart the notebook server after that.

Process is

Modify the config stuff to create an sqlalchemy engine.
Read the data

Ends up with the following code

import json
 
# Put the config outside the Indicators directory, keep it out of git
with open('../config.json') as f:
    conf = json.load(f)
 
from sqlalchemy.engine.url import URL 
from sqlalchemy import create_engine
 
engine = create_engine(URL(**conf))
 
df = pd.read_sql('select id,username,firstname,lastname ' +
                 'from moodle.mdl_user where id<100 ',engine)
df

The config.json file looks something like the following. The current plan is that this sits above this directory, as this directory and its contents will eventually end up in github

{
  "drivername": "postgresql",
  "database": "someDbaseName",
  "username": "user", 
  "host": "localhost",
  "port": "5432",
  "password": "myPassword"
}

What’s next?

That works and seems a reasonable. Some ideas for the next step

Figure out how to remove/handle the moodle schema that’s in the SQL above, not to mention the mdl_ prefix on the table.
Linked to allowing the code to be run across different institution’s easily.
Move the config code into a library for this work?
Set up the github repository, get this code shared and start working with @beerc on this.
Experiment with how the assumptions built into the Perl code I was using can be translated appropriately into this environment.
How to develop the class hierarchy (or if) using sqlalchemy.

How the perl idioms translate into python, sqlalchemy and pandas. Pandas has some nice features that might eliminate some practices.

John on Pondering if and how Hax & web components fit in BlackboardDecember 29, 2023
There is some good documentation on this particular CDN here: https://webcomponents.psu.edu/?path=/story/about-getting-started--using-penn-state-cdn
admin on You only get this type of education in class – mythic attributes of the lectureAugust 3, 2023
Has been interesting to revisit this post. First to cringe a touch at the writing (naive use of the "luddite"…
David Snowden on You only get this type of education in class – mythic attributes of the lectureJuly 28, 2023
I've got some sympathy with the academic you have labeled as a Luddite here. I think if the lecture is…
How do we design the need for design? – Col's Weblog on Concrete LoungeJuly 11, 2023
[…] needs (Allen & Varga, 2006). Because our IT systems tend not to evolve (they become concrete lounges), the reality…
Networked Learning – Radical Logos on Exploring connected versus/and networked learningJuly 10, 2023
[…] Also: Exploring connected versus/and networked learning […]

Assembling the heterogeneous elements for (digital) learning

Jupyter notebook indicators platform: Baby step #1 – Finding an ORM

Pandas and SQLAlchemy

What’s next?

Leave a Reply Cancel reply

Jupyter notebook indicators platform: Baby step #1 – Finding an ORM

Pandas and SQLAlchemy

What’s next?

Share this:

Playing with Python and Jupyter notebooks for analytics

Sharing “indicators platform” via github

Leave a Reply Cancel reply