Install Python. Quarto Render All the Things

PyData NYC 2022

Daniel Chen

Hello đź‘‹

Munsee Lenape

Daniel Chen

@chendaniely

Daniel Chen

Python + R

Quarto

What is Quarto?

Who Would Use Quarto?

  • Data Scientist
    • Jupyter Notebooks
      • Analysis
      • Reports + Documentation
  • Academic
    • Papers
  • Technical Writer
    • Blog
    • Website
    • Presentation
    • Book

Doesn’t Jupyter do that?

Julia + Python + R

Let’s talk about Jupyter Notebooks…

Daniel’s List

  • Technical Writing
    • âś… Literate programming
    • ❌ Editing JSON
  • Data Science
    • More an output format than a source document
    • âś… Great for posting code+output (e.g. a workshop)
    • ❌ Not great for source control collaborative document
  • Teaching
    • âś… nbgrader for course assignment creation + grading
    • âś… Restart Kernel > Run All

Quarto vs Jupyter

Quarto ➡️ Jupyter

fast.ai

Let’s do an analysis

Load - EDA - Plot - Model

from palmerpenguins import load_penguins

penguins = load_penguins()
penguins.head()
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 female 2007
3 Adelie Torgersen NaN NaN NaN NaN NaN 2007
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 female 2007

Load - EDA - Plot - Model

import pandas as pd

penguins.describe()
bill_length_mm bill_depth_mm flipper_length_mm body_mass_g year
count 342.000000 342.000000 342.000000 342.000000 344.000000
mean 43.921930 17.151170 200.915205 4201.754386 2008.029070
std 5.459584 1.974793 14.061714 801.954536 0.818356
min 32.100000 13.100000 172.000000 2700.000000 2007.000000
25% 39.225000 15.600000 190.000000 3550.000000 2007.000000
50% 44.450000 17.300000 197.000000 4050.000000 2008.000000
75% 48.500000 18.700000 213.000000 4750.000000 2009.000000
max 59.600000 21.500000 231.000000 6300.000000 2009.000000

Load - EDA - Plot - Model

from plotnine import ggplot, aes, geom_boxplot, theme_xkcd

(
  ggplot(
    data=penguins,
    mapping=aes(x="sex", y="body_mass_g", color="species")
  )
  + geom_boxplot()
  + theme_xkcd()
)
<ggplot: (327066350)>

Load - EDA - Plot - Model

import statsmodels.formula.api as smf

penguins["sex_01"] = penguins.sex.replace({"male": 1, "female":0})
pen_no_na = penguins.dropna()

log_reg = smf.logit("sex_01 ~ body_mass_g", data=pen_no_na).fit()
log_reg.params
Optimization terminated successfully.
         Current function value: 0.595563
         Iterations 5
Intercept     -5.162542
body_mass_g    0.001240
dtype: float64

Model Ops

Isabel Zimmerman
Holistic MLOps for
Better Science

Let’s go make things

Create .qmd or .ipynb

The YAML header

---
format: html
---
---
format: html
title: Quarto Document
subtitle: Data Science!
author: Daniel Chen
toc: true
toc-deph: 3
code-overflow: scroll
code-line-numbers: true
execute: 
  echo: true
keep-md: true
keep-ipynb: true
jupyter: python3
---

The .qmd code chunk

```{python}
#| echo: true
#| eval: true

from palmerpenguins import load_penguins

penguins = load_penguins()
penguins.head()
```
  • Markdown
  • Engine: {python}, {r}, etc
    • Knitr for {r}
    • Jupyter for any other block {python}, {julia}, etc
  • Chunk options: #|
  • Code

Make the document!

% quarto render examples/01-quarto.qmd

Executing '01-quarto.ipynb'
  Cell 1/4...Done
  Cell 2/4...Done
  Cell 3/4...Done
  Cell 4/4...Done

pandoc 
  to: html
  output-file: 01-quarto.html
  standalone: true
  section-divs: true
  html-math-method: mathjax
  wrap: none
  default-image-extension: png
  toc: true
  
metadata
  document-css: false
  link-citations: true
  date-format: long
  lang: en
  title: Quarto Document
  subtitle: Data Science!
  author: Daniel Chen
  toc-deph: 3
  jupyter: python3
  
Output created: 01-quarto.html

Works on your existing Jupyter notebook

% quarto render examples/02-jupyter.ipynb --execute

Starting python3 kernel...Done

Executing '02-jupyter.ipynb'
  Cell 1/5...Done
  Cell 2/5...Done
  Cell 3/5...Done
  Cell 4/5...Done
  Cell 5/5...Done

pandoc 
  to: html
  output-file: 02-jupyter.html
  standalone: true
  section-divs: true
  html-math-method: mathjax
  wrap: none
  default-image-extension: png
  
metadata
  document-css: false
  link-citations: true
  date-format: long
  lang: en
  
Output created: 02-jupyter.html

Profit!

Profit More!

This is a Shinylive application embedded in a Quarto doc.

#| standalone: true

from shiny import *

app_ui = ui.page_fluid(
    ui.input_slider("n", "N", 0, 100, 40),
    ui.output_text_verbatim("txt"),
)

def server(input, output, session):
    @output
    @render.text
    def txt():
        return f"The value of n*2 is {input.n() * 2}"

app = App(app_ui, server)

Plots make people go WOO

#| standalone: true
#| viewerHeight: 420

from shiny import App, render, ui
import numpy as np
import matplotlib.pyplot as plt
app_ui = ui.page_fluid(
    ui.layout_sidebar(
        ui.panel_sidebar(
            ui.input_slider("period", "Period", 0.5, 2, 1, step=0.5),
            ui.input_slider("amplitude", "Amplitude", 0, 2, 1, step=0.25),
            ui.input_slider("shift", "Phase shift", 0, 2, 0, step=0.1),
        ),
        ui.panel_main(
            ui.output_plot("plot"),
        ),
    ),
)
def server(input, output, session):
    @output
    @render.plot(alt="Sine function")
    def plot():
        t = np.arange(0.0, 4.0, 0.01)
        s = input.amplitude() * np.sin(
            (2 * np.pi / input.period()) * (t - input.shift() / 2)
        )
        fig, ax = plt.subplots()
        ax.set_ylim([-2, 2])
        ax.plot(t, s)
        ax.grid()
app = App(app_ui, server)

Maps?

#| standalone: true
#| viewerHeight: 420

from htmltools import css
from shiny import App, reactive, render, ui
from shinywidgets import output_widget, reactive_read, register_widget

import ipyleaflet as L

app_ui = ui.page_fluid(
    ui.div(
        ui.input_slider("zoom", "Map zoom level", value=12, min=1, max=18),
        ui.output_ui("map_bounds"),
        style=css(
            display="flex", justify_content="center", align_items="center", gap="2rem"
        ),
    ),
    output_widget("map"),
)


def server(input, output, session):
    # Initialize and display when the session starts (1)
    map = L.Map(center=(40.758896, -73.985130), zoom=12, scroll_wheel_zoom=True)
    # Add a distance scale
    map.add_control(L.leaflet.ScaleControl(position="bottomleft"))
    register_widget("map", map)

    # When the slider changes, update the map's zoom attribute (2)
    @reactive.Effect
    def _():
        map.zoom = input.zoom()

    # When zooming directly on the map, update the slider's value (2 and 3)
    @reactive.Effect
    def _():
        ui.update_slider("zoom", value=reactive_read(map, "zoom"))

    # Everytime the map's bounds change, update the output message (3)
    @output
    @render.ui
    def map_bounds():
        center = reactive_read(map, "center")
        if len(center) == 0:
            return

        lat = round(center[0], 4)
        lon = (center[1] + 180) % 360 - 180
        lon = round(lon, 4)

        return ui.p(f"Latitude: {lat}", ui.br(), f"Longitude: {lon}")


app = App(app_ui, server)

Shiny for Python!

“Interactive apps and dashboards made easy-ish”

Joe Cheng

Winston Chang

Python…

Common error

$ quarto preview talk.qmd 

Starting python3 kernel...Traceback (most recent call last):
  File "/opt/quarto/share/jupyter/jupyter.py", line 21, in <module>
    from notebook import notebook_execute, RestartKernel
  File "/opt/quarto/share/jupyter/notebook.py", line 16, in <module>
    import nbformat
ModuleNotFoundError: No module named 'nbformat'
Python 3 installation:
  Version: 3.10.8
  Path: /usr/bin/python3
  Jupyter: (None)

Jupyter is not available in this Python installation.
Install with python3 -m pip install jupyter

Python Setup

Virtual Environments

  • Built-in Python 3.5+ venv
  • pyenv-virtualenv plugin
  • pipenv
  • conda

Posit Academy

  • Pyenv + pipenv

UBC-MDS

Finding the binaries

  • Make sure you are in the correct environment
    • which python
    • pyenv versions
  • Check your Jupyter settings in Quarto
    • quarto check
  • In VSCode
    • Python: Select Interpreter

The Jupyter kernel

  • In your YAML:
jupyter: python3
  • You do not need to “register” the kernel in your env

python -m ipykernel install --user --name myenv --display-name "Python (myenv)"

All the Things!

What can you do: Gallery

How can you do: Get Started + Guide

Share: Github

  • Build from a branch
  • index.qmd

Share: Quartopub

Learn more

Try Quarto!

  1. https://quarto.org/
  2. Get Started (aka install)
  3. Guides (Pick a project)
  4. Website: quarto create-project mysite --type website
  5. quarto preview mysite
  6. Profit?