Install Python. Quarto Render All the Things

PyData NYC 2022

Daniel Chen

Hello đź‘‹

Munsee Lenape

Daniel Chen

@chendaniely

Daniel Chen

Quarto

What is Quarto?

Who Would Use Quarto?

  • Data Scientist
    • Jupyter Notebooks
      • Analysis
      • Reports + Documentation
  • Academic
    • Papers
  • Technical Writer
    • Blog
    • Website
    • Presentation
    • Book

Doesn’t Jupyter do that?

Julia + Python + R

Let’s talk about Jupyter Notebooks…

Joel Grus JupyterCon 2018

Jeremy Howard “I like notebooks”

What I do like about Notebooks

Daniel’s List

  • Technical Writing
    • âś… Literate programming
    • ❌ Editing JSON
  • Data Science
    • More an output format than a source document
    • âś… Great for posting code+output (e.g. a workshop)
    • ❌ Not great for source control collaborative document
  • Teaching
    • âś… nbgrader for course assignment creation + grading
    • âś… Restart Kernel > Run All

Quarto vs Jupyter

Quarto ➡️ Jupyter

Let’s do an analysis

Load - EDA - Plot - Model

from palmerpenguins import load_penguins

penguins = load_penguins()
penguins.head()
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 female 2007
3 Adelie Torgersen NaN NaN NaN NaN NaN 2007
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 female 2007

Load - EDA - Plot - Model

import pandas as pd

penguins.describe()
bill_length_mm bill_depth_mm flipper_length_mm body_mass_g year
count 342.000000 342.000000 342.000000 342.000000 344.000000
mean 43.921930 17.151170 200.915205 4201.754386 2008.029070
std 5.459584 1.974793 14.061714 801.954536 0.818356
min 32.100000 13.100000 172.000000 2700.000000 2007.000000
25% 39.225000 15.600000 190.000000 3550.000000 2007.000000
50% 44.450000 17.300000 197.000000 4050.000000 2008.000000
75% 48.500000 18.700000 213.000000 4750.000000 2009.000000
max 59.600000 21.500000 231.000000 6300.000000 2009.000000

Load - EDA - Plot - Model

from plotnine import ggplot, aes, geom_boxplot, theme_xkcd

(
  ggplot(
    data=penguins,
    mapping=aes(x="sex", y="body_mass_g", color="species")
  )
  + geom_boxplot()
  + theme_xkcd()
)
<ggplot: (377115112)>

Load - EDA - Plot - Model

import statsmodels.formula.api as smf

penguins["sex_01"] = penguins.sex.replace({"male": 1, "female":0})
pen_no_na = penguins.dropna()

log_reg = smf.logit("sex_01 ~ body_mass_g", data=pen_no_na).fit()
log_reg.params
Optimization terminated successfully.
         Current function value: 0.595563
         Iterations 5
Intercept     -5.162542
body_mass_g    0.001240
dtype: float64

Model Ops

Isabel Zimmerman
Holistic MLOps for
Better Science

Let’s go make things

Create .qmd or .ipynb

The YAML header

---
format: html
---
---
format: html
title: Quarto Document
subtitle: Data Science!
author: Daniel Chen
toc: true
toc-deph: 3
code-overflow: scroll
code-line-numbers: true
execute: 
  echo: true
keep-md: true
keep-ipynb: true
jupyter: python3
---

The .qmd code chunk

```{python}
#| echo: true
#| eval: true

from palmerpenguins import load_penguins

penguins = load_penguins()
penguins.head()
```
  • Markdown
  • Engine: {python}, {r}, etc
    • Knitr for {r}
    • Jupyter for any other block {python}, {julia}, etc
  • Chunk options: #|
  • Code

Make the document!

% quarto render examples/01-quarto.qmd

Executing '01-quarto.ipynb'
  Cell 1/4...Done
  Cell 2/4...Done
  Cell 3/4...Done
  Cell 4/4...Done

pandoc 
  to: html
  output-file: 01-quarto.html
  standalone: true
  section-divs: true
  html-math-method: mathjax
  wrap: none
  default-image-extension: png
  toc: true
  
metadata
  document-css: false
  link-citations: true
  date-format: long
  lang: en
  title: Quarto Document
  subtitle: Data Science!
  author: Daniel Chen
  toc-deph: 3
  jupyter: python3
  
Output created: 01-quarto.html

Works on your existing Jupyter notebook

% quarto render examples/02-jupyter.ipynb --execute

Starting python3 kernel...Done

Executing '02-jupyter.ipynb'
  Cell 1/5...Done
  Cell 2/5...Done
  Cell 3/5...Done
  Cell 4/5...Done
  Cell 5/5...Done

pandoc 
  to: html
  output-file: 02-jupyter.html
  standalone: true
  section-divs: true
  html-math-method: mathjax
  wrap: none
  default-image-extension: png
  
metadata
  document-css: false
  link-citations: true
  date-format: long
  lang: en
  
Output created: 02-jupyter.html

Profit!

Profit more!

Shiny for Python: Interactive apps and dashboards made easy-ish

Joe Cheng

Winston Chang

Python…

Common error

$ quarto preview talk.qmd 

Starting python3 kernel...Traceback (most recent call last):
  File "/opt/quarto/share/jupyter/jupyter.py", line 21, in <module>
    from notebook import notebook_execute, RestartKernel
  File "/opt/quarto/share/jupyter/notebook.py", line 16, in <module>
    import nbformat
ModuleNotFoundError: No module named 'nbformat'
Python 3 installation:
  Version: 3.10.8
  Path: /usr/bin/python3
  Jupyter: (None)

Jupyter is not available in this Python installation.
Install with python3 -m pip install jupyter

Python Setup

Virtual Environments

  • Built-in Python 3.5+ venv
  • pyenv-virtualenv plugin
  • pipenv
  • conda

Posit Academy

  • Pyenv + pipenv

UBC-MDS

Finding the binaries

  • Make sure you are in the correct environment
    • which python
    • pyenv versions
  • Check your Jupyter settings in Quarto
    • quarto check
  • In VSCode
    • Python: Select Interpreter

The Jupyter kernel

  • In your YAML:
jupyter: python3
  • You do not need to “register” the kernel in your env

python -m ipykernel install --user --name myenv --display-name "Python (myenv)"

Other formats

What can you do: Gallery

How can you do: Get Started + Guide

Share: Github

Share: Quartopub

  • https://quartopub.com/

Learn more

Try Quarto!

  1. https://quarto.org/
  2. Get Started (aka install)
  3. Guides (Pick a project)
  4. Website: quarto create-project mysite --type website
  5. quarto preview mysite
  6. Profit?