Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Using JupyterHub on DRAC

JupyterHub sessions are indeed deployed as interactive jobs on the compute cluster, but this interactive usage pattern has several key differences and limitations compared to typical High-Performance Computing (HPC) batch job submissions (sbatch).

Deploying JupyterHub as an Interactive Job

You can deploy and use JupyterHub interactively on the clusters. Accessing resources via a Jupyter server is equivalent to accessing them through an interactive job on the corresponding cluster.

  1. Job Submission: When a user starts a new session on JupyterHub, the system automatically submits a new interactive job on the user’s behalf to the cluster scheduler.

  2. Available Clusters: JupyterHub is available on several general-purpose clusters, including Fir, Narval, and Rorqual, and was previously available on Béluga.

  3. Intended Use: JupyterLab and notebooks are designed for short interactive tasks, such as debugging, testing, or quickly visualizing data, typically lasting only a few minutes. For running longer analysis tasks, users must use a non-interactive job submission (sbatch).

  4. Resource Specification: When starting a JupyterHub session, users typically set parameters via a Server Options form, which dictates the requested compute resources for the underlying interactive job. These options can include the Account, Time (hours), Number of (CPU) cores reserved on a single node, Memory (MB) limit, and optionally, the GPU configuration.

Features and Use of JupyterNotebooks

Jupyter Notebook is a notebook authoring application, under the Project Jupyter umbrella. Built on the power of the computational notebook format, Jupyter Notebook offers fast, interactive new ways to prototype and explain your code, explore and visualize your data, and share your ideas with others.

Notebooks extend the console-based approach to interactive computing in a qualitatively new direction, providing a web-based application suitable for capturing the whole computation process: developing, documenting, and executing code, as well as communicating the results. The Jupyter notebook combines two components:

A web application: A browser-based editing program for interactive authoring of computational notebooks which provides a fast interactive environment for prototyping and explaining code, exploring and visualizing data, and sharing ideas with others

Computational Notebook documents: A shareable document that combines computer code, plain language descriptions, data, rich visualizations like 3D models, charts, mathematics, graphs and figures, and interactive controls

Main features of the web application

Notebook documents

Notebook documents contains the inputs and outputs of a interactive session as well as additional text that accompanies the code but is not meant for execution. In this way, notebook files can serve as a complete computational record of a session, interleaving executable code with explanatory text, mathematics, and rich representations of resulting objects. These documents are internally JSON files and are saved with the .ipynb extension. Since JSON is a plain text format, they can be version-controlled and shared with colleagues.

Notebooks may be exported to a range of static formats, including HTML (for example, for blog posts), reStructuredText, LaTeX, PDF, and slide shows, via the [nbconvert] command.

Furthermore, any .ipynb notebook document available from a public URL can be shared via the Jupyter Notebook Viewer nbviewer. This service loads the notebook document from the URL and renders it as a static web page. The results may thus be shared with a colleague, or as a public blog post, without other users needing to install the Jupyter notebook themselves. In effect, nbviewer is simply [nbconvert] as a web service, so you can do your own static conversions with nbconvert, without relying on nbviewer.

There are a lot of things to explore within JupyterNotebooks and JupyerHub. Please let me know if there’s a specific topic you’d like a deeper exploration of!

Differences from Typical HPC Usage

The interactive nature of JupyterHub jobs, as well as specific cluster configurations, introduce important distinctions compared to standard non-interactive batch processing on HPC systems:

1. Job Priority and Queueing

The order in which jobs are considered for scheduling on the clusters is determined by priority, typically using the Fair Tree algorithm. While batch submission (sbatch) is generally the most common and efficient way to use the clusters, interactive jobs like JupyterHub sessions have specific scheduling behaviors:

2. Networking Restrictions

A significant difference is the internet access limitation on the compute nodes where the Jupyter kernels run:

3. Resource Accounting

JupyterHub sessions are treated as scheduled jobs, and the resources requested impact future job priority.