Sysadmin:Jupyterhub Notebook Server: Difference between revisions

From Earlham CS Department
Jump to navigation Jump to search
Kmmuter11 (talk | contribs)
No edit summary
Pelibby16 (talk | contribs)
 
(13 intermediate revisions by 4 users not shown)
Line 1: Line 1:
Jupyterhub is the multi-user version of Project Jupyter, which is an open-source web environment for running live, interactive code. If you're familiar with iPython, it's an upgraded version of that. Within Jupyter, you can run more than just iPython notebooks. There are kernels available for many different languages, including but not limited to: iPython, Julia, R, Ruby, Perl, Javascript, Haskell. You can get a full list of kernels [https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages here]. If you want to learn more about Project Jupyter and it's sub-projects, go to their [http://jupyter.org/index.html website]. A website that I've found useful when working with Jupyter is: [http://jupyter.readthedocs.org/en/latest/index.html Read the Docs].
Jupyterhub is the multi-user version of Project Jupyter, which is an open-source web environment for running live, interactive code. If you're familiar with iPython, it's an upgraded version of that. Within Jupyter, you can run more than just iPython notebooks. There are kernels available for many different languages, including but not limited to: iPython, Julia, R, Ruby, Perl, Javascript, Haskell. You can get a full list of kernels [https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages here]. If you want to learn more about Project Jupyter and it's sub-projects, go to their [http://jupyter.org/index.html website]. A website that I've found useful when working with Jupyter is: [http://jupyter.readthedocs.org/en/latest/index.html Read the Docs].
= Architecture =
Jupyterhub is a way to run a series of Jupyter notebooks for many users from one system.
Jupyterhub is installed on Bowie through its conda instance. See <code>/etc/systemd/system/jupyterhub.service</code> for details and <code> /etc/jupyterhub/jupyterhub_config.py</code> for the live configuration.


===Installation===
===Installation===
Line 5: Line 11:


===Upgrading Jupyterhub===
===Upgrading Jupyterhub===
Always, ALWAYS, ALWAYS backup config files before upgrading jupyterhub. You'll want to save jupyterhub_config.py, jupyterhub.sqlite, and jupyterhub_cookie_secret. These exist in whatever directory you chose as your config directory.  
Always, ALWAYS, ALWAYS backup config files before upgrading jupyterhub. You'll want to save jupyterhub_config.py, jupyterhub.sqlite, and jupyterhub_cookie_secret. These exist in whatever directory you chose as your config directory.
 
= Management =
 
=== Running the Jupyterhub service ===


We've installed Jupyterhub in systemd. As such, you can use <code>service jupyterhub $command</code> to <code>start</code>, <code>stop</code>, <code>restart</code>, or view the <code>status</code> of Jupyterhub.


===Installing Python Packages===
===Installing Python Packages===
Jupyter environments (kernels) are managed through Anaconda.
On Bowie, we have the following Python environments (as of Feb 2023):
<pre>
base    /mounts/bowie/software/anaconda3                ("Python 3" Jupyter kernel)            Python 3.7.4
py27    /mounts/bowie/software/anaconda3/envs/py27      ("Python 2" Jupyter kernel)            Python 2.7.17
py39    /mounts/bowie/software/anaconda3/envs/py39      ("Python 3.9" Jupyter kernel)          Python 3.9.1
</pre>
You can install packages a couple different ways with Anaconda. First, activate whichever environment corresponds to the Jupyter kernel you want to install a package into, then, use either <code>pip</code> or <code>conda</code> to install that package.


The full process looks something like this:
# <code>sysadmin@bowie:~$ conda activate base</code>                // Load the relevant conda env.
# <code>(base) sysadmin@bowie:~$ pip list | grep [PACKAGE]</code>    // Check to see if the package is already installed, and which version.
# <code>(base) sysadmin@bowie:~$ pip install [PACKAGE]</code>  // Install package using pip.
## Use <code>pip install [PACKAGE]==[VERSION]</code> to install a specific version.
# <code>(base) sysadmin@bowie:~$ python</code>    // Open a python shell to test packages.
## <code>>>> import [PACKAGE]</code>        // Test importing the installed package to make sure it works.


===Installing Kernels===
===Installing Kernels===
Within Jupyter, you can run more than just iPython notebooks. There are kernels available for many different languages, including but not limited to: iPython, Julia, R, Ruby, Perl, Javascript, Haskell. You can get a full list of kernels [https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages here].
To install the IRKernel, for R, using conda: <code>conda install -c r r-irkernel</code>.
Kernel files, among other things, are in <code>/usr/local/share/jupyter</code> and <code>/usr/share/jupyter</code>.
=== Killing Old Notebooks ===
JupyterHub has a [https://github.com/jupyterhub/jupyterhub-idle-culler separate project] for culling servers that haven't been active for X time period. It is installed via pip in the conda environment for Jupyter - <code>pip install jupyterhub-idle-culler</code> - and then sourced in the config file:
<pre>
import sys
c.JupyterHub.services = [
    {
            'name': 'idle-culler',
            'admin': True,
            'command': [
                sys.executable,
                '-m', 'jupyterhub-idle-culler',
                '--timeout=3600'
            ],
      }
]
</pre>
====Archival====
This may be out of date, as the Jupyterhub start command appears to also trigger cull-idle. Docs are here for posterity.
Keeping everyone's IPython notebooks alive tends to overload tools.cs.earlham.edu after a while.
There is a useful script called [https://github.com/jupyterhub/jupyterhub/tree/master/examples/cull-idle cull_idle_servers.py] that was added to <code>jupyterhub</code> in 2015, but apparently, it was not included with our installation. My guess is that it's only available from the source repository.
Luckily, I am able to grab the version of that script that corresponds to the version of <code>jupyterhub</code> that we have running.
Hopefully using this script will make tools.cs.earlham.edu more responsive and less likely to crash.
I have added this script, along with some others here:
<pre>
/root/jupyter_0.6.1_examples/
</pre>
I've created a wrapper script to handle authentication in that directory as <code>.../examples/cull-idle/kill_idle_servers.sh</code>.
This script should be run in the background, and by default, kills idle servers every 3 hours.
Unfortunately, I've had to modify the <code>cull_idle_servers.py</code> script to include the keyword argument <code>validate_cert=False</code> since we use a self-signed SSL certificate on tools. If we don't, and the script complains, this is why.
=== Running the script ===
<pre>
cd /root/jupyterhub_0.6.1_examples/cull-idle/
nohup ./kill_idle_servers.sh &
</pre>
==Nbgrader==
See [[Nbgrader notes | our notes for Nbgrader]].
= Troubleshooting and debugging =
===If Jupyter is down===
Our most common problem with Jupyter being "down" is that it is accessible but unresponsive. This in turn is usually because Jupyter Notebooks might consume a lot of RAM and are not consistently closed properly by users (including ourselves). The solution:
# Try to run the kill_idle_servers.sh script in <code>/home/nbgrader/cull-idle</code> (as of September 2018 this is giving an HTTP error still under investigation). This will produce authentication credentials and pass them on to the cull_idle_servers.py script.
# Failing that, restart Jupyter.
If you can't reach Jupyter at all, either it is not running (you can run <code>service jupyterhub restart</code> to bring it back) or the web server is down (verify this by going to, e.g., [https://tools.cs.earlham.edu the tools landing page]).
= Sage =
SageNB appears to have been [https://www.sagemath.org/notebook-vs-cloud.html deprecated]. We now run it as a Jupyter kernel. It also works as a command line tool, which requires only a Sage installation.
==Installation==
The steps [https://doc.sagemath.org/html/en/installation/source.html#build-from-source-step-by-step here] work.
===Adding to Jupyter===
Thanks to [https://stackoverflow.com/questions/39296020/how-to-install-sagemath-kernel-in-jupyter Stack Overflow] for the key steps for this.
First make sure you have completely installed both Jupyter and Sage.
Run <code>sudo jupyter kernelspec install /home/sage/sage-8.2/local/share/jupyter/kernes/sagemath</code>.
You'll also have to update kernel.json. The containing folder can be found with <code>jupyter kernelspec list</code>. Edit the path to the sage executable.
===Starting/Stopping===
Sage itself is run by a sage user, so all starting and stoping is done by that user.
# Connect to tools, which runs sage. <tt>ssh tools.cs.earlham.edu</tt>.
# Become the sage user: <tt>sudo su - sage</tt>
# Check if anything is running with <tt>ps auxww | grep sage</tt>.
# If you're restarting or stopping, then kill anything running that's associated with sage.
# There's a sage-x.x (version) dir in the sage user's home dir where all of the source is. <tt>cd sage-x.x</tt>
# Remove the nohup.out file. <tt>rm nohup.out</tt>
# To start: <code>nohup /home/sage/sage-x.x/sage --notebook=sagenb accounts=False automatic_login=False interface=&apos;&apos; port=8080 &</code>
==Other Programs==
For images and animations, Sage likes to have imagemagick and ffmpeg.
* imagemagick: Using sudo as some user (most recently it was syadmin), follow [https://www.imagemagick.org/script/install-source.php these instructions].
* ffmpeg: <code>sudo apt-get install ffmpeg</code>

Latest revision as of 15:50, 6 March 2023

Jupyterhub is the multi-user version of Project Jupyter, which is an open-source web environment for running live, interactive code. If you're familiar with iPython, it's an upgraded version of that. Within Jupyter, you can run more than just iPython notebooks. There are kernels available for many different languages, including but not limited to: iPython, Julia, R, Ruby, Perl, Javascript, Haskell. You can get a full list of kernels here. If you want to learn more about Project Jupyter and it's sub-projects, go to their website. A website that I've found useful when working with Jupyter is: Read the Docs.

Architecture

Jupyterhub is a way to run a series of Jupyter notebooks for many users from one system.

Jupyterhub is installed on Bowie through its conda instance. See /etc/systemd/system/jupyterhub.service for details and /etc/jupyterhub/jupyterhub_config.py for the live configuration.

Installation

Project Jupyter makes it pretty easy to install and setup Jupyterhub. For reference, look at their github.

Upgrading Jupyterhub

Always, ALWAYS, ALWAYS backup config files before upgrading jupyterhub. You'll want to save jupyterhub_config.py, jupyterhub.sqlite, and jupyterhub_cookie_secret. These exist in whatever directory you chose as your config directory.

Management

Running the Jupyterhub service

We've installed Jupyterhub in systemd. As such, you can use service jupyterhub $command to start, stop, restart, or view the status of Jupyterhub.

Installing Python Packages

Jupyter environments (kernels) are managed through Anaconda.

On Bowie, we have the following Python environments (as of Feb 2023):

base     /mounts/bowie/software/anaconda3                ("Python 3" Jupyter kernel)            Python 3.7.4
py27     /mounts/bowie/software/anaconda3/envs/py27      ("Python 2" Jupyter kernel)            Python 2.7.17
py39     /mounts/bowie/software/anaconda3/envs/py39      ("Python 3.9" Jupyter kernel)          Python 3.9.1

You can install packages a couple different ways with Anaconda. First, activate whichever environment corresponds to the Jupyter kernel you want to install a package into, then, use either pip or conda to install that package.

The full process looks something like this:

  1. sysadmin@bowie:~$ conda activate base // Load the relevant conda env.
  2. (base) sysadmin@bowie:~$ pip list | grep [PACKAGE] // Check to see if the package is already installed, and which version.
  3. (base) sysadmin@bowie:~$ pip install [PACKAGE] // Install package using pip.
    1. Use pip install [PACKAGE]==[VERSION] to install a specific version.
  4. (base) sysadmin@bowie:~$ python // Open a python shell to test packages.
    1. >>> import [PACKAGE] // Test importing the installed package to make sure it works.

Installing Kernels

Within Jupyter, you can run more than just iPython notebooks. There are kernels available for many different languages, including but not limited to: iPython, Julia, R, Ruby, Perl, Javascript, Haskell. You can get a full list of kernels here.

To install the IRKernel, for R, using conda: conda install -c r r-irkernel.

Kernel files, among other things, are in /usr/local/share/jupyter and /usr/share/jupyter.

Killing Old Notebooks

JupyterHub has a separate project for culling servers that haven't been active for X time period. It is installed via pip in the conda environment for Jupyter - pip install jupyterhub-idle-culler - and then sourced in the config file:

import sys
c.JupyterHub.services = [
     {
            'name': 'idle-culler',
            'admin': True,
            'command': [
                sys.executable,
                '-m', 'jupyterhub-idle-culler',
                '--timeout=3600'
            ],
      }
]

Archival

This may be out of date, as the Jupyterhub start command appears to also trigger cull-idle. Docs are here for posterity.

Keeping everyone's IPython notebooks alive tends to overload tools.cs.earlham.edu after a while. There is a useful script called cull_idle_servers.py that was added to jupyterhub in 2015, but apparently, it was not included with our installation. My guess is that it's only available from the source repository.

Luckily, I am able to grab the version of that script that corresponds to the version of jupyterhub that we have running. Hopefully using this script will make tools.cs.earlham.edu more responsive and less likely to crash.

I have added this script, along with some others here:

/root/jupyter_0.6.1_examples/

I've created a wrapper script to handle authentication in that directory as .../examples/cull-idle/kill_idle_servers.sh. This script should be run in the background, and by default, kills idle servers every 3 hours.

Unfortunately, I've had to modify the cull_idle_servers.py script to include the keyword argument validate_cert=False since we use a self-signed SSL certificate on tools. If we don't, and the script complains, this is why.

Running the script

cd /root/jupyterhub_0.6.1_examples/cull-idle/
nohup ./kill_idle_servers.sh &


Nbgrader

See our notes for Nbgrader.

Troubleshooting and debugging

If Jupyter is down

Our most common problem with Jupyter being "down" is that it is accessible but unresponsive. This in turn is usually because Jupyter Notebooks might consume a lot of RAM and are not consistently closed properly by users (including ourselves). The solution:

  1. Try to run the kill_idle_servers.sh script in /home/nbgrader/cull-idle (as of September 2018 this is giving an HTTP error still under investigation). This will produce authentication credentials and pass them on to the cull_idle_servers.py script.
  2. Failing that, restart Jupyter.

If you can't reach Jupyter at all, either it is not running (you can run service jupyterhub restart to bring it back) or the web server is down (verify this by going to, e.g., the tools landing page).

Sage

SageNB appears to have been deprecated. We now run it as a Jupyter kernel. It also works as a command line tool, which requires only a Sage installation.

Installation

The steps here work.

Adding to Jupyter

Thanks to Stack Overflow for the key steps for this.

First make sure you have completely installed both Jupyter and Sage.

Run sudo jupyter kernelspec install /home/sage/sage-8.2/local/share/jupyter/kernes/sagemath.

You'll also have to update kernel.json. The containing folder can be found with jupyter kernelspec list. Edit the path to the sage executable.

Starting/Stopping

Sage itself is run by a sage user, so all starting and stoping is done by that user.

  1. Connect to tools, which runs sage. ssh tools.cs.earlham.edu.
  2. Become the sage user: sudo su - sage
  3. Check if anything is running with ps auxww | grep sage.
  4. If you're restarting or stopping, then kill anything running that's associated with sage.
  5. There's a sage-x.x (version) dir in the sage user's home dir where all of the source is. cd sage-x.x
  6. Remove the nohup.out file. rm nohup.out
  7. To start: nohup /home/sage/sage-x.x/sage --notebook=sagenb accounts=False automatic_login=False interface='' port=8080 &

Other Programs

For images and animations, Sage likes to have imagemagick and ffmpeg.

  • imagemagick: Using sudo as some user (most recently it was syadmin), follow these instructions.
  • ffmpeg: sudo apt-get install ffmpeg