Build JupyterNotebook(Anaconda) environment in AWS EC2

Key words: JupyterNotebook, Anaconda, AWS, EC2, Proxy, Winscp, Putty, DeepLearning

<Background>
  • The company employs a proxy server, through which all external connections must pass.
  • Company computers utilize Virtual Desktop Infrastructure (VDI), restricting the installation of large software such as Anaconda.

<Task>
As a Deep Learning engineer, it is necessary to set up a Jupyter Notebook environment on AWS cloud. However, services like SageMaker are costly, thus requiring a self-built environment on AWS EC2.

<AWS EC2 Configuration>
Create an EC2 Instance with the following specifications:
  • AMI: Ubuntu/Amazon Linux 2 AMI
  • CPU: t3.large
  • VPC: Public subnet with an Internet Gateway (IGW)
  • Elastic IP: Required, as dynamic IP addresses are prone to firewall interception.
  • Security Group: Inbound ports 8081 (Jupyter) and 22 (SSH) should be open; all outbound ports should be accessible.
  • Storage: Minimum of 16GB, as Anaconda alone requires 9GB.
  • Key Pair: Generate an RSA key pair, download the private key (pem) for SSH access.

<Connect to the EC2 Instance>
You can use AWS SSM Session Manager to access the instance, but I don't have the IAM permissions of Session Manager.

[SSH Login]
  • Download the latest version of PuTTY and use PuTTYgen to convert the PEM key to a PPK key.
  • PuTTY configuration
  • session: setthe Elastic IP's static IP as the Host Name (e.g., ubuntu@12.34.56.78) with port 22.
  • Connection-->Proxy: Configure Proxy settings.
  • Connection-->SSH-->Auth-->Credentials: specify the private key.
  • Click "Open" to establish the connection.

[File Transfer]
Download the latest version of WinSCP
  • transfer protocol: SFTP
  • host: Elastic IP's static IP
  • username: ubuntu
  • password: leave the field empty
  • Port: 22
  • Connection-->Proxy: Configure Proxy settings
  • Connection-->tunnel: leave the field empty, Optionally, configure tunnel settings under Connection if accessing an EC2 in a private subnet.
  • SSH-->Auth: specify the private key
  • Click "OK" and then "Login" to connect.
  • Note: Ensure you're using the latest version of WinSCP, as older versions may not support the upgraded key format, which could lead to connection issues.(I cost 1 hour+ to solve this problem)

[Set up Jupyter Notebook]
1.Download Anaconda 

$ mkdir /opt/anaconda
$ cd /opt/anaconda
$ wget https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.sh
$ bash Anaconda3-2024.02-1-Linux-x86_64.sh

Installing the software into /opt/anaconda is recommended.
Afterwards, set up the environment variable:

$ export PATH=/root/anaconda3/bin:$PATH
$ source ~/.bashrc


2.Configure Jupyter Notebook

$ mkdir ~/.jupyter
$ cd ~/.jupyter
$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycert.pem
Provide the requested information such as Country Name, State, Locality Name, etc.


Set the Password of JupyterNotebook
$ jupyter notebook password

Set the following configurations in the file:

$ vi jupyter_notebook_config.py

c.NotebookApp.ip='*' # Allows access from any IP address
c.NotebookApp.open_browser=False # Prevents automatic browser opening on Jupyter startup
c.NotebookApp.port=8081 # Specifies the port number (here, 8081)
c.NotebookApp.certfile = u'/home/ubuntu/.jupyter/mycert.pem' # Certificate file for HTTPS
c.NotebookApp.keyfile = u'/home/ubuntu/.jupyter/mykey.key' # Private key for HTTPS
c.NotebookApp.notebook_dir = '/home/ubuntu/work' # Specifies the root directory visible in Jupyter's WebUI

$ jupyter notebook &


[Accessing Jupyter Notebook]
Navigate to the following URL in your browser:
https://<Elastic IP's static IP>:8081
You can now access your Jupyter Notebook environment.




[JupyterLab Permanent Startup]
$ vi /etc/systemd/system/jupyter.service

[Unit]
Desctiption = Jupyter Lab
After = syslog.target
[Service]
Type=simple
WorkingDirectory = /home/ubuntu/work
Restart = always
ExecStart=/opt/anaconda/anaconda3/bin/jupyter-lab
User=ubuntu
Group=ubuntu
[Install]
WantedBy=multi-user.target

sudo systemctl start jupyter
sudo systemctl status jupyter

sudo systemctl enable jupyter

Comments

Popular posts from this blog

AWS Notes - Network - TransferFamily(Transfer) connect with EFS

AWS Notes - DevOps - OpsWorks