How to Setup an AWS Server and NVIDIA Xavier NX2 for Deep Learning?

Surya Murugavel Ravishankar
5 min readMar 9, 2021

--

Requirements:

  • AWS Server: Amazon EC2(G3 Tesla M60)
  • Client: Ubuntu 18.04.5 LTS(Optional)
  • Board: NVIDIA Xavier NX2

Table of Contents:

How to Login to your AWS instance through ssh?

You would require an id_rsa and id_rsa.pub files created for the server, now navigate to the folder containing these files in a terminal and enter

ssh -i id_rsa your_username@IP_Address_of_AWS_instance

You might have to set up a password for your first-time login, and use it to log in later whenever required. Make sure your username has “sudo” access!

Initial set up for AWS server:

  • CUDA version 10.1
  • Tensorflow Version 2.2.0rc2

After logging into the AWS server enter the following in the terminal to bring the files up-to-date

sudo apt-get update
sudo apt-get upgrade

NVIDIA Driver Installation:

Now download the NVIDIA driver in your client browser from the official site, and choose your preference:

  • Product Type: Data Center / Tesla
  • Product Series: M-Class
  • Product: M60
  • Operating System: Linux 64-bit Ubuntu 16.04
  • CUDA Toolkit: 10.1
  • Language: English (US)
  • Recommended/Beta: All

After downloading the NVIDIA driver version 418.*, send the file from client to server using the “scp” command:

scp -i id_rsa /NVIDIA-Linux-x86_64*.run your_username@IP_Address_of_AWS_instance:~/

It will take a while for the file to be sent to the server, then you can install and restart the server by using the commands:

sudo /bin/sh ./NVIDIA-Linux-x86_64*.run
sudo shutdown -r now

Your connection will be closed and you will be able to connect again after a while! Don’t panic. Enter the following command after logging into the server:

nvidia-smi

If the installation was successful you should be able to see something like this but different versions of course:

Congrats you are one step closer to the apocalypse!

CUDA and cuDNN installation:

Download Cuda(version 10.1) and cuDNN(Download cuDNN v7.6.5 (November 5th, 2019), for CUDA 10.1) files on the client system and send them to the AWS server through the “SCP” command:

scp -i id_rsa cuda_10.1.105_418.39_linux.run your_username@IP_Address_of_AWS_instance:~/scp -i id_rsa cudnn-10.1-linux-x64-v7.6.5.32.tgz your_username@IP_Address_of_AWS_instance:~/

Note that the version you are using might be different, edit the above command accordingly!

Now login to the AWS server and install CUDA after making it an executable file:

chmod +x cuda_10.1.105_418.39_linux.run
sudo ./cuda_10.1.105_418.39_linux.run

Now extract the cuDNN file and copy it into the CUDA installed directory:

tar -xzf cudnn-10.1-linux-x64-v7.6.5.32.tgz
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-10.1/lib64/
sudo cp cuda/include/cudnn.h /usr/local/cuda-10.1/include/

For the system to recognize the CUDA installation we will have to add the location of the installation directories manually by editing the “bash.bashrc” file:

sudo nano /etc/bash.bashrc

This would open the file in a “nano text editor”, add the following lines to the end of the file:

export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}$
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}$
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}$
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}$

Again make sure the version of your CUDA is the same as what you enter here. Save and exit the editor by using Ctrl+X and entering “Y” to accept the changes. Now to check if the installation was a success enter the following”

source /etc/bash.bashrc
sudo shutdown -r now
nvcc -V

This should now give an output similar to the following With CUDA version 10.1:

Tensorflow GPU installation:

sudo apt-get install python3-pip
sudo pip3 install -U pip
pip3 install tensorflow-gpu==2.2.0rc2

This should take a while, if everything has been set up properly you should be able to import TensorFlow and run the following in python:

python3import tensorflow as tf
tf.test.is_gpu_available()

If this returns “true”, NVIDIA drivers, CUDA, and Tensorflow with GPU were a success. Congrats you have saved yourself 2 weeks' worth of headache(Reference)!

Initial set up for Nvidia Xavier NX2:

  • Jetpack version 4.4
  • Cuda version 10.2
  • Tensorflow version 2.2

Initial Setup:

Download the JetPack SDK 4.4 for “JETSON XAVIER NX DEVELOPER KIT”, and write it to your SD card by following the steps in the NVIDIA Forum. The Jetpack already has CUDA and NVIDIA Drivers along with it. I had a LAN connection, monitor, keyboard, and mouse for the development board.

Important: If you are using an SD card that has more than 16 GB on the first boot follow the steps below to utilize the full SD card

oem-config

The default root disk partition is 14GiB. One approach to make it bigger is:

In the SDK manager “Linux_for_Tegra” subdirectory, before using SDK manager, edit “jetson-xavier-nx-devkit.conf” and add the line:

ROOTFSSIZE=30GiB

Then, on the NVIDIA Xavier NX2 Development Board:

sudo resize2fs /dev/mmcblk0p1

Then run the following commands:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get autoremove
sudo apt-get autoclean

Reboot the device, and enter the following in the terminal:

nvcc -V 

If it doesn’t work, add the CUDA directory to the “.bashrc” file after installing “nano”:

sudo apt-get install nano
sudo nano ~/.bashrc

Add the following lines to the end of the file:

export PATH=$PATH:/usr/local/cuda-10.2/bin
export CUDADIR=/usr/local/cuda-10.2
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.2/lib64

Save and exit by pressing Ctrl+x and entering “Y” for the prompt.

Tensorflow GPU installation:

sudo apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortransudo apt-get install python3-pip
sudo pip3 install -U pip
sudo pip3 install -U pip testresources setuptools numpy==1.16.1 future==0.17.1 mock==3.0.5 keras_preprocessing==1.0.5 keras_applications==1.0.8 gast==0.2.2 futures protobuf pybind11 h5py==2.9.0sudo pip3 install — pre — extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 tensorflow==2.2.0+nv20.7

This will take a while! If everything has been set up properly you should be able to import TensorFlow and run the following in python:

python3import tensorflow as tf
tf.test.is_gpu_available()

If this returns “true”, NVIDIA drivers, CUDA, and Tensorflow with GPU were a success. Congrats again you have saved yourself another 2 weeks’ worth of headache(Reference)!

--

--

Surya Murugavel Ravishankar
Surya Murugavel Ravishankar

Written by Surya Murugavel Ravishankar

Robotics and AI enthusiast, in pursuit of impactful ideas!

No responses yet