How to Create an Ubuntu Server to Build an AI Product Using Docker

https://miro.medium.com/max/1200/0*y2eUUCuT6ppEXwaC

Original Source Here

— An Ubuntu Machine

Let’s start with Ubuntu 20.04. First, you must update the advanced package manager apt-get designed for Ubuntu (as well as other major distributions of Linux). Then, git software must be installed to let you checkout your codebase whenever needed. The codes below would be the very first lines in your Dockerfile.

FROM ubuntu:20.04
RUN apt-get update && apt-get -y install sudo
RUN sudo apt-get -y install software-properties-common
### INSTALL GIT
RUN sudo apt-get -y install git

— How to Install Python 3.7 on Ubuntu

You can install a clean Python 3.7 on your docker machine by using the code below. Note that you will need to install libraries needed for your project anyway. Here, I tried to keep it simple and clean.

### INSTALL PYTHON
RUN sudo apt-get -y install libssl-dev openssl
RUN sudo apt-get -y install libreadline-gplv2-dev libffi-dev
RUN sudo apt-get -y install libncursesw5-dev libsqlite3-dev
RUN sudo apt-get -y install tk-dev libgdbm-dev libc6-dev libbz2-dev
RUN sudo apt-get -y install wget
RUN apt-get update && apt-get install make
RUN sudo apt-get -y install zlib1g-dev
RUN apt-get -y install gcc mono-mcs && \
rm -rf /var/lib/apt/lists/*
RUN wget https://www.python.org/ftp/python/3.7.10/Python-3.7.10.tgz
RUN tar xzvf Python-3.7.10.tgz
RUN ./Python-3.7.10/configure
RUN sudo make install
RUN alias python=python3

— How to Install DATABRICKS-CONNECT on Ubuntu

If you work with real-world data, you may need to process a large volume of data that can not be managed easily on a small VPS or a powerful dedicated bare-metal server. Databricks introduced a very useful service that lets you connect to a highly optimized Databricks cluster for big data processing. Their powerful Spark-enabled machines let you process build large AI models in a matter of seconds. To do so, you must connect to the Databricks cluster through a library called databricks-connect . You can install the databricks-connect on an Ubuntu using the code below. To learn more about the step-by-step configuration of Databricks Cluster check this article: How to Connect a Local or Remote Machine to a Databricks Cluster.

### INSTALL JAVA
RUN sudo add-apt-repository ppa:openjdk-r/ppa
RUN sudo apt-get install -y openjdk-8-jre
### INSTALL DATABRICKS-CONNECT
RUN pip3 install --upgrade pip
RUN pip3 uninstall pyspark
RUN pip3 install -U databricks-connect==7.3.*

— How to Install DOCKER and DOCKER-COMPOSE on Ubuntu

Let’s say you are working on an API service that runs an AI engine underhood. Every time that the AI model gets updated you must build a new solution. In this case, you need to install the docker service on your machine to let you run commands such as docker build or docker push . You can run the code below to let you install docker on an Ubuntu machine.

### INSTALL DOCKER
RUN apt-get update && apt-get install -y curl
RUN curl -fsSL https://get.docker.com -o get-docker.sh
RUN sudo sh get-docker.sh
## INSTALL DOCKER-COMPOSE
RUN sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
RUN sudo chmod +x /usr/local/bin/docker-compose

— How to Install GCLOUD and KUBECTL on Ubuntu

If you want to use Google cloud services for your product, you must install gcloud to communicate with the Google infrastructure. For example, if you want to use their Container Registry service to store the docker images built during the project, you must install gcloud library. Plus, if you want to deploy your solution on this infrastructure using their powerful Kubernetes Engine, you must install kubectl. Note that you definitely can use other services such as AWS or Azure that will need their own configuration.

### INSTALL GCLOUD
RUN curl https://dl.google.com/dl/cloudsdk/release/google-cloud-sdk.tar.gz > /tmp/google-cloud-sdk.tar.gz
RUN mkdir -p /usr/local/gcloud \
&& tar -C /usr/local/gcloud -xvf /tmp/google-cloud-sdk.tar.gz \
&& /usr/local/gcloud/google-cloud-sdk/install.sh
ENV PATH $PATH:/usr/local/gcloud/google-cloud-sdk/bin
### INSTALL KUBECTL
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
RUN chmod +x ./kubectl
RUN mv ./kubectl /usr/local/bin

— How to Install NODE.JS on Ubuntu

If you want to create a simple user interface for your AI engine, you most likely will need to install Node. I highly recommend keeping it clean and light to ensure not having a memory issue down the road.

### INSTALL NODE.JSENV NODE_VERSION=12.6.0
RUN apt install -y curl
RUN curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.34.0/install.sh | bash
ENV NVM_DIR=/root/.nvm
RUN . "$NVM_DIR/nvm.sh" && nvm install ${NODE_VERSION}
RUN . "$NVM_DIR/nvm.sh" && nvm use v${NODE_VERSION}
RUN . "$NVM_DIR/nvm.sh" && nvm alias default v${NODE_VERSION}
ENV PATH="/root/.nvm/versions/node/v${NODE_VERSION}/bin/:${PATH}"
RUN node --version
RUN npm --version

— Last Words

You may find some redundancies in the codes above. For example, I may update theapt-get package manager several times. I did this to make sure each part can be executed standalone. Last but not least, it took me some time to find how to install essential software and libraries on an Ubuntu machine. Hope this can expedite your development.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: