Building a Bioinformatics Docker Environment with Python, R, and Jupyter Lab
Published:
This post documents a Docker-based development environment tailored for bioinformatics research, combining Python, R (with Seurat for single-cell analysis), and Jupyter Lab in a single container with proper non-root user setup.
Overview
The setup provides:
- Non-root user configuration for better security and file permissions
 - Python 3 with virtual environments and Jupyter Lab
 - R with Seurat, ggplot2, data.table, and sf packages
 - Chinese mirror sources (Tsinghua University) for faster package downloads
 - Docker Compose profiles for development and release environments
 - Host network mode for seamless port access
 
Dockerfile
# docker build --build-arg UID=$(id -u) --build-arg GID=$(id -g) --build-arg USERNAME=biouser -t ubuntu-nonroot .
# FROM ubuntu:jammy
FROM ubuntu:25.04
# Install common packages
# Replace Ubuntu mirrors with Tsinghua
# ubuntu:jammy: /etc/apt/sources.list
RUN sed -i 's|archive.ubuntu.com|mirrors.tuna.tsinghua.edu.cn|g' /etc/apt/sources.list.d/ubuntu.sources \
    && sed -i 's|security.ubuntu.com|mirrors.tuna.tsinghua.edu.cn|g' /etc/apt/sources.list.d/ubuntu.sources \
    && apt update
# passwd is for delluser and delgroup
RUN apt install -y sudo adduser
# Create a non-root user
ARG USERNAME=biouser
ARG UID=1000
ARG GID=1000
# --remove-home ubuntu requires perl, so skip
RUN deluser ubuntu || true \
    && delgroup ubuntu || true \
    && groupadd -g 1000 biouser \
    && useradd -m -u 1000 -g biouser -s /bin/bash biouser \
    && echo "biouser ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
# RUN groupadd -g $GID $USERNAME \
#     && useradd -m -u $UID -g $USERNAME -s /bin/bash $USERNAME \
#     && echo "$USERNAME ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
# Switch to the new user
USER $USERNAME
WORKDIR /home/$USERNAME
ENV HOME=/home/$USERNAME
# need relogin and apt update to populate the database
RUN sudo apt install -y command-not-found
RUN sudo apt update
# to avoid tzdata require input in the following apt installation
ENV TZ=Asia/Shanghai
RUN sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ | sudo tee /etc/timezone
RUN sudo apt install \
    nano vim \
    build-essential cmake \
    -y
# ------------------------------------------------------------------- python
RUN sudo apt install -y python3 python3-pip python3-virtualenv
# RUN pip config set global.index-url https://mirrors.cloud.tencent.com/pypi/simple
RUN pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
# pip install some binaries to .local/bin folder
ENV PATH=$HOME/pyenvs/bio/bin:$HOME/.local/bin:$PATH
ENV BIOPY=$HOME/pyenvs/bio
RUN virtualenv $BIOPY
RUN echo "source $HOME/pyenvs/bio/bin/activate" >> $HOME/.bashrc
# ------------------------------------------------------------------- jupyter
RUN $BIOPY/bin/pip install jupyterlab
ENV JUPYTER_CONFIG_DIR=$HOME/.jupyter
RUN mkdir -p $JUPYTER_CONFIG_DIR
# Set hashed password
RUN echo '{ \
  "IdentityProvider": { \
    "hashed_password": "<replace with password genreation by jupyter server password>" \
  } \
}' > $JUPYTER_CONFIG_DIR/jupyter_server_config.json
# ------------------------------------------------------------------- R core
RUN sudo apt install -y r-base r-base-dev
# Speed up R compilation
RUN echo "MAKEFLAGS=-j$(nproc)" | sudo tee -a /etc/R/Makeconf
ENV MAKEFLAGS="-j8"
# the default repos sometimes cause download error, the tsinghua is very stable
RUN R_VERSION=$(Rscript -e 'cat(paste0(R.version$major, ".", strsplit(R.version$minor, "[.]")[[1]][1]))') && \
    mkdir -p ~/R/x86_64-pc-linux-gnu-library/${R_VERSION} && \
    echo ".libPaths('~/R/x86_64-pc-linux-gnu-library/${R_VERSION}')" >> ~/.Rprofile
RUN echo 'options(repos = c(CRAN = "https://mirrors.tuna.tsinghua.edu.cn/CRAN/"))' >> ~/.Rprofile
RUN Rscript -e 'install.packages("IRkernel"); IRkernel::installspec(user = TRUE)'
# ------------------------------------------------------------------- R application
# required by Seurat (as the installation message hint)
RUN sudo apt install -y libcurl4-openssl-dev libssl-dev
# Matrix is already bundled with R-base
RUN Rscript -e 'install.packages("Seurat")'
RUN Rscript -e 'install.packages(c("ggplot2", "data.table"))'
RUN sudo apt install -y libudunits2-dev libssl-dev cmake libgdal-dev libgeos-dev libproj-dev libsqlite3-dev libudunits2-dev
RUN Rscript -e 'install.packages("sf")'
# ------------------------------------------------------------------- jupyter service
ENV JUPYTER_PORT=8889
# set SHELL and -l (login shell) to make the terminal in jupyter with normal prompt
ENV SHELL=/bin/bash
CMD ["bash", "-lc", "jupyter lab --ip=0.0.0.0 --port=$JUPYTER_PORT --no-browser --notebook-dir=/"]
# Keep container alive for background dev
# CMD ["sleep", "infinity"]
Key R packages:
- Seurat: Single-cell RNA-seq analysis
 - ggplot2: Data visualization
 - data.table: Fast data manipulation
 - sf: Spatial data handling
 
Docker Compose Configuration
version: "3.9"
services:
  # =========================================================
  # Development profile: full rebuilds, uses local Dockerfile
  # =========================================================
  biodev:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        UID: ${UID:-1000}
        GID: ${GID:-1000}
        USERNAME: biouser
    image: bio:dev
    container_name: bio-dev
    hostname: docker-dev
    profiles: ["dev"]
    environment:
      - JUPYTER_PORT=8887
    extra_hosts:
      - "docker-dev:127.0.0.1"
    volumes:
      - /work:/work
    working_dir: /work
    tty: true
    network_mode: host
    shm_size: "2g"
  # =========================================================
  # Release profile: use prebuilt stable image
  # =========================================================
  biorelease:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        UID: ${UID:-1000}
        GID: ${GID:-1000}
        USERNAME: biouser  
    image: bio:release
    container_name: bio-release
    hostname: docker
    profiles: ["release"]
    restart: unless-stopped
    environment:
      - JUPYTER_PORT=8889
    extra_hosts:
      - "docker:127.0.0.1"
    volumes:
      - /work:/work
    working_dir: /work
    tty: true
    network_mode: host
    shm_size: "2g"
Key Features
- Profiles: Separate 
devandreleaseprofiles for different use cases - Host network mode: Direct access to ports on the host
 - Volume mounts: Mounts 
/workdirectories - Shared memory: 2GB 
shm_sizefor applications that need it (e.g., R) - TTY: Interactive terminal support
 
Usage
Build and Run Development Container
# Build and start development container
sudo docker compose -f docker-compose.yml --profile dev up -d --build
# Clean up unused images after build
docker image prune -f
Run Pre-built Release Container
# Start release container (no rebuild)
sudo docker compose -f docker-compose.yml --profile release up -d
# Or build and run release
sudo docker compose --profile release -f docker-compose.yml up --pull never --build -d && docker image prune -f
Access Jupyter Lab
After starting the container, access Jupyter Lab at:
- Development: 
http://localhost:8887 - Release: 
http://localhost:8889 
The password is configured in the Dockerfile (you’ll need to set your own hash for production).
Access Container Shell
sudo docker exec -it bio-release bash
Configuration Notes
Timezone
The Dockerfile sets the timezone to Asia/Shanghai:
ENV TZ=Asia/Shanghai
RUN sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ | sudo tee /etc/timezone
Command Default
The container runs Jupyter Lab by default:
ENV SHELL=/bin/bash
CMD ["bash", "-lc", "jupyter lab --ip=0.0.0.0 --port=$JUPYTER_PORT --no-browser --notebook-dir=/"]
Best Practices
- User IDs: Match container UID/GID with host user for volume mounts
 - Password Security: Replace the hardcoded Jupyter password hash with your own
 - Mirror Selection: Adjust package mirrors based on your geographic location
 - Resource Limits: Consider adding memory/CPU limits in docker-compose for production
 - Volume Permissions: Ensure mounted directories have appropriate permissions
 
Summary
This Docker setup provides a complete bioinformatics environment with:
- Secure non-root user execution
 - Python and R ecosystems
 - Jupyter Lab for interactive analysis
 - Optimized package installation via Chinese mirrors
 - Flexible deployment via Docker Compose profiles
 
Perfect for single-cell RNA-seq analysis with Seurat, general data science work, and reproducible research environments.
