I have been building some python Docker images recently. Best practice is obviously not to run containers as root user and remove sudo privileges from the non-privileged user.
But I have been wondering what's the best way to go about this.
Here is an example Dockerfile
FROM python:3.10
## get UID/GID of host user for remapping to access bindmounts on host
ARG UID
ARG GID
## add a user with same GID and UID as the host user that owns the workspace files on the host (bind mount)
RUN adduser --uid ${UID} --gid ${GID} --no-create-home flaskuser
RUN usermod -aG sudo flaskuser
## install packages as root?
RUN apt update \
&& apt upgrade -y \
&& apt-get install -y --no-install-recommends python3-pip \
#&& [... install some packages ...]
&& apt-get install -y uwsgi-plugin-python3 \
## cleanup
&& apt-get clean \
&& apt-get autoclean \
&& apt-get autoremove --purge -y \
&& rm -rf /var/lib/apt/lists/*
## change to workspace folder and copy requirements.txt
WORKDIR /workspace/web
COPY ./requirements.txt /tmp/requirements.txt
RUN chown flaskuser:users /tmp/requirements.txt
## Install python packages as root?
RUN python3 -m pip install --disable-pip-version-check --no-cache-dir -r /tmp/requirements.txt
RUN chmod -R 777 /usr/local/lib/python3.11/site-packages/*
ENV PYTHONUNBUFFERED 1
ENV PYTHONPATH "${PYTHONPATH}:/workspace/web"
ENV PYTHONPATH "${PYTHONPATH}:/usr/local/lib/python3.10/site-packages"
## change to non-priviliged user to run container
USER flaskuser
CMD ["uwsgi", "uwsgi.ini"]
So my questions are:
Is installing packages with apt-get as root ok or should these be installed with the non-privileged user (with sudo which later should be removed)?
Best location to install these packages, i.e. /usr/local/ (as default when installing as root) or would it be preferable to install in user home?
When installing python packages with pip as root, I get the following warning
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
(However I don't need a venv since the docker image is already isolated for a single service, so I guess I can just ignore that warning).
Anything else I am missing or should be aware of?
NB: the bind mounted workspace is only for development, for a production image I would copy the necessary files/artifacts into the image/container.
Thanks
In general, the easiest safe approach is to do everything in your Dockerfile as the root user until the very end, at which point you can declare an alternate USER
that gets used when you run the container.
FROM ???
# Debian adduser(8); this does not have a specific known uid
RUN adduser --system --no-create-home nonroot
# ... do the various install and setup steps as root ...
# Specify metadata for when you run the container
USER nonroot
EXPOSE 12345
CMD ["my_application"]
For your more specific questions:
Is installing packages with apt-get as root ok?
It's required; apt-get
won't run as non-root. If you have a base image that switches to a non-root user you need to switch back with USER root
before you can run apt-get
commands.
Best location to install these packages?
The normal system location. If you're using apt-get
to install things, it will put them in /usr
and that's fine; pip install
will want to install things into the system Python site-packages directory; and so on. If you're installing things by hand, /usr/local
is a good place for them, particularly since /usr/local/bin
is usually in $PATH
. The "user home directory" isn't a well-defined concept in Docker and I wouldn't try to use it.
When installing python packages with pip as root, I get the following warning...
You can in fact ignore it, with the justification you state. There are two common paths to using pip
in Docker: the one you show where you pip install
things directly into the "normal" Python, and a second path using a multi-stage build to create a fully-populated virtual environment that can then be COPY
ed into a runtime image without build tools. In both cases you'll still probably want to be root.
Anything else I am missing or should be aware of?
In your Dockerfile:
## get UID/GID of host user for remapping to access bindmounts on host
ARG UID
ARG GID
This is not a best practice, since it means you'll have to rebuild the image whenever someone with a different host uid wants to use it. Create the non-root user with an arbitrary uid, independent from any specific host user.
RUN usermod -aG sudo flaskuser
If your "non-root" user has unrestricted sudo
access, they are effectively root. sudo
has some significant issues in Docker and is never necessary, since every path to run a command also has a way to specify the user to run it as.
RUN chown flaskuser:users /tmp/requirements.txt
Your code and other source files should have the default root:root
ownership. By default they will be world-readable but not writeable, and that's fine. You want to prevent your application from overwriting its own source code, intentionally or otherwise.
RUN chmod -R 777 /usr/local/lib/python3.11/site-packages/*
chmod 0777
is never a best practice. It gives a place for unprivileged code to write out their malware payloads and execute them. For a typical Docker setup you don't need chmod
at all.
The bind mounted workspace is only for development, for a production image I would copy the necessary files/artifacts into the image/container.
If you use a bind mount to overwrite all of the application code with content from the host, then you're not actually running the code from the image, and some or all of the Dockerfile's work will just be lost. This means that, when you go to production without the bind mount, you're running an untested setup.
Since your development environment will almost always be different from your production environment in some way, I'd recommend using a non-Docker Python virtual environment for day-to-day development, have good (pytest
) unit tests that can run outside the container, and do integration testing on the built container before deploying.
Permission issues can also come up if your application is trying to write out files to a host directory. The best approach here is to restructure your application to avoid it, storing the data somewhere else, like a relational database. In this answer I discuss permission setup for a bind-mounted data directory, though that sounds a little different from what you're asking about here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With