This article uses the Nebula Graph process as an example to explain how to debug the process as if it were locally without destroying the contents of the original container or installing any toolkits in it.
1. Demand
In the process of development or testing, we often use the deployment method under the repo vesoft-inc/nebula-docker-compose, because in order to compress the volume of the docker image of each Nebula Graph service as much as possible, so during the development process All the commonly used tools are not installed, not even the editor VIM.
This makes it difficult for us to locate the problem inside the container, because every time we can only install some toolkits, we can carry out the next work, which is very troublesome. In fact, there is another way to debug the process inside the container. You don’t need to destroy the contents of the original container, and you don’t need to install any toolkit in it.
This kind of technology is actually quite common in the k8s environment, which is the sidecar mode. The principle is relatively simple, which is to start another container and let this container share the same pid/network namespace with the container you want to debug. In this way, the process and network space in the original container can be “at a glance” in the debugging container, and all the tools you want are installed in the debugging container, and the next stage is left for you to play.
2. Demo
Next, I will demonstrate how to operate
Let’s first deploy a Nebula Graph cluster locally using the docker-compose method described above. For the tutorial, see the README in the repo. The result after deployment is as follows
$ docker-compose up -d
Creating network "nebula-docker-compose_nebula-net" with the default driver
Creating nebula-docker-compose_metad1_1 ... done
Creating nebula-docker-compose_metad2_1 ... done
Creating nebula-docker-compose_metad0_1 ... done
Creating nebula-docker-compose_storaged2_1 ... done
Creating nebula-docker-compose_storaged1_1 ... done
Creating nebula-docker-compose_storaged0_1 ... done
Creating nebula-docker-compose_graphd_1 ... done
$ docker-compose ps
Name Command State Ports
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
nebula-docker-compose_graphd_1 ./bin/nebula-graphd --flag ... Up (health: starting) 0.0.0.0:32907->13000/tcp, 0.0.0.0:32906->13002/tcp, 0.0.0.0:3699->3699/tcp
nebula-docker-compose_metad0_1 ./bin/nebula-metad --flagf ... Up (health: starting) 0.0.0.0:32898->11000/tcp, 0.0.0.0:32896->11002/tcp, 45500/tcp, 45501/tcp
nebula-docker-compose_metad1_1 ./bin/nebula-metad --flagf ... Up (health: starting) 0.0.0.0:32895->11000/tcp, 0.0.0.0:32894->11002/tcp, 45500/tcp, 45501/tcp
nebula-docker-compose_metad2_1 ./bin/nebula-metad --flagf ... Up (health: starting) 0.0.0.0:32899->11000/tcp, 0.0.0.0:32897->11002/tcp, 45500/tcp, 45501/tcp
nebula-docker-compose_storaged0_1 ./bin/nebula-storaged --fl ... Up (health: starting) 0.0.0.0:32901->12000/tcp, 0.0.0.0:32900->12002/tcp, 44500/tcp, 44501/tcp
nebula-docker-compose_storaged1_1 ./bin/nebula-storaged --fl ... Up (health: starting) 0.0.0.0:32903->12000/tcp, 0.0.0.0:32902->12002/tcp, 44500/tcp, 44501/tcp
nebula-docker-compose_storaged2_1 ./bin/nebula-storaged --fl ... Up (health: starting) 0.0.0.0:32905->12000/tcp, 0.0.0.0:32904->12002/tcp, 44500/tcp, 44501/tcp
At this time, we will demonstrate in two scenarios, one is the process space and the other is the network space. First of all, we need to have a handy debugging image, we will not build it ourselves. We will find one that has been packaged from the docker hub for demonstration. Later, we feel that it is not enough. We can maintain a nebula-debug image and install us. For all the debugging tools you want, I will first borrow the nicolaka/netshoot solution in the community here. We first pull the mirror to the local
$ docker pull nicolaka/netshoot
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
vesoft/nebula-graphd nightly c67fe54665b7 36 hours ago 282MB
vesoft/nebula-storaged nightly 5c77dbcdc507 36 hours ago 288MB
vesoft/nebula-console nightly f3256c99eda1 36 hours ago 249MB
vesoft/nebula-metad nightly 5a78d3e3008f 36 hours ago 288MB
nicolaka/netshoot latest 6d7e8891c980 2 months ago 352MB
Let’s take a look at what it would be like to execute this mirror directly
$ docker run --rm -ti nicolaka/netshoot bash
bash-5.0# ps
PID USER TIME COMMAND
1 root 0:00 bash
8 root 0:00 ps
bash-5.0#
The above shows that this container cannot see any Nebula Graph service process content, so let’s add some parameters to it and take a look
$ docker run --rm -ti --pid container:nebula-docker-compose_metad0_1 --cap-add sys_admin nicolaka/netshoot bash
bash-5.0# ps
PID USER TIME COMMAND
1 root 0:03 ./bin/nebula-metad --flagfile=./etc/nebula-metad.conf --daemonize=false --meta_server_addrs=172.28.1.1:45500,172.28.1.2:45500,172.28.1.3:45500 --local_ip=172.28.1.1 --ws_ip=172.28.1.1 --port=45500 --data_path=/data/meta --log_dir=/logs --v=15 --minloglevel=0
452 root 0:00 bash
459 root 0:00 ps
bash-5.0# ls -al /proc/1/net/
total 0
dr-xr-xr-x 6 root root 0 Sep 18 07:17 .
dr-xr-xr-x 9 root root 0 Sep 18 06:55 ..
-r--r--r-- 1 root root 0 Sep 18 07:18 anycast6
-r--r--r-- 1 root root 0 Sep 18 07:18 arp
dr-xr-xr-x 2 root root 0 Sep 18 07:18 bonding
-r--r--r-- 1 root root 0 Sep 18 07:18 dev
...
-r--r--r-- 1 root root 0 Sep 18 07:18 sockstat
-r--r--r-- 1 root root 0 Sep 18 07:18 sockstat6
-r--r--r-- 1 root root 0 Sep 18 07:18 softnet_stat
dr-xr-xr-x 2 root root 0 Sep 18 07:18 stat
-r--r--r-- 1 root root 0 Sep 18 07:18 tcp
-r--r--r-- 1 root root 0 Sep 18 07:18 tcp6
-r--r--r-- 1 root root 0 Sep 18 07:18 udp
-r--r--r-- 1 root root 0 Sep 18 07:18 udp6
-r--r--r-- 1 root root 0 Sep 18 07:18 udplite
-r--r--r-- 1 root root 0 Sep 18 07:18 udplite6
-r--r--r-- 1 root root 0 Sep 18 07:18 unix
-r--r--r-- 1 root root 0 Sep 18 07:18 xfrm_stat
This time it’s a bit different, we see the process of metad0, and its pid is still 1. After seeing this process, it’s easy to do something about it. For example, can you attach it directly in gdb? Since there is no corresponding image with nebula binary at hand, I leave it to you to explore privately.
We have already seen that the pid space can be shared by specifying –pid container:, then let’s take a look at the network situation. After all, sometimes we need to capture a packet and execute the following command
bash-5.0# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
Nothing. It’s a bit different from what we expected. It is impossible for us to have a metad0 process without a connection. If you want to see the network space in this container, you need to add some more parameters, like the following way to start the debugging container
$ docker run --rm -ti --pid container:nebula-docker-compose_metad0_1 --network container:nebula-docker-compose_metad0_1 --cap-add sys_admin nicolaka/netshoot bash
bash-5.0# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 172.28.1.1:11000 0.0.0.0:* LISTEN -
tcp 0 0 172.28.1.1:11002 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:45500 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:45501 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.11:33249 0.0.0.0:* LISTEN -
udp 0 0 127.0.0.11:51929 0.0.0.0:* -
This time it is different from the above output. After adding the –network container:nebula-docker-compose_metad0_1 operating parameter, the connection status in the metad0 container can also be seen, so you can capture the packet and debug it.