Not being able to prep-node from bareOS

I’m trying to prep-node from one of bareOS (Ubuntu) in an internal network, but not successful. I’ve already tried to set up NAT on my firewall to forward ports to servers where it runs pf9_cli and pf9 node, based on doc https://docs.platform9.com/kubernetes/getting-started/bareos/networking-prerequisites/, but no luck.

part of the pf9_cli log like below, would you please help? please let me know if you need any other log
And, please do not remove my account as I’m still trying to create a cluster with it

account URL: https://pmkft-1598610465-10726.platform9.io

bill@c1-master1:~$ pf9ctl cluster prep-node -u bill -p ********** -i 192.168.0.223
Preparing nodes [####################################] 100%
Encountered an error while preparing the provided nodes as Kubernetes nodes. Code: 2, output log: /home/bill/pf9/log/node_provision_2020_09_06-04_25_02.log

TASK [wait-for-convergence : INFO starting wait_for_agent_convergence] **********************************************************************************
Sunday 06 September 2020 04:25:16 +0000 (0:00:00.027) 0:00:13.533 ******
ok: [192.168.0.223] => {
“msg”: “running wait_for_agent_convergence.sh pmkft-1598610465-10726.platform9.io **************************************************** k8s”
}

TASK [wait-for-convergence : wait for pf9-hostagent to converge] ****************************************************************************************
Sunday 06 September 2020 04:25:16 +0000 (0:00:00.032) 0:00:13.565 ******
fatal: [192.168.0.223]: FAILED! => {“changed”: true, “msg”: “non-zero return code”, “rc”: 1, “stderr”: “Shared connection to 192.168.0.223 closed.\r\n”,
“stderr_lines”: [“Shared connection to 192.168.0.223 closed.”], “stdout”: "[ waiting for pf9-hostagent to complete convergence ]\r\n–> TIMEOUT = 1200 se
conds\r\n–> flag_k8s=0\r\n File “”, line 1\r\n import sys, json; print json.load(sys.stdin)[“extensions”][“ip_address”][“status”]\r
n ^\r\nSyntaxError: invalid syntax\r\n(23) Failed writing body\r\n File “”, line 1\r\n import sys, json; print j
son.load(sys.stdin)[“extensions”][“ip_address”][“status”]\r\n ^\r\nSyntaxError: invalid syntax\r\n(23) Failed writing bo
dy\r\n File “”, line 1 …

Hi @meokey thanks for your question. I’ve flagged it to our team. As it’s a long weekend in the US a response may be a little delayed. Best, John

Which version of Ubuntu are you ruining on?

20.04.1 LTS (Focal Fossa)

@meokey The output in your initial post appears to be truncated; however, this appears to match up with an issue that was reported recently: https://github.com/platform9/express-cli/issues/98. The conflict is related to Python 2/3 code compatibility pertaining to logging in one particular script which monitors for host convergence.

It should not affect the actual outcome of the deployment, however. It is most likely that the node itself failed to converge and the hardcoded timeout value was exceeded. In this case, it would be best to check /var/log/pf9/hostagent.log to see if the pf9-kube role has failed. If so, then you would need to subsequently check /var/log/pf9/kube/kube.log to see where it has failed in this process.

The UI may also be able to indicate as to which step the node failed on – if it’s visible.

Please provide us any further details in this thread if you manage to find out any more!

@meokey can you please use Ubuntu 18.04.
We do not support 20.04.

in /var/log/pf9/hostagent.log, there are all logs like below, and I do not find /var/log/pf9/kube directory

at my home directory, I notice the following error message:
bill@c1-master1:~/pf9/log$ tail pf9ctl.log
2020-09-08 17:12:08,637 - pf9.cluster.commands - INFO - prep-node
2020-09-08 17:12:25,393 - pf9.express - INFO - pf9.expressInitialized
2020-09-08 17:12:25,395 - pf9.cluster.commands - INFO - prep-node
2020-09-08 17:33:55,443 - pf9.cluster.commands - ERROR - Encountered an error while preparing the provided nodes as Kubernetes nodes.
Traceback (most recent call last):
File “/home/bill/pf9/pf9-venv/lib/python3.8/site-packages/pf9/cluster/commands.py”, line 567, in prepnode
prep_node(ctx, user, password, ssh_key, adj_ips, node_prep_only=True)
File “/home/bill/pf9/pf9-venv/lib/python3.8/site-packages/pf9/cluster/commands.py”, line 60, in prep_node
raise PrepNodeFailed(msg)
pf9.cluster.exceptions.PrepNodeFailed: Code: 2, output log: /home/bill/pf9/log/node_provision_2020_09_08-17_12_29.log
bill@c1-master1:~/pf9/log$ tail /home/bill/pf9/log/node_provision_2020_09_08-17_12_29.log
===============================================================================
wait-for-convergence ------------------------------------------------- 1204.33s
gather_facts ----------------------------------------------------------- 63.52s
common ------------------------------------------------------------------ 7.92s
pf9-hostagent ----------------------------------------------------------- 4.29s
ntp --------------------------------------------------------------------- 1.38s
disable-swap ------------------------------------------------------------ 0.35s
include_role ------------------------------------------------------------ 0.14s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
total ---------------------------------------------------------------- 1281.92s

and, @the_fun_police you are right, I have no problem running the prep-node on Ubuntu 18.04.

please kindly advise if you are interested in helping to troubleshoot on 20.04.

================

And, what firewall ruled do I need to open up for the worker? I’ve successfully added the node (18.04) as master but it cannot be seen as worker. I’ve configured NAT as following. Please kindly advise. Thanks.

      Interface	Protocol	Source Address	Source Ports	Dest. Address	Dest. Ports	NAT IP	NAT Ports	Description	Actions
			WAN	TCP	52.88.38.208	*	WAN address	30000 - 32767	192.168.0.244	30000 - 32767	Platform9 to pf9-node1	  
			WAN	TCP	52.88.38.208	*	WAN address	4194	192.168.0.244	4194	Platform9 to pf9-node1	  
			WAN	TCP	52.88.38.208	*	WAN address	10255 - 10256	192.168.0.244	10255 - 10256	Platform9 to pf9-node1	  
			WAN	TCP	52.88.38.208	*	WAN address	10250	192.168.0.244	10250	Platform9 to pf9-node1	  
			WAN	TCP	52.88.38.208	*	WAN address	2379 - 2380	192.168.0.222	2379 - 2380	Platform9 to pf9-node1 as master	  
			WAN	TCP	52.88.38.208	*	WAN address	4001	192.168.0.222	4001	Platform9 to pf9-node1 as master	  
			WAN	TCP	52.88.38.208	*	WAN address	443 (HTTPS)	192.168.0.222	443 (HTTPS)	Platform9 to pf9-node1 as master

We only require outbound on port 443 to platform9.io
Did you get the cluster running??