--- myst: html_meta: "description lang=en": "Kasm Workspaces end-to-end connectivity advanced troubleshooting." "keywords": "Kasm, Troubleshooting, Debugging" "property=og:locale": "en_US" --- ```{title} Troubleshooting ``` ## Connectivity Troubleshooting When deploying Kasm at scale in enterprise type settings, with advanced L7 firewalls, proxies, and other security devices in the path, there is no end to the possible combinations of configurations and devices at play. Therefore, an advanced guide is needed to assist engineers in finding potential problems within their environment that may be keeping Kasm from working properly. ```{note} When troubleshooting this issue, always test by creating new sessions after making changes. Resuming a session may not have the changes applied. ``` (session_loading)= This section assumes that users are able to get to the Kasm Workspaces application and login. Users are able to create a session and Kasm does provision a container or VM session for the user. The user would typically see the following connecting screen which may loop, the screenshot below is shown with Chrome Developer Tools open. ```{figure} /images/troubleshooting/advanced_connectivity/requesting_kasm_devtools.webp :align: center **Session hangs at requesting Kasm** ``` ### Browser Extensions As a first step, open a new Incognito Window, ensure no other Incognito Windows are open before opening a new one. This will ensure that browser extensions are disabled. It will also ensure [cookie collisions](#cookie-conflict) do not occur. Ensure to use the Incognito Window for all further testing as you progress through the following sections. If the problem goes away immediately when using an Incognito Window, try a normal window but disable all browser extensions. Enable them one at a time until the issue appears again. If you cannot load a session with all browser extensions disabled, but you can load a session in an Incognito Window, then it is likely a cookie issue. Follow the troubleshooting steps in the [Validating Session Cookies](#validate-session-cookies), [Cookie Conflict](#cookie-conflict), and [Cookie Transit](#cookie-transit) sections. ### Validate Session Port Here is an expanded view of the above example with the Url field extended. In this below example, the port going to the KasmVNC session is on 8443 while the Kasm Web application seems to be on port 443. This is likely a misconfiguration. Many organizations will configure Kasm Workspaces to run on a high port number internally, but then proxy to the internet on port 443. This is very common in the DoD and Federal sector where DISA STIGs disallow the use of privileged low port numbers on internal servers. ```{figure} /images/troubleshooting/advanced_connectivity/devtools_wrong_port.webp :align: center **Session hangs at requesting Kasm** ``` What typically occurs is that Kasm Workspaces is installed internally on a high port number. When accessing Kasm directly on the high port number, it works fine. When accessing Kasm externally through a proxy on port 443, Kasm sessions fail to load. The setting that applies here is the [Zone](../zones/deployment_zones.md#configuring-deployment-zones) proxy port setting. This port setting is relative to the client. So if Kasm has been configured to internally be running on 8443, but it is proxied by an F5, for example, on port 443, the Zone setting `Proxy Port` should be set to port 443. After changing the setting, you will need to destroy any existing sessions. Newly created sessions will pick up the new port change. ### Validate Session Cookies While standard API calls use tokens in the JSON body of the request, requests to KasmVNC use two cookies that authorize the connection. The cookies are validated at each hop in the path to the users container. KasmVNC itself does not check the cookies, instead, the last NGINX server in the path injects a HTTP Authorization header with a unique token. The client never has access to or knowledge of this token. Further sections will walk through validating this process, this section will focus on ensuring the cookies are present and confirming they are making it all the way to the last hop. ```{figure} /images/troubleshooting/advanced_connectivity/devtools_cookies_present.webp :align: center **Session hangs at requesting Kasm** ``` Using the above screenshot as an example, find the request to load vnc.html and select the request. On the Cookies sub tab for the request, ensure the checkbox `show filtered out request cookies` is unchecked. The username and session_token cookies are the ones that are required by Kasm. Ensure there is only one of each. If the username or session_token cookie is missing, check the `show filtered out request cookies` to see if the cookie is there but being blocked by your browser. See the sub section [Browser Blocking Cookies](./advanced_connection_troubleshooting.md#browser-blocking-cookies) for troubleshooting this issue. If there are multiple usernames or multiple session_token cookies, see the [Cookie Conflict](./advanced_connection_troubleshooting.md#cookie-conflict) section. If both cookies are present and there is only one of each, continue on to the [Cookie Transit](./advanced_connection_troubleshooting.md#cookie-transit) subsection. #### Cookie Conflict If you have Kasm Workspaces deployed with a subdomain on your company's primary domain name, such as kasm.apps.acme.com, you may see a very long list of cookies. This is because your company may have hundreds of other websites under the apps.acme.com domain name and these applications may be configured [insecurely](https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies#security) by allowing their cookies to be sent to all applications under acme.com. This can cause many issues, the first issue is that app.acme.com may have a username or session_token cookie as well, causing overlapping cookies. To fix this, the offending application(s) needs to be identified. Ensure the domain field of cookies for the offending application is made more specific so that they are not applied to all hostnames under the same domain. Another issue that can occur is that other applications on your domain may not be following the [specification](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie#attributes) for cookies. Kasm Technologies has identified multiple instances where other applications on a corporate domain are both inappropriately setting the domain field on the cookie and setting cookies with invalid characters in the name. Unfortunately browsers have sided with being more amendable to servers not meeting the specifications and as a result these invalid cookies are sent to all web applications on the corporate domain. Not all web servers will be as willing to handle invalid cookies as some browsers. The Kasm API service will stop processing cookies on a request once an invalid cookie is hit. Fortunately, as of Kasm Workspaces 1.14.0, Kasm blocks all cookies at the NGINX container, which do not match any of the names of cookies that Kasm sets. As stated earlier, KasmVNC does not utilize cookies, but the NGINX proxy sitting in front of user containers uses the cookies to authorize the incoming request and places a session token in a header for KasmVNC. While KasmVNC does not use cookies, it is limited in the total length of any header, to include the cookies header. As of Kasm Workspaces 1.14.0, the cookies header is modified by NGINX. All cookies that are not relevant to KasmVNC are dropped. In conclusion, if you see a lot of cookies being sent, ensure you are on Kasm Workspaces 1.14.0, this will fix most issues, but it will not fix cookie conflicts. If you see multiple cookies with the name username or session_token, follow the above guidance. #### Browser Blocking Cookies Browsers can block cookies for a number of reasons. Browser extensions and security software can block cookies for any number of reasons. A good place to start would be the Console tab within DevTools. If the browser itself is blocking the cookies, it will usually list the reason. Common reasons for the browser itself to block the cookie would include [Cross-Origin Resource Sharing](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS) security mechanisms. If Console tab output indicates CORS issues, ensure you are on the latest release of Kasm Workspaces. Kasm Workspaces 1.14.0 included better handing of CORS issues for common architectures. CORS issues can come into play when Kasm Workspaces is setup with multiple Zones, with each zone in a different region. Each zone will have a different domain name. When a user navigates to the main site, such as kasm.acme.com, all requests go to the primary region's API servers. When a user creates a session, the iframe for KasmVNC or RDP session will go to the region specific hostname, such as us-east.kasm.acme.com. It is important that all zone domain names are sub domains of the primary site, otherwise you will run into CORS issues. Do not, for example, make the primary application domain kasm.acme.com and a zone domain us-east.acme.com. Ensure that zone domain names are sub domains of the primary Kasm Workspaces domain name. #### Cookie Transit After you have confirmed that the required cookies are present and being sent by the browser, it is time to confirm the cookies are making it all the way to Kasm. Enterprises with complex architectures may have multiple security or proxy devices in the path, which can interfere with the transmission of cookies and other HTTP headers. The following guidance assumes you have a multi-server deployment. If you are on a single server deployment, you may skip some parts that are not applicable. We need to trace the request at each hop to ensure the cookie is received at all hops. First create a session and do not destroy it. Go to the admin page in Kasm and navigate to the sessions panel. Find your session and make note of which agent the session is on. We will need to tail logs on all API servers and the agent server that your session is on. Load balancers will typically spread requests across all WebApp servers making troubleshooting more difficult. For the purposes of this demonstration, I have configured my front-end load balancer to send all requests to a single Kasm API (WebApp) server, however, you could also just repeat these steps on all API servers and tail the logs on all of them at once. SSH to the each API server and Agent and modify the NGINX configuration. ```bash # Modify the NGINX logging to include the session token sudo sed -i "s#cookie_username\",'#cookie_username\",'\n '\"cookie_session_token\": \"\$cookie_session_token\",'#" /opt/kasm/current/conf/nginx/logging.conf # Reload nginx sudo docker exec -it kasm_proxy nginx -s reload # You may get warnings in the output depending on your configuration, ensure no errors are present # Tail the NGINX logs to find requests to load vnc.html. # You can filter further by piping the results to another grep and looking for your username. sudo tail -f /opt/kasm/current/log/nginx/access_json.log | grep -P 'request":"GET \S+?vnc\.html' ``` Here is an example log. ```json {"upstream_response_length": "6318","body_bytes_sent": "6318","server_addr": "172.18.0.9","server_port": "443","request_method": "GET","http_referer": "https://139.243.62.99/","http_user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36","http_x_forwarded_for": "","http_x_header": "","nginx_version": "1.25.1","server_protocol": "HTTP/1.1","request":"GET /desktop/3f712ec4-c6c5-49c9-a74b-eb3ba2939eab/vnc/vnc.html?video_quality=2&enable_webp=false&idle_disconnect=20&password=&autoconnect=1&path=desktop/3f712ec4-c6c5-49c9-a74b-eb3ba2939eab/vnc/websockify&cursor=true&resize=remote&clipboard_up=true&clipboard_down=true&clipboard_seamless=true&toggle_control_panel=null HTTP/1.1","request_length": "1336","request_time": "0.025","upstream_response_time": "0.021","request_host":"139.243.62.99","server_name": "ubuntu-base","remote_addr": "119.15.5.44","realip_remote_addr": "119.15.5.44","http_status": "200","time_local":"17/Nov/2023:15:03:26 +0000","time_iso8601":"2023-11-17T15:03:26+00:00","msec":"1700233406.024","upstream_addr": "172.18.0.9:443","upstream_connect_time": "0.001","upstream_response_time": "0.021","upstream_status": "200","uri": "/3f712ec4-c6c5-49c9-a74b-eb3ba2939eab/vnc/vnc.html","query_string": "video_quality=2&enable_webp=false&idle_disconnect=20&password=&autoconnect=1&path=desktop/3f712ec4-c6c5-49c9-a74b-eb3ba2939eab/vnc/websockify&cursor=true&resize=remote&clipboard_up=true&clipboard_down=true&clipboard_seamless=true&toggle_control_panel=null","remote_user": "","cookie_username": "\"matt@kasm.local\"","cookie_session_token": "b5175662-a9ad-41e0-b3b3-69f3cc4fa5bd","upstream_header_time": "0.020"} ``` Ensure that the `cookie_username` and `cookie_session_token` fields are in the JSON and contain valid values. The following is an example log where the cookie values are not present. When the cookies are not present or otherwise not valid, you will notice that there is no upstream_status, this is because NGINX never proxied the request, because the request authorization failed. ```json {"upstream_response_length": "","body_bytes_sent": "177","server_addr": "172.18.0.9","server_port": "443","request_method": "GET","http_referer": "https://139.243.62.99/","http_user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36","http_x_forwarded_for": "","http_x_header": "","nginx_version": "1.25.1","server_protocol": "HTTP/1.1","request":"GET /desktop/3f712ec4-c6c5-49c9-a74b-eb3ba2939eab/vnc/vnc.html?video_quality=2&enable_webp=false&idle_disconnect=20&password=&autoconnect=1&path=desktop/3f712ec4-c6c5-49c9-a74b-eb3ba2939eab/vnc/websockify&cursor=true&resize=remote&clipboard_up=true&clipboard_down=true&clipboard_seamless=true&toggle_control_panel=null HTTP/1.1","request_length": "870","request_time": "0.003","upstream_response_time": "","request_host":"139.243.62.99","server_name": "ubuntu-base","remote_addr": "119.15.5.44","realip_remote_addr": "119.15.5.44","http_status": "403","time_local":"17/Nov/2023:15:09:37 +0000","time_iso8601":"2023-11-17T15:09:37+00:00","msec":"1700233777.141","upstream_addr": "","upstream_connect_time": "","upstream_response_time": "","upstream_status": "","uri": "/3f712ec4-c6c5-49c9-a74b-eb3ba2939eab/vnc/vnc.html","query_string": "video_quality=2&enable_webp=false&idle_disconnect=20&password=&autoconnect=1&path=desktop/3f712ec4-c6c5-49c9-a74b-eb3ba2939eab/vnc/websockify&cursor=true&resize=remote&clipboard_up=true&clipboard_down=true&clipboard_seamless=true&toggle_control_panel=null","remote_user": "","cookie_username": "","cookie_session_token": "","upstream_header_time": ""} ``` Ensure this test passes on all API servers and the agent server the session's container is on. If the session_token or username values are empty in the logs of any API server, then something between the user's browser and the Kasm WebApp server, not including the WebApp server itself, is interfering. This can include the user's own systems as security software installed on the client's system can intercept communications, either in transit or as a browser extension. Devices between the client and the Kasm WebApp server can also interfere with HTTP headers, such as the cookie header. Inspect the logs of any security or proxy devices in the network path to ensure the cookies are not being blocked. If possible, output the cookies within the logs of each device, as we did for NGINX, to see if each device in the path is receiving the logs. If you saw the request come into the API server, and the cookies were present with valid values, but the request was denied with a 403 in the `http_status` field, then proceed to the [Validate Session Authorization](./advanced_connection_troubleshooting.md#validate-session-authorization) section. Also proceed to the Validate Session Authorization section if you saw the request come in at the Agent server but it returned any status code other than a 200 series. If you saw the request come into the API server, and the cookies were present with valid values, but the request returned a 502 or 504, there is likely a connectivity issue to the agent. Use curl to check the health of each agent from each of the API servers. ```bash curl -k https://:443/agent/__healthcheck {"ok": true} ``` If all the API servers can talk to the agent, then proceed to the [Validate Session Authorization](./advanced_connection_troubleshooting.md#validate-session-authorization) section. ### Validate Session Authorization When a user's request travels to their session container, it traverses a WebApp server's nginx container, it then travels to the agent server that the container is on and traverses an nginx container and finally the user's container. On both servers, the NGINX container makes an API call to the kasm_api container. For the WebApp server this kasm_api container resides on the same server as nginx and in the same docker network. The WebApp server's NGINX container makes a call to /api/kasm_connect to retrieve the details of where to forward the request. In a distributed architecture, the agent can be anywhere and be privately addressed. The client does not have the IP address or hostname of the agent, nor does the user's HTTP request contain this information. NGINX calls the kasm_connect API to retrieve the required information. Run the following command to check that the API container is getting the request and that it returns a 202 status code. ```bash sudo docker logs -f --tail 10 kasm_api 2>&1 | grep /api/kasm_connect 2023-11-17 18:31:01,529 [INFO] cherrypy.access.140087972390848: 172.18.0.9 - - [17/Nov/2023:18:31:01] "GET /api/kasm_connect/ HTTP/1.0" 202 - "https://kasm.example.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" ``` In the above example the /api/kasm_connect API call was received and a status code 202 was returned, indicating it should have been successful. Ensure that the target agent is reachable from each API server. NGINX receives the same IP/hostname that is shown in the Kasm Admin UI under Infrastructure -> Docker Agents. So ensure you use the address listed there in the curl example below to ensure the API server can reach the agent in question. ```bash curl -k https://:443/agent/__healthcheck {"ok": true} ``` The Agent server will send API calls to the Kasm WebApp server to authorize the incoming request. This can sometimes cause issues in complex environments. First, check your deployments configuration, go to the Kasm Admin UI and navigate to Infrastructure, Zones, and edit the applicable Zone. In the Zone settings is the `Upstream Auth Address`. For a multi-server deployment the default value is `$request_host$` which is a Kasm variable that gets replaced at runtime with the hostname of the incoming request. Lets walk through the following example. Client --> Load Balancer (kasm.example.com) --> Layer 7 Firewall --> Kasm WebApp Servers --> Agent A client connects to Kasm at https://kasm.example.com, the load balancer then forwards that to one of 4 Kasm WebApp servers, which then forward that request to the agent that the user's container is on. When the agent receives the request, it needs to authorize the request. With default Zone settings on a multi-server deployment, it will send that API call to https://kasm.example.com/api/internal_auth. This is a ok default behavior that works for many deployments, however, in this example the domain name kasm.example.com does not point directly to the WebApp servers, it is for a public load balancer that is in a DMZ some where else in the enterprise. The Kasm agents may not have access to send API calls there, or the calls might be subject to a forward proxy with SSL inspection. To validate whether your agent can access the API servers through this default setting, run the following command. ```bash # success curl -k https://kasm.example.com/api/__healthcheck {"ok": true} # name resolution fails curl -k https://kasm.example.com/api/__healthcheck curl: (6) Could not resolve host: kasm.example.com # invalid host curl -k https://kasm.example.com/api/__healthcheck curl: (7) Failed to connect to 10.0.0.251 port 443 after 3074 ms: No route to host ``` If you do not get the json response shown in the first example above, then your agent likely can't access the WebApp server through the same domain name that your clients access kasm through. A better way to architect this for enterprise grade deployments is to use an internal load balancer with a hostname. Change the [Zone](../zones/deployment_zones.md) Upstream Auth Address to the hostname of the internal load balancer. Ensure your can curl the API health check through the internal load balancer from the agents. Another approach is to use an internal DNS name that points to all 4 WebApp servers and change the Zone's Upstream Auth Address to point to that internal hostname. After changing this setting, you will need to delete any created sessions. Any newly created sessions will have the new setting. Finally, it is good to ensure that an API server actually received the internal_auth API request and what it did with the request. Run the below command on each WebApp server to inspect the API container logs for internal_auth requests. ```bash sudo docker logs -f kasm_api 2>&1 | grep internal_auth 2023-11-15 18:43:18,076 [INFO] cherrypy.access.140168522947744: 172.18.0.9 - - [15/Nov/2023:18:43:18] "GET /api/internal_auth/ HTTP/1.1" 202 - "https://kasm.example.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" ``` The above example log shows a 202 response, indicating the request was not only received, wit was authorized. ### WebSockets The actual stream of the desktop session goes through a websocket connection, and websocket connections are handled differently and may be blocked or otherwise disrupted by security software installed on the client system or by security devices in the path from the client to the Kasm WebApp servers. First, make sure that the client is sending the request for the websocket connection in the first place. Open up DevTools in the browser, go to the Network tab, and then attempt to connect to a session. Click the WS filter, shown in the below screenshot, to get just the websocket connections and look for the `websockify` request. ```{figure} /images/troubleshooting/advanced_connectivity/devtools_websocket.webp :align: center **DevTools WebSocket** ``` Also check the console tab in DevTools and ensure you don't see errors. ```{figure} /images/troubleshooting/requesting_kasm/websocket_error.png :align: center **Session hangs at requesting Kasm** ``` Next, determine if the websockify request is making it all the way to the target agent. The easiest way to quickly determine if the websocket connection is making it all the way down to the agent is to run the following command on the agent. ```bash sudo docker logs -f kasm_proxy 2>&1 | grep '/vnc/websockify ' | grep -v -P '(internal_auth|kasm_connect)' 123.123.123.123 - - [15/Nov/2023:18:50:20 +0000] "GET /desktop/72248a05-922d-4518-b92f-7a9d1ea529eb/vnc/websockify HTTP/1.1" 101 3104787 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" "-" ``` In the above example, the request was received and the response code was 101, which is what should happen. If you see this request come in on the agent, then this means the websocket has traversed all the way through the stack to the agent. In this case proceed to the next section. If you do not see any output from the above command on the Agent, then something in your stack, prior to Kasm is interfering with the websocket connection. ### KasmVNC Troubleshooting If you have verified all the above steps the next step is to troubleshoot KasmVNC. First, enable debug logging on the user container. 1. In the Kasm Admin UI, navigate to Workspaces -> Workspaces 2. Find the target Workspace in the list and click the Edit button. 3. Scroll down to the Docker Run Config Override field and paste in the following. `{ "environment": { "KASM_DEBUG": 1 } }` 4. Launch a new session Now SSH to the agent that the session was provisioned on and run the following command to get a shell inside the container. ```bash # Get a list of running containers and identify your session container, the name of the container contains your partial username and session ID. sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 125348fe9990 kasmweb/core-ubuntu-jammy-private:feature_KASM-5078_multi-monitor "/dockerstartup/kasm…" 3 minutes ago Up 3 minutes 4901/tcp, 5901/tcp, 6901/tcp mattkasm.loc_4f8ada7f # Get a shell inside of the container sudo docker exec -it 125348fe9990 /bin/bash default:~$ # Tail the KasmVNC logs default:~$ tail -f .vnc/125348fe9990\:1.log ``` With the tail of the logs running, try to connect to the session. You will see a bunch of normal HTTP requests that are loading the static page resources such as vnc.html, javascript, stylesheets, etc. They will look like the following: ```bash 2023-11-17 14:51:07,525 [DEBUG] websocket 141: BasicAuth matched 2023-11-17 14:51:07,525 [INFO] websocket 141: /websockify request failed websocket checks, not a GET request, or missing Upgrade header 2023-11-17 14:51:07,525 [DEBUG] websocket 141: Invalid WS request, maybe a HTTP one 2023-11-17 14:51:07,525 [DEBUG] websocket 141: Requested file '/app/sounds/bell.oga' 2023-11-17 14:51:07,525 [INFO] websocket 141: 172.18.0.9 71.62.47.171 kasm_user "GET /app/sounds/bell.oga HTTP/1.1" 200 8701 2023-11-17 14:51:07,525 [DEBUG] websocket 141: No connection after handshake 2023-11-17 14:51:07,525 [DEBUG] websocket 141: handler exit ``` The first field is the date and time. After the time is a comma, followed by a number, that number is a HTTP request ID. Group each request by ID, so that you have all logs for a specific request. In the above example, all the logs were produced for the request for /app/sounds/bell.oga. The above log of "/websockify request failed websocket checks, not a GET request, or missing Upgrade header" is misleading. This is not, in and of itself an issue. It merely means that the incoming request was not a websocket request. In the beginning, KasmVNC only had a websocket server and it did not handle other types of web requests. The message is only relevant if the requested file was /websockify. If you saw the message "/websockify request failed websocket checks, not a GET request, or missing Upgrade header" for the /websockify file, then this means that KasmVNC was unable to identify the request as a valid websocket connection. Below is an example of what you should see for the websocket connection. Note the message "using protocol HyBi/IETF 6455 13" indicating that KasmVNC was able to correctly identify the exact Websocket specification being used by the browser. ```bash 2023-11-17 14:51:07,526 [DEBUG] websocket 142: using SSL socket 2023-11-17 14:51:07,526 [DEBUG] websocket 142: X-Forwarded-For ip '71.62.47.171' 2023-11-17 14:51:07,529 [DEBUG] websocket 142: BasicAuth matched 2023-11-17 14:51:07,529 [DEBUG] websocket 142: using protocol HyBi/IETF 6455 13 2023-11-17 14:51:07,529 [DEBUG] websocket 142: connecting to VNC target 2023-11-17 14:51:07,529 [DEBUG] XserverDesktop: new client, sock 32 ``` In some cases KasmVNC could have a bug. In the recent past, KasmVNC would crash due to large cookies or if the websocket connection was not exactly to the spec. These issues have since been corrected, however, there is no end to the combination of devices and services out there that sit between users and Kasm. In some cases these security devices or services manipulate the HTTP requests in a way that brings them out of compliance with the specification. This can cause improper handling by KasmVNC. The following is an exmaple of what a crash would look like in the KasmVNC logs. ```bash (EE) (EE) Backtrace: (EE) 0: /usr/bin/Xvnc (xorg_backtrace+0x4d) [0x5e48dd] (EE) 1: /usr/bin/Xvnc (0x400000+0x1e8259) [0x5e8259] (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f5a57ef6000+0x12980) [0x7f5a57f08980] (EE) 3: /lib/x86_64-linux-gnu/libc.so.6 (epoll_wait+0x57) [0x7f5a552eca47] (EE) 4: /usr/bin/Xvnc (ospoll_wait+0x37) [0x5e8d07] (EE) 5: /usr/bin/Xvnc (WaitForSomething+0x1c3) [0x5e2813] (EE) 6: /usr/bin/Xvnc (Dispatch+0xa7) [0x597007] (EE) 7: /usr/bin/Xvnc (dix_main+0x36e) [0x59b1fe] (EE) 8: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xe7) [0x7f5a551ecbf7] (EE) 9: /usr/bin/Xvnc (_start+0x2a) [0x46048a] (EE) (EE) Received signal 11 sent by process 17182, uid 0 (EE) Fatal server error: (EE) Caught signal 11 (Segmentation fault). Server aborting (EE) ``` KasmVNC will be restarted automatically by the container's entrypoint script, so you may see this repeat. Copy the backtrace output and provide it to Kasm support, along with the output of the following command. ```bash sudo docker exec -it 125348fe9990 Xvnc -version Xvnc KasmVNC 1.2.0.e4a5004f4b89b9da78c9b5f5aee59c08c662ccec - built Oct 31 2023 11:22:56 Copyright (C) 1999-2018 KasmVNC Team and many others (see README.me) See http://kasmweb.com for information on KasmVNC. Underlying X server release 12008000, The X.Org Foundation ``` With the above information we should be able to symbolize the backtrace and potentially find out what the issue is. ## Server Configuration Issues The following sub sections cover configuration issues on individual servers. These issues would be at the host OS level, so not with Kasm itself, but with the configuration of the host operating system or dependencies therein. ### Confirm Local Connectivity Sometimes in troubleshooting, if individual Kasm service containers are started, stopped, or restarted, the Kasm proxy container may lose the local hostname resolution of the other containers. First, lets stop and start the Kasm services to ensure hostname resolution is refreshed and that all containers were started in the proper order. ```bash /opt/kasm/bin/stop /opt/kasm/bin/start ``` Next lets confirm that all services are up, running, and healthy. The following output shows that all services are up, running, and healthy. This is from a single server deployment. ```bash sudo docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 401963ee87a0 kasmweb/nginx:1.25.1 "/docker-entrypoint.…" 6 days ago Up 6 days 80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp kasm_proxy 0eb604899140 kasmweb/agent:1.14.0 "/bin/sh -c '/usr/bi…" 6 days ago Up 6 days (healthy) 4444/tcp kasm_agent 140660c3c201 kasmweb/manager:1.14.0 "/bin/sh -c '/usr/bi…" 6 days ago Up 6 days (healthy) 8181/tcp kasm_manager 1ed22b860c6b kasmweb/kasm-guac:1.14.0 "/dockerentrypoint.sh" 6 days ago Up 6 days (healthy) kasm_guac 45fceb6bff08 kasmweb/share:1.14.0 "/bin/sh -c '/usr/bi…" 6 days ago Up 6 days (healthy) 8182/tcp kasm_share 5759e5692a85 kasmweb/api:1.14.0 "/bin/sh -c 'python3…" 6 days ago Up 6 days 8080/tcp kasm_api 482059a66347 redis:5-alpine "docker-entrypoint.s…" 7 days ago Up 7 days 6379/tcp kasm_redis 670da792ed27 postgres:12-alpine "docker-entrypoint.s…" 7 days ago Up 7 days (healthy) 5432/tcp kasm_db ``` For a WebApp server on a multi-server deployment, the output should look like the following. ```bash sudo docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 401963ee87a0 kasmweb/nginx:1.25.1 "/docker-entrypoint.…" 6 days ago Up 6 days 80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp kasm_proxy 140660c3c201 kasmweb/manager:1.14.0 "/bin/sh -c '/usr/bi…" 6 days ago Up 6 days (healthy) 8181/tcp kasm_manager 45fceb6bff08 kasmweb/share:1.14.0 "/bin/sh -c '/usr/bi…" 6 days ago Up 6 days (healthy) 8182/tcp kasm_share 5759e5692a85 kasmweb/api:1.14.0 "/bin/sh -c 'python3…" 6 days ago Up 6 days 8080/tcp kasm_api ``` For an Agent server on a multi-server deployment, the output should look like the following. ```bash sudo docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 401963ee87a0 kasmweb/nginx:1.25.1 "/docker-entrypoint.…" 6 days ago Up 6 days 80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp kasm_proxy 0eb604899140 kasmweb/agent:1.14.0 "/bin/sh -c '/usr/bi…" 6 days ago Up 6 days (healthy) 4444/tcp kasm_agent ``` Make note of the port number in the output of the kasm_proxy container. From the above examples you can see `0.0.0.0:443->443/tcp`, which indicates that the host's port 443 is mapped to the container's port 443. This indicates that kasm was installed on port 443, ensure this matches your expectation. Next, ensure that there is a program listening on the target port. ```bash ss -ltn State Recv-Q Send-Q Local Address:Port Peer Address:Port Process LISTEN 0 4096 0.0.0.0:111 0.0.0.0:* LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0:* LISTEN 0 128 0.0.0.0:22 0.0.0.0:* LISTEN 0 4096 0.0.0.0:443 0.0.0.0:* LISTEN 0 511 127.0.0.1:35521 0.0.0.0:* LISTEN 0 511 0.0.0.0:9001 0.0.0.0:* LISTEN 0 4096 [::]:111 [::]:* LISTEN 0 128 [::]:22 [::]:* LISTEN 0 4096 [::]:443 [::]:* ``` The above output shows that my server is listening on port 443 on both ipv4 and ipv6. Next, get the local IP address of the user facing network interface. ```bash ubuntu@roles-matt:~$ ip add 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens3: mtu 9000 qdisc pfifo_fast state UP group default qlen 1000 link/ether 02:00:17:0f:8c:19 brd ff:ff:ff:ff:ff:ff altname enp0s3 inet 10.0.0.106/24 metric 100 brd 10.0.0.255 scope global ens3 valid_lft forever preferred_lft forever inet6 fe80::17ff:fe0f:8c19/64 scope link valid_lft forever preferred_lft forever ``` Ignore loopback interfaces, docker0, bridge interfaces, and any other interfaces without an IP address. In the example above we can see the IP address is 10.0.0.106. Curl the IP address using HTTPS on the expected port number. ```bash # Correct Output curl -k https://10.0.0.106:443/api/__healthcheck {"ok": true} # Failure curl -k https://10.0.0.106:443/api/__healthcheck curl: (7) Failed to connect to 10.0.0.106 port 443 after 0 ms: Connection refused ``` If that command either hung or immediately returned a `Connection refused` message as shown above in the failure case, then your local system likely has a firewall running, see the next section. ### Host Based Firewalls Host based firewalls such as McAfee HBSS and even the Linux default UFW can interfere with communications. Docker manages iptables rules and other firewalls and security products either apply additional rules or actually use iptables as well. This can result in corrupt iptables rules. If you have UFW installed, run the following to allow https on port 443. See the UFW documentation on how to make this rule permanent, add alternative ports, or for additional usage instructions. ```bash sudo ufw status sudo ufw allow https ``` The following commands will completely clear IP tables rules and any NAT rules. **This may break your system, ensure you know what you are doing** ```bash # shut down kasm /opt/kasm/bin/stop # Accept all traffic first to avoid ssh lockdown via iptables firewall rules # iptables -P INPUT ACCEPT iptables -P FORWARD ACCEPT iptables -P OUTPUT ACCEPT # Flush All Iptables Chains/Firewall rules # iptables -F # Delete all Iptables Chains # iptables -X # Flush all counters too # iptables -Z # Flush and delete all nat and mangle # iptables -t nat -F iptables -t nat -X iptables -t mangle -F iptables -t mangle -X iptables -t raw -F iptables -t raw -X # Restarting docker will regenerate the iptables rules that docker needs sudo systemctl restart docker # Bring Kasm back up /opt/kasm/bin/start ``` If the above fixes your issues, it may only be temporary. You may have security or configuration management software installed on your server that eventually re-apply the offending rules. Please consult the documentation of your offending software for remediation.