Zrok share Fails on AWS EC2 with DNS Resolution Error #791

Title: zrok share Fails on AWS EC2 with DNS Resolution Error

Problem Description

On AWS EC2, when attempting to share a local service using zrok, the following command fails:

zrok share public http://localhost:8009

The error received is:

[ERROR]: error creating proxy backend (error listening: failed to listen: no apiSession, authentication attempt failed: Post "https://ip-10-0-0-10:8441/edge/client/v1/authenticate?method=cert": dial tcp: lookup ip-10-0-0-10 on 8.8.8.8:53: no such host)

Key Details

  • The Ziti Edge Controller is installed on an AWS EC2 instance and advertises its internal AWS hostname (ip-10-0-0-10), which is not resolvable externally.
  • The client machine does not have Ziti installed and relies solely on zrok for accessing the Ziti network.

Root Cause

The Ziti Edge Controller is configured to advertise its internal AWS hostname (ip-10-0-0-10), which is not accessible outside of the AWS VPC. As a result, external clients (e.g., zrok) cannot resolve the hostname, leading to a DNS resolution failure.

Hi @mabrotrix, welcome to the community and to zrok (and OpenZiti/BrowZer)!

Looking at that hostname, I would say this looks like an OpenZiti overlay issue. However you installed the OpenZiti overlay was not configured with the proper advertisements.

How did you end up creating the OpenZiti overlay? What methodology are you using/following?

I used this video tutorial for the configuration:https://www.youtube.com/watch?v=870A5dke_u4 It worked perfectly the last time, but now I've hosted it on a new server, and I'm facing an issue.


On my hosted server:

root@my-server:~# nslookup my-hostname
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   my-hostname.example.net
Address: 164.x.x.30

On my local PC:

(base) user@local-machine % nslookup my-hostname
Server:         192.168.1.1
Address:        192.168.1.1#53

** server can't find my-hostname: NXDOMAIN

If you can't nslookup the FQDN, I would think you haven't created a A record in your DNS of choice. In the video it's around the 2m 45s mark. Around https://youtu.be/870A5dke_u4?t=160 when Ken makes a * A Record for his IP

The client machine is in the same VPC as your zrok instance, and is failing to resolve the domain name "ip-10-0-0-10" in Google's global recursive nameserver 8.8.8.8, correct?

You encountered the problem when you attempted to create a zrok public share. Does that mean you ran zrok enable without error?

I suspect your new instance is in a VPC that is not the "default" VPC. Default VPCs have hostname DNS enabled by default, and custom VPCs (non-default) require the administrator to enable hostname DNS. View and update DNS attributes for your VPC - Amazon Virtual Private Cloud

When you enable DNS hostnames in your AWS VPC settings the instances in the VPC will be auto-configured to use the VPC's recursive nameserver instead of the global, recursive nameserver, e.g., 8.8.8.8 (Google DNS).

I added an A record in the format *.chain

Checkbox Type Name/Host IP Address TTL
:ballot_box_with_check: A @ 102.167.x.12 600 seconds
:ballot_box_with_check: A *.chain 156.52.x.30 600 seconds
:ballot_box_with_check: A releases 154.52.x.160 600 seconds
:ballot_box_with_check: A www 102.167.x.12 600 seconds

My machine is an AWS EC2 micro instance.
Yes, when I use Zrok enable locally on my PC, it works fine. However, when I try to use Zrok in a public cmd, I encounter the following issue:

(base) user@machine zrok_0.4.37_darwin_arm64 % ./zrok share public http://localhost:7899 --headless --verbose

[   0.340]   DEBUG sdk-golang/ziti.(*ContextImpl).authenticate: attempting to authenticate
[ERROR]: error creating proxy backend (error listening: failed to listen: no apiSession, authentication attempt failed: 
Post "https://<hostname-redacted>:8441/edge/client/v1/authenticate?method=cert": dial tcp: lookup <hostname-redacted>: no such host)
(base) user@machine zrok_0.4.37_darwin_arm64 %

The fact that the error says there is no apiSession would indicate the environment has not been properly initialized. Have you run the "zrok enable" on the AWS system, and was it successful? Your output above shows the share command, but not the enable.

I still see "no such host" in the error. The machine you're using can't locate the DNS record imo

@TheLumberjack @mike.gorman @qrkourier
I have set up a Zrok instance hosted on AWS, and everything is working correctly there. However, when I attempt to use Zrok on my local machine, I encounter an error during the share command execution.

Environment Details

AWS Instance (Working Setup)

ubuntu@ip-172-31-46-125:~$ zrok status

Config:

 CONFIG           VALUE                   SOURCE 
 apiEndpoint      http://127.0.0.1:18080  env    
 defaultFrontend  public                  binary 

Environment:

 PROPERTY       VALUE   
 Secret Token   <<SET>> 
 Ziti Identity  <<SET>> 

Command executed on AWS instance:

ubuntu@ip-172-31-46-125:~$ zrok share public http://localhost:7899 --headless
[   0.134]    INFO sdk-golang/ziti.(*listenerManager).createSessionWithBackoff: {session token=[860088a9-3006-4631-8592-cdbedf3db9d2]} new service session
[   0.175]    INFO main.(*sharePublicCommand).run: access your zrok share at the following endpoints:
 https://qdgrkjjbj3sc.chain.ziglinkzag.com

Local Machine (Error Setup)

(base) john@local-machine zrok_0.4.37_darwin_arm64 % ./zrok status    

Config:

 CONFIG           VALUE                             SOURCE 
 apiEndpoint      https://api.chain.ziglinkzag.com  env    
 defaultFrontend  public                            binary 

Environment:

 PROPERTY       VALUE   
 Secret Token   <<SET>> 
 Ziti Identity  <<SET>> 

Command executed on local machine:

(base) john@local-machine zrok_0.4.37_darwin_arm64 % ./zrok share public http://localhost:7899   
[ERROR]: error creating proxy backend (error listening: failed to listen: no apiSession, authentication attempt failed: Post "https://ip-172-31-46-125:8441/edge/client/v1/authenticate?method=cert": dial tcp: lookup ip-172-31-46-125: no such host)

Problem

On my local machine, I get the following error:

error creating proxy backend (error listening: failed to listen: no apiSession, authentication attempt failed: Post "https://ip-172-31-46-125:8441/edge/client/v1/authenticate?method=cert": dial tcp: lookup ip-172-31-46-125: no such host)

Expected Behavior

The share command should work on the local machine as it does on the AWS instance, allowing me to access the shared endpoint.

Actual Behavior

The command fails with a "no such host" error, indicating that it cannot resolve the AWS instance hostname.

Workaround

I found a temporary workaround by adding the AWS instance's public IP to my local /etc/hosts file. The following entry resolves the issue:

sudo nano /etc/hosts

Add the following line:

<aws-public-ip> ip-172-31-46-125

With this change, the command works as expected on my local machine.

Additional Information

  • Zrok Version: 0.4.37
  • Local Machine: macOS (Darwin ARM64)
  • API Endpoint on Local Machine: https://api.chain.ziglinkzag.com
  • API Endpoint on AWS Instance: http://127.0.0.1:18080

Possible Cause

It appears that the issue is related to DNS resolution. The local machine cannot resolve the AWS instance's hostname (ip-172-31-46-125) without manually updating the /etc/hosts file.

Request

Could you please provide guidance on a more permanent solution or configuration change that would avoid the need to update the /etc/hosts file manually?


The ip-172-31-46-125 address isn't a fully qualified domain name, it is the form of the usual prompt on EC2 VMs. EC2 instances usually have a FQDN like ec2-3-97-111-194.compute-1.amazonaws.com. So adding it to the hosts file as you did resolved it, but so should using the actual FQDN of the instance.

Thanks for the clarification! I am following the installation instructions from this OpenZiti Quickstart guide. Based on your explanation, it seems that I need to set the advertised addresses for the controller and router using environment variables to resolve the DNS issue.

I believe adding the following environment variables should fix it:

export ZITI_CTRL_EDGE_ADVERTISED_ADDRESS="ec2-3-97-111-194.compute-1.amazonaws.com"
export ZITI_ROUTER_ADVERTISED_ADDRESS="ec2-3-97-111-194.compute-1.amazonaws.com"

Could you please confirm if this is the correct approach? This should ensure that the local machine can resolve the public FQDN instead of the internal AWS hostname (ip-172-31-46-125).

Thanks in advance!

That should do it! It will also allow new environments to be properly initialized. (Using the correct FQDN, of course....)

Just FYI... If you don't have an enabled environment, the zrok share command will give you an error to that effect:

$ HOME=/non/existent zrok share public -b web .
[ERROR]: unable to create share (unable to load environment; did you 'zrok enable'?)

You understand correctly. Every device that uses zrok must be able to resolve and reach the FQDNs of the zrok controller, zrok frontend, Ziti controller(s), and Ziti router(s). These can all be found at the same public IP address.

Consider using a static IP and DNS provider instead of the temporary IP and generated domain name. Those are fine for a test, but changing the advertised addresses severely disrupts your zrok instance.

EC2 does not guarantee the stability of the public IP or generated name unless you pay extra for an elastic IP (EIP). The argument for using a DNS provider like Route53 is that your zrok instance must be protected by TLS for security. A wildcard certificate is practical because it allows zrok to invent public share tokens on the fly, and wildcard certificates require DNS verification.

I believe you followed the zrok self-hosting guide for Linux, which talks about using Nginx and Certbot to manage the wildcard certificate. Alternatively, you could use Caddy or Nginx Proxy Manager to replace both Nginx and Certbot.

Working perfectly, thanks so much!

2 Likes