SSM Agent provides access to EC2 instances via Systems manager, over AWS endpoints. This helps in providing additional security to your workloads in below ways,
- SSH access is not required, so no need of creating local users or additional overhead of domain joined machines
- Internet access or Access via VPN on port 22 is not needed as SSM uses standard port 443
I did faced though a unique error where SSM agent was installed correctly yet users were not able to access the machine via systems manager. Thankfully, I did had a system user created for accessing the machine on port 22 only for admin tasks. If not the case, we can use EC2 serial console.
Error
Since agent was up, I checked the logs stored on default logging path for the ssm-agent on machine. Logs are stored in /var/log/amazon/ssm/errors.log
2024-04-26 10:39:45 ERROR [minLog @ credentialrefresher.go.280] [CredentialRefresher] Retrieve credentials produced error: unexpected error getting instance profile role credentials or calling UpdateInstanceInformation. Skipping default host management fallback: retrieved credentials failed to report to ssm. RequestId: 749daea7-e6dd-40f8-af23-d3f0845f8112 Error: InvalidSignatureException: Signature expired: 20240426T093945Z is now earlier than 20240426T094023Z (20240426T094523Z - 5 min.)
Error is pretty clear. My time is behind by 5 mins. Although I did changed the timezone manually just to assure myself yet issue was unresolved.
Solution
I checked the chrony config on my instance and at first glance everything seemed normal. I compared it with the recommended steps by AWS here, https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-time.html#configure-ec2-ntp
Only after reading it, I realised my machine was not using AWS VPC default DNS for performing time sync. Post updating the server info, I restarted the agent and users were sucessfully connected this time.
vi /etc/chrony/chrony.conf
server 169.254.169.123 prefer iburst minpoll 4 maxpoll 4
210 Number of sources = 1
.-- Source mode '^' = server, '=' = peer, '#' = local clock.
/ .- Source state '*' = current synced, '+' = combined , '-' = not combined,
| / '?' = unreachable, 'x' = time may be in error, '~' = time too variable.
|| .- xxxx [ yyyy ] +/- zzzz
|| Reachability register (octal) -. | xxxx = adjusted offset,
|| Log2(Polling interval) --. | | yyyy = measured offset,
|| \ | | zzzz = estimated error.
|| | | \
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123 3 4 377 0 -3588ns[-3959ns] +/- 291us
Pretty straightforward, **Only if you know what you doing 😉*
Leave a comment