I recently ran into an issue with a Debian VM instance that had been upgraded from Stretch to Buster. Prior to the upgrade, the ssh via gcloud os login worked just fine. After the upgrade, however, every attempt resulted in a permission denied error.
Now, there are a lot of posts and threads about this particular error. In this case, however, none of them provided the necessary answers to solve this particular scenario. Worse, yet, the backup user account was (suddenly?) no longer or otherwise not in the sudoers group – which added a minor complexity to the troubleshooting.
Post upgrade problem appears :
gcloud beta compute ssh –zone “<zone name>” “<vm name>” –tunnel-through-iap –project “<project name>”
<username>@compute.: Permission denied (publickey).
ERROR: (gcloud.beta.compute.ssh) [/usr/bin/ssh] exited with return code [255].
Troubleshooting – unique elements only
- Login to Cloud Platform
- Create a Snapshot of the VM
- edit the VM Instance
- Enable connecting to serial ports and save
- connect to the serial port (I found it useful to do this from a terminal window on a separate screen)
- gcloud compute –project=<projectname> connect-to-serial-port <vmname> –zone=<zonename>
- Reset the vm
- Look for the following from the console as the VM reboots : localhost systemd[1]: Reloaded OpenBSD Secure Shell server. [ 12.027817] google_guest_agent[378]: ERROR oslogin.go:147 Error updating NSS cache: exec: “google_oslogin_nss_cache“: executable file not found in $PATH. localhost google_guest_agent[378]: ERROR oslogin.go:147 Error updating NSS cache: exec: “google_oslogin_nss_cache”: executable file not found in $PATH.
(note – if you still have sudo access via the serial console, you can obviously skip this next step since it’s just adding an existing user to the sudoers group) - Edit the VM Metadata
- key : startup-script
value : #!/bin/bash usermod -aG sudo - Reset the VM
- Login using the indicated username and verify sudo
- remove the startup-script from metadata
- sudo systemctl list-unit-files | grep google | grep enabled
- Verify the following : google-disk-expand.service enabled
google-guest-agent.service enabled
google-osconfig-agent.service enabled
google-shutdown-scripts.service enabled
google-startup-scripts.service enabled
google-oslogin-cache.timer enabled
Note, especially, if google-oslogin-cache.timer is missing. - sudo apt-get update
- sudo apt-get install
- curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add –
- DIST=$(cat /etc/os-release | grep “VERSION=” | sed “s/\”\|(\|)\|VERSION=//g” | awk ‘{print tolower($NF)}’) sudo tee /etc/apt/sources.list.d/google-cloud.list << EOM deb http://packages.cloud.google.com/apt google-compute-engine-${DIST}-stable main deb http://packages.cloud.google.com/apt google-cloud-packages-archive-keyring-${DIST} main EOM
- sudo apt update
- sudo apt install -y google-cloud-packages-archive-keyring sudo apt install -y google-compute-engine google-osconfig-agent
- sudo reboot
- check from your local machine to see if the problem is solved.
- gcloud beta compute ssh –zone “<zone name>” “<vm name>” –tunnel-through-iap –project “<project name>”
- if not, then : sudo apt update sudo apt install google-compute-engine google-compute-engine-oslogin google-guest-agent google-osconfig-agent
- Check again from your local machine to see if the gcloud compute ssh connection works.
- If it’s all working, remember to clean up any unneeded snapshots
- Edit the VM instance, disable “connecting to serial ports” and save
Wrap up and Links
Hopefully that helped solve your problem. If not, here are a few links that may guide you towards a successful resolution :
- https://cloud.google.com/compute/docs/images/install-guest-environment
- https://cloud.google.com/sdk/gcloud/reference/compute/connect-to-serial-port
- https://cloud.google.com/compute/docs/troubleshooting/troubleshooting-using-serial-console
- https://cloud.google.com/compute/docs/instances/managing-instance-access
- https://cloud.google.com/compute/docs/oslogin
- https://cloud.google.com/compute/docs/oslogin/troubleshoot-os-login
- https://cloud.google.com/compute/docs/instances/startup-scripts/linux
- https://cloud.google.com/compute/docs/troubleshooting/troubleshooting-ssh
- https://bobcares.com/blog/google-cloud-compute-engine-ssh/
- https://cloud.google.com/sdk/gcloud/reference/compute/config-ssh
- https://www.reddit.com/r/googlecloud/comments/6vhrgs/permission_denied_publickey/
- https://medium.com/google-cloud/resolving-getting-locked-out-of-a-compute-engine-85800251890b
- https://stackoverflow.com/questions/62952693/google-compute-engine-virtual-machine-instance-unable-to-login-ssh-the-vm-insta
- https://gist.github.com/pydevops/cffbd3c694d599c6ca18342d3625af97
- https://stackoverflow.com/questions/35016795/get-root-password-for-google-cloud-engine-vm