Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk resize for VM not updated in terraform state #1271

Open
mattburchett opened this issue May 8, 2024 · 14 comments
Open

Disk resize for VM not updated in terraform state #1271

mattburchett opened this issue May 8, 2024 · 14 comments
Labels
🐛 bug Something isn't working 🤷 can't reproduce

Comments

@mattburchett
Copy link

Describe the bug
After increasing the disk size in Terraform, the state still contains the old values and tries to increase again.

To Reproduce
Steps to reproduce the behavior:

  1. Create a VM resource
  2. Increase disk size by any amount
  3. terraform apply
  4. once complete, run terraform plan

Expected behavior
Terraform's state should have the new value after applying.

Log

# Original Apply
Terraform will perform the following actions:

  # proxmox_virtual_environment_vm.vms["web"] will be updated in-place
  ~ resource "proxmox_virtual_environment_vm" "vms" {
        id                      = "100"
        name                    = "web"
      + protection              = false
        # (26 unchanged attributes hidden)

      ~ disk {
          ~ size              = 50 -> 100
            # (11 unchanged attributes hidden)
        }

        # (8 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

proxmox_virtual_environment_vm.vms["web"]: Modifying... [id=100]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 10s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 20s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 30s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 40s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 50s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 1m0s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 1m10s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 1m20s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Still modifying... [id=100, 1m30s elapsed]
proxmox_virtual_environment_vm.vms["web"]: Modifications complete after 1m34s [id=100]

# terraform plan

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

  # proxmox_virtual_environment_vm.vms["web"] will be updated in-place
  ~ resource "proxmox_virtual_environment_vm" "vms" {
        id                      = "100"
        name                    = "web"
        # (27 unchanged attributes hidden)

      ~ disk {
          ~ size              = 50 -> 100
            # (11 unchanged attributes hidden)
        }

        # (8 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.
  • Single or clustered Proxmox: Single
  • Proxmox version: 8.x
  • Provider version (ideally it should be the latest version): latest (v0.55.1)
  • Terraform/OpenTofu version: Terraform 1.5.7
  • OS (where you run Terraform/OpenTofu from): Ubuntu 22.04 LTS

Let me know if I can provide any more information that would be useful.

@mattburchett mattburchett added the 🐛 bug Something isn't working label May 8, 2024
@bpg
Copy link
Owner

bpg commented May 8, 2024

One question about the step 1: have you cloned the VM from a template / another VM, or created from scratch?

@mattburchett
Copy link
Author

mattburchett commented May 8, 2024

It is a full clone (not a linked clone) from a template.

I'm also not sure if relevant, but the template uses an image of Ubuntu's cloud-init images.

@KorzunKarl
Copy link

@mattburchett You need to install and enable, qemu-guest-agent to target VM, because after resizing the VM requires a reboot, and without an agent this is not possible even from the interface proxmox. Install you may do with cloud-init, and add agent = true to target VM

@mattburchett
Copy link
Author

@mattburchett You need to install and enable, qemu-guest-agent to target VM, because after resizing the VM requires a reboot, and without an agent this is not possible even from the interface proxmox. Install you may do with cloud-init, and add agent = true to target VM

It does have qemu-guest-agent enabled.

image

And my Terraform config:

resource "proxmox_virtual_environment_vm" "vms" {
  for_each = local.pve-hosts

  name        = each.key
  description = "Managed by Terraform"

  node_name = each.value.target-node
  vm_id     = split(".", each.value.ip.address)[3]

  agent {
    enabled = true
  }

  clone {
    datastore_id = each.value.hardware.storage
    retries      = 10
    node_name    = "lrhq-pve"
    vm_id        = each.value.template
  }

  cpu {
    cores   = each.value.hardware.cores
    sockets = 1
    type    = "host"
  }

  disk {
    datastore_id = each.value.hardware.storage
    interface    = "virtio0"
    size         = each.value.hardware.disk_size
  }

  initialization {
    datastore_id = each.value.hardware.storage
    ip_config {
      ipv4 {
        address = "${each.value.ip.address}/${each.value.ip.cidr}"
        gateway = each.value.ip.gw
      }
    }

    user_data_file_id = proxmox_virtual_environment_file.ubuntu_cloud_config[each.key].id
  }

  lifecycle {
    ignore_changes = [
      initialization[0].user_data_file_id
    ]
  }


  memory {
    dedicated = each.value.hardware.memory
  }

  network_device {}

  on_boot = true

  operating_system {
    type = "l26"
  }

  serial_device {}

  depends_on = [cloudflare_record.proxmox-pve-dns]
}

@KorzunKarl
Copy link

@mattburchett Please check status of agent on this VM ?
image

@mattburchett
Copy link
Author

web ~ [0]# systemctl status qemu-guest-agent.service
● qemu-guest-agent.service - QEMU Guest Agent
     Loaded: loaded (/lib/systemd/system/qemu-guest-agent.service; static)
     Active: active (running) since Wed 2024-05-08 17:59:16 UTC; 4s ago
   Main PID: 148060 (qemu-ga)
      Tasks: 2 (limit: 2309)
     Memory: 688.0K
        CPU: 6ms
     CGroup: /system.slice/qemu-guest-agent.service
             └─148060 /usr/sbin/qemu-ga

May 08 17:59:16 lrhq-web.linuxrocker.cloud systemd[1]: Started QEMU Guest Agent.

I did restart it a moment ago, because it was having some errors for guest ping in the logs, but they weren't recent.

And after the restart, a terraform plan still shows that it wants to change the disk.

  # proxmox_virtual_environment_vm.vms["web"] will be updated in-place
  ~ resource "proxmox_virtual_environment_vm" "vms" {
        id                      = "100"
        name                    = "web"
        # (27 unchanged attributes hidden)

      ~ disk {
          ~ size              = 50 -> 100
            # (11 unchanged attributes hidden)
        }

        # (8 unchanged blocks hidden)
    }

@bpg
Copy link
Owner

bpg commented May 10, 2024

@mattburchett Hmm... I can't reproduce this issue in my simple test. 🤔
Does your template VM have more than one disk? Could you possibly take a screenshot of its hardware configuration and post it here?

@mattburchett
Copy link
Author

mattburchett commented May 11, 2024

Sure thing.

image

I'll also toss in the information for my template creation:

resource "proxmox_virtual_environment_file" "ubuntu_jammy_template" {
  content_type = "iso"
  datastore_id = "local"
  node_name    = "lrhq-pve"

  source_file {
    path = "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
  }
}

resource "proxmox_virtual_environment_vm" "ubuntu-jammy-template" {
  name        = "ubuntu-jammy-template"
  description = "Managed by Terraform"

  node_name = "lrhq-pve"
  vm_id     = 9006

  cpu {
    cores   = 1
    sockets = 1
    type    = "kvm64"
    flags   = ["+aes"]
  }

  disk {
    datastore_id = "local-zfs"
    file_id      = proxmox_virtual_environment_file.ubuntu_jammy_template.id
    interface    = "virtio0"
  }

  on_boot  = false
  started  = false
  template = true
}

@mattburchett
Copy link
Author

Well, that's interesting. I actually wonder if it's a proxmox bug.

On the VM, I can see the disk increase took place:

web ~ [0]# df -h  /
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        97G   19G   79G  19% /

But when I view the hardware for the VM in Proxmox, it shows a disk size of 50G.

image

I'm going to try and run some proxmox updates to bring myself to the latest version, and I'll test with another VM and see if I can replicate it.

@mattburchett
Copy link
Author

Yeah, I think this might be a proxmox bug. It still happened on my test VM after an update to 8.2.2 (from 8.0).

I found someone with the same issue, but with a LXC container: https://bugzilla.proxmox.com/show_bug.cgi?id=305

I gave some information over there to see if it's a bug on their end.

@bpg
Copy link
Owner

bpg commented May 12, 2024

Hmm... I've tried your exact template & vm on both ZFS and LVM storages, and had no issues. PVE v8.2.2 as well. Perhaps something specific to your ZFS config?

@mattburchett
Copy link
Author

I'm not certain honestly. I don't think I've done anything specific with ZFS on Proxmox. It's pretty much a bog-standard single-node install with ZFS in a RAIDZ2, with ZFS on root, all done through the installer.

I do have some server setup that is done via Ansible, but nothing that messes with the storage arrays. It pretty much just installs monitoring and sets up my shell.

@svengreb
Copy link
Contributor

@mattburchett Is there is hardwar RAID controller where ZFS runs on? If so this is not supported and recommended by ZFS because ZFS "likes" to see all disks directly and some some features even require this, e.g. to protect against silent bitrot.

@mattburchett
Copy link
Author

Proxmox is running on a PowerEdge R430 with a PERC H730 Mini. The PERC is not configured and is just passing the devices through via JBOD.

image

A zpool status from the host:

lrhq-pve ~ [0]# zpool status
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 01:11:57 with 0 errors on Sun May 12 01:35:58 2024
config:

        NAME                                                   STATE     READ WRITE CKSUM
        rpool                                                  ONLINE       0     0     0
          raidz2-0                                             ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S6PTNS0W113188Z-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S6PTNM0TA47793P-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S6PTNS0W113144X-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S6PTNM0TA47728D-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S6PTNS0W113153M-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S6PTNM0TA47790F-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S6PTNM0TA48482L-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_1TB_S6PTNM0TA49482Y-part3  ONLINE       0     0     0

errors: No known data errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working 🤷 can't reproduce
Projects
None yet
Development

No branches or pull requests

4 participants