Is it possible to adjust the 512x512 to a different height/width? #22

DustinBrett · 2023-04-19T15:34:06Z

I have been adding this amazing software to my personal website and currently I have it making new wallpapers every few minutes. But what I would love is if I could sync up the height/width of the viewport with the image generation. I see some hardcoded 512's but I am unable to build this without CUDA (at least in WSL). Is it possible on the JavaScript side to set these values? Thanks!

MasterJH5574 · 2023-04-22T04:27:47Z

Hi @DustinBrett, thanks for the suggestion! We’re so glad to see you use our work as a generator of new wallpapers, and the image size adjustment will definitely make things cooler. However, it is still an ongoing job that we are working on, and it might still take some time to ship. Sorry for the inconvenience here. We will update once we support.

DustinBrett · 2023-04-22T04:40:26Z

Thanks and no problem on the inconvenience, I am just happy to use it. I've also made an "app" within my side project daedalOS to create images at that correct resolution.

matbee-eth · 2023-04-28T04:42:14Z

Hey, I'm currently working through the code and trying to figure out where it differs from the current Diffusers implementation here, https://huggingface.co/blog/stable_diffusion

Docs use this as the example to set the width and height in the torch.randn to create the tensor shape.

latents = torch.randn(
    (batch_size, unet.in_channels, height // 8, width // 8),
    generator=generator,
)
latents = latents.to(torch_device)

I don't see a similar execution in the code- how is the latent noise currently being created, and where should I implement this as its quite a bit different

I see some references to: "this.vm.getFunction", but this loses me as it doesnt point to any known location that I can find.

There is a deploy.py that contains a section of interest but, again, I don't see how it's connected to the implementation.

        latents = torch.randn(
            (1, 4, 64, 64),
            device="cpu",
            dtype=torch.float32,
        )
        latents = tvm.nd.array(latents.numpy(), self.tvm_device)

        for i in tqdm(range(len(self.scheduler.timesteps))):
            t = self.scheduler.timesteps[i]
            self.debug_dump(f"unet_input_{i}", latents)
            self.debug_dump(f"timestep_{i}", t)
            noise_pred = self.unet_latents_to_noise_pred(latents, t, text_embeddings)
            self.debug_dump(f"unet_output_{i}", noise_pred)
            latents = self.scheduler.step(self.vm, noise_pred, latents, i)

        self.debug_dump("vae_input", latents)
        image = self.vae_to_image(latents)
        self.debug_dump("vae_output", image)
        image = self.image_to_rgba(image)
        return Image.fromarray(image.numpy().view("uint8").reshape(512, 512, 4))

Is this used in the deployed webgl version?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to adjust the 512x512 to a different height/width? #22

Is it possible to adjust the 512x512 to a different height/width? #22

DustinBrett commented Apr 19, 2023 •

edited

MasterJH5574 commented Apr 22, 2023

DustinBrett commented Apr 22, 2023

matbee-eth commented Apr 28, 2023 •

edited

Is it possible to adjust the 512x512 to a different height/width? #22

Is it possible to adjust the 512x512 to a different height/width? #22

Comments

DustinBrett commented Apr 19, 2023 • edited

MasterJH5574 commented Apr 22, 2023

DustinBrett commented Apr 22, 2023

matbee-eth commented Apr 28, 2023 • edited

DustinBrett commented Apr 19, 2023 •

edited

matbee-eth commented Apr 28, 2023 •

edited