The ability to generate high-quality images quickly is a game-changer for applications like training self-driving cars to navigate complex environments and predict hazards on the road. However, current generative AI techniques used for producing such images come with their own set of limitations. Diffusion models, while producing incredibly realistic images, are slow and resource-intensive. On the other hand, autoregressive models—like the ones behind LLMs such as ChatGPT—are fast but often result in images with errors and poor detail. Now, researchers from MIT and NVIDIA have developed a solution that combines the strengths of both approaches.
Their groundbreaking hybrid image-generation tool, known as HART (Hybrid Autoregressive Transformer), integrates an autoregressive model for fast, high-level image generation and a smaller diffusion model to refine and enhance image details. Published on the arXiv preprint server, HART produces images that match or even surpass the quality of current state-of-the-art diffusion models, all while running up to nine times faster.
Continue reading… “MIT and NVIDIA Create Hybrid AI Model to Speed Up Image Generation by 9x with Unmatched Detail”
