ByteDance launches Bagel, claiming a new AI tool to master in image generation and editing
Bagel: ByteDance claims that Bagel does better image editing than other existing open-source VLMs. It can easily do things like adding emotions to the image, removing, changing or adding an element, style transfer, free-form editing, i.e. making changes without any limited framework.
China's tech giant ByteDance has just introduced its new multimodal artificial intelligence (AI) model, Bagel. It is a visual language model (VLM), not just able to comprehend images, but also to create and even edit them. The biggest news is that the firm has opened it up and it is now available to be downloaded from popular AI platforms like GitHub and Hugging Face.
Features of Bagel:
- Multimodal input: Capable of understanding and processing both text and images simultaneously.
- 14 billion parameters: 7 billion of which are active at a time.
- Interleaved training data: Text and images are trained together, allowing Bagel to make better connections between the two.
ByteDance claims that Bagel performs better image editing than other existing open-source VLMs. It can easily do tasks like adding emotions to an image, removing, replacing or adding an element, style transfer, free-form editing, i.e. making changes without any limited framework.