πŸ“ Text Generation

Models like GPT, Gemini, LLaMA create essays, articles, stories, and conversations.

πŸ–ΌοΈ Image Generation

Tools like Stable Diffusion, DALLΒ·E, MidJourney create realistic or artistic images from text prompts.

🎡 Audio & Music

Models like MusicLM, ElevenLabs generate songs, background music, and realistic human voices.

🎬 Video Generation

Platforms like Runway, Pika Labs can create short clips, animations, and even movies from prompts.

πŸ’» Code Generation

AI tools like GitHub Copilot, Tabnine help developers write code faster and smarter.

🌐 Multimodal AI

New models like GPT-4o, Gemini 1.5 handle text, images, audio, and video together for richer interactions.

πŸ”§ Example: Generate an Image

A Python snippet using Stable Diffusion to create an image from a text prompt:

from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

image = pipe("a futuristic cityscape at sunset").images[0]
image.save("city.png")
βœ… Summary

Generative AI spans multiple media types: text, images, audio, video, and code. With multimodal AI, all of these can be combined for advanced applications.