Jump to content

Media creation with artificial intelligence: Difference between revisions

m
no edit summary
mNo edit summary
mNo edit summary
Line 64: Line 64:


==Tools==
==Tools==
===2D images===
===Images===
[https://www.nvidia.com/en-us/glossary/multimodal-large-language-models/ Multimodal LLM]s and plugins-using LLMs such as ChatGPT, Copilot, Gemini, Grok and Meta AI provide image generation capabilities.
[https://www.nvidia.com/en-us/glossary/multimodal-large-language-models/ Multimodal LLM]s and plugins-using LLMs such as ChatGPT, Copilot, Gemini, Grok and Meta AI provide image generation capabilities.
* In addition, there are specialized tools (e.g., diffusion-based systems) that offer more control and customization. However, most users already have access to at least one of these platforms and can begin generating images immediately.
* In addition, there are specialized tools (e.g., diffusion-based systems) that offer more control and customization. However, most users already have access to at least one of these platforms and can begin generating images immediately.
Line 94: Line 94:


AUTOMATIC1111 (aka Stable diffusion), ComfyUI
AUTOMATIC1111 (aka Stable diffusion), ComfyUI
'''Expert workflow'''
This would include to train own models. The idea is to let the models have an mental image of objects that equals screenshot-taking from 3D so that prompts will output images that almost never include hallucinated details.
Own models are interesting for rapid prototyping because they reduce the necessity to have a more complexer, combined 2D-3D-workflow.


===Videos===
===Videos===
'''Google Veo'''
'''Grok''' (xAI)
'''Grok''' (xAI)
'''Gemini''' (Google)
   
   
'''Sora''' (OpenAI)
'''Sora''' (OpenAI)
* In 2026, Sora was announced to be discontinued. It will be probably just paused for a few years until future AI chips have lowered computation costs.


===3D objects===
===3D objects===
8,844

edits