8,844
edits
Paradox-01 (talk | contribs) mNo edit summary |
Paradox-01 (talk | contribs) mNo edit summary |
||
| Line 48: | Line 48: | ||
'''GenAI systems operate probabilistically.''' Do not expect identical results when repeating prompts with the same inputs. The same text prompts may produce similar, but not identical, outputs. Therefore, in some scenarios, it can be beneficial to generate multiple results and select the most suitable candidates for your intermediate or final goal. | '''GenAI systems operate probabilistically.''' Do not expect identical results when repeating prompts with the same inputs. The same text prompts may produce similar, but not identical, outputs. Therefore, in some scenarios, it can be beneficial to generate multiple results and select the most suitable candidates for your intermediate or final goal. | ||
==Sounds== | |||
For natural voices you may want to look for emotional text-to-speech. | For natural voices you may want to look for emotional text-to-speech. | ||
| Line 59: | Line 59: | ||
For professional-level fine-tuning you might want look at editors like [https://www.celemony.com/en/melodyne/what-is-melodyne Melodyn]. | For professional-level fine-tuning you might want look at editors like [https://www.celemony.com/en/melodyne/what-is-melodyne Melodyn]. | ||
=== | ==Images== | ||
===Techniques=== | |||
Creating new content | |||
* | * Text-only prompts | ||
* Text prompts with one or multiple references | |||
Changing existing content | |||
* Expanding | |||
* Inpainting (replacement of subsections) | |||
* Style transfers | |||
== | ===Workflows=== | ||
[https://www.nvidia.com/en-us/glossary/multimodal-large-language-models/ Multimodal LLM]s and plugins-using LLMs such as ChatGPT, Copilot, Gemini, Grok and Meta AI provide image generation capabilities. | [https://www.nvidia.com/en-us/glossary/multimodal-large-language-models/ Multimodal LLM]s and plugins-using LLMs such as ChatGPT, Copilot, Gemini, Grok and Meta AI provide image generation capabilities. | ||
* In addition, there are specialized tools (e.g., diffusion-based systems) that offer more control and customization. However, most users already have access to at least one of these platforms and can begin generating images immediately. | * In addition, there are specialized tools (e.g., diffusion-based systems) that offer more control and customization. However, most users already have access to at least one of these platforms and can begin generating images immediately. | ||
| Line 80: | Line 77: | ||
For image editing and post-processing, dedicated graphics software such as [https://www.gimp.org/downloads/ GIMP], [https://krita.org Krita], or [https://www.adobe.com/products/photoshop.html Photoshop] is recommended. These tools allow precise control (e.g., masking, compositing, color correction) and can complement GenAI workflows. | For image editing and post-processing, dedicated graphics software such as [https://www.gimp.org/downloads/ GIMP], [https://krita.org Krita], or [https://www.adobe.com/products/photoshop.html Photoshop] is recommended. These tools allow precise control (e.g., masking, compositing, color correction) and can complement GenAI workflows. | ||
In general you will always want to takes these steps: Generate, refine via text prompts, select final candidates, refine via graphic tools. | |||
====Beginner workflows==== | |||
In context of its limitations, chatbot-based workflows are still improvements over a pure manual workflows: They speed up prototyping and let you explore different creative directions for drafts. | |||
=====ChatGPT===== | |||
* '''Iterative prompting''': Describe the desired result as clearly as possible: Motive, perspective, colors, lights, shadows, art style. Refine the prompt step by step based on undesired aspects rather than expecting a perfect result on the first attempt. Negative prompts: You can also explicitly write what you don't want. | * '''Iterative prompting''': Describe the desired result as clearly as possible: Motive, perspective, colors, lights, shadows, art style. Refine the prompt step by step based on undesired aspects rather than expecting a perfect result on the first attempt. Negative prompts: You can also explicitly write what you don't want. | ||
* '''Avoid quality loss''': If the GenAI degenerates the image quality because of too many iterations, try from a new start with combined text prompts. | * '''Avoid quality loss''': If the GenAI degenerates the image quality because of too many iterations, try from a new start with combined text prompts. | ||
| Line 92: | Line 92: | ||
* '''Consistency strategies''': When available, use features such as seeds, style references, or controlled variations to maintain visual coherence across multiple images.--> | * '''Consistency strategies''': When available, use features such as seeds, style references, or controlled variations to maintain visual coherence across multiple images.--> | ||
(Add examples here.) | |||
=====Copilot===== | |||
=====Gemini===== | |||
AUTOMATIC1111 (aka Stable diffusion) | ====Advanced workflows==== | ||
AUTOMATIC1111 (aka Stable diffusion) | |||
ComfyUI | |||
====Expert workflows==== | |||
This would include to train own models. The idea is to let the models have a neural representation of objects that equals to screenshot-taking from 3D so that prompts will output almost never hallucinate details. That way artist can reduce post-editing as the generated outputs also include his own style. | This would include to train own models. The idea is to let the models have a neural representation of objects that equals to screenshot-taking from 3D so that prompts will output almost never hallucinate details. That way artist can reduce post-editing as the generated outputs also include his own style. | ||
Own models once more boost rapid prototyping because they reduce the necessity to have a more complexer, combined 3D-2D-workflow. | Own models once more boost rapid prototyping because they reduce the necessity to have a more complexer, combined 3D-2D-workflow. | ||
==Videos== | |||
'''Google Veo''' | '''Google Veo''' | ||
| Line 117: | Line 115: | ||
'''Sora''' (OpenAI) | '''Sora''' (OpenAI) | ||
* In 2026, Sora was announced to be discontinued. It will be probably just paused for a few years until future AI chips have lowered computation costs. | * In 2026, Sora was announced to be discontinued. It will be probably just paused for a few years until future AI chips have lowered computation costs. | ||
==3D content== | |||
There exists content generators that turn 2D data into 3D data by calculating plausible assumptions for the missing dimension. | |||
===3D objects=== | ===3D objects=== | ||
* ... | * ... | ||
===3D animations=== | ===3D animations=== | ||
* ... | * ... | ||
== | |||
==World maps== | |||
* ... | * ... | ||
* World generator inside Unreal Engine 5 | * World generator inside Unreal Engine 5 | ||
** As for 2026, this is technically speaking still "procedural" but it is plausible to expect an LLM-driven approach in the future. On the internet you can find already LLM-driven experiments. | ** As for 2026, this is technically speaking still "procedural" but it is plausible to expect an LLM-driven approach in the future. On the internet you can find already LLM-driven experiments. | ||
edits