Media creation with artificial intelligence: Difference between revisions

← Older edit

Media creation with artificial intelligence (view source)

Revision as of 14:12, 28 March 2026

1,491 bytes added , 14:12

m

no edit summary

Paradox-01

8,844

edits

@@ Line 1: / Line 1: @@
 [[Category:Real World]]
+Sub pages:
+* [[Media_creation_with_artificial_intelligence/Examples|Examples gallery]]
 ==Copyright and fair use==
 To understand the full picture of copyright, it is necessary to look at its real-world implementation.
@@ Line 58: / Line 62: @@
 ===Music generators===
+====Suno====
+* https://suno.com/create
 ====Lyria (Google Gemini)====
 * https://gemini.google/overview/music-generation/
@@ Line 75: / Line 82: @@
 * Inpainting (replacement of subsections)
 * Style transfers
+===Select best tool per use-case===
+====Overview====
+Right now, Gemini seems to perform best in most use-cases.
+(Add table.)
+====Drafting====
+====Upscaling====
+Very often you will prefer a generated style of one tool over the style of other tools. However, that shouldn't stop you from trying out other tools for other tasks. In the end a combination of generators can get you closer to the result you had in mind.
+For instance, if you like ChatGPT for the style then maybe you still want to scale up a draft or reference image in Gemini first.
+=====Gemini=====
+* Free Users: Generally capped at 1K resolution.
+* AI Plus/Pro Subscribers: Can access 2K resolution.
+* AI Ultra Subscribers: Have full access to the 4K resolution toggle and downloads.
+====Fine-drawing====
+====Generating====
+====Coloring====
+====Shading====
+====Editing and fine-tuning====
 ===Workflows===
+====Beginner workflows====
 [https://www.nvidia.com/en-us/glossary/multimodal-large-language-models/ Multimodal LLM]s and plugins-using LLMs such as ChatGPT, Copilot, Gemini, Grok and Meta AI provide image generation capabilities.
-* In addition, there are specialized tools (e.g., diffusion-based systems) that offer more control and customization. However, most users already have access to at least one of these platforms and can begin generating images immediately.
+* In addition, there are specialized tools (e.g., diffusion-based systems) that offer more control and customization. However, most users already have access to at least one of '''ChatBot''' and can begin generating images immediately.
-* For ''high-volume generation'', a paid subscription or plan is typically required.
+* For '''high-volume generation''', a '''paid subscription''' or plan is typically '''required'''.
-For image editing and post-processing, dedicated graphics software such as [https://www.gimp.org/downloads/ GIMP], [https://krita.org Krita], or [https://www.adobe.com/products/photoshop.html Photoshop] is recommended. These tools allow precise control (e.g., masking, compositing, color correction) and can complement GenAI workflows.
+'''For image editing''' and post-processing, '''dedicated graphics software''' such as [https://www.gimp.org/downloads/ GIMP], [https://krita.org Krita], or [https://www.adobe.com/products/photoshop.html Photoshop] '''is recommended'''. These tools allow precise control (e.g., '''masking, compositing, color correction''') and can '''complement GenAI workflows'''.
 In general you will always want to takes these steps: Generate, refine via text prompts, select final candidates, refine via graphic tools.
-====Beginner workflows====
+In context of its limitations, chatbot-based workflows are most often nonetheless a big '''improvement over pure manual workflows''': They '''speed up prototyping''' and let you explore different creative directions for drafts.
-In context of its limitations, chatbot-based workflows are still improvements over a pure manual workflows: They speed up prototyping and let you explore different creative directions for drafts.
+When you want to work with chatbots, you effectively have to learn [[wp:prompt_engineering|prompt engineering]]: You basically learn how to write ''good prompts''.
+=====General notes on image upgrading=====
+For upgrading a very low quality yet important image you probably want to '''upgrade specific elements first''' so details are not hallucinated to an unacceptable level.
-When you want to work with chatbots, you effectively have to learn "prompt engineering".
+ text prompt + low quality image + in advance updated elements used as references in text prompt = higher quality image
 =====ChatGPT=====
@@ Line 106: / Line 146: @@
 =====Gemini=====
+=====Paint (Cocreator)=====
+With '''Windows 11 and 40 TOPS minimum''' you can use '''Microsoft Paint with its Cocreator module'''. The Cocreator is sometimes also named Image Creator. (The Tera Operations Per Second is usually referring to INT8 operations on AI accelerator hardware, NPUs.) Windows PCs that have the naming tag '''Copilot+''' are safe to assume to have that feature.
+* You write a prompt, optionally select a style and then draw a draft that gets updated almost in real-time in a secondary panel.
 =====Photoshop (Adobe Firefly)=====
@@ Line 111: / Line 155: @@
 In Photoshop you can prompt images and immediately start editing them. Or you chose expand them first or do partial replacements.
-=====Microsoft Paint with Cocreator=====
-With '''Windows 11 and 40 TOPS minimum''' you can use '''Paint with Cocreator'''. The Cocreator is sometimes also name Image Creator. (The Tera Operations Per Second is usually referring to INT8 operations on AI accelerator hardware, NPUs.) Windows PCs that have the naming tag '''Copilot+''' are safe to assume to have that feature.
-* You write a prompt, optionally select a style and then draw a draft that gets updated almost in real-time in a secondary panel.
 ====Advanced workflows====
@@ Line 122: / Line 162: @@
 ====Expert workflows====
-This would include to train own models. The idea is to let the models have a neural representation of objects that equals to screenshot-taking from 3D so that prompts will output almost never hallucinate details. That way artist can reduce post-editing as the generated outputs also include his own style.
+This would include to train own models. The idea is to let the models have a neural representation of objects that equals to screenshot-taking from 3D so that prompts will output almost never hallucinate details. That way artists can reduce post-editing as the generated outputs also include there own styles.
 Own models once more boost rapid prototyping because they reduce the necessity to have a more complexer, combined 3D-2D-workflow.