Can you tell the AI something like "I want a fairy sitting on a toadstool?" Then once the picture is done, fine tune with prompts like, "Widen the eyes a bit" and "I want the wings to be shaped like hawk wings instead of butterfly wings" and so on until the picture has met your mental image?
That's possible. There was an early image-to-image model called instruct-pix2pix that worked this way, but it wasn't updated and is now way behind the times. There are other ways of having the generated image matching your mental image that are (currently) easier than prompting modification in natural languaege, of example inpainting (where you select a part of the image with a brush, in your example the wings, and tell the model to draw hawk wings on it, and it try to mix the new prompt with the existing image, with the hawk wings where you want them), or tools like gligen where you draw boxes on a canvas and tell the model to generate specific things in each box, so you can get better composition of the image and not just random elements all over the picture.
If you can, then art created like that would for me cross over from AI created to artist created via an AI tool. The ultimate piece of art is the artists vision, not whatever the AI decided to draw upon for the original picture.
True.
But the question was "how can one tell if an image is AI-generated or not?" and it's often more apparent at very high resolution. I am pretty sure in a printed book with few details, it is already impossible to tell an AI work (well made, touched up for realism, not just a "type a prompt and click generate until you're cool with the result", which is quicker but will keep artifacts) and a graphic-software using human. It is very hard for photorealistic picture already, and the telling signs are tiny (slightly deformed reflection in the irises, for example, or errors in the fine detail of a cloth pattern... but it only works to detect AI-generated images passing as photographs, as illustrators wouldn't generally include such a fine level of details in their drawing anyway) . If you stay in the realm of fantasy pictures, how can one say that the pattern on your butterfly fairy wings are not realistic enough? There is more leeway for creativity, and most of the signs at this point can pass as artistic choices.
Arguably, consistency is difficult right now (but it won't stay true in the next few month) so asking for one character in a series of scenes could help improve detection. But if it was a "contest" and the submitter knew he'd have to pass the scrutiny, he'd probably train the AI on his first creation and tell it to reuse it for the next image. It is becoming very easy to transfer a specific face to improve consistency.
I don't think at this point it's possible to have an overall "test of AI-ness" working everytime. It would depends on the specifics of the art piece (mostly style, adherence to composition and very little details), and thus "AI-art" would need to be narrowed for a satisfying answer to be determined. After all, if one draw a t-rex and tell the AI to put it in different backgrounds, it would be AI-art but identifying it would rely on focusing on the background, not the T-Rex. But I guess you'd say it's not AI art, so it should pass the test of being man-made.
The areas where the generative AI is making mistakes are narrowing and while it was easier to detect in the long forgotten past of 2023 as "bad hands" is something that can be apparent on a most picture, the current difficulties are much less general. Specific interactions between items are hard, especially if the interactions are uncommon and the AI was rarely trained on image representing it. It can have someone eating with chopsticks, for example, but if you ask for someone being in the process of eating a burrito, there is a good chance the AI will draw someone cutting it and not eating it the right way. Same with sushi: it will most probably draw someone eating it with the rice at the bottom instead of turning it so the fish slice is facing the tongue, as one should. So one could commission a specific image of someone eating sushi correctly and a real human illustrator will be able to do it, while AI will most likely struggle with it. But it's impractical as a test, because one can't draw a whole monster manual's worth of monsters eating sushi the right way.