Exploring and Evaluating Generative Artificial Intelligence Number Three

I decided to make a Header Image (above) for this little series of posts and have retrofitted it to the two previous posts here and here. So I asked the Midjourney app on Discord, to depict a silver-skinned Android, firstly, standing at an easel painting, and then at a computer typing. I am fairly sure that the AI known as Midjourney had no sense of the irony of asking it to anthropomorphise an Android doing these activities, because current forms of AI are so far from having the sentience required to appreciate concepts as subtle as irony. Spoiler alert, I approached this evaluative exploration with certain preconceptions about the likely conclusion although I didn’t know for sure, how those conclusions might be reached because I didn’t know how AI’s work, in detail. What I am going to show you today is what I have learned, but I am also going to link you to a very erudite analysis of why we should not be worried about AI taking over the world – in a piece called “Why the AI Doomers Are Wrong“, Nafeez Ahmed explains why the direction of travel of AI development, simply can’t lead to a human-brain-like sentience. I will quote from his article later.

First of all, look at the left-hand side of the header picture, in particular, the easel. On close inspection, you can see that the easel is the wrong way round and that the painter/android, is standing behind the easel. Midjourney produces four images by default, in the remarkable time of about 60 seconds which is almost like magic – indeed, in 1962, Arthur C. Clarke, a science fiction writer, stated in his book “Profiles of the Future: An Inquiry into the Limits of the Possible” that “Any sufficiently advanced technology is indistinguishable from magic”. So despite the apparent magic of creating these images so quickly, the AI has made a fundamental mistake that reveals that it doesn’t really understand what an easel is or how it should be used. Nafeez Ahmed is mostly talking about text generative interactions with AI – ChatGPT and the like, but what he says below, is equally applicable to images generated by AI…

The stunning capabilities of the new chatbots and their uncanny ability to engage in human-like conversations has sparked speculation about sentience. Yet there is zero tangible evidence of this beyond a sense of amazement.
What many miss is that the amazing nature of these conversations is not a function of an internal subjective ‘understanding’ by an identifiable entity, but rather the operation of algorithms made-up of hundreds of billions of parameters trained on massive datasets of human text to predict words that should follow given strings of text. {…} This is not a breakthrough in intelligence, although it is, certainly, a breakthrough in being able to synthesise human responses to similar questions and thereby mimic patterns in human interaction. This model of AI, therefore, cannot, in itself, generate fundamentally new knowledge or understanding – let alone sentience.
Nafeez Ahmed

Nafeez goes into great detail about how the research is headed in the wrong direction and indeed, how it is unlikely it is that it will ever succeed in equating to human sentience, so if you want to put your mind at rest about a Terminator-style future in which humans are subjugated by machines – nip on over and read the full article. Meanwhile I am going to show you some more examples of how Midjourney gets things “wrong” and how to get the “right” results and what that says about how useful such a programme can be.

You interact with the Midjourney app by sending it a message (just as if it was really an intelligent entity) containing a prompt, and once you receive your four images, you can choose one to enlarge, if you are satisfied with it, or run variations on one or all of them. Here is the prompt that produced the above set of images. “Silver android painting at an easel by Roy Lichtenstenstein” – the AI places most importance on the object at the beginning of the prompt, then on the activity described and lastly, it attempts to produce the images, in this case, in the style of the Pop Artist Roy Lichtenstein – famous for painting s in the style of close-ups of comic book pictures. These close-ups show the dot screens that were used to shade the illustrations of the comic book plus the hard black outlining and Midjourney has picked up well on these style features, particularly the top right and bottom left pictures. The top-left shows a sculpture vaguely resembling an up-cycled easel made of silver and the bottom right shows a silver-skinned figure with dot-screen effect, holding a brush and painting a picture but with no easel. In the [op-right picture, the top of the easel is just showing in the bottom corner and the android “artist” is holding a small blank canvas in her hand and drawing on it. Having seen the header image at top, and these pictures were as near as I could get to what I wanted, from multiple attempts, you can see that what I wanted was an all-over silver-skinned android and in the images above, top-right has a human face although “her” body is robotic – perhaps cyborg is a better description, whilst the other pictures show a sculpture, a woman and a totally abstract figure. So I decided to change the prompts to “Robot” rather than “Android” which produced better results. The reason I had started with “Andriod” was because robots range from automatic hoovers that move around your rooms looking like little miniature flying saucers sucking up dirt to more anthropomorphic devices – which is what I wanted.

“standing silver robot painting at an easel by Roy Lichtenstein” produced(among others) the above image in which the robot, possibly standing, is grasping what looks like the top of an easel but the “painting” does not appear to be on the easel. So I tried “Robot standing painting at an easel” and got this rather cute robot who looks like he is sitting on an invisible chair – “Hey Midjourney” just because you don’t show the chair, doesn’t make it standing!” Notice that with the style reference to Roy Lictensten gone, this image is very different. I would like to show you more of the iterations but Midjourney froze and when I reinstalled it, it had lost the entire session of work – you just can’t get the staff…

Another thing that I have discovered in my experiments, is that both Midjourney and ChatGPT, like to add unspecified embellishments – remember in my first report, how ChatGPT found the correct explanation for the phrase “Cold enough to freeze the balls off a brass monkey” but then added a totally made up explanation? Well Midjourney does the same thing too. Here is a picture of the railway viaduct at Knaresborough in West Yorkshire , an hours drive from where I live.

https://www.visitharrogate.co.uk/explore/knaresborough

I wanted to see if Midjourney could produce a collage image using fragments of maps which it tried but didn’t really understand the concept – although I am not saying that it can’t, but at the very least, my prompt wasn’t sufficient (one of the oldest sayings amongst computer programmers is “Garbage in – garbage out!”) Here is Midjourneys best effort…

There are some map elements and the whole scene has been chopped up and rearranged but not in a way that makes sense – this one is closer to the real view…

But my first attempt, before I added the collage style, was simply to see how Midjourney would find and represent the viaduct and it generated the four images below. The top left image, Midjourney has added another railway running beneath the viaduct, likewise, lower-left it has added a canal and in the images on the right, Midjourney has transported us into a past sans Knaresborough and a post apocalyptic future where vegetation is growing over the viaduct.

Enough with all the pretty pictures – what does all this reveal about the way that the AI Midjourney works! Referring to the work – Erik J. Larson in his book, The Myth of Artificial Intelligence, Nafeez Ahmed cites a summary of the work by Ben Chugg, lead research analyst at Stanford University (Iknow – quotes within quotes) as follows:-

“Larson points out that current machine learning models are built on the principle of induction: inferring patterns from specific observations or, more generally, acquiring knowledge from experience. This partially explains the current focus on ‘big-data’ — the more observations, the better the model. We feed an algorithm thousands of labelled pictures of cats, or have it play millions of games of chess, and it correlates which relationships among the input result in the best prediction accuracy. Some models are faster than others, or more sophisticated in their pattern recognition, but at bottom they’re all doing the same thing: statistical generalization from observations.
This inductive approach is useful for building tools for specific tasks on well-defined inputs; analyzing satellite imagery, recommending movies, and detecting cancerous cells, for example. But induction is incapable of the general-purpose knowledge creation exemplified by the human mind.”
https://towardsdatascience.com/the-false-philosophy-plaguing-ai-bdcfd4872c45

Nafeez goes on:-

Current AI has become proficient at both deductive and inductive inference, with the latter becoming a primary focus.
Larson points out that human intelligence is based on a far more creative approach to generating knowledge called ‘abduction’. Abductive inference allows us to creatively select and test hypotheses, quickly eliminate the ones which are proven wrong, and create new ones as we go along before reaching a reliable conclusion. “We guess, out of a background of effectively infinite possibilities, which hypotheses seem likely or plausible,” writes Larson in The Myth of Artificial Intelligence. {…}
And here is Larson’s killer diagnosis: We don’t have a good theory of how abductive inference works in the human mind, and we have no idea how to recreate abductive inference for AI: “We are unlikely to get innovation if we choose to ignore a core mystery rather than face it up,” he writes with reference to the mystery of human intelligence.
Before we can generate genuine artificial intelligence that approaches human capabilities, we need a philosophical and scientific revolution that explains abductive inference. “As long as we keep relying on induction, AI programs will be forever prediction machines hopelessly limited by what data they are fed”, explains Chugg
https://www.bylinesupplement.com/p/why-the-ai-doomers-are-wrong?utm_source=substack&utm_medium=email

To relate this back to my experiments with Midjourney, the AI could identify what an easel looked like and include it in an image but it didn’t really know what it was or how it was used. Easels are probably present in thousands of pictures of artist’s studios as well as adverts but I bet there isn’t a Painters 101 that “First you will need an easel and this is how you use it” because when a human artist goes into a studio and sees canvasses fixed into easels, even if he has never seen one before, he is there to paint canvasses and it is obvious what they are and what they are for. It might be obvious to a human being with our ability to use inference, deduction and abductive capabilities, but an AI might identify an easel, but without finding a clear description of the usage, it cannot fully fathom how to use it…

As for the the tendency to add extraneous details, well the algorithms that govern Generative AI’s, are designed to mimic human conversational style, so when it has found a relevant answer to the requested information or task, it looks to extend the conversation in what it has learned might possibly follow – it doesn’t know whether it is true or not, because that is way above it’s paygrade ( a metaphor which it probably wouldn’t understand either). This phenomena of AI’s making things up is called hallucination – a very anthropocentric term…

I will make one more report on my attempts to get exactly what I wanted from Midjourney and how I found a compromise to be able to work with the results…

3 thoughts on “Exploring and Evaluating Generative Artificial Intelligence Number Three”

Misky
May 16, 2023 at 7:42 pm

Discord crash: Try going to https://www.midjourney.com/app/ and login using your Discord username and password. This is Midjourney’s api server, and there you should be able to find previously created work – (select All) to view all artwork you’ve created.

As for a collage, have you tried the Blend prompt and upload 3 or 4 different images for the Bot to work with? I actually like your first collage a lot. It’s very interesting.
- humanist55Post author
  May 16, 2023 at 9:59 pm
  
  Yes! You beauty! It’s all there!
  Thanks from
  Your humble Grasshopper!
  - Misky
    May 17, 2023 at 9:54 pm
    
    You’re entirely welcome.

How would you know…

Exploring and Evaluating Generative Artificial Intelligence Number Three

3 thoughts on “Exploring and Evaluating Generative Artificial Intelligence Number Three”

Leave a Reply Cancel reply