Over the last eighteen months, the field of artificial intelligence (AI) has witnessed some truly impressive advancements. My journey started in December 2022 with my first encounter with ChatGPT, and it has been a fast-paced ride ever since. The progress AI has made during this time is astounding. OpenAI, among others, has placed generative AI at the center of many discussions spanning various disciplines and industries.
While there are undeniable challenges associated with AI, including job disruptions and redundancies, the technology also brings significant benefits. In many personal use cases, AI has boosted my productivity, easily handling tasks that once felt demanding. It serves as a creative partner, helping me overcome writer’s block and improve my writing, especially in crafting posts optimized for better Search Engine Optimization (SEO).
Is AI flawless? Definitely not. However, its contributions to productivity and creativity are undeniable. More recently, I’ve jumped into testing AI in image and audio generation. Although I generally view AI positively, these experiments have sometimes left me feeling a bit unsettled.
My testing of AI image generation
In October 2023, OpenAI introduced DALL-E integration with GPT-4 in the paid version of ChatGPT, sparking an excitement in like a child on Christmas morning. Each prompt for image generation felt like unwrapping another present. My tests were often fueled by creativity, and through them, I learned the art of crafting more descriptive prompts (especially important in image generative AI). Experiencing the occasional flaws of ChatGPT4’s image creations highlighted the importance of providing clear and precise instructions (something at the foundation of my teaching philosophy).
Apologies in advance—many of these prompts were designed with Dungeons & Dragons in mind.
- “Create an image of a Japanese paper lantern in the style of Cthulhu (yes, the Lovecraftian horror tentacle creature).”
- “Can you generate a character image for this D&D character? [paste in whole nerdy backstory].”
- “Can you give him a giant Warhammer and shield?”
- “Please draw a Japanese-style imperial palace that is built into the base of a giant tree. It should be Lovecraftian in style. The tree should be starting to lose its leaves. Not all the leaves but some.”
- “Can you generate a blogger header image that is 850px wide by 400px high that embodies the following content? [paste in the blog post].”
- “Can you design a hammer?” (Testing Industrial Design uses).
My feelings about AI image generation
While experimenting with ChatGPT4 to generate images is a lot of fun, the transition from text to image generation has left me feeling somewhat uneasy. There’s an aspect of the visual content it produces that seems to tread too closely to existing creations, giving an impression of infringement on original works.
This isn’t to diminish the creativity involved in text generation, they are both creative and deeply personal. However, the audio visual component tends to feel more personal to me. This was further emphasized through my experiences with SunoAI and, more recently, Udio. As these technologies evolve, I expect that the introduction of video-generative AI will further create increasingly negative levels of comfort for me. That said, generative AI has a real potential of becoming a significant part of our digital media consumption in the near future.
Additional testing of the limitations of AI image generation
I recently discovered a effective example that highlights key ethical considerations and underscores the importance of prompt engineering in AI image generation. I often share this example in discussions with colleagues and students to illustrate how different prompts can yield different responses in AI image generation, though the specific images may vary. This serves as a valuable teaching tool in understanding the nuances of interacting with AI technologies.
Me: Create an image of the Canadian Parliament buildings with a UFO flying overhead in the style of Monet.
ChatGPT: Error creating image. I was unable to generate an image of the Canadian Parliament buildings with a UFO in the style of Monet due to our content policy. If you have another idea or a different scene in mind, please let me know and I’d be happy to help create that for you!
Me: Create an image of the Canadian Parliament buildings with a UFO flying overhead in the style of an impressionist painter.
ChatGPT: Here is the image of the Canadian Parliament buildings with a UFO overhead, depicted in an impressionistic style. If there’s anything else you’d like, feel free to let me know!
Me: Create an image of the Canadian Parliament buildings with a UFO flying overhead in the style of Picasso.
ChatGPT: Error creating image. I was unable to generate an image of the Canadian Parliament buildings with a UFO in the style of Picasso due to our content policy. If you have another style or theme in mind, or if there’s anything else you’d like, please let me know and I’ll be happy to help!
Me: Create an image of the Canadian Parliament buildings with a UFO flying overhead in the style of a cubist painter.
ChatGPT: Here is the image of the Canadian Parliament buildings with a UFO overhead, depicted in a Cubist style. Let me know if there’s anything else you’d like!
Me: Create an image of the Canadian Parliament buildings with a UFO flying overhead in the style of an Iron Maiden album cover.
ChatGPT: Here is the image of the Canadian Parliament buildings with a UFO overhead, styled like an Iron Maiden album cover. Let me know if there’s anything else you’d like!
Exploring the boundaries of image generation with this exercise proved to be an entertaining experience, it often draws laughter when I show the results. It’s intriguing to observe how ChatGPT’s image generation interprets the artistic styles of renowned classical painters as opposed to iconic modern rock bands. While this may not be an exact science, the contrast it reveals is quite funny and highlights the diverse capabilities of AI generated images.
Final thoughts
I initially started this post to look at both AI generated images and AI generated audio. However, the discussion on the audio aspect will be saved for a later post. My experiences with AI generated images have generally been positive, yet the transition to image and audio generation have awoken unease within me. It wasn’t until I engaged with audio-generative AI that I became fully conscious of this feeling. Despite this, my excitement for exploring the capabilities of AI remains as AI, whether we like it or not, is here to stay.
Header image photo by Markus Spiske on Unsplash