Is AI-generated artwork getting good enough to put artists out of the job? | by Beatrice Milasan

A study of the development of GANs over the years

AI-created art of AI creating art — image credit

You’ve probably glanced at it on the news. Even heard it at the latest G7 summit. AI (artificial intelligence) is becoming powerful.

According to the World Economic Forum’s 2020 “Future of Jobs Report”, AI is expected to take 85 million jobs worldwide by 2025, and a 2013 study conducted by Oxford University is confident that over the next 20 years, AI could eliminate 47% of jobs in the United States alone. Just think about how fast AI has progressed since 2013 — it could be even sooner than we think.

AI is growing faster and taking down jobs with a greater velocity than a set of dominos falling onto each other. In fact, that’s a perfect analogy for it. All progression in AI seems to be causing this sort of ripple effect, where one thing is discovered, another algorithm is created and trained, and more discoveries are made. It just seems to keep on going.

But now, an opposing viewpoint:

Will AI help us in any way? Apart from the obvious simplifying mundane tasks, another source by the World Economic Forum states that 95 million new jobs will be created by AI by the year 2025.

The job race: AI vs. humans — image credit

The use of artificial intelligence has become such a controversial topic lately, and for the most part, we don’t even really know what AI is. If you went up to your average American and asked them what exactly is artificial intelligence, I’m not sure quite how many would be able to define it properly. (For the record, this article does an amazing job.)

Now back to our original question, is AI-generated artwork getting good enough to put artists out of the job? That was not just clickbait, it is a question we will be answering in this article. Our first order of business is to find out how exactly can art be created through the use of AI. 🤖

For a while now the quality of AI-generated images has been increasing, mainly thanks to something called a GAN (generative adversarial network).

You probably have not heard the term ‘generative adversarial network’ before so let’s break it down.

A GAN is a machine learning model (ML) where two neural networks compete and create new data instances that resemble their training data.

(If you’re unfamiliar, neural networks are methods of deep learning that use nodes mimicking the structure of the human brain to create a system where computers can learn from past errors, just like humans do.)

The way GANs are structured is through two neural networks that are actively trying to outsmart each other. One is referred to as the generator and the other is called the discriminator. The generator is given the goal of artificially generating a result that can be mistaken as real data. The discriminator is tasked with identifying and differentiating real vs. fake data.

What a GAN looks like laid out (image credits to Google’s ML hub)

Then eventually if the generator training goes well, it becomes good enough to fool the discriminator.

Though this section isn’t directly linked to what I was talking about above with adversarial networks, I think it’s a necessary bit to get a fuller picture of the world of AI.

The way AI learns what it knows is through being fed training data. Here are two commonly used ways to train AI on a dataset:

— Supervised learning

Supervised learning algorithms are trained on labeled datasets. Data scientists provide the model with both input and output data, then give the model feedback and develop a predictive model based on practice and repetition. A good example of this is how spam detection algorithms work. They are fed many junk emails, and through that repetitive process, they learn the characteristics of a junk email and soon are able to pick one out in the wild, on their own. This process is known as classification.

— Unsupervised learning

This learning model trains algorithms on unlabeled datasets. This is the method of learning that is used to train GANs. In unsupervised learning, models must attempt to group and label given data sets solely on input data. They then receive feedback on how they did and iterate their judging process from there. Think of it as a trial-and-error approach at the start. When you think of unsupervised learning, it’s easy to remember it by knowing works by labeled datasets in groups or clusters. This path of finding a structure among datapoint is commonly known as clustering in unsupervised learning.

A chart to help differentiate classification vs. clustering— image credit

Though a quite recent invention, over the years, GANs have progressed rather quickly.

These networks were first introduced in June of 2014, mentioned in a paper written by Ian Goodfellow and a few other researchers from the University of Montreal, including Yoshua Bengio, a computer scientist recognized worldwide for his work in deep learning.

Not soon after, Yann LeCun — the research director of AI at Facebook — named adversarial training “the most interesting idea in the last ten years in machine learning.”

Though GANs were relatively new at the time, many AI researchers and scientists quickly began to look into them seeing a road of possibility ahead!

This begs the question, why exactly were generative adversarial networks so exciting to the AI community at the time?

Well, there are a bunch of reasons why AI fans quickly became obsessed with these networks. A huge one is how they were able to generate such good and convincing results. They also sparked a massive revolution in the deep learning field, which later produced major breakthroughs in the history of AI development and even computer science as a whole.

Another great thing about GANs is the fact that they use unsupervised learning. GANs do not need labeled data to get started which makes these powerful algorithmic architectures very easy to use and understand. It also clears up a lot of time that would have previously been spent labeling data, as done with supervised models.

QUICK NOTE:

It’s a common myth that GANs can only be used to create images.

Their function is to create new data instances that resemble your training data. Data is not necessarily an image, it could also be a video or a sound even. They are algorithms with the purpose of creating something new, and their end result is not limited to just producing an image.

It’s surely safe to say that GANs are a blessing to the world of AI that have made our lives so much easier.

Though, it is not all bad news. There are still many examples of how we can use AI for good and how it can be used to simplify tasks on a day-to-day basis. How it can help us put together the perfect presentation, how it can optimize our schedule for us to be able to get the most work done in a day without burning out. How AI is capable of so many things that revolutionize life as we know it today and all we need to do is know in what ways to apply it.

Ever wanted a behind-the-scenes look into how exactly AI generates images and what that training process looks like? Well, you’re in the right place.

In order to properly grasp this topic, we’ll start off with a case study of how something called DALL-E 2 was built.

But first a little background:

DALL-E 2 was created by OpenAI (yes, the ‘ChatGPT company’) and launched in April of 2022. Notice how it’s called DALL-E 2? Yes, there is a version that came before this one, launched in January 2021, however, version one was not nearly as advanced.

DALL-E 2 is currently one of the best image generators available and you can try it out yourself here for free. All you need to do is type in a prompt and DALL-E 2 will whip up an image for you straight away!

For example, with the prompt ‘a sunlit indoor lounge area with pink chairs and a pool with clear water in the style of digital art,’ this is what I got:

An image generated by DALL-E 2 for the prompt I provided

Now, as for the process of building something like DALL-E or DALL-E 2 here are some big steps you would need to take:

Data collection

First, you would need to gather a large dataset of images to train it on. This dataset would contain pairs of images and their descriptions. Datasets would probably look something like this at the start, though with detailed captions of each image:

An example of images a model like DALL-E might be fed — this one is of snails and harps! (source)

2. Preprocessing

A raw dataset however isn’t any good for training models straight away. The next crucial step in the process would be to ensure that all necessary modifications are made. These probably (and likely) include enhancing image resolution, resizing images, cropping images, et cetera.

3. Training

And, once again we welcome back GANs to help us out with the training:

Training through a generative adversarial network (same model as shown previously) — source

4. Evaluation

Our final step here would be to evaluate the performance of the model using a number of different metrics and validation datasets (also called ‘test datasets’) to make sure it meets all necessary requirements.

Now that we’ve looked into that, a question:

Both DALL-E and DALL-E 2 were trained using a GAN, so why is DALL-E 2 so much better?

Time, really. DALL-E 2 was able to continue to learn and grow. The more time it spent in action the better it was able to grasp how to create realistic photos tailored to what each user wanted.

Isn’t AI cool? Just look at what we were about to do with adversarial networks — literally, generate any image we can possibly think of on command!

It’s safe to say they’re pretty powerful… but what if someone has other plans for how they’re going to use GANs?

It’s no secret that something as powerful as AI can be used for good and for bad. Not to destroy your faith in humanity, but we’ve seen all sorts of cases like these. For example, when ChatGPT first came and people asked it things like ‘how to hotwire a car’, for those first few weeks it obeyed these commands after being tricked, as it had a few unresolved bugs.

Which brings me to another thought: can GANs be used for evil?

The not-so-amazing answer: yes, they very easily can. A classic example is something called a ‘deepfake’.

Deepfakes are described as “synthetic media that have been digitally manipulated to replace one person’s likeness” or “the manipulation of facial appearance through deep generative methods”.

In short, it’s when you use AI to make it look like someone is speaking when they are actually not, making them say things they are actually not saying. Back in 2010, it may not have been much to worry about, but now deepfakes are getting better and better at mimicking facial expressions and voice.

This Youtube video does an amazing job of talking about the dangers of deepfakes. Take a look — they’re getting scary good.

A video portraying the danger of deepfakes

And today, in 2023, it’s even stronger and twice as accessible as it was six years ago. Today, there are over 500 000 deepfake videos on the Internet and that number will just keep growing. To create one all you essentially need is a laptop and the right YouTube video… or the wrong one I guess.

But it’s not all bad news. We can still use AI for good, and just because it may not always be used for its intended purpose that doesn’t mean it’s not still helping people around the world. I’m just trying to give you a clear picture of both the good and the bad.

So now that you’ve hopefully learned enough about GANs, we will finally answer that first question I posed at the start of this article: Is AI-generated artwork getting good enough to put artists out of the job?

Well, first of all, ‘artist’ is a general term. If you’re talking about painters and sculptures, then probably not…

But if you mean digital artists, then I’m still not too sure. I know you’re probably sighing right now because you read a whole article only to hear that I can’t predict the future, but I can show you both sides of the story.

On one hand, AI can produce quite high-quality work faster and cheaper than your average digital artist. Say goodbye to those nasty commision prices!

But on the other hand, some people look at a piece of art and they pay for the stories. They pay for the real human emotion and thought process that goes into every piece, along with the value of many years of studying and creating to be able to make something that speaks for itself.

As a society, I have no clue which way we will go.

For fun, I asked ChatGPT, and here’s what it said: