2/6: Exploring the mechanics of AI

Let’s go beyond tech jargon and dig into the basics of AI in simpler terms. Understanding what is happening under the hood empowers designers to make informed decisions that impact the overall experience.

Dilip jagadeesh

Published in

Bootcamp

8 min readMar 3, 2024

Browse full 6 part series written by Dilip Jagadeesh and Kristina Rakestraw

The image is an illustration depicting a child with a backpack, curiously examining a large, complex machine situated in a park. The machine is detailed, featuring pipes and a glowing center, resembling a fusion of vintage and futuristic design elements, suggestive of steampunk aesthetics. The park setting is tranquil, with trees and distant figures in the backdrop, rendered in soft, watercolor tones. — Image produced using DALL·E 3

Do you recall a recent meeting where you were bombarded with technical jargon and felt confused and unable to advocate for the best possible product experience?

As designers, it’s critical that we have a solid grasp of the technology we’re working with — in the same way we understand the business dynamics, customer needs and market competition. This is especially important when working with Generative AI technologies to shape experiences. By gaining a foundational understanding of this technology, we can push the boundaries of product experiences.

A bunch of tech jargon in a thought bubble. — Image produced using DALL·E 3

In this article we’ll cover the technology behind Generative AI and explore its limits.

Technology behind Generative AI

AI used to be leveraged to understand, recommend or predict information, however now it is used for much more. It is a type of machine learning that uses neural networks to analyze content to then create content. It acts like a creative mind within a computer that learns from vast amounts of data and uses this knowledge to create something that didn’t exist before — with the potential to generate content that is indistinguishable from what a human can make. There are many creative applications for a system that “learns to generate more objects that look like the data it was trained on”(Source) such as generating new music, producing art work and making decision recommendations based on a complex analysis of large datasets.

Leveraging generative models is incredibly efficient for making predictions on structured data, but less so for unstructured data. This is why it is never a full substitute for human knowledge and interaction — many tasks will still require a human for reasons we’ll go into later.

Additional reading:
Google generative AI , MIT: Explained: Generative AI ,What is generative AI? IBM , All Things Generative AI

An abstract looking machine that processes something then creates something new. There are lots of bright colors and shapes that are all connected by tubes. — Image produced using DALL·E 3

AI models

AI models are an essential ingredient for designers to understand because they essentially power the effectiveness of what gets generated. While designers don’t explicitly have a hand in crafting them, knowledge of how they work is important because they impact the experience. Let’s review a common misconception between artificial intelligence, machine learning and deep learning. Artificial intelligence is about constructing machines to stimulate human knowledge and behavior, while machine learning is about providing the machines the ability to learn on their own without programming. The value of this emerging new model is that it gets better and better at efficiency for real-world applications and can solve some tasks that otherwise would be impossible.

Additional reading:
The Ultimate Guide to Understanding and Using AI Models (2024

Machine learning: the foundation

Machine learning is the backbone of Generative AI — it is a way for computers to learn and decide on their own by examining lots of data and finding patterns.

Generative AI uses three learning approaches:

Supervised learning: learns by looking at examples that have correct answers, like being in a classroom with a teacher’s help
Unsupervised learning: learns by finding patterns in data all by itself, without any specific answers given, like learning from home without any help
Semi-supervised learning: learns with some examples that have correct answers and some that don’t. It’s like doing an assignment whereby some examples have answers and the others don’t and you need to figure it out on your own. The teacher might review it after for you to learn from your mistakes.

Additional reading: Introduction to machine learning

A machine with a big screen, big colorful buttons, gears behind it and lots of puzzle pieces on the floor. It is meant to depict the concept of mechanics of how something works. — Image produced using DALL·E 3

Neutral networks: the team of experts

Neural networks in AI are like a team of experts where each person specializes in a part of the problem, allowing them to work together to understand and generate a comprehensive language. They are computers that are designed to work like brains and leverage many small parts to process information and figure out complex problems — much like neurons in our brains.

Additional reading: MIT: Neural networks explained , Stanford NLP group

Layers

Let’s break down the five layers that make up Gen AI and the role each play.

Dataset — think of datasets as the ingredients in a recipe.
You need a large dataset. This dataset is like a training ground where the AI learns so the better and more varied the data is the better the outcome. An example would be gathering a large collection of ingredients in order to generate a specific recipe.

A bunch of fresh ingredients on a table such as fresh baked bread, tomatoes on a vine, basil, whole garlic, a basket of more basil, a can of olives and a few glass jars of olive oil.Ingredients on a table such as bread, tomatoes, garlic, olive oil, etc. — Image produced using DALL·E 3

Pre-process — the quality and variety of the ingredients determines the potential creation the chef can produce.

Before cooking, ingredients need to be cleaned, measured or chopped.

The data needs to be preprocessed to make it easier to understand. Examples of pre-processing include removing noise or irrelevant data, grouping data, making data more uniform, simplifying data without losing meaning or balancing out data and removing outliers. A simple example is resizing all images of birds so they are the same size.

A whole tomato and a sliced tomato on a wooden cutting board with the knife in a resting position. The cutting board is resting on top of a blanket and there is basil, garlic and olive oil nearby. — Image produced using DALL·E 3

Model — Different chefs have different styles based on their individual training and taste.
Different models have different approaches. While there are many models for training data, below are the most common when it comes to generative AI. The purpose of models is to set up a system for how information flows through.

A chef in a kitchen standing by an oven surrounded by pots hanging from the wall, bowls of fresh vegetables, a pot of steaming water and lots of cooking supplies. — Image produced using DALL·E 3

Generative Adversarial Networks (GANs) — like experimental chefs learning through trial and error.
Generative Adversarial Networks are made up of two neural networks — a generator and a discriminator. The generator crafts new content based on input, while the discriminator evaluates and provides feedback on the quality of the output as it continues to learn. Example applications include image, video and text generation or music composition.

A diagram of boxes that depict a GAN model starting with input that goes through a series of stages that ends in a fork with two potential paths — discriminator loss and generator loss. — Diagram created from inspiration from Source: Google: GAN Architecture

Autoencoders — like expert chefs recreating classic dishes.
Autoencoders are made up of two components — an encoder and a decoder. The encoder takes input and compresses it to the simplest form, while the decoder takes the compressed representation of data and reconstructs it back into its original input form. Example applications include efficient data storage and highly realistic image/video generation.

Additional reading : Introduction to Autoencoders [Theory and Implementation]

Diagram of an autoencoder with input nodes x1, x2, x3 leading to an encoder, a latent space, a decoder, and reconstructed outputs x’1, x’2, x’3. — Source: Sunny Bhaveen Chandra on [Theory and Implementation]

Variational Autoencoders (VAEs) — like chefs adding their unique twists
Variational Autoencoders (VAEs) introduce probabilistic models for encoding input data and generating new samples — in order to measure the accuracy of the result.

An infographic comparing Autoencoder (AE) and Variational Autoencoder (VAE) on the feature of smiling, showing AE with discrete values and VAE with a probabilistic approach. — Source: Variational Autoencoders (VAEs) for Dummies — Step By Step Tutorial

Additional reading: Variational Autoencoders (VAEs) for Dummies

Transformers — like master chefs, instinctively handling complex recipes with lots of ingredients.
Transformers can work with large amounts of text and recognize how words in a sentence relate to each other — how cool is that? Their ability to grasp context and the sum of the parts yields more accurate results. Applications include natural language processing such as translation and text summarization. More recently, they have been used to process tasks like image and music generation such as within Dalle.

The image depicts an AI foundation model workflow. Data types like text, images, speech, structured data, and 3D signals feed into training. The trained foundation model then adapts to perform tasks like question answering, sentiment analysis, information extraction, image captioning, object recognition, and instruction following. — Source: Nvidia What is a transformer model

Additional reading: Nvidia What is a transformer model

Train — training an AI model is akin to teaching a check new skills
Part of what makes a chef successful is learning from successes and mistakes and refining one’s craft.
Training a model is an iterative process. The generator conjures new content based on random inputs, and the discriminator discerns between real and synthetic content. The generator refines its output based on discriminator feedback, gradually achieving the ability to generate content indistinguishable from real data.

An abstract image of a chef surrounded by lots of color, utensils suspended in the air, lots of ingredients on the table, with a pen in hand and an experimental type vibe. — Image produced using DALL·E 3

Generate — Crafting a new delicious meal

A well trained chef can quickly improvise to create new meals with a few simple inputs.

Once the AI model goes through sufficient training, it is capable of generating fresh material based on input. For example, a person can enter a simple prompt that includes a single ingredient and cuisine preference with an ask to generate an entire meal. Or it could share a single recipe and ask for an entire recipe book.

A wonderfully cooked meal full of color and texture surrounded by the ingredients used to make it. — Image produced using DALL·E 3

Limits of Generative AI

As fascinating as Generative AI is and as intelligent as it seems, it’s important to recognize its limits. While Generative AI appears human-like due to its imitation of human behavior, it will never actually be human. AI operates in a closed system where rules are predefined and inputs are offered, while humans operate in an open ended system where rules are ill defined and inputs are messy. The real power lies at the intersection between humans and machines. Let’s take a step back and reflect on what humans are good at vs. what machines are good at.

Human decision-making can navigate more open-ended scenarios, while machines can process and analyze significant amounts of data to generate insights within a closed scenario with predefined factors. If external factors change along the way, the machine often fails to produce meaningful results. Human intervention is necessary to adjust programmed assumptions and define the new rules before the machine can learn, evolve and generate value.

The image displays a comparison chart with two columns. The left column, titled “What humans are good at,” lists skills like empathy, creativity and imagination, critical thinking, adaptability, intuition, understanding context and nuances, complex motor skills, ethical and moral reasoning, communication skills, ambiguity, pattern recognition, and anticipation. The right column, titled “What machines are good at,” lists abilities such as fast processing speed, accuracy and precision, data storag

Additional reading: MIT: Explained: Generative AI , AI Should Augment Human Intelligence, Not Replace It .

Ethical considerations of Generative AI

As designers working with AI technologies, we believe it is critical to reflect on ethical and social implications. This includes ensuring that prompt interfaces are inclusive, accessible and do not and do not reinforce or perpetuate biases or stereotypes.

By having a foundational understanding of Generative AI and being mindful of its limitations, designers can more effectively craft engaging, intuitive and ethical experiences in collaboration with developers, data scientists and other stakeholders.

Closing thoughts

We believe that rather than focusing on what machines will take away from humans — we should focus our attention on combining the unique skills of humans and machines to unlock the true power of human-computer interaction and achieve what neither could achieve on its own.

—