Maker-in-residence report Richard Lundquist
Maker-in-residence report
Author | Richard Lundquist |
Date | 15.12.2019 |
What I did
I, Richard Lundquist, have been tinkering with the software Runway ML. Here is what came out of it.
What is Runway ML?
It's a machine learning software made for creative people. No coding is required, which makes it really easy to use. Normally, you would have to have some serious technical skills to work with AI. One of the co-founders has compared working with AI creatively to that of working with painting before the invention of the paint tube. The paint tube made it possible for anyone with brushes to express oneself without having to deal with combining and grinding down complicated ingredients to make your own color, as was part of the craft back in the days. Runway ML is meant to democratize the technology of machine learning.
What is machine learning?
Machine learning is an application of AI that makes systems automatically learn and improve from "experience" without being explicitly programmed.
There are many ways that ml is used. It can be used to recognize a human face in a picture (and possibly even who it belongs to) or pull together a perfectly tailored playlist based on what you have listened to before (Spotify's Discover Weekly playlist). They are all algorithms trained on a data set (text, images etc) and recognizes patterns based on the data set. Normally, a huge data set is required for the ml model to create accurate output.
Runway ML offers many different machine learning models to play with. You can create images based on text input, turn simple doodles into realistic landscapes, do face tracking etc. I tried out these models:
- AttnGAN
- StyleGAN
- DensePose
- YOLACT
Those names might not say much. I'll show some samples
Results
attnGAN
The attnGAN is trained on a (big) bunch of images together with corresponding captions. Which means that the algorithm can somewhat "understand" what the picture contains. When you type in a combination of words, the program searches for the corresponding images and merges them into abstract surrealist dream images. It's not super accurate. Even if you just type in one word/object like "cat", the image vaguely resembles something furry. But sometimes you can get some surprising and even beautiful output.
GAN means Generative Adversarial Network. The basic component are two neural networks: one is the generator, that synthesises samples, and another one is the discriminator. The discriminator, trained on the data set, is trying to determine wether the synthesised images created by the generator are “real” or “fake”. When the discriminator can not determine which is real and fake, the sample will be made ready for output.
styleGAN
Famous from the website "This Person Does Not Exist", which is a collection of realistic computer-generated images of people. A similar example could also be made in Runway ML. In the “model setup”, you can choose to base the output on a collection of Flickr images of peoples portraits.
There are also examples online of fake pokemons, celebrities, anime characters etc.
But you can also generate painted portraits, which I did. The model has been trained on different portraits from different eras in art history.
I made a video of all the generated portraits:
DensePose
DensePose is a ML model by Facebook. The goal is to be able to perform a real-time mapping human pixels of 2D images to a 3D surface-based model of the body. It not only recognises a human body, but tries to make a pose estimation with limbs and joints. I tried some footage from a performance of the Swan Lake as input.
YOLACT
The YOLACT model does object recognition, or "instance segmentation" as it is known within the AI community. I tried some different images as input. A picture containing ballet dancers were seen as a picture with birds in it. For every object recognised in an image, there is a measurement of probability. The bird probability is pretty low. But the computer is not completely wrong. It's from a performance of Swan Lake after all!
What I expected
I expected a buggy beta software (it's still in beta) that at most contains two or three machine learning models to try out. I expected the UI to be messy, and the software in general to be overrated. Why? Maybe I'm just pessimistic.
What I experienced
I read that Runway ML was easy to use, but I didn't realise it would be THAT easy. The UI is easy to navigate in, and there is a lot of different ml models to experiment with. One could spend hours of altering the input to see how the algorithm interprets it and produces different output. Great fun!
How could this benefit my design practice?
What makes Runway ML really good - the simplicity - is also what makes it somewhat lacking. You can try out stuff and understand how AI could be used for creative purposes, but it doesn’t go much further than that.
I could not find a way to use my own data set for training the models, for example for the styleGAN. Which means I am limited to the ones available in Runway ML. This of course limits the creative possibilities. But since it is also still a beta, it might be too much to ask for. I’m really looking forward to trying it out in the future, when there most likely will be more possibilities.
The DensePose can do 3D pose estimation based on a 2D video as input. I’m not sure how accurate the coordinates are when imported to a 3D software, but it could potentially make it possible to do motion capture without a motion capture suit.
Right now, I primarily see this tool as a way for people that are not programmers or AI experts, to see what could be made with AI. But I believe that designers and artist will make use of AI more and more in the future, just as with other occupations. It will be interesting to see what the future holds!
Working in the lab
The lab is an inspiring place to hang out in. In fact, way more inspiring than the studio. I would be happy to see more useful software on the computers. At the time of writing, there is no video editing software. What!