Google Keeps Shipping: Canvas, Gemini 2.0 Flash Experimental
Download MP3Dalton Anderson (00:00.874)
Welcome to Venture Step podcast, where we discuss entrepreneurship, industry trends, and the occasional book review. Unfortunately, I had some audio issues last week, and so I will be re-recording this episode. So hopefully when you go to look at the most recent episode and the previous episode, they both have audio. So if you were trying to listen to the episode last week, it was because I was having audio issues. And unfortunately, during the week, I was just slammed back and I could not.
re-record. So here I am recording the same episode that I did last week. I hope that you'll enjoy it this time when it has sound. So what are we discussing in this episode? In this episode, we're pretending like there wasn't a recent release of Chat GPT's image model. And in this world, I don't know this information. And it's really just talking about Google and some of the Google releases that they've had.
And of those releases, they had some really cool releases related to image generation and native image generation. And not only can you create images, but the model understands the context of the image and can have a consistent character use throughout your image creation. So if you provided a character, say, Hey, this is what I want to create images with. You can tell a story with it. You can make marketing materials. You could do product designs. You can make models.
And so that is something that you couldn't do before, especially natively in an app. keep in mind that you can only do this in AI. They call it Google AI studio. And then they also released, and I'll have some examples of that. They also released a another thing called Canvas and Canvas is supposed to be used for generation of documents and websites and
previews of apps that you create. This isn't a particularly new feature. Anthropic has had their version of Canvas, which they call artifacts, for quite a while. So don't think it's brand new, but I do think the take of rich text editing within a document, within your chat, is pretty interesting. And I haven't seen that, but I have seen...
Dalton Anderson (02:30.253)
a side panel pop out that previous app or the website, and then you can toggle it back and forth. So I've seen that where when the artifact is active, it takes up about 75 % of your screen and the other 25 % is used for chatting and changing the website or changing the app. But that's what we're going to be going over today. And I hope once again that this episode has sound. I tested it a couple of times, so we'll see. Fingers crossed. I haven't had any sound issues.
except till last week. So was 50 plus episodes, 52. Almost a year without any podcast episode issues. interesting. All right, so the first thing that we'll be discussing is the native image generation. And I have some examples that I'm gonna put on the screen. And if you're listening via audio only, that's fine. I'll narrate a little bit of what's going on and you'll be able to understand. I picked examples that would...
make sense in people's heads where it's not like super complicated. So let me share my screen and those watching on video, I'll be looking on my screen that I'm sharing.
Windows.
here.
Dalton Anderson (03:46.572)
Okay, so in this first example, let me turn off the sound, that's annoying. So this first example has a model that is slouched over, not looking very competent, not competent, but confident, and she's got really bad posture. And this could be from a disability or this could just be because she's not having a good day. I don't know, I don't have the context here, but say that you wanted to change your image and make...
yourself have good posture. There's a side by side where one person is doing all the work that would require to do this in Photoshop. And then there's another version of this to the right of the screen that just says, make the girl, make the girl stand. What does this say? Make the girl stand in correct standing posture. OK, weird way to prompt that. Not the best, not the best, but.
And then it just comes back, it says six seconds later and she's standing up straight with good posture. So that's one example. I thought that was interesting.
Then we have another one. So this one I really like, and this emphasizes the consistent subject model that's created. So the first prompt was to create a transparent futuristic vehicle. And it has these big off-road tires and it's got this futuristic look. It's transparent so you can see all the parts in it. It looks really cool.
Like it looks sick. And then the next thing that was prompted was, okay, now create different perspectives of this subject. And so it does a front view, the three quarter view, like in the rear, people that know cars will know what I'm talking about. And does the three quarter view, front view, and then does the side view where it's like close up onto the top part of that portion of the tire and shows the taillights. It does a really good job.
Dalton Anderson (06:01.505)
And it looks, it looks, this one's my favorite. It looks sick. I'm like, wow, this is cool. Only thing that would not be so good is if you're driving a transparent car, like everyone sees you just sit there awkward in your car. But really cool stuff.
Yeah, I like the tire design and how they went about it. And it creates a studio environment and that's what they asked for in the.
in the prompt. This one's interesting. So this one is of a woman and there's an original selfie and she looks like she's in college or I don't know. She looks like she's in some kind of library, I think. And it's a selfie of her and her arms are like out. So you can't even see her hands. So her hands are not visible. The only thing that is sent in the selfie is her smiling and her hair and the top that she's wearing.
So then it's maker create a sharp a heart shape with her hands. And then she does that with the hands that didn't exist in the selfie. And they even give it French tips. Like so the AI model gave the woman in the image like French tips. So they kind of look like French tips. I'm not a nail expert. So maybe some other women can chime in that is watching this. Like all those are French tips.
No, those aren't fringe tips. It's a variant of fringe tips. I don't know. But they did something with the nails. And so it successfully makes the heart shape, generates hands that don't exist and it matches her skin tone of her body pretty well. And then the next prompt was make her gives a thumbs up and the thumbs up works pretty well. The hands are once again, they are generated
Dalton Anderson (08:04.887)
because there is no hands in the original selfie. So these are.
prompted hands. And it's just using the tone of the body to figure out, this is what the inside of your hand should look like given the tone of your skin and the darker coloration of your thumb. It looks good. Like the inside of your thumb, know how the inside of your thumb's a little darker, like the tips of your fingers are a little darker than the rest of your body, of your hand.
It does a great job. Then it was able to successfully execute the thumbs up, which is great.
That was that example. And I've got a couple that are interesting. So this one also emphasizes this next example emphasizes telling a story. And I see this as a marketing plan. So all of marketing and sales is really telling a story. But a bigger piece is, and I talked about it recently, it was like community building, having in-person events and also building in public.
and also creating this like compelling marketing campaign where people want to know what happens next. You can create those things now with some prompts. So.
Dalton Anderson (09:28.235)
The original prompt is saying something like, I want a scene of a lonely man on Pluto, imagining a happy life. So it makes the first frame of this lonely man in the middle of nowhere. And then it creates another frame of the same man.
Dalton Anderson (09:50.157)
and this woman.
Dalton Anderson (09:55.438)
Close up with warm light, illuminates the side of the man's face. We see him inside his helmet, eyes closed, inside his helmet. I guess he's out of the, I guess he has a helmet on. I'm not sure if I'll follow that one, but now he's holding hands with his partner.
He's eating dinner and it's the same person. It's five shots. And then it flips back to where he is. He's actually in Pluto. And this made up story.
It keeps going back and forth between the different memories. Compelling, compelling stuff.
Dalton Anderson (10:36.757)
And that was, I think, nine shots. And then the story keeps continuing, but I'm not going to go through the whole thing. It's a long video. This one's cool. And this is also one of the key things that you can do if you can have a
sustained subject in your image generation, you can use it to model products or you can have your products being used by models. You can have your products being used by people in different various ways. But if you have a whole bunch of different people doing similar things, but in a different way, but different people, it just looks weird. It doesn't look like an actual legitimate.
listing, but if you have the same subject modeling the item in different ways or using it different ways, then it's more compelling. So in this example, says create image, make the girl in the photo, wear the jewelry in the second photo. And so it's this like looks to be an expensive piece of jewelry. It's a set. It's a I think this is an emerald. It's a gold emerald necklace that has
pearls and then earrings are gold emerald earrings and then the image generates a
Dalton Anderson (12:11.145)
image of the model wearing the jewelry that was required, which is great. This one is great. This is another example of the model. I think this is better than the jewelry one. The jewelry one, I feel like it added the pearls on the earrings, but I can't really tell in the image. So I think it's just a bad image example that they provided.
But here's a legit, like this one's really good. So the original image is a woman wearing a top and some jeans and she's got her fingers in a peace sign, she's smiling. And from that, they created one, two, three, four reference images for the model. And so one is with her smiling, looking to the side, having her hands behind her back, one.
is with her like playing with her hair a little bit and being playful. And then there's another one of her smiling and drinking a cup of coffee. And there's another one of a side angle where she is smirking and has her hands in her pocket. So in this example, I could see that you could take a model and once again, hopefully have people's permissions when you're doing this. Like I don't suggest you just take random models on the Internet.
and start using their likeness for these things. But this is a separate conversation. I'm talking about the capabilities of the model, not if this is correct or not. anyways.
You could have an image on the internet. You could fabricate a digital model and you could say, OK, like I like the two looks of these people. Can you can make a make a made up person? And then from there you can provide a. Image of an object that you want them to be wearing or a thing that they want them to do. And then from there you can create these. Consistent subject.
Dalton Anderson (14:18.433)
Verifications, not verification, variations of your original idea. And this one's a good example where you can take, this is model wearing a purple and green vest. There's an image of a black sweater that they want being modeled. And it asks, okay, can you put this sweater on the model? And it works. It works perfectly.
It looks crazy legit. The only thing is the image quality is quite bad on this one. think they screen-shotted it and somebody else screen-shotted it. And then I'm looking at it on the web and it just doesn't look good. But those are the examples of the AI Flash 2.0 experimental where you can generate now native images.
Natively, you can do image editing and consistent creation of subjects within new prompts, which wasn't capable before, which is great. Like somehow the model understands the context and the subject that was generated last time and makes sure that it feeds that through and generates the same person every time.
I didn't do my own examples simply because it's in the Google AI studio. And I thought that the examples that these creators had were great and figured that I would share my screen and give them credit. And instead of generating my own idea, which probably wouldn't be as good as theirs. OK, so the next thing that we're going to touch on is we're going to touch on Canvas. Canvas is within
Gemini Chat, gemini.com. And Canvas, as I mentioned, isn't a new idea. Other companies like Anthropic have had artifacts. I do think it's a great addition to Gemini and it allows for...
Dalton Anderson (16:42.733)
Quick, I don't know what this is. Quick prototyping of websites or apps or creating of documents and then editing those documents in real time with your adjacent AI buddy. That's how I say it. Okay, so I created a website using, a very simple website by the way, nothing crazy, using a,
prompt that I created earlier. So in this website, I'm going to be creating a maritime insurance coverage website. I'm going to select Canvas. then also Canvas formats your code. So I have code in here, and I'm selecting Canvas. See how the code changes?
Oh, it did previously. I don't know why didn't again. But anyways, so I'm selecting Canvas. So Canvas opens up, as I was saying, it has a 75%, takes up 75 % or maybe a 70, 30, takes up two thirds of your screen and creates this website. And I was like, okay, let's say what are some, can you...
Dalton Anderson (18:09.055)
Add what?
You what?
Dalton Anderson (18:22.413)
Good
Dalton Anderson (18:29.719)
asking for a demo.
Dalton Anderson (18:35.179)
making stuff up.
Dalton Anderson (18:40.353)
I did have issues when I originally recorded this episode was when you try to undo and then reprompt. So if I put the undo button, then it will take me to the previous version. But the issue with that is when I was doing this, it would break the prompt. Like I couldn't prompt anymore. So I think that's just a bug, but keep that in mind. Like I would only press undo if you're done and you're like, okay, I'm done.
Let me go back and review. And I want to go back to the previous version because the latest one isn't working. But I had that issue where I wanted to do a change and I just didn't do it. Jim and I just didn't do the change that I requested. And I wanted to go back. Because it jacked up the website. And then I couldn't. And it was it all happened live and I just made a whole new website because
I couldn't figure out how to get it to work. It wouldn't understand my prompts anymore. So I think that's just a bug. Like it's recent release. I think it's only been out for three weeks. So keep that in mind. I would only use the previous version button when you're completely finished. And I honestly probably wouldn't even use it at the moment because I don't think it functions the way it's supposed to.
Dalton Anderson (20:05.911)
Just letting you know, okay, so in this prompt, I said, can you help improve the website, add whatever you want? I said, am, I'm gonna say I'm presenting, I was gonna say I'm presenting this to my boss, but I said, I am this to be good, my boss asking for a demo on Monday. Great, I was roasting people's prompts earlier in the episode, and that's the prompt that I came up with. That's awesome. Okay.
So they replied back, they want to do content expansion and then they want to do enhanced styling and responsive design. All right, let's see. So I can see that the website looks visibly better. They added a solid header, header styling. They added one of those lines to separate the content below. And then what happens when I click on this? It brings me, brings me, wow, nice.
Once I click on the headers at the top, so I have headers, like cargo insurance, whole insurance, offshore energy, marine liability. When I click on this, brings me exactly to, so I clicked on offshore energy insurance, and it brings me to that area on the website, which I think is great. So I'm happy with this. I'm sure my imaginary boss would love it. Would love it. He would eat that up.
Okay, so next thing is I want to...
Dalton Anderson (21:43.245)
is up.
Okay, here it is. Perfect.
So this one I wanna create a document and it's the same gist where you click Canvas and Canvas creates a website, an app or documents for you. So I already created a wonderful outline for AI to process and turn into a document that could share. So let's do that.
Dalton Anderson (22:18.101)
Okay, so it breaks down the different types of coverage in maritime insurance or marine insurance. And so we have transport, cargo, we've got whole insurance, offshore energy, marine liability. But one of the things that it doesn't show is what are things that typically aren't covered. So let's do this. can you please add the things?
Dalton Anderson (22:46.573)
Typically.
Alright.
And then I have two T's for typically.
Awesome. And so let's leave it like that.
Dalton Anderson (23:04.193)
Typical exclusions.
Dalton Anderson (23:11.405)
Okay, so it said it added a common exclusion section.
How can you add the exclusions?
illusion
Dalton Anderson (23:31.917)
I'm messing up. Exclusions to each.
Please. And keep in mind how I'm always polite with my prompts. I'm always very nice to my AI friends.
Dalton Anderson (23:52.312)
Okay, so let's see how this change goes. So updated the document, yes, this is exactly what I'm looking for. So each section of the coverages now has common exclusions, exclusions often include. And so if we scroll up and we look at transport cargo insurance.
Exclusions often include inherent vice, spoilage of perishable goods, improper packing, delay, loss of market. If we go to whole insurance, whole insurance is wear and tear, gradual deterioration, latent defects, damage due to lack of maintenance. Okay, so that's Canvas.
And then you can export in docs if you wanted to change the formatting, you could do that. It allows you, I think, a more integrated approach to generating documents and creating articles or blogs or large, I would say larger documents, whereas instead of you copying it back and forth or having some kind of AI inside the docs, which I don't think works as well, Google docs with.
Gemini integrated, it's not the same thing as being on gemini.google.com. And this canvas feature allows you to edit the documents and allow you to format it in the manner that you want or request formatting changes to your document and or new additions to the document in real time, which is something I have not seen before. And I think it works pretty well. And it has the same issue with the versioning though.
The versioning seems to be broken on Canvas and something that they need to work through and figure out because it doesn't seem to work.
Dalton Anderson (25:45.399)
Say stop sharing.
That was...
Google's recent release and once again, sorry about the audio issues that I was having last week. And hopefully that doesn't happen again anytime soon, because it is definitely a pain to re-record an episode. It seems like deja vu. I know the episode in and out. I know exactly what I'm gonna talk about. Don't need an outline, I didn't need to review an outline. I got it, I got it memorized. So hope that you enjoyed this episode and
Appreciate you listening in every week and wherever you are in this world, have a great day. Good morning, good evening or good afternoon. Thanks for listening. Hope you listen in next week. Next week we'll be discussing OpenEye's recent release of their model and the things that you can do with that one. Have a great day.
Creators and Guests
