Unlocking the Power of Open-Source AI: Exploring Meta's Llama 3.1 Release
Download MP3Dalton Anderson (00:00)
Welcome to VentureStep podcast where we discuss entrepreneurship, industry trends, and the occasional book review. Have you ever wondered what AI would be like if it wasn't just a tool, but a partner in innovation and something that you could carry around in your pocket, in your glasses, on your watch, or at your home and have your own setup for your AI? Well, you can do that now. First time ever you have a highly capable
foundational model that you could download and run at your house. You only need the compute of multiple computers and a plethora of other things to get it going, but you could do it if you wanted to. But many enterprises are doing so. Llama has 3 ,500 enterprises using their models and Llama recently just surpassed
300 million total downloads.
And that was before they had the 405 billion parameter model, which is insanely good for being open source. I think there was a lot of question on whether or not open source would ever be able to compete with a closed source system. And Meta after using 16 ,000 H100s to train their cluster and almost
3 trillion training tokens at an 8K context and 800 billion tokens at 128K context. It seems like they've done it. Definitely for the most
The range of the results vary, but I won't go through all of the benchmarks. I'll just talk about a very simple diagram they have. Basically they compare Lama 3 .1, 405 million versus Chatchi PT 4 .0 and 4. So 4, which is the previous one, not 4 .0, so 4 Omni is the newer one and then Chatchi PT
is the older Chachi BT4. Chachi BT4 versus Llama 3 .1, $405 billion, won 23 .3 % of the time, tied 52 % of the time. Okay, so it lost around 24%. Llama 3 .1, $405 billion versus Chachi BT4.
That's the newer model that OpenAI is offering, which is the one you need premium for. 19 .1 of the time it won and 51 % of the time it tied. So it lost around 29 % of time.
That's crazy since businesses no longer have to send their data to this closed source model and they can control the weights and the parameters of these AI models, whereas in a closed source model, it's one size fits all.
And it's not
given the opportunity to distill the model, to edit the weights, to do things as you wish, and to run it on your own infrastructure. You can do that now, and I'm super excited about it because I think it will have a positive effect. And one thing I do try to do is I try to just be very positive in my remarks about these releases. There are some things that aren't so good.
and I'll call them out. I've called out Google several times for misleading people with their demos. But for the most part, I try to be positive as these companies are putting themselves out there and they keep pushing each other and the rate of innovation is incredible. Basically every one, maybe every four weeks or something, six weeks, there's some big announcement and you don't really see that kind of stuff.
especially from large companies as they're concerned about their brand recognition and potential mistakes and people utilizing these models to make media stories, to push certain agendas. Because overall, the technology doesn't have any evil use cases, it's the users.
but I'll digress. But before we dive in fully, I'm your host Dalton Anderson. My background is a mix of programming, data science and insurance offline. can feel me, find me running, building my side business, lost in a good book. You can listen to his podcasts and video and audio on YouTube. If audio is more your thing, you can find the podcasts on Spotify, Apple podcasts,
or wherever else you get your podcasts, including YouTube. Okay. So you could have guessed it. We're going to be talking about Meta's Llama 3 .1 and that's going to entail a couple of things. We're just talking about some feature releases. What are my thoughts about Llama, the new one and
Where do I think things are going be going with open source? Like, why do I think open source is great? And maybe some remarks that summarize remarks from Mark, from, from the things I've watched and read from him. He's done a couple of interviews since the release and he has also wrote or not wrote, but he released a, he released a letter to, you know, stakeholders and such.
I read that as well and it was quite interesting. Well basically, these large CEO, like large company CEOs, for these kind of things, these visionary things, they do a good job of honing in on the vision and what they want. And so during these interviews, they almost have a script where they run through the ideas that they have, the messages they wanna release, and that's what they talk about.
And some of the phrases are almost verbatim. And I've heard that Mark say like the same exact phrase like seven times and like the same example like six times. So it's interesting. You hear the same thing from Elon or these other CEOs that you follow during their interviews. If they go on their road shows, they have the same messaging and vision consistently.
So make sure they're heard, I guess. So the power of open source, building with Llama. So how can you go about doing that? Some of the safety and security tools that they built. And then I have some miscellaneous things like the future of AI and Llama, where I can't go over everything today, because there's literally so much. And this is probably the first time where I'm gonna have to break up an episode. I'm not gonna record two separate episodes, but I am gonna have to break these up because it's gonna be too much. So basically,
They released a whole bunch of features. Last week we talked about what I think is gonna be released and the real searching. This week they released an updated model for, updated models for their previous models. So there's 70 and their what, five or seven? No, eight, sorry. There's 70 and there are eight billion parameter models and then there are .5, 405 billion parameter
So those are brand new. So basically they took their 405 billion perimeter model. They did all those training tokens. So they about 2 .75 trillion tokens for their eight, eight K context, 800 billion tokens for their 128 K context window. And if you're not familiar with the context window, that's how, like how much stuff you could put in that text box. If you're on the UI and if you're using API is basically
how many tokens can you put in there basically and that's up, I would say 128 ,000 tokens is like 20 to 30 pages of text considering on what that information is and how it's depicted. But I would say general rule thumb, 128K, like 125K is like 20 to 30 pages of text.
Okay, so let's get right into it. and the other thing that I forgot to mention is, so they released these kind of AI studio demos. They have Segment Anything 2, which recently came out, Seamless Transition. So Segment Anything 2 creates video cutouts and other visuals from a video you upload, or you can use example videos. I uploaded my own video, which is pretty cool.
and I can play around with it. It's going to be a game changer for creators. Seamless transition. can do, sorry, not transition, translation. Translation. I'm not sure I got transition. was kind of just looking over, it's all trans. was like, transition. Seamless translation. You can translate your voice to any language using your own voice as a recording. Translate your, it translates your recording and kind of mimics your voice, which is pretty cool. Animated drawings,
you upload your own drawing. I didn't do my own drawing because I can't draw unfortunately. Maybe I could practice, but I have other interests and so I've never been talented at drawing and I never worked on it so I can't draw. Basically you can upload your own drawing and it'll animate for you, which is really cool I think. And then there's this other thing called Audio Box which is really cool where you can create an AI audio
speech generated story and I asked it to create a sci -fi story and then you can also add in additional sound effects, additional voices and you can even record your own voice and add yourself into the story. And so I added myself, I didn't do that much because I didn't want to like a crazy amount of time on that because I had to like record my voice, I had to upload it. you do it all in the UI but it has to process your voice which is pretty quick, fairly quick.
and then I was having some problems with what I was trying to generate, I wanted to say.
This is an unauthorized launch. Repeat, this is an unauthorized launch. This will be reported to the Galaxy Federation or something like that. But it was giving me an error. I say my context was too long, even though I was at like 113 characters and the max is 125. So it was quite sad, but I used their example prompt, which is like, I have a great story to tell or something. I appear in the beginning. I'm like a narrator. I have a great story to tell.
that goes into the story. So those are really cool. And then they also have an AI agent platform where you can make your own AI agents and share them with your friends and do other things. So I'll be doing that maybe in the next episode. launched, or they didn't launch, but they released their AI research paper for Llama 3 .1, which is 100 pages. I think it's 97, which I have to go through. And I went through it a little bit,
It's just so much information. Just the research paper, the AI platform, these other snippet things that they release for their AI demos.
It's just too much to put in one episode. Like just me explaining it took me like six minutes. All right. So that being said, today we're going to talk about Lama 3 .1, open sourcing AI, building with Lama, safety and security tools that they built. And then we'll talk a little bit about the future of AI and Lama. And that will include some of these creator tools that are built.
which I think are really neat. And I'm not gonna touch on them too long because they're kind of small and you can explore them yourself, but I think they're really cool and valuable to creators long -term once these tools get built in like an enterprise -like manner and less of a demo. But I could see this being a huge feature for creators. And on Facebook and Instagram, there are about 200 million creators or people who consider themselves
creators, I guess I would be a creator. I don't know if I consider myself a creator, I'm more of a talker. But I guess by definition, I am a content creator. And these tools would definitely be useful for someone who is a professional content creator or someone trying to get into the realm of content creation.
with those features, I feel like you're competing with other platforms. And it might not be who could potentially make you famous like TikTok. It might be that who wins the creators with these best AI features, allowing people to easily create content on your platform, you win the content creators. If you win over the content creators, you win the users because people want to interact
the content creators and interact with content and have new refreshing content hit their feeds. And where that is, is where the content is being created. So if you have the features, you've got the users. Okay. Wow. What an intro. 14 minutes. Okay. So Llama 3 .1, 405 billion had an increase
content or context window from I think it was I think it was low like I think it was I think it was low the original content
window was low. Context window, I said content. So much content talking about, I'm creating content. We're talking about other people creating content. We're talking about content, tools. We're getting all over the place, but context window, 128K up from, I think the eight, it was 8K. You couldn't put that much stuff in there. So 128K is huge. It's not the most, I think Google still leads the pack with that with
There, my Google AI is going off now. Everywhere actually, it's like my Google Home, my phone, anyways. It was quite distracting. Google leads that in that regard. They are around 500 ,000 context window and I think they put it up to for their like pro ultra. I think it's at like over a million, so.
They're on another level when it comes to content. my gosh, they're still going on about this content. Context, they're on another level when it comes to context windows. But many people don't need like a million context windows. that's a lot. A million tokens is a lot. Like that's like books. And for the normal user, for many use cases, 128K is plenty. For like,
intense research and testing of AI capabilities, this million, two million is fine. That's great. But for the most part, people don't need that. But it's nice to have. And so what does an increase context window do for the user? Well, it allows Llama for 405 to understand
more complex topics and conversations that you're having. It can remember previous interactions and maintain context. It can do recall, it can recall what you've previously said. It can generate more accurate and relevant responses. And just the overall support for long form text summarization and document analysis.
And so those are the big things that this increase context window provides to you as a user. There's now multi -language support. So the model can now support up to eight languages. I don't know if they listed out the languages on their release. I'm sure they do English though. And
They probably do English, Mandarin, but I don't know the languages, so I'm not gonna say. But I know they do English, I can tell you that for sure. 100%, I know. So they do up to eight languages support, so you can type multiple different languages in your messages if it's a supported language. And it might be able to translate languages it's never even interacted with.
there has been reports of some AI models being able to do that. Chachi Petit did that. and I did that. So there is the ability potentially to do things that the AI model doesn't know that it knows to do. So, and that's happened before multiple times. So I don't think it's out of the realm of possibilities.
This model release of the 405 billion is the first open source frontier model to exist. This hasn't been done before. I mentioned earlier that Meta used 16 ,000 H100s from Nvidia. And to put that in perspective, that's about like $600 million worth of training.
equipment that doesn't count setting it up or getting it all started. But I think that it's about 35 ,000 per one H 100. And so 35 ,000 times 16 ,000. Yeah. It will put you, put you around 600 million, which is a
So they spent basically $600 million on just buying GPUs for their training clusters and architecture support to release this model free to the public, to empower many users and many.
corporations, enterprises, startups, universities, governments.
It's a game changer because many countries have the technical support to utilize these kind of models, but they don't have the infrastructure to put up $600 million on GPUs. Another couple hundred million on CPUs set up the power infrastructure running 700 watts.
I think 700 watts.
I think it's seven, man, I don't remember the watts, but I know it was crazy. think maybe it's 700 ,000 watts. It's a lot. That I don't know. I'll have to look that one up.
Let's see, maybe if I ask Meta, Meta will know. It's in their research paper, but it's quite large.
So I'll see if Menno knows.
Okay, yeah, so it's a lot. it's saying it's 700 kilowatts per day, which is a crazy amount, 700.
Ahem.
Yeah, I'm have AI edit this out. This is a long pause here.
That's gotta be more than
Okay, on insure, I don't know. I don't know.
It doesn't... Man, I wish I didn't even mention
I swear it was 700.
Okay, so it's trained 16 ,000 H100s running 700 watts each. 700 watts each. So that is 700 times 16 ,000, 1 ,120 ,000 watts while they're running. So what is that?
I don't know.
what that is, but I assume that's a lot, because I don't know.
Okay, so we're asking Meta, because Meta knows everything,
my gosh. Okay, so how much is 1 ,120 ,000 watts, which is the equivalent of 1 .12 megawatts and a megawatt is enough power for a thousand homes.
So it's the consumption of a small town, like I guess per day. Small town per day.
Okay. And that would be for the going rate of electricity at 13 cents a kilowatt hour. It would cost around $34 ,944 per day in electricity. So basically spending $50 ,000 a day in electricity. And it took them like four or five, six months or something to train that AI.
So, I mean, they spent like $9 million or something in electricity, just training the AI, training it around. So I try to put these numbers in perspective because these things aren't obtainable. Like the amount of electricity that is being used per day, the upfront costs of getting the GPUs and the CPUs and getting the infrastructure set up to run that amount of electricity to
to the training clusters is immense. even governments can't afford to do these things. Only developed countries will be able to do so.
But once you get the model created, it's not that hard to train the model, you, to do what you want to edit the parameters and in the weights and things like that. Those things aren't very complicated compared to the whole piece of creating a model from scratch. That's what's complicated. Getting the training data meta almost trained three trillion tokens.
You can't get that data. It's not just lying around.
Those things are huge barriers for other people, huge barriers for individuals that are interested, huge barriers for governments, universities, and businesses. And what this does is allows a level of innovation that you normally wouldn't see for a closed source models. And this allows people to use the models at their house.
at their company, wherever they are, they could just download it and they make it really easy. You could download it on Hugging Face. You can download it on Kaggle. You can download it on meta .meta .ai I think forward slash download.
And not only do they lie to download it, they also have ability to use API. So they don't have an API set up because they're not a cloud provider. So they don't have their own API. And you don't have to be a cloud provider to offer an API. That's not what I'm saying, but they're kind of, okay, we're just going to build the best product. We're going to release it and we're going to have partners that support it. So they have partners that support the AI via cloud.
or you can just use a API either or like Huggie face has an API for
Google, AWS, Databricks, they all have offerings to allow you to use the API, not API, but the model on their infrastructure. And they have just some general support, but the two places that have support top to bottom, which would be like real -time inferencing, batch inference, fine -tuning, model evaluation, knowledge base, contextual pre -training, safeguard rails,
synthetic data generation and distillation recipe would be Databricks and NVIDIA. Those are the only ones that have a full suite of offerings. And why that's important is for the first time you can edit these models yourself and you can fully customize it with your own data, with your own training, can distill them how you wish, you can do anything you want. It's Apache 2
That's the licensing, by the way, if you're not familiar.
And so these partners had to build out these features to be able to offer these things. And the ones that really were interested in building out these features to fully utilize this model was Databricks and NVIDIA.
And what are they calling this? They're calling this llama stack API. So it's a standardized inference that allows the developers to, you know, create applications on top of 3 .1. And so it will provide a predefined.
set of tools and functions to allow you to easier integrate into various projects.
I'm getting some potential, some mic feedback, think. But it just allows you to build custom applications on top of Llama or with Llama, either or. And it just allows you to utilize the initial knowledge base.
With that, you can also download, I think they call it the, let me look at the GitHub, within the Llama stack, can download your safety guide rails and Llama Guard in this stack. So Llama Guard, it has this
ability to detect and prevent potential misuse. And so this misuse could be hate speech, harassment, explicit content. And then they have this thing called prompt guard, which tries to prevent people to gain the system. So a lot of times with these closed source models, you can, and other foundation models and other offerings, you can...
force the model to do things you wouldn't normally do. So like if you wanted to know how to make bombs or to do things that you typically wouldn't allow a system to look up on its own, originally it will say, I can't do that, blah, blah, blah, it's against my standards. Well, you can kind of game it into telling you information it normally wouldn't tell you. So you have to word things a certain way and confuse the AI and the AI eventually will.
tell you the things that you wanna know when it's not supposed to tell you. So Prompt Guard tries to prevent and flag and categorize the prompts that you're sending to Llama and if it detects that you're trying
misuse the AI by, you know, confusing it and crafting these
special prompts to jailbreak llama, it will.
prompt you that there's like security threats and potentially somebody's trying to abuse the product. And so it will block you from doing so. So these are important because it gives companies peace of mind, it gives safety. There's some criticisms regarding open source AI with the being that it gives bad actors a chance to use these products
and do malicious and act in a malicious way. And this will help protect against these bad actors. And so Lama Guard and Prompt Guard are already integrated into the model. What I'm talking about is if you want to customize Prompt Guard and Lama Guard yourself for your business or whatever you're trying to do.
you can do so and you can integrate that into your model.
It will prevent harm and people misusing this product, prevents misinformation and malicious content. And it makes sure that llama is used ethnically and responsibly. And what I was saying is that closed source models are really like, if it's open source, it's dangerous. Anyone can do anything with it. And people are going to do bad things, bad things.
And they mainly focus on people attacking the government and doing these things that are normally only, I would say, categorized for sophisticated actors, sophisticated bad actors like governments or large organizations, terrorist organizations that have more resources that are typically backed by another government and they're using the terrorist organization as like a shell company.
those groups of people are gonna be doing bad things no matter what. And the perspective of Mark Zuckerberg and others is if you open source these things, you can potentially protect governments and companies and universities from bad actors because they'll have this sophisticated AI model that they'll be able to launch on their own systems.
to prevent.
issues to prevent attacks. And that's how it's going to be where you have this open source sophisticated model that you can just update on your system and your infrastructure and then wallah you're protected.
And that's, that's Mark's vision where he feels that he doesn't want to be forced on what features he can make because that's some issues that he's had with Apple, where Meta has tried to launch features that will be useful for their users. Apple said, Hey, like this isn't really in your wheelhouse. We're not going to allow you to do though, to do these launches. Meta can't do the launches anymore because they're forced into these closed source models
mobile phones because close, close model one. Meta wants to have the opportunity to make sure that the open source model wins. And Meta is one of the front runners in this technology as they don't want to be reliant on a closed source model from another company. And they want to build a wonderful atmosphere, not atmosphere, ecosystem to allow creators and companies and, and innovators to use their product and interact with their brand.
And then it also allows them to be a front runner in governmental conversations regarding compliance and regulation and making sure that the U .S. is a leader in
revolutionary technology that is going to shape the world in a positive manner and being in control of those conversations definitely gives Meta an edge. And then also to prevent them from having an issue with not having the best and leading technology in this sector.
with them being attacked by sophisticated actors where they don't have the model. And so now they have their own cutting edge model that competes and wins with the best closed source models around right now in all areas. They're 8 billion, they're 405, they're 70 billion. They're all winning across the board pretty much. And their 405 billion did pretty well against OpenAI and their 70 and 8 billion parameter models killed the other competition.
they won like majority of the time. And so it gives them an opportunity to one, innovate their products, protect themselves against sophisticated actors, lead in these governmental conversations regarding regulation and compliance, embrace the open source model and allow people to innovate and help out millions of people because
I'm accounting the people that work for these small businesses and these corporations and these universities and inspire and inspire many people. I mean, they have 300 million downloads already for llama 3 .1 and llama two and just llama in general. And they want to be one of the most used AI assistants in the world, which they're on track to do before the end of the year, according to Mark, recent his recent interview of like three days ago.
with Bloomberg.
because they have this AI assistant in WhatsApp, they have it in Instagram, they have it in Facebook, they have it in Messenger. And so you can constantly interact with this AI wherever you are in any of their ecosystem of apps. It doesn't know, it would be cool if it would have the context of all your conversations that you have with the AI on these different apps, but it doesn't. But I still find it super useful.
And I love it. Honestly, it's great. And I use it all the time on Instagram. And they have those agents, those AI agents that you can talk to. I talked to the cooking guy, I think cooking with Max all the time for different cooking recipes. Those things are super useful where you ask it, you know, what can I cook with these ingredients? Just give it, give it five ingredients and, and go from there. And you know, you can plan a picnic or, or
a charcuterie board or with certain different wines and it's just great, honestly. I love it, I love it, I love every bit of
So with the prompt guard and these other pieces.
And I touched on it a bit with the open source. Open source allows Meta to control the ecosystem. It allows them to become the industry standard. And the example he uses often is the Linux example where a lot of companies were building these closed operating systems. And then Linux came about and became
popular because you could do whatever you want with it. It was easy to use. It wasn't as secure or as fast as the closed models. But since it was easy to use and an open source, the innovation of those models were not models, but of Linux was much faster than the closed source models. And so in that regard, Linux won and that conversation.
And the same could be said as like there's Apple and then there's Windows, Windows one, the computer. Windows isn't necessarily open source, but it's much more open than. And in this card, I would say open ecosystem, not open source. Windows is closed source, but their ecosystem is open where you can basically use any app you can. Do as you wish, whereas Apple is kind of closed where.
You know, if you want to have your messages linked, you got to have so -and -so phone and, and things like that when you are kind of tied into the ecosystem with the walled garden. Windows, not so much. And I'm glad that Windows won. I like Apple though. haven't, I I'm literally recording this on my Mac. So I don't want people to be like, you're, you're such an Apple hater. No, I like Apple quite a
Just there are certain pros and cons of closed source or open source. I would definitely choose open source for sure, as it's the level of innovation that you'll get is probably not obtainable with closed source long -term.
And I keep bringing this up, if Foster's, know, this innovation and collaboration and also in some circumstances.
it puts a higher scrutiny on your work. So when you're releasing your product to say, users in your user group, you can really only get user feedback from your QA or QE, your developers and your users on a close model. Open source model potentially,
you can have feedback on your code, the features, the bugs, security, all that stuff from millions of people, like 300 million people have downloaded llama. So they'll all be able to interact with it. They'll all be able to give suggestions to be able to make their own versions of, of llama or whatever you want to call it. I don't know if they can rename it. It's Apache 2 .0, which means that you can use, you know, I to make correction it originally.
In one of Mark's interviews, he said that he would allow, you know, small companies to use it for free. And if they're larger, you'd want them to pay. But they decided to do Apache 2 .0. Apache 2 .0 basically means that they can do whatever they want. It's open source as much as it gets. You can use it for commercial reasons, education. You can edit the code. You can.
add things, remove things, rename it, repurpose it, reface it, whatever you want, it's fine. So it's free use. It's literally free use.
So with it being free use.
many people, 300 million people have downloaded it. And of that, like almost 4 ,000 or 3 ,500 enterprises have downloaded or not downloaded, using Llama for their work and their company.
And so when you do those things, you get a heightened scrutiny on everything, heightened scrutiny on your code, your security, features, and people can provide suggestions either with a fork or a push request or all these things. I don't know how that works for these large open source platforms, but I know for libraries.
they'll allow you, if you're a contributor, can ask to become a contributor and you can contribute to the code. But I'm sure you could be able to, you should be able to offer suggestions or some comments. And they have some general.
Some general stuff that they're putting in here, like I'm looking at the GitHub right now. This one said, this person has got 20 likes. It says, know, general opinion, I feel like there's a need for a 3 billion or 4 billion base model that makes it easier, faster to do inference on low end hardware or a CPU, which is my goal. The benefits would be less space, reduces resource need and the cost of accuracy, quality, but fine tuning that for the end users case. Use case would be a game changer.
if quanticides of models would be smaller and maybe more usable for text generation. However, it still could be very good at finding probabilities for tokens and use for applications such as compression of text, logs, binary, structured text, yielding high compression ratio.
So that's just like a suggestion they put like, think this person is working on
Yeah, so this person put that comment in there. And then the author of the GitHub for the Llama stack said, you for the comment. I understand there's a need for smaller models. In this release, we provided FB8, I guess that is their $8 billion model, and our largest $4 .5 billion. In addition, we released both dynamic inference code as well as inference code for
persistent weights in implementation, please check them out. So it's basically saying, hey, we've offered our small model of 8 billion and our largest at 4 .405 billion, and you can edit the weights and the code. Please check out the docs, and you can change the parameters.
this one got 15 down votes. Wow. All right, they're killing them. Wow. Okay. So that's open source and open source is just, I feel like better where it gives an opportunity for governments, universities, individuals to really play around with this technology and technology that they would never be able to interact with at this level without having immense amount of resources.
And you never know what comes up from this. There might be some brilliant pie in the sky idea that's possible because of this AI. Or how much of the coding will you have to do when you're building a new product? How fast can you start innovating when you don't have to build everything from scratch? And you can ask an AI system to build it for you.
Who's the person just throwing stuff against the wall and stuff might stick? You never know. you put something in the hands of 300 million people, something is bound to happen that is potentially spectacular. And I'm excited to what comes of it all. And I am thrilled that they're doing these things. And it's just really cool that we're here in this space because this open source is going to push closed source to the brink.
and they're going to get pushed. Closed Source is going to make some big releases. Open Source is going to push right back. And so this is going to be this constant tug of war and this tug of war and competition will bring out the best of both parties. And I just am excited to see what comes of it all. And I am very, very happy where the future is going.
It's honestly great and I couldn't be more inspired about this whole thing. It's awesome, it's awesome. And so with that said, I'm gonna transition over to these small AI demos. We're near the end of the episode. So I'm share my screen and I have an alien, I don't know, little alien guy. He's got one eye, he's blue and I animated him so I'm gonna share, share screen.
Man, it's a lot of, here we go. Share screen and audio. I don't have any audio, but so you can animate them. And so I have this little alien dude in the funny category. And so I can change the little animation. And so this is like a boxing thing. What's this
This is kind of cool. This kind of matches the vibe of the alien guy. Little like alien dance thing. So he's blue. He's walking all weird and he's got really long arms. He's like, like kind of worming them back and forth while he's walking. So quite interesting. I think it's really cool. I think it would be really awesome for a kid to see this and draw something. You upload it and talk about
your drawings came alive or something. That's what I would do if I was a dad. I would get a lot of joy out of it. Okay, and so let's do this one. This is quite crazy. So we'll share this tab instead. Okay, so this is my original clip. I'm gonna share this and hopefully you can hear
Okay, so then this is the non -expressive translation, which I think is female.
So I think that would just be, hey, this is like the straight up translation, and then this is the translation of my voice in Spanish. So this is supposed to sound like me if I was able to speak Spanish.
It sounds like I smoke or something with my voice. And maybe that's what I sound like. I don't know. But let's see. hermano. maybe that's close. I'm not sure. Well, I'll have to play it back. But in the accents a lot better in this translation, obviously. But, you know, I'm working on mine.
Now the next thing, so that was the seamless translation. The previous one was the animated drawings. And then this next one is this Segment Anything demo, which I think is the biggest one that they've released recently. So I'm gonna share this tab instead now.
So this is a platform that allows you to upload videos and.
make changes to them on the fly. So you could select which objects you want to edit. So I did the person. And so this was a race that I did last weekend with my friend and she doesn't know I'm doing this. So I'm not going to show her face, but there's a whole bunch of different options and I could play around with the background. So I could erase the background. I can make it black and white. I can make it white. So you can't even see her anymore. We get a green screen.
and I could put a gradient on it, I could change the gradient a couple times, and you could overall, you could just do a lot of things when you click these objects and you make these objects in the original piece where you select them, you just click, it's super easy. You just click on it and you can pick out articles of clothing, you could pick out her shoes, I could do her hair, I was able to do her shirt or her shorts.
and then you click on the whole body and it will do it. And normally, if you're not familiar with video editing, this stuff takes a long time to make these segments. And so with Adobe,
What is it? What Adobe? The video editor, I'm blanking on it on the podcast. Adobe Premiere, Adobe Premiere. This stuff would take a long time, because you have to get the segment, you have to segment it, and then you have to segment it and then go through the key frames of the video, which it's doing, and see where you're at, and see if this segment is still matching up correctly. And the AI was able to do
on its own, on its own. And it's not like it's super clear on who this person is. There's a lot of stuff going on in the background. There's text, there's different colors, there's different people. There is a lot of movement. And so it's matched up with this person perfectly. And it's mind blowing because something like this would take me a long time to do manually. Like a long time, like 40 minutes just to get
the segment put together. And maybe they've made some AI improvements since the last time I've used Adobe Premiere, but it is something that...
get wowed about. This is a wow, like wow, this is crazy. And you can add text over your video, you can desaturate it and make it in black and white or just kind of just edit the color hues that are allowed to bleed through. You can blur it out, you can do the outline. And so you can do more than this. If you go to Mark's recent post on Instagram, you could see that
You can see that he recently used this to make a video and he did a whole bunch of different objects and different effects in his video and it looks sick. Looks sick. And he's got the T pain chain and he's like thinking T pain. He puts a chain on the effects, start rumbling in the video and T pain starts talking, like not talking, but the music plays. It's cool. And it's something where it allows people to easily create
cool video with not much effort, because to do the same thing, it'd take you an hour or so just to do a segment, and then to have these different options, takes time, and to do all these things, you have to set up different layers, and so it just makes it more accessible, and allows the user to do stuff that normally would take a long time, pretty quick. And it's got plenty of options. Each one's got three.
It's all good. I would show you more, but I thought this video was pretty good and it had a lot, it was challenging for the AI to find the segment. So I thought it would be good, but I don't want to share her face because she doesn't know I'm doing this video. the time that I was able to ask, she wasn't available. yeah, but this is what it looks like. It's pretty cool. Okay. Transitioning over now.
to my story that I built. Share this tab instead. So this is really cool. And this is another one that I was like, wow, this is insane. I can't believe it. Wild. So I asked this storytelling machine to build me a story about a non or not non -fiction, a fictional sci -fi story. And so it came up with this whole thing except
my voice at the end right here. So this is me and it only says I have a story. And this was talking about earlier. I wanted to say, you know, you're not authorized, blah, blah, abort, abort, whatever. And I was saying I was going to report Jamie to the Space Federation or something. So I'm going to give this a play and everyone's going to be able to listen. So they turn it up. Turn it up, baby. All
ether
and me a
Okay, so there was my story that I had AI create for me using the audio box. Really cool stuff, super easy to use. You can easily create your own voice and then you can add stuff on the fly. If I wanted to add stuff, I could add speech, I could use my voice, I could type whatever, you could do examples, happy days, a great day is what I use. And I would say welcome, or I would say thank
Say what if I said venture step podcast is the best? Let's see
So it's generating right now if you are listening and then we'll see how long it takes. It normally didn't take that long, less than 10 seconds. So it should be coming up here shortly. Here we go. So it gives you an option. So here's this one. I don't know why I'm whispering, but.
All right.
I think this sounds more like me, but I don't know why I'm fading out so much at the end. And maybe that's how I talk, I don't know. But this sounds like me, so I like this one. And so if I wanted to add that to my story, I could do that. So you could see how this could be really cool if you had a larger context window where maybe not 125, but maybe 10 ,000 or 1 ,100.
would allow you to make these elaborate stories and sound effects and all this stuff. And then you could animate with your animation tool and then you can make a whole whole like cartoon just with these like couple tools with using AI. And so it unlocks the possibilities that normally you wouldn't be able to do in that short amount of time. So it allows a lot more people to create.
and share their ideas and provide value to millions of other people because there's 200 million creators. so hopefully this thing.
pans out to be a exponent for creativity, growth.
improving the economy and people's lives, which I'm sure it will. And these tools are just getting started. They're just demos. And they're honestly so cool as is. I think it's a sick demo. And this is much better than Google's demos, where Google doesn't really allow you to upload and do your own thing. These tools allow you to upload your own videos, your own voice, all that stuff.
And it's just a demo. It's very impressive. I'm very impressed with it. It's really cool. Next week, we're going to be talking about Metta. We're going to go on a Metta marathon. We're going to become Metta super fans. And so next week, we're going to be discussing. I think I'm most interested to discover the A .I. and to discuss making some A .I. agents with Metta and seeing what you can do and try to push.
push to the limits and see where we could do and if I could share with my friends and what are the things that I find most useful for use cases and where do I see this going? I know Mark has talked about having an AI -generated social media where AI has their own social media to learn from each other. They'll have their own AI channels for the agents. They'll have
the agents would interact with you. Every social media influencer would have their own AI agent and allow them to interact with their fans. They'll use their content and their comments and their videos to train the AI agent to interact and mimic you to allow your users to have more time with you. But, it's not you. So I don't, I don't know where that, that all.
comes about, like the whole thought process, but it would allow you to better service your fans, but it's not you. So I don't know if it means as much if you apply to every single comment of your hundreds of thousands of comments, but it's not physically you. It's an AI algorithm that mimics you. Is that the same
if it interacts with other people, mimicking you using things that you would say and the love that you would provide, is it as genuine as if it was just you?
think many people would say no. And so there's certain parts where human connection is going to fade a bit, but then you'll just free up more time to connect in different ways, right? Like you'll probably have more free time than you will in the future than you do now. So pros and cons, pros and cons. But that's what we're discussing. I want to talk about the AI studio.
And then I have to read the research paper, which is a massive behemoth with just lots of dense information. And then I like to talk about the research paper too. And hopefully by then I can finish the book behind me, The One in Blue. And we'll be on regular book reviews again, because I'm behind by one. I'm behind by like three books. So I have to get those done. And yeah, and I really appreciate you listening in today. And of course, wherever you are in this world.
Good morning, good afternoon, good evening.
Thanks for listening and please listen again. I'll talk to you next week. Goodbye.