Venture Step | Transcript: The AI That Wasn't: Unmasking Reflection 70B

The AI That Wasn't: Unmasking Reflection 70B

October 15, 2024 / 32:36/E38 Download MP3

Dalton Anderson (00:02.068)
Welcome to VentureStep Podcasts where we discuss entrepreneurship, industry trends, and the occasional book review. In a fast-paced world of AI, there's bound to be some breakthroughs and controversies, and they often go hand in hand. Today, we're talking about a story that shook the open source community. That is Reflection, 7 Billion, Parameter Model.

Today we're gonna discuss the what is Reflection 70 billion and why did generate so much excitement? What's the controversy around the model and the company? And what are some of the acquisitions that are being made surrounding the Reflection 70 billion? What really happened? Trying to separate fact from fiction and the lessons learned in the future of open source AI.

But before we dive in, I'm your host Dalton Anderson. My background is a bit of a mix of programming, data science, insurance, offline. can find me building my side business running or lost in a good book. You can listen to the podcasts in both audio or video format on YouTube. If audio is more your thing, you can find the podcasts on Spotify, Apple podcasts, YouTube or wherever else you get your podcasts.

Okay, just first before we dive in, I'm going to give you a quick overview about the situation. And I think the first thing that we should do is define what fraud is. So I went on Google and I Googled fraud. So fraud is a noun, the wrongful or criminal deception intended to result in financial or personal gain. Okay. He was convicted of fraud, fraudulence, sharp practice, cheating, swindling,

Triclar are some of the words related to fraud and a person or thing intended to deceive others typically by unjustifiably claiming or being credited with accomplishments or qualities. Okay. So that being said, I am not accusing Schumer, which is the person who's like the leader of the company.

Dalton Anderson (02:28.908)
or the people at Reflection 70 Billion of fraud, I am simply defining what fraud is and you as the listener will be able to make your own decision on what you deem is fraud or not fraud. Is that fair? I think that that is pretty fair.

Okay, so.

Dalton Anderson (02:56.305)
Let's see here.

Dalton Anderson (03:10.516)
So the company who made Reflection 70 billion, this guy named Max Schumer, and it is this other company owned by Grave, Glaive AI, which is like a company that like sells data. And this Reflection 70 billion was made by WriteUp, think, WriteUp AI.

Dalton Anderson (03:41.204)
Hyperwriter AI, sorry. Yeah, I couldn't find it for a second. yeah, HyperWrite is a company owned by this great glaive company, Glaive AI. Glaive AI invested or created HyperWrite. HyperWrite is this like AI powered tool to help writing and emails and things like that. And you can craft these like customized personas and such.

And they're okay. They're not, I wouldn't say seen as like someone who is a leader in this like emerging industry, but there's someone that is known. They're not a household name. Like if you ask somebody random, I bet you they could tell you that OpenAI makes AI products. Hyperwrite is a little bit more niche, but it was respectful in its own regard.

Okay, so now you have that kind of the background. There's two companies really involved. There's this guy.

There's guy named Max Schumer, and then there is Hyper-Rite, and then there's Grave AI. Those are like the three kind of people that are, or two companies, one person. And so it generated a lot of excitement because it was supposedly the best open source model, not only for its size, but

like it just in general is the best and it offered superior performance, reduced costs, all sorts of like crazy, crazy numbers, crazy numbers. And it generated a lot of buzz and excitement and people were clamoring to get this into their systems and the cloud providers trying to get this into the cloud. And people were downloading the model. had

Dalton Anderson (05:43.518)
you know, like 100,000 pulls and downloads on, on people's computers to try to match the results. And so there, there was this big, big excitement about this model from, from the technology, like, I guess, like the tech community, I would say. And it was generating a lot of buzz. Like I, I was seeing it all day, all the time.

And it was just wowed a lot of researchers. Like lot of researchers were so struck back by like the progress that they made and their claim to like the superior performance and how they went about it was they said that they had a breakthrough with their reflection prompt engineering, which is like some custom proprietary piece that they build on top of these foundational models that

allows the model to see its previous prompts and then recognize which prompts are bad.

and improves them to no longer.

no longer write bad prompts. Something like that, right? It was like this really abstract thing that was supposed to, it just wasn't clear how that would improve the model so much. so the model was released September 5th. People were so excited about it because of all the news and all the hype. September like six to the ninth.

Dalton Anderson (07:26.856)
there was an independent evaluations and they failed to replicate the results. And that's when accusations started occurring. And if you're not familiar with scientific research or like the science community, if you publish something, and you should be somewhat familiar. mean, everyone was in fifth grade once, everyone had a science project. But if you publish your experiment and then people,

people can't get the same results as you. Maybe one or two people, that's fine, but if you're talking about like 20 researchers can't get the same result, it seems sketchy, right? So then people are like, what's going on here? This is not really mapping up with what you're saying. I can't get the same results. Some people on Twitter are like, I'm not getting the same results as what's being shown. And then September 10th,

Schumer kind of does like a half apology, like, yeah, like I was super enthusiastic about it all and I may have gotten too excited. And then in the following weeks, further investigation confirms that they were using a wrapper of Sonnet, which is 3.5, which is a model from Anthropic. And basically the controversy has continued ever since then.

And it's really difficult for them to back out of that one because it was confirmed that it was sonnet 3.5. They said that they were using Meta 70 billion model, 70 billion parameter model, and they did their own little fine tuning and tweaking to improve it.

and they were changing the industry, blah, blah. And then people realized that, like you're using Anthropix model.

Dalton Anderson (09:21.588)
So one of the key issues is misrepresentation of the model performance. And so what the issue was, they were using one of the best in-class models and then saying that has over 400 billion parameters and then saying it's a 70 billion parameter model, which would have been incredible to get that kind of performance and speed and the size of the model with those results would have been changing the industry for sure.

And then there's just like overall lack of transparency.

on this slave AI and the hyper right, like all this stuff going on with this is very, very murky, I would say. there is potential evidence for fraud. The model didn't seem to really exist and they went out in public and had lots of interviews, like VentureBet interviewed them and a couple other like large like tech.

like news companies or magazines or newsletters, so and so. And they had to publicly post apologies that, like we trusted these people at the word. We couldn't individually verify the facts. And I'm sorry that we misread, like we misled you. And so there's a lot of people that had to apologize. There's companies that were super upset about using GPUs on this product.

to train or put up a cloud instance for customers when the whole thing was fraudulent. And so I think there's gonna be some blowback from this. Or it's not fraudulent, it's accused of being fraudulent. There's quite a few angry businesses that spent money and labor on putting up these models into the cloud and be able to offer it to their customers. And the model

Dalton Anderson (11:22.632)
doesn't really seem to be what it was said it was.

And so I think that it's been covered by a couple news orgs, so I felt more comfortable talking about it. So we have VentureBeat, Hacker News.

Glaive AI posted an update about it and then Tom's guide has an article as well. So now we're gonna transition over to the rise and fall of Reflection 70 billion. So as I said, it was said to be the world's top open source model. It claims superior performance and benchmarks, things that have never been done before in

that small of a model and they followed up with all of this interest with hype and media.

Dalton Anderson (12:23.442)
media showcases and they went on like a whole road show just hyping up this model. And they were rudely awakened almost a couple days later when people started looking into it. And I think there was a lot of initial buzz, but there was also skepticism too. It wasn't like no one was skeptical.

Because a lot of times, these things aren't really a surprise. To put it in perspective, think X is gonna, when they post Gronk 3, and they're training it on 100,000 H100s, I think the new Gronk is gonna be really good. They're Gronk from Gronk 1 to Gronk 2.

Massive improvement. Gronk 3 I think is going to be substantially better and one of I think the top models, foundational models, and they're going to open source Gronk 2. Same instance. Llama 1, it was okay. Llama 2, it was a lot better. Llama 3, phenomenal. And same thing with Google Gemini. Gemini, like Bard, was okay.

then they made a bar two or something and then they did Gemini and then they upgraded Gemini a couple of times and now Gemini is like pretty much a top model.

And these companies are spending, they're spending billions of dollars on these things, right? So it's, it's not like you're gonna.

Dalton Anderson (14:12.404)
You're gonna just.

I don't know the exact saying, but like, fly by night company that just pops out of nowhere and just has this phenomenal industry shaking model approaches that have never been thought about. you're just so much better than everyone else. Like those things don't really happen when you're putting that much money into the model.

all these companies are putting so much resources into these initiatives and they have the best talent in the world. Like literally these are the companies to work at and like this is like the most hype thing right now in the world. So they're paying salaries of a million plus dollars when you bring in like total comp with stock bonuses.

And their salaries for some of these things like open AI or Google, they're making big bucks, like $800,000 salary and then some bonuses and all that stuff and equity. They're making serious money. And so I'm making all those points to really dissolve the idea that some random

I know we can't really say random, not a top player is gonna just show up and be that much better than everyone else. It just didn't seem, it just didn't seem, seemed so out of ordinary. And I had some doubts and I've gotten burned on some of this stuff before. I haven't talked about it on the podcast, but there's some stuff I read about and I was like, wow, this is so cool.

Dalton Anderson (16:13.502)
That's crazy. Congratulations to them. And then it comes out like two months later that what they, what they demoed was fake or what they made wasn't actually theirs. It was just a rapper. And it's fine to make these rapper products, but you have to disclose like, Hey, this is a rapper that I didn't make this all myself. I I'm building off of someone else's work. It's very, very disingenuous to take somebody else's stuff.

and then just say it's It's just not a good look for sure, especially when your big selling point is the thing that you supposedly built and it's not really yours. It's quite odd, quite odd. And...

Dalton Anderson (17:02.46)
I would say if you are going to, I'm not advocating for this, but if you are going to commit fraud, I would not try to commit fraud against some of the smartest people in the world. I don't think that one's going to play out that well. Not only are they one of the smartest people in the world, they're also one of the most curious people in the world. So they're going to be super curious about the model, the results they're going to download on their computer. They're going to set up their own instance. They're going to

just comb through it for hours and hours and hours. And they're gonna be like, this doesn't make any sense. And they're gonna post on Twitter or whatever you call it nowadays, X. And then the other people are gonna post the same thing. And before you know it, you've got this fire in the tech community. And that's literally how it played out. People are dropping five hours on these new models, trying them out and figuring them out.

Dalton Anderson (17:59.84)
you're going to get called out if you do something like that. Especially to that scale. Yeah. my goodness. It's like crazy, crazy thing to even attempt to do. Like that's definitely not a good idea. Definitely not a good idea. So after the model was launched, all the smart people gathered up and were like, Hey,

I don't know about this one. This is, this is really not mapping up of what they're talking about. I am not getting the same results. And then, so everyone else kind of was like, yeah, me too. I, I don't know. And then, and then what really, what really sparked, sparked flames to the, to the, to the smoking inverse was when, and what was an accelerant was when people like actual employees from open AI were looking into it.

and they questioned it. And then people from Meta and the people from Google, like actual AI researchers, they're like, this isn't, this doesn't, this doesn't map up. Like, I don't know, something's going on here. And then it really started to unravel when those things were happening because then there was some weight to it. Like if some random person's like, I don't know, you know, versus.

someone who's like the director or vice president of AI development at their perspective company and the company is like OpenAI or Google or DeepMind or MetaAI. People listen to those people. Those people have like 50,000 followers. Some of these AI engineers at these companies have like 100 plus thousand followers on X. They have a big following and a lot of those people are

highly technical and centralized in that segment, that cluster in the industry.

Dalton Anderson (19:58.174)
So that being said, it didn't work out that well. It was pretty quick that people really questioned Max Schumer's push that reflection 70 billion parameter model was that much better than everyone else's models. And there was a lot of skepticism. And I think that there should be a little bit more transparency and ethical practices for sure, especially these things are eye-opening.

because a lot of people spent money on these models to get these put together. There's labor, there's time. And if these things consistently happen.

Dalton Anderson (20:40.424)
doesn't look good obviously for AI, right? Like I think that AI people, I think people already are very skeptical of AI. Like is AI really improved my life? Is AI really worth all the hype? Is AI worth all the money that people are putting into it? Is AI even gonna become useful? People have all these questions like that, similar to that.

is AI dangerous? And then they have that skepticism in the back of their mind, and then they see something like this, and they're like, yeah, of course, like that's AI in a nutshell, like just scammers and nonsense. And I think there's some studies that show that people already feel unfavorably to AI.

Dalton Anderson (21:59.998)
Yeah, the vast majority of Americans feel negatively about artificial intelligence and how it'll impact their future.

Dalton Anderson (22:12.5)
54 % of Americans feel that they should be cautious towards AI. There's a negative sentiment.

I'm on

Americans with AI. there's, they're just like, people just don't really feel that favorable of AI in the first place. 52 % of Americans are more concerned than they are excited about AI. And then that's compared to, with just 10 % said that they're more excited than concerned. And then 36 % said that they're a mix.

But a lot of people just don't feel that good about AI, right? And there's also in businesses, like there's such a big push, like every business is like, I'm AI, AI this, AI that. And I think there was another study, this is not on sentiment, but really on like how your customer reacts. And a lot of customers, think like 70 % are like turned off by people pushing AI like for their company as an offering.

And so it does already have that sentiment, right? And these kind of events don't help. And that's kind of my concern is like, I don't want this to turn into some ETF, not ETF, sorry. ETFs are good. I don't want this to turn into one of those crypto, like, what is it? Crypto dogs? EFT, EFT, yep.

Dalton Anderson (23:57.054)
crypto.

Dalton Anderson (24:10.9)
Mmm.

Dalton Anderson (24:16.936)
I just don't want it to turn into one of those things where it's just like this, this really

really just nonsense, like crypto stuff. There's so much crypto nonsense and there was so much hype and potential and then with all these scams and other foolishness involved with crypto, people just turned off from crypto. A lot of people just don't trust crypto. It's not something people would put their money in if they're gonna lose everything or they're gonna get scammed and.

people are gonna steal their money or it's gonna get hacked or there's so many issues with crypto. And when you already have this fight to change someone's perception and there's negative events ongoing, it's good luck. Good luck doing that. That's all I have to say. Like it's not happening. It's not gonna happen.

Dalton Anderson (25:24.498)
So I think the result of the controversy is still playing out. I don't know if they have a clear action plan on like what's gonna happen to Glaive AI and Hyper-Rite AI or with Max Schomer. And so those things are still being investigated I think by a team of investigators. But I'm not really sure what will happen if

something happens, I don't know. I know that there's investors involved with both companies. I know that there was companies disappointed with what was some say fraud or at least by a very minimum.

they would say misrepresentation.

Dalton Anderson (26:21.392)
where is the consequences for potentially these things? I don't know where that lies. I do know that they're investigating it and trying to really, really have a solid case before like formal acquisitions are made. But from the media that did investigative journalism on it, from other people close to the issue,

It seems very, very likely and confirmed that they were using enthropics, like 400 billion parameter model when they said that they were using 70 billion parameter model, which was an edited version of Meta's 70 billion parameter model.

So Max Schoemer, he produced, I think a couple of apologies, but he never really addressed the issue. He just said, hey, we're overly excited. That was like when he had their first one. The second one, they didn't really say anything about it. Like they just kept it hush hush and obviously intentionally. And so with the skepticism and reaction of the AI community,

I think that reflection, 70 billion parameter model, Hyper-Write, AI, and Glaive AI, I think they're toast. I don't know how they can come back from something like that. That's a really big breach of trust.

So I'm not sure, maybe with this issue with the reflection 70 billion parameter model and the controversy associated with it, maybe that changes how people approach these new models in open source AI and just AI models in general. Maybe they have to have independent verification before they make these claims because I think that

Dalton Anderson (28:29.106)
since everything is moving so fast and a lot of the community is scientific, there's a lot of trust on.

the person making the claims because that's a big thing in the scientific community where if you say something like that publicly, you're very, very confident that what you're saying is true.

And so don't make these kind of statements if you're unsure or if you're lying, supposedly, right? Supposedly lying.

So maybe there needs to be independent validation from a different group to validate the claims before they're made. So there isn't this huge hype train and potential financial gain. I don't know. I think the main reason why you would do something like that is if you needed to raise more funding, if you're in a funding round and you're trying to get your evaluation higher or something regarding

financial gain. And if you remove the ability for someone to make these outward claims and make them get validation from so and so not not really a centralized group because I don't think that works either. So they could just pay that centralized group to

Dalton Anderson (29:59.698)
monitor it? I don't know. I mean, I'm not saying that the community open sourcing their validation is bad. I just wish that there was a better way to validate the claims before they're made and

Dalton Anderson (30:17.51)
make the whole industry look bad. think that's the main concern. I don't know. Maybe there's certain volunteers that will volunteer their time and they'll do like a rotation or something like that. Like randomly selected researchers will go and like...

use the model and see. Maybe they could do something like that. I don't know.

Currently, that's an issue. I think this happened a couple times, nothing to this severity, but this stuff has happened and a lot of times it's centered around them raising more money or getting their evaluation up or some sort of financial gain is involved a lot of times. And so just to recap, reflection, 70 billion,

parameter model was led by Hyper AI and Glaive AI and a guy named Matt Schomer.

Dalton Anderson (31:26.334)
there was announcements and a showcase of Reflection 70 billion parameter model being the world's best open source model in its class and outside of its class. So that would mean it's better than the 400 billion parameter models and the 70 billion and the ones below it. So it was saying it was the best open source model.

Dalton Anderson (31:58.876)
I think there needs to be an emphasis on skepticism potentially more. So now after a couple hiccups now, because it turns out that it's been confirmed, what they said wasn't actually the exact situation. They were using clods.

400 billion parameter model.

And so.

think that this situation provides a good opportunity and allows users to become more skeptical of these announcements and prove, things need to be proven before they are announced so widely like this, especially if you are unknown-ish and not one of the key contributors for these foundational models at least.

I there's always gonna, I think with any industry changing product, there's always gonna be, or industry changing opportunity, I think there's always gonna be people who are not excited about technology to better the world. They're excited about the potential financial gain that they can get from the opportunity. And then there's the genuine people who are excited about

Dalton Anderson (33:30.92)
the technology because it's going to better the world. And those are the people that are interesting. Those are the people that will make the best product. The people that are in it for the financial gain are not going to be around long-term, I don't think. Because it's going to be a 30-year grind, 25-year grind. And I don't know. I don't think that you would provide the best product long-term.

And we'll see how it all plays out, who knows, who knows. But yeah, gonna encourage you guys to share your thoughts on what you think about this reflection 70 billion parameter model. I personally think it's crazy. It's a crazy thing to do, for one. Two, these are like the smartest people in the world. And they're very thorough in what they do. And they like to check things out in their free time.

And so I just never, never thought that was going to really pan out that well for, for them. As soon as I started seeing like some weird tweets about it, I was like, yeah, this is, this is over with.

Dalton Anderson (34:48.284)
I don't necessarily have a planned episode for next week right now. I don't think. I'm not sure what route I'm going to take. I will, I will get back to you on that, right? I don't have, I don't have something planned out. I've got a list of episodes that I want to do. Maybe I could do next week. I probably might be able to play around with Meta AI and look at the new, the new

results of their update and some of the abilities that they have now like.

I would like to test out my reel turning into Spanish. I post reels on Instagram. So let's see what it looks like when I speak Spanish. I would like to see unused Meta's voice function. And.

I would like to use the Meta glasses, but they're sold out. But I did purchase Meta's VR headset, the Quest 2. So I would be, or no, sorry, I I will be doing an episode on that when I get back home. I'm excited. VR looks so cool. I want to work in VR. Like I would love to go travel and just bring my laptop and my VR headset and able to work in VR. I don't, I know it's not very realistic for a long time because

The headsets are so heavy, but I think long term, I think it would be pretty cool to at least be able to use the VR headset for a couple hours, four hours or something a day. Of course, wherever you are in the world, good morning, good afternoon, good evening. I appreciate you listening and hope you listen in next week. Thank you, goodbye.

Creators and Guests

Host

Dalton Anderson

I like to explore and build stuff.

The AI That Wasn't: Unmasking Reflection 70B

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere