Big Players Look to Establish New Deals on AI Development

As we enter the next stage of AI development, more questions are being raised about the safety implications of AI systems, while the companies themselves are now scrambling to establish exclusive data deals, in order to ensure that their models are best equipped to meet expanding use cases.

On the first front, various organizations and governments are working to establish AI safety pledges, which corporations can sign-up to, both for PR and collaborative development means.

And there’s a growing range of agreements in progress:

The Frontier Model Forum (FMF) is a non-profit AI safety collective working to establish industry standards and regulations around AI development. Meta, Amazon, Google, Microsoft, and OpenAI have signed up to this initiative.
The “Safety by Design” program, initiated by anti human trafficking organization Thorn, aims to prevent the misuse of generative AI tools to perpetrate child exploitation. Meta, Google, Amazon, Microsoft and OpenAI have all signed up to the initiative.
The U.S. Government has established its own AI Safety Institute Consortium (AISIC), which more than 200 companies and organizations have joined.
EU officials have also adopted the landmark Artificial Intelligence Act, which will see AI development rules implemented in that region.

At the same time, Meta has also now established its own AI product advisory council, which includes a range of external experts who will advise Meta on evolving AI opportunities.

With many large, well-resourced players looking to dominate the next stage of AI development, it’s critical that the safety implications remain front of mind, and these agreements and accords will provide additional protections, based on assurances from the participants, and collaborative discussion on next steps.

The big, looming fear, of course, is that, eventually, AI will become smarter than humans, and, at worst, enslave the human race, with robots making us obsolete.

But we’re not close to that yet.

While the latest generative AI tools are impressive in what they can produce, they don’t actually “think” for themselves, and are only matching data based on commonalities in their models. They’re essentially super smart math machines, but there’s no consciousness there, these systems are not sentient in any way.

As Meta’s chief AI scientist Yann LeCun, one of the most respected voices in AI development, recently explained:

“[LLMs have] a very limited understanding of logic, and don’t understand the physical world, do not have persistent memory, cannot reason in any reasonable definition of the term and cannot plan hierarchically.”

In other words, they can’t replicate a human, or even animal brain, despite the content that they generate becoming increasingly human-like. But it’s mimicry, it's smart replication, the system doesn’t actually understand what it’s outputting, it just works within the parameters of its system.

We could still get to that next stage, with several groups (including Meta) working on Artificial general intelligence (AGI), which does simulate human-like thought processes. But we’re not close as yet.

So while the doomers are asking ChatGPT questions like “are you alive,” then freaking out at its responses, that’s not where we’re at, and likely won’t be for some time yet.

As per LeCun again (from an interview in February this year):

“Once we have techniques to learn “world models” by just watching the world go by, and combine this with planning techniques, and perhaps combine this with short-term memory systems, then we might have a path towards, not general intelligence, but let's say cat-level intelligence. Before we get to human level, we're going to have to go through simpler forms of intelligence. And we’re still very far from that.”

Yet, even so, given that AI systems don’t understand their own outputs, and they’re still increasingly being put in informational surfaces, like Google Search and X trending topics, AI safety is important, because right now, these systems can produce, and are producing, wholly false reports.

Which is why it’s important that all AI developers agree to these types of accords, yet not all of the platforms looking to develop AI models are listed in these programs as yet.

X, which is looking to make AI a key focus, is notably absent from several of these initiatives, as it looks to go it alone on its AI projects, while Snapchat, too, is increasing its focus on AI, yet it’s not yet listed as a signee to these agreements.

It’s more pressing in the case of X, given that it’s already, as noted, using its Grok AI tools to generate news headlines in the app. That’s already seen the system amplify a range of false reports and misinformation due to the system misinterpreting X posts and trends.

AI models are not great with sarcasm, and given that Grok is being trained on X posts, in real time, that’s a difficult challenge, which X clearly hasn’t got right just yet. But the fact that it is using X posts is its key differentiating factor, and as such, it seems likely that Grok will continue to produce misleading and incorrect explanations, as its going on X posts, which are not always clear, or correct.

Which leads into the second consideration. Given the need for more and more data, in order to fuel their evolving AI projects, platforms now looking at how they can secure data agreements to keep accessing human-created info.

Because theoretically, they could use AI models to create more content, then use that to feed into their own LLMs. But bots training bots is a road to more errors, and eventually, a diluted internet, awash with derivative, repetitive, and non-engaging bot-created junk.

Which makes human-created data a hot commodity, with social platforms and publishers are now looking to secure.

Reddit, for example, has restricted access to its API, as has X. Reddit has since made deals with Google and OpenAI to use its insights, while X is seemingly opting to keep its user data in-house, to power is own AI models.

Meta, meanwhile, which has bragged about its unmatched data stores of user insight, is also looking to establish deals with big media entities, while OpenAI recently came to terms with News Corp, the first of many expected publisher deals in the AI race.

Essentially, the current wave of generative AI tools is only as good as the language model behind each, and it’ll be interesting to see how such agreements evolve, as each company tries to get ahead, and secure their future data stores.

It’s also interesting to see how the process is developing more broadly, with the larger players, who are able to afford to cut deals with providers, separating from the pack, which, eventually, will force smaller projects out of the race. And with more and more regulations being enacted on AI safety, that could also make it increasingly difficult for lesser-funded providers to keep up, which will mean that Meta, Google and Microsoft will lead the way, as we look to the next stage of AI development.

Can they be trusted with these systems? Can we trust them with our data?

There are many implications, and it’s worth noting the various agreements and shifts as we progress towards what’s next.