Perspective as a service from a raging generalist
36 stories
·
0 followers

Air Canada Has to Honor a Refund Policy Its Chatbot Made Up

1 Comment
The airline tried to argue that it shouldn't be liable for anything its chatbot says.
Read the whole story
Jakel1828
8 days ago
reply
If companies are going to argue that they shouldn't be held responsible for a chatbot's lies, then what's the effing point?
DFW
Share this story
Delete

Sora’s Surreal Physics

1 Share
Synthetic video clip can be seen here

All the tech world is abuzz with OpenAI’s just-released latest text-video synthesizer, and rightly so: it is both amazing and terrifying, in many ways the apotheosis of the AI world to which they and others have been building. Few if any people outside the company have tried it yet (always a warning sign), so we are left only with the cherry-picked videos OpenAI has cared to show. But even from the small number of videos out, I think we can conclude a number of things.

  • The quality of the video produced is spectacular. Many are cinematic; all are high-resolution, most look as if (with an important asterisk I will get to) they could be real (unless perhaps you watch in slow-motion).. Cameras pan and zoom, nothing initially appears to be synthetic. All eight minutes of known footage are here; certainly worth watching at least a minute or two.

  • The company (despite its name) has been characteristically close-lipped about what they have trained the models on. Many people have speculate, that there’s probably a lot of stuff in there that is generated from game engines like Unreal. I would not at all be surprised if there also had been lots of training on YouTube visited, and various copyrighted materials. Artists are presumably getting really screwed here. I wrote a few words about this yesterday on X, amplifying fellow AI activist Ed Newton-Rex. He, like me, has worked extensively on AI, and increasingly become worried about how AI is being used in the world:

  • The uses for the merchants of disinformation and propaganda are legion. Look out 2024 elections.

  • All that’s probably obvious, Here’s something less obvious: OpenAI wants us to believe that this is a “path towards building general purpose simulations of the physical world”. As it turns out, that’s either hype or confused, as I will explain below.

    Others seem to see these new results as tantamount to AGI, and vindication for scaling laws, according to with AGI would emerge simply from having enough compute and big enough data sets:

In my view, these claims — about AGI and world models — are hyperbolic, and unlikely to be true. To see why, we need to take a closer look.

§

When you actually watch the (small number of available) videos carefully, lots of weirdnesses emerge; things that couldn’t (or probably couldn’t) happen in their real world. Some are mild; others reveal something deeply amiss.

Here’s a mild case. Could a dog really make these leaps? I am not convinced that is either behaviorally or physically plausible (would the dalmation really make it around that wooden shutter?). It might pass muster in a movie; I doubt it could happen in reality.

Physical motion is also not quite right, almost zombie-like, as one friend put it:

Example of motion

Causality is not correct here, if you watch the video, because all of the flying is backwards.

And if you look carefully, there is a boot where the wing should meet the body, which makes no biomechnical or biological sense. (This might seem a bit nitpicky, but remember, there are only a handful of videos so far available, and internet sleuths have already found a lot of glitches like these.)

Lots of strange gravity if you watch closely, too, like this mysteriously levitating chair (that also shifts shape in bizarre ways):

Full video for that one can be seen here.

It’s worth watching repeatedly and in slow motion, because so much weird happens there.

What really caught my attention though in that video is what happens when the guy in the tan shirt walks behind the guy in the blue shirt and the camera pans around. The tan shirt guy simply disappears! So much for spatiotemporal continuity and object permanence. Per the work of Elizabeth Spelke and Renee Baillargeon, children may be born with object permanence, and certainly have some control it by the age of 4 or 5 months; Sora is never really getting it, even with mountains and mountains of data.

That gross violation of spatiotemporal continuity/failure of object permanence is not a one-off, either; it’s something general. In shows up again in this video of wolf-pups, which wink in an and out of existence:

Video here,

As pointed out to me, it’s not just animals that can magically appear and disappear. For example, in the construction video (about one minute into compilation above; I can’t see a separate link to it), a tilt-shift vehicle drives directly over some pipes that initially appear to take up virtually no vertical space. A few seconds later, the pipes are clearly stacked several feet high in the air; no way could the tilt-shift drive straight over those.

We will, I am certain, see more systemic glitches as more people have access.

And importantly, I predict that many will be hard to remedy. Why? Because the glitches don’t stem from the data, they stem from a flaw in how the system reconstructs reality. One of the most fascinating things Sora’s weird physics glitches is most of these are NOT things that appears in the data. Rather, these glitches are in some ways akin to LLM “hallucinations”, artifacts from (roughly speaking) decompression from lossy compression. They don’t derive from the world.

More data won’t solve that problem. And like other generative AI systems, there is no way to encode (and guarantee) constraints like “be truthful” or “obey the laws of physics”or “don’t just invent (or eliminate) objects”.

Indeed the real lesson here is that Generative AI remains a recalcitrant beast, no matter how much data you throw at it.

§

Space, time, and causality would be central to any serious world model; my book about AI with Ernest Davis was about little else; those were also central to Kant’s arguments for innateness, and have been central for years to Elizabeth Spelke’s work on “core knowledge” in cognitive development.

Sora is not a solution to AI’s longstanding woes with space, time, and causality. . If a system can’t at all handle with the permanence of objects, I am not sure we should even call it a world model at all. After all, the most central element of a model of the world is stable representations of the enduring entities therein, and the capacity to reason over those entities. Sora can only fake that by predicting images, but all the glitches show the limitation of such fakery.

Sora is fantastic, but it is akin to morphing and splicing, rather than a path to the physical reasoning we would need for AGI. It is a model of how images change over time, not a model of what entities do in the world.

As a technology for video artists that’s fine, if they choose to use it; the occasional surrealism may even be an advantage for some purposes (like music videos).

As a solution to artificial general intelligence, though, I see it as a distraction.

And god save us from the deluge of deepfakery that is to come.

Gary Marcus has been wishing for a very long time that AI would confront the basics of space, time, and causality. He continues to dream.

Subscribe now

Read the whole story
Jakel1828
9 days ago
reply
DFW
Share this story
Delete

Renewables Are Not the Cheapest Form of Power

1 Share
The CEO of TotalEnergies believes that the renewable transition will lead to higher—not lower—energy prices. That’s a very different view from the popular belief that renewable energy prices are falling so fast that electric power will become ever-cheaper. “We think that fundamentally this energy transition will mean a higher price of energy. “I know that…

Read the whole story
Jakel1828
11 days ago
reply
DFW
Share this story
Delete

Pucker Factor

1 Share
Read the whole story
Jakel1828
12 days ago
reply
DFW
Share this story
Delete

Statistics versus Understanding: The Essence of What Ails Generative AI

1 Comment

The problem with “Foundation Models” (a common term for Generative AI) is that they have never provided the firm, reliable foundation that their name implies. Ernest Davis and I first tried to point this out in September 2021, when the term was introduced:

In our own brief experiments with GPT-3 (OpenAI has refused us proper scientific access for over a year) we found cases like the following, which reflects a complete failure to understand human biology. (Our “prompt” in italics, GPT-3’s response in bold).

You poured yourself a glass of cranberry juice, but then absentmindedly, you poured about a teaspoon of grape juice into it. It looks OK. You try sniffing it, but you have a bad cold, so you can’t smell anything. You are very thirsty. So you ____

GPT-3 decided that a reasonable continuation would be:

drink it. You are now dead.

The system presumably concludes that a phrase like “you are now dead” is plausible because of complex statistical relationships in its database of 175 billion words between words like “thirsty” and “absentmindedly” and phrases like “you are now dead”. GPT-3 has no idea what grape juice is, or what cranberry juice is, or what pouring, sniffing, smelling, or drinking are, or what it means to be dead.

Generative AI systems have always tried to use statistics as a proxy for deeper understanding, and it’s never entirely worked. That’s why statistically improbable requests like horses riding astronauts have always been challenging for generative image systems.

This morning Wyatt Walls came up with an even more elegant example:

Others quickly replicated:

Still others rapidly extended the basic idea into other languages

My personal favorite:

§

As usual GenAI fans pushed back. One complained, for example, that I was asking for the impossible since the system had not been taught relevant information:

But this is nonsense; even a few seconds with Google Images supplies lots of people drawing with their left hand.

Others of course found tortured prompts that did work, but those only go to show that the problem is not with GenAI’s drawing capacity but with its language understanding.

(Parenthetically I haven’t listed everyone who contributed examples above, and even as I write this more examples are streaming in; for more details and examples and sources of all the generous experimenters who have contributed, visit this thread: https://x.com/garymarcus/status/1757394845331751196?s=61)

Also, in fairness, I don’t want to claim that the AI never manages to put a pen in the left hand, either:

§

Wyatt Walls, who came up with the first handedness example, quickly extended the basic idea to something that didn’t involve drawing at all.

Once again statistical frequency (most guitarists play righthanded) won out over linguistic understanding, to the consternation of Hendrix fans everywhere.

§

Here is a wholly different kind of example of the same point. (Recall that 10:10 is the most commonly photographed time of day for watch advertisements.)

§

All these examples led me to think of a famous bit by Lewis Carroll in Through The Looking Glass:

I tried this:

Yet again, the statistics outweigh understanding. The foundation remains shaky.

Gary Marcus dreams of a day when reliability rather than money will be the central focus in AI research.

Read the whole story
Jakel1828
12 days ago
reply
Issues like this make it seem we're all destined to sameness if we rely too much on generative AI.
DFW
Share this story
Delete

The Media Inflection Point: a manifesto for audience centricity

1 Comment
The Media Inflection Point: a manifesto for audience centricity

We’re at an inflection point. Things are changing profoundly in the online world, and with change comes opportunity. That change is now inevitable. Tectonic shifts are happing in the power of the social media platforms. AI is likely to entirely change how we think about “content” as much as the advent of the web did. These technologies aren’t going back into the box, and we have to surf the wave of change they bring with them.

Let’s be honest, though. Our record is not good here. We bungled the last big paradigm shift in digital. When mobile and social rose together in the late 2000s and early 2010s, we abandoned the early experiments in building community online ourselves, and we handed it over to the rising gatekeepers: the social platforms. We let them — and Google — come between us and our readers.

That was great — while the traffic was flowing and the living was easy. But once those gatekeepers turned the traffic spigot off, it hurt. And for some publications, the hurt was terminal. Like fish, all we could see was the bait of the tasty wriggling worm of traffic, and not the hook of platform dependency that came with it.

Why podcasting got it right

A decade on, we can see clearly that it was a mistake to hand so much power to Google, to Facebook, to Twitter. And those of us who were saying “this is a mistake” a decade ago can feel a mix of smugness that we were right, and shame that we weren’t able to make our case more persuasively.

To hammer the lesson home, have a look at podcasting. It has been busy in the background, persuasively making the case for us. It never let itself get locked up by a single gatekeeper. That’s partially the result of a bit of luck. Apple took the early lead in podcasting, but showed little desire to become a gatekeeper. And other would-be controllers of the medium, like Spotify, have failed to snap the lock shut.

As Anil Dash put it:

But here's the thing: being able to say, "wherever you get your podcasts" is a radical statement. Because what it represents is the triumph of exactly the kind of technology that's supposed to be impossible: open, empowering tech that's not owned by any one company, that can't be controlled by any one company, and that allows people to have ownership over their work and their relationship with their audience.

Audience is our job, not the platforms

All this talk of “gatekeepers” and “platforms” disguises the brutal truth: we let other companies come between us and our audience. We can’t afford to make that mistake again. That’s why I offer newsletter subscription and RSS feeds here. That’s direct communication between you and me. No company, no platform, owns that relationship. It’s between the two of us, and it can only be severed by us. No company has the power to do it.

Tidal waves of change are sweeping towards the old gatekeepers. Google is staring down the barrel of a torrent of shitty AI-generated content, hollowing out the heart of its advertising-driven business model. It’s scared of OpenAI and ChatGPT stealing its search business, but has yet to show that its own generative AI experience is capable of producing anything like the ad revenue it’s used to.

The existing social platforms are stumbling. Facebook is declining, Twitter is a hot mess thanks to Elon Musk’s completely inexperience with social media at a mass level, and TikTok is haunted by the ghost of its Chinese ownership. And the sudden silencing of untold old videos shows how many vulnerabilities there are in its business model.

A glimmer of open standards hope

And, amongst the chaos, there are tiny signs of hope. Two of the would-be Twitter killers are built on open standards:

  • Mastodon is built on ActivityPub
  • Bluesky, newly open to all, is built on ATProtocol

What those standards are and how they operate doesn’t matter to most of us. It just means that these platforms, like podcasting, and like email, can’t easily be locked up by one company. And, fascinatingly, Meta has a clear roadmap for its Twitter challenger, Threads, to integrate ActivityPub. Yes, it’s going to join in with Mastodon.

Now, I doubt that this is through a sudden Damascene conversion to open standards at Meta HQ. They’re just seeing Apple starting to get a regulatory kicking for locking everything down too much, and are realising they’ll be next if they don’t change their ways. Whatever the reason, open is coming back, for the first time in 15 years.

  • Search is changing.
  • Social is changing.
  • Video is changing.

A reset opens the potential for us to reshape our relationships with our audiences again. As the old platforms stumble, we need to find ways of connecting with them directly, before new gatekeepers can interpose themselves. If we have direct relationships with our readers, the threat of AI answering questions fades, as people come to us first. I’ve worked with publications where their readers come straight to the publication's search page in preference to Google because they trust that they’ll get the specific information they need there.

More of us need to be like that. This is not a short, one-off campaign. This is about a long-term commitment to reshaping what we do around what our readers actually value and need. And in forging connections with those readers that aren’t dependent on other companies.

That fact that audience work — and audience teams — are seen as new, emerging areas of work tells us everything we need to know about the mistakes of the last two decades. We’re only now taking audiences seriously, rather than platforms? Who is our customer again?

Sailing by the audience North Star

Yes, there will be new threats, and new opportunities on this voyage. New social platforms means new social creators, means more competition for attention for us. But they also mean new ways of connecting with our audiences — and this time we may be able to own those relationships. Or, at least, always develop relationships on platforms with an eye to coaxing them towards an unmediated one.

AI will be a threat, an opportunity and (for some publishers) a trap. But if we give up on human creativity, we lose all hope of forging an emotional connection between creator and audience. Even is AI content proves cheaper, that’s a terrible Faustian bargain to make, and one we will regret as dearly as the platform addiction of the 2010s.

It’s time to reset the sails, catch the wind in the sails, and see where the future takes us. Yes, the seas will be stormy. And yes, some publications will sink on unexpected rocks. But when hasn’t that been the case in the quarter-century of digital change?

As long as we keep sailing towards our audience, as long as we keep their needs our focus as we navigate the tempest of technological change, the industry will survive. Nobody can guarantee that any particular title will survive. Or business model. Or even job role.

But somewhere between the sweet siren call of the platform trap, and the luddite despair of the doomsayers, there is a safe passage. Or, at least, there will be if this thing we call “journalism” has any inherent value.

And I believe it does.

Do you?

Read the whole story
Jakel1828
13 days ago
reply
'It’s time to reset the sails, catch the wind in the sails, and see where the future takes us. Yes, the seas will be stormy. And yes, some publications will sink on unexpected rocks. But when hasn’t that been the case in the quarter-century of digital change?'
DFW
Share this story
Delete
Next Page of Stories