llms - lesser daemon (Page 2)

02

Apr

Vibe Coding Has Weak Aura

Vibe coding is all the rage right now. Apparently Andrej Karpathy only coined this term back in February, which is wild to me because I feel like I've been seeing takes about it for the last five hundred years. Probably if you're reading this I don't need to define it for you, but for friends who are less stuck in the programmer suffering mines: vibe coding is when you write your code entirely with LLMs and it mostly works.

Plenty of people have pushed back on the concept of vibe coding because, fundamentally, it

Living in different worlds

If you're online at all you've probably seen the flood of Studio Ghibli-style images from ChatGPT's new image model update. It is technically impressive, ethically pretty gross, and most likely something that will fade in usage after this viral moment but continue to contribute to the critical mass of slop on the internet.

Beyond "this sucks", I don't know how to feel about it. This is a weird moment where a lot of what's being said seems to lack context on one side or the other. A few things are true at the same time:

1) Lots of people are

The Future Is Too Easy

It is rich cynics trying to make something lifeless grow in the way that living things do, and lock the dying present they rule in for the foreseeable future by effectively removing everyone from it but them. They are impatient not just because they are high-handed and avaricious, but because they know that the only future they can rule in the way they want is one that is passive, stupid, small and shrinking.

David Roth, excellent as always, on CES in Defector. I do think it's easy to get myopic about AI as someone who exists mostly in the spaces it does have genuine utility in (programming, games, and digital art) and thus is something to fear. This stuff is still mostly useless in practice! Roth also raises a point I think isn't made enough (because I have made it a bunch and no one listens to me). Most of the run-of-the-mill uses of AI as it is currently are the various and sundry tedium of life in late capitalism, things we shouldn't have to do in the first place: filling out forms, writing cover letters, desperately trying to get someone to fix your medical bills.

A point Roth doesn't make, but one I think ties into the quote above, is that by putting their own offal into the well of data they're drawing from, AI companies are effectively freezing usable data at around the year 2023. So much of the internet is already generated slop, and it's so difficult to actually determine which data is usable, that bot-free datasets can't stay up to date. The rich are freezing AI's knowledge of the world at the dying present, functionally preventing its growing from the same means it was created. I gave a somewhat tongue-in-cheek talk about this last year, and while I'm not sure how well some of the points I made there will hold up, I do think this is something to watch. Synthetic data and curated datasets may be a way out of this hole, and maybe the slop will get good enough that it can train on its own output (model distillation is a big thing right now) but I can't help but question how far that can take us. How useful is an LLM that can't grow at the pace human culture does? What will come of them endlessly consuming their output? Or will the proliferation of LLMs prevent us from growing at all?

27

Jan

DeepSeek FAQ

Just came across this nice technical breakdown of what, exactly, R1 accomplished. Interesting news to me was that they had to bypass CUDA entirely and write optimizations in PDX. Good complementary reading if you're here from my DeepSeek post.

26

Jan

DeepSeek-R1 Initial Notes

While I've been traveling, little-known Chinese research lab Deepseek released an open-source model that can compete with the best closed-source products OpenAI and Anthropic have to offer. Everyone appears to be freaking out about it.

The reason for this is its astonishingly low training cost compared to performance (it reportedly cost $5 million dollars to train), the fact that it's open-source, and even if you choose to pay for its API, it's about 90% cheaper than its American competitors. The model is also available to try for free immediately. This is all possible because of how much cheaper the