llms

26
Jan

DeepSeek-R1 Initial Notes

While I've been traveling, little-known Chinese research lab Deepseek released an open-source model that can compete with the best closed-source products OpenAI and Anthropic have to offer. Everyone appears to be freaking out about it.

The reason for this is its astonishingly low training cost compared to performance (it reportedly cost $5 million dollars to train), the fact that it's open-source, and even if you choose to pay for its API, it's about 90% cheaper than its American competitors. The model is also available to try for free immediately. This is all possible because of how much cheaper the

Read more
5 min read
17
Jan

1/17 Link Roundup & Screenshot Dump

I've reached day 3 of my hundred days and I'm already out of ideas. A new record! However, I accounted for this inevitability in the goal I set for myself, which was just to "write something on my blog every day". I'd like to add some functionality to the blog to display shorter posts in the main feed - things like til or links with short notes attached - which will then basically allow me to do a hundred days of tweets but on my personal site if I so wish. In the meantime, though, I'm stuck with the blog

Read more
6 min read
15
Jan

AI Inferno: Making A Reverse Roko's Basilisk With AI Agents

For those who weren't in some of the more toxic corners of the tech internet in the late aughts and early teens, Roko's Basilisk is an old thought experiment to the effect that, should a benevolent superintelligence come to exist in the future, it would be incentivized to punish people who were aware of its possibility and did nothing to help it come to exist. This is the type of pretty dumb thought experiment that really hits for a certain type of Online Tech Guy: it originated on LessWrong (more like More Wrong imo), made its way across the Slatestar

Read more
11 min read