The mysterious Torrent release of Mistral 8x7b sent me down the rabbit hole of trying to understand what this term means. I heard it first as a possible explainatin for GPT-4’s architecture. I think, with the help of ChatGPT, I was able to grok about 60% of this explainer.
Steven Levy describes a Google product resulting from a collaboration with Steven Johnson. Both authors I’m a fan of. I’ve played with this tool just a bit. My summary would be: Upload some documents and start asking questions. Those questions are answered by the content of your documents mixed with general knowledge the LLM already possesses. It generates “notes” as answers to your questions. Those notes have citations associated with them where you can review the sources of the information.
I’ve only read the free start to this paid newsletter article, but it is extremly insightful.
One of the funniest trends we see in the Bay area is with top ML researchers bragging about how many GPUs they have or will have access to soon. In fact, this has become so pervasive over the last ~4 months that it’s become a measuring contest that is directly influencing where top researchers decide to go
So many great quotes in this one. This fits my priors:
Yes, being efficient with GPUs is very important, but in many ways, that’s being ignored by the GPU-poors. They aren’t concerned with efficiency at scale, and their time isn’t being spent productively. What can be done commercially in their GPU-poor environment is mostly irrelevant to a world that will be flooded by more than 3.5 million H100s by the end of next year. For learning, experimenting, smaller weaker gaming GPUs are just fine.
Real-world benchmarks of Apple Silicon vs. Gaming-level GPUs.