🗂️Will we run out of data?
Khem Raj September 04, 2025 #metaThe paper by epoch.ai is interesting in many ways but fundamentally asking a question
Have we eaten the internet?
It estimates
We estimate the stock of human-generated public text at around 300 trillion tokens. If trends continue, language models will fully utilize this stock between 2026 and 2032, or even earlier if intensely overtrained.
Like any other resource it will deplete. So what would fuel AI growth in future?