When you look at the night sky, you are looking into the past. The Andromeda Galaxy is 2.5 million light-years away, meaning its photons striking your retina tonight left when early hominins were still figuring out how to bang rocks together. This is a hard constraint of physics: information takes time to travel. Even worse, the photons go through a distortion process given the landscape it must travel so there could be a red shift occuring or otherwise.
The same principle applies to ideas, perspectives, and intellectual progress. Just as astronomical distance creates lag, geographical and institutional distance creates "epistemic lag". Just like the path from which the photon must travel distorts the photon, the information path token can distort the information. Why did it take so long for everyone to realize what was going to happen with AI after AlexNet got published in 2012? Why did CV labs take 2-3 years to adopt Conv Nets after that? Why did it take investors 10 years to realize NVDA was going to explode? Why does the US not get the middle east and visa versa? Why does DC and Silicon Valley differ so much in perspective? Why does a NYC quant interview and a SV DL interview literally have no jaccard similarity yet are hitting on many of the same topics. Information/Perspective is not evenly distributed; it has points of origin, diffusion pathways, and varying rates of transmission. If you want to be on the cutting edge, if you want to live in "the future" how? where do you live? what do you do all day? The question is actually: where do the novel tokens originate, and then how to minimize the time for them to reach you?
Just like photons from andromeda, watching the news is watching the past. Not just in the trivial sense that it happened hours ago, but in the deeper sense that the underlying research, business decisions, and technical implementations were conceived long before the press release. The rest of us are merely observing the afterglow.
Raw information originates in primary sources: closed-door meetings at OpenAI/Anthropic, experimental labs at DeepMind, informal conversations in Menlo Park cafes, or pitch meetings in A16Z. And even before that, there is a laptop and a person that an "ah ha" moment occurs.. that is the origination point. That is the first hand smoke. Everything else is second hand or worse.
Even worse is how we update our priors after receiving that information. In signal processing, there’s a well-known phenomenon called filter lag. Any system that processes information—whether a Kalman filter tracking an object, or a financial analyst digesting market signals—will always be slightly behind reality. Filtering requires aggregation, smoothing, and verification, which introduce delay.
Consider the media ecosystem as a filtering mechanism. This filtering process introduces an inevitable lag between when something is known and when it is known widely.
For those consuming information solely through news outlets, academic journals, or industry reports, the result is clear: you are experiencing a delayed snapshot of a past reality. If you want to operate on real-time information, you need to be closer—physically, socially, or professionally—to the source.
Consider how large language models (LLMs) acquire knowledge. They predict the next token based on statistical correlations in past data. Their epistemology is retrospective; they do not generate fundamentally new information but remix existing tokens in ways that approximate novelty.
The same is true for human knowledge consumption. If your intellectual diet consists primarily of secondhand sources—news articles, research papers, analyst reports—you are engaging in a form of autoregressive token prediction. You are not in the domain where novel tokens are generated; you are in the domain where existing tokens are recombined.
So where do new tokens originate? In environments of high-entropy innovation—places where new concepts, frameworks, and heuristics are actively being developed rather than merely disseminated. Historically, these environments have been geographically concentrated:
To return to the LLM analogy: if you are far from these generative environments, your information diet consists of tokens from the past, not real-time novel tokens. Being physically or socially close to these environments increases the probability that your cognitive model is trained on cutting-edge data rather than lagging distributions.
Not everyone can—or wants to—move to the Bay Area. But the core problem remains: how do you minimize epistemic lag without geographical proximity? Here are some heuristics:
The experience of “living in the future” is ultimately an artifact of information flow dynamics. If you are geographically, socially, or intellectually distant from information-generating hubs, you are necessarily consuming filtered, lagging data. The greater this lag, the more outdated your model of reality becomes.
Silicon Valley is not special because of some intrinsic property of its geography, but because it has become the central node in the global information network for technological innovation. If you want to live in the future, you need to optimize for proximity to novel tokens. This is not a metaphor; it is a fundamental principle of information theory, epistemic economics, and network science.
The future is an unevenly distributed function of information flow. Your position in that function determines whether you are predicting the next token—or generating it.