Hi, I'm Francois

My Background: Linked In

I am a PhD Student at Stanford co-advised by Chris Ré and Mykel Kochenderfer. Right now, I think the highest and best use of my FLOPS is to help push the AGI ball up the hill in any way I can. This is the first time in the history of our universe (that we are aware of) where a species has a chance at developing a superintelligence and working on anything else seems remarkably meek in the face of that fact. What a time to have been born, to be alive, and to have the skill set that I have. What else could I possibly be working on? So, I done with the startup game for now, and I am going back to Stanford to get my PhD to do exactly that. I believe strongly in the "Science in the Open" ethos and as part of that, I want to open a window into the research I am doing even in the "nascent" stages of such research, even if that means it may get poached, thats ok, if it will accelerate the attainment of man-kind's research goals. So I will try to dedicate a few hours a week to documenting what I am doing / thinking about here in a raw / open format.

Here is what I am CURRENTLY working on / interested in:

I think that SGD + stacked transformers is "perhaps" a sufficient path to get to AGI, but at what cost. We need huge terrawatt datacenters while our brains are better at learning (as of now) and operate on 100W. Something is missing. If Universal Apprx Theorem is true, and if neurons don't backprop, then there must exist another learning algorithm (solver) + architecture that gets us there as well, perhaps much quicker and less expensive. I think there are many "sufficient" paths to get there. We have found (perhaps) one element in the set of AGI, but its certainly not necessary. I agree with Yann Lecunn that we did not need flapping wings to fly, but we did need two wings (initially). Similarly there are many elements in the set of flight, and they all have different trade offs (e.g. helicopters are great for their use case, rockets are great for theirs, planes are great for theirs) there will be many elements in the set of AGI, and they will simiarly have different tradeoffs that we will want to use some for some use cases and others for others.

So right now I want to find another one. The biggest reason why LSTMs, DNCs, NTMs, and other RNNs with big external / internal memories did not scale was BPTT. Well lets get rid of it. Can we find a forward-forward algo that outperforms? Can achieve infinite context length and does not need to keep around all temporal activations! Lets try! If you are interested in this please reach out.

Blog links below:

  1. Universal Address Spaces for GPUs (6/1/2024)
  2. Strange Loop Networks (6/6/2024)
  3. Modality Curriculum (6/9/2024)
  4. Performing a Lobotomy on Llama3 8B (6/14/2024)
  5. Proving Vanilla SGD just memorizes, inspiration for gradient agreement filtering (GAF) paper. (10/1/2024)
  6. Perhaps the Golden Rule is a Nash Equilibrium point for sufficient intelligence, and we can sleep easy while developing AGI (10/4/2024)
  7. Is diffusion all you need? Is diffusion the free lunch we have all been waiting for? I think so! (11/28/2024)
  8. Intelligence complexity is not model complexity, must include training dynamics and data as well. (1/10/2025)
  9. My master plan to solve AGI (1/12/2025)
  10. Some VERY interesting plots for my up comming paper "Generalization Efficiency Scaling Laws for RNNs without BPTT" (1/28/2024)
  11. My Journey Going from First Order to Zeroth Order to Reduce the Cost of Intelligence 1 million fold. (2/1/2024)
  12. Are "you" your dual agent training your primary agent? Warning: May break your brain. (2/12/2025)
  13. Novel Token Spickets: Put yourself right next to the source of novel tokens, not second-hand smoke, but first-hand. (2/20/2025)
  14. Future of Work and the Rate of Automation (2/21/2025)
  15. All you need is a World Model + Policy + Value Model; Thats it. (4/15/2025)
  16. Whats greek to some is native tongue (6/8/2025)
  17. Brains dont backprop; Neurons are one-way! (6/27/2025)
  18. The Copernican Epiphany: Realizing You’re Not the Center of the Universe (7/3/2025)
  19. An After Action Review (AAR) for scientific progress over the last 1000 years: Humanity’s Regret Curve (7/10/2025)
  20. (Train Time) Recurrence as a necessary condition for General Intelligence (8/18/2025)
  21. Le Chátelier Principle applied to startups, software, and AI (8/19/2025)