Have we discovered an ideal gas law for AI? Head to
https://brilliant.org/WelchLabs/ to try Brilliant for free for 30 days and get 20% off an annual premium subscription.
Welch Labs Imaginary Numbers Book!
https://www.welchlabs.com/resources/imaginary-numbers-bookWelch Labs Posters:
https://www.welchlabs.com/resourcesSupport Welch Labs on Patreon!
https://www.patreon.com/welchlabsSpecial thanks to Patrons: Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti, Brian Henry, Tim Palade, Petar Vecutin
Learn more about WelchLabs!
https://www.welchlabs.comTikTok:
https://www.tiktok.com/@welchlabsInstagram:
https://www.instagram.com/welchlabsREFERENCES
A Neural Scaling Law from the Dimension of the Data Manifold:
https://arxiv.org/pdf/2004.10802First 2020 OpenAI Scaling Paper:
https://arxiv.org/pdf/2001.08361GPT-3 Paper:
https://arxiv.org/pdf/2005.14165Second 202 OpenAI Scaling Paper:
https://arxiv.org/pdf/2010.14701Google Deepmind “Chinchilla Scaling” Paper:
https://arxiv.org/abs/2203.15556Nice summary of Chinchilla Scaling:
https://www.lesswrong.com/posts/6Fpvch8RR29qLEWNH/chinchilla-s-wild-implicationsGPT-4 Technical Report:
https://arxiv.org/pdf/2303.08774Nice Neural Scaling Laws Summary:
https://www.lesswrong.com/posts/Yt5wAXMc7D2zLpQqx/an-140-theoretical-models-that-predict-scaling-lawsExplaining Neural Scaling Laws:
https://arxiv.org/pdf/2102.06701High Cost of Training GPT-4:
https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/Nvidia V100 FLOPs:
https://lambdalabs.com/blog/demystifying-gpt-3Nvidia V100 Original Price: [
https://www.microway.com/hpc-tech-tips/nvidia-tesla-v100-price-analysis/#:~:text=Tesla GPU model,Key Points](
https://www.microway.com/hpc-tech-tips/nvidia-tesla-v100-price-analysis/#:~:text=Tesla%20GPU%20model,Key%20Points)Great paper on scaling up training infrastructure:
https://arxiv.org/pdf/2104.04473Eight Things to Know about LLMs:
https://arxiv.org/abs/2304.00612Emergent Properties of LLMs:
https://arxiv.org/abs/2206.07682Theoretical Motivation for Cross Entropy (Section 6.2):
https://www.deeplearningbook.org/Some papers that appear to pass the compute efficient frontier
https://arxiv.org/pdf/2206.14486https://arxiv.org/abs/2210.11399Leaked GPT-4 training info
https://patmcguinness.substack.com/p/gpt-4-details-revealedhttps://www.semianalysis.com/p/gpt-4-architecture-infrastructurehttps://epochai.org/blog/tracking-large-scale-ai-models