- A Survey of LLMs
- Large Language Models: A Survey: 🏆Well organized visuals and contents [9 Feb 2024]
- A Survey of Transformers:[cnt] [8 Jun 2021]
- A Survey of Large Language Models:[cnt] [v1: 31 Mar 2023 - v15: 13 Oct 2024]
- A Primer on Large Language Models and their Limitations: A primer on LLMs, their strengths, limits, applications, and research, for academia and industry use. [3 Dec 2024]
- Google AI Research Recap
- Gemini [06 Dec 2023] Three different sizes: Ultra, Pro, Nano. With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU ref
- Google AI Research Recap (2022 Edition)
- Themes from 2021 and Beyond
- Looking Back at 2020, and Forward to 2021
- Microsoft Research Recap
- Research at Microsoft 2023: A year of groundbreaking AI advances and discoveries
- Survey of Hallucination in Natural Language Generation:[cnt] [8 Feb 2022]
- A Survey on In-context Learning:[cnt] [31 Dec 2022]
- A Survey on Transformers in Reinforcement Learning:[cnt] [8 Jan 2023]
- Multimodal Deep Learning:[cnt] [12 Jan 2023]
- A Survey on Efficient Training of Transformers:[cnt] [2 Feb 2023]
- A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT:[cnt] [7 Mar 2023]
- An Overview on Language Models: Recent Developments and Outlook:[cnt] [10 Mar 2023]
- Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning:[cnt] [28 Mar 2023]
- Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models:[cnt] [4 Apr 2023]
- A Cookbook of Self-Supervised Learning:[cnt] [24 Apr 2023]
- Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond:[cnt] [26 Apr 2023]
- Challenges & Application of LLMs:[cnt] [11 Jun 2023]
- A Survey on Multimodal Large Language Models:[cnt] [23 Jun 2023]
- A Survey on Evaluation of Large Language Models:[cnt] [6 Jul 2023]
- A Survey of Techniques for Optimizing Transformer Inference:[cnt] [16 Jul 2023]
- Efficient Guided Generation for Large Language Models:[cnt] [19 Jul 2023]
- Survey of Aligned LLMs:[cnt] [24 Jul 2023]
- Foundation Models in Vision:[cnt] [25 Jul 2023]
- Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback:[cnt] [27 Jul 2023]
- Universal and Transferable Adversarial Attacks on Aligned Language Models:[cnt] [27 Jul 2023]
- SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension: [cnt] [30 Jul 2023]
- Trustworthy LLMs:[cnt] [10 Aug 2023]
- Model Compression for LLMs:[cnt] [15 Aug 2023]
- Survey on Instruction Tuning for LLMs:[cnt] [21 Aug 2023]
- A Survey on LLM-based Autonomous Agents:[cnt] [22 Aug 2023]
- A Survey of LLMs for Healthcare:[cnt] [9 Oct 2023]
- Overview of Factuality in LLMs:[cnt] [11 Oct 2023]
- Evaluating Large Language Models: A Comprehensive Survey:[cnt] [30 Oct 2023]
- Hallucination in LLMs:[cnt] [9 Nov 2023]
- A Survey on Language Models for Code:[cnt] [14 Nov 2023]
- ChatGPT’s One-year Anniversary: Are Open-Source Large Language Models Catching up? > Evaluation benchmark: Benchmarks and Performance of LLMs [28 Nov 2023]
- Data Management For Large Language Models: A Survey [4 Dec 2023]
- A Survey of Reasoning with Foundation Models [17 Dec 2023]
- Retrieval-Augmented Generation for Large Language Models: A Survey [cnt] [18 Dec 2023]
- From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape:[cnt] [18 Dec 2023]
- Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems: The survey aims to provide a comprehensive understanding of the current state and future directions in efficient LLM serving [23 Dec 2023]
- Mitigating Hallucination in LLMs: Summarizes 32 techniques to mitigate hallucination in LLMs [cnt] [2 Jan 2024]
- A Comprehensive Survey of Compression Algorithms for Language Models [27 Jan 2024]
- A Survey on Retrieval-Augmented Text Generation for Large Language Models [17 Apr 2024]
- A Survey on Mixture of Experts [26 Jun 2024]
- A Survey of Prompt Engineering Methods in Large Language Models for Different NLP Tasks [17 Jul 2024]
- A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?: [9 Aug 2024] git
- What is the Role of Small Models in the LLM Era: A Survey [10 Sep 2024]
- Small Language Models: Survey, Measurements, and Insights [24 Sep 2024]
- A Survey on Data Synthesis and Augmentation for Large Language Models [16 Oct 2024]
- A Comprehensive Survey of Small Language Models in the Era of Large Language Models [4 Nov 2024]
- A Survey on LLM-as-a-Judge [23 Nov 2024]
- Large Language Model-Brained GUI Agents: A Survey [27 Nov 2024]
- A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges [16 Dec 2024]
- GUI Agents: A Survey [18 Dec 2024]
- Evolutionary Tree of Large Language Models: x-ref
- How real-world businesses are transforming with AI:💡Collected over 200 examples of how organizations are leveraging Microsoft’s AI capabilities. [12 Nov 2024]
- Anthropic Clio: Privacy-preserving insights into real-world AI use [12 Dec 2024]
- Google: 321 real-world gen AI use cases from the world's leading organizations [19 Dec 2024]
- State of AI
- Retool: Status of AI: A Report on AI In Production 2023 -> 2024
- The State of Generative AI in the Enterprise [ⓒ2023]
- 96% of AI spend is on inference, not training. 2. Only 10% of enterprises pre-trained own models. 3. 85% of models in use are closed-source. 4. 60% of enterprises use multiple models.
- Standford AI Index Annual Report
- State of AI Report 2024 [10 Oct 2024]
- LangChain > State of AI Agents [19 Dec 2024]
- An unnecessarily tiny implementation of GPT-2 in NumPy. picoGPT: Transformer Decoder [Jan 2023]
q = x @ w_k # [n_seq, n_embd] @ [n_embd, n_embd] -> [n_seq, n_embd]
k = x @ w_q # [n_seq, n_embd] @ [n_embd, n_embd] -> [n_seq, n_embd]
v = x @ w_v # [n_seq, n_embd] @ [n_embd, n_embd] -> [n_seq, n_embd]
# In picoGPT, combine w_q, w_k and w_v into a single matrix w_fc
x = x @ w_fc # [n_seq, n_embd] @ [n_embd, 3*n_embd] -> [n_seq, 3*n_embd]
- lit-gpt: Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed. git [Mar 2023]
- pix2code: Generating Code from a Graphical User Interface Screenshot. Trained dataset as a pair of screenshots and simplified intermediate script for HTML, utilizing image embedding for CNN and text embedding for LSTM, encoder and decoder model. Early adoption of image-to-code. [May 2017]
- Screenshot to code: Turning Design Mockups Into Code With Deep Learning [Oct 2017] ref
- Build a Large Language Model (From Scratch):🏆Implementing a ChatGPT-like LLM from scratch, step by step
- Spreadsheets-are-all-you-need: Spreadsheets-are-all-you-need implements the forward pass of GPT2 entirely in Excel using standard spreadsheet functions. [Sep 2023]
- llm.c: LLM training in simple, raw C/CUDA [Apr 2024] | Reproducing GPT-2 (124M) in llm.c in 90 minutes for $20 ref
- llama3-from-scratch: Implementing Llama3 from scratch [May 2024]
- Umar Jamil github:💡LLM Model explanation / building a model from scratch 📺
- Andrej Karpathy📺: Reproduce the GPT-2 (124M) from scratch. [June 2024] / SebastianRaschka📺: Developing an LLM: Building, Training, Finetuning [June 2024]
- Transformer Explainer: an open-source interactive tool to learn about the inner workings of a Transformer model (GPT-2) git [8 Aug 2024]
- Beam Search [1977] in Transformers is an inference algorithm that maintains the
beam_size
most probable sequences until the end token appears or maximum sequence length is reached. Ifbeam_size
(k) is 1, it's aGreedy Search
. If k equals the total vocabularies, it's anExhaustive Search
. ref [Mar 2022] - Einsum is All you Need: Einstein Summation [5 Feb 2018]
- You could have designed state of the art positional encoding: Binary Position Encoding, Sinusoidal positional encoding, Absolute vs Relative Position Encoding, Rotary Positional encoding [17 Nov 2024]
-
ref: Must-Read Starter Guide to Mastering Attention Mechanisms in Machine Learning [12 Jun 2023]
- Soft Attention: Assigns continuous weights to all input elements. Used in neural machine translation.
- Hard Attention: Selects a subset of input elements to focus on while ignoring the rest. . Requires specialized training (e.g., reinforcement learning). Used in image captioning.
- Global Attention: Attends to all input elements, capturing long-range dependencies. Suitable for tasks involving small to medium-length sequences.
- Local Attention: Focuses on a localized input region, balancing efficiency and context. Used in time series analysis.
- Self-Attention: Attends to parts of the input sequence itself, capturing dependencies. Core to models like BERT.
- Multi-head Self-Attention: Performs multiple self-attentions in parallel, capturing diverse features. Essential for transformers.
- Sparse Attention: reduces computation by focusing on a limited selection of similarity scores in a sequence, resulting in a sparse matrix. It includes implementations like "strided" and "fixed" attention and is critical for scaling to very long sequences. ref [23 Oct 2020]
- Cross-Attention: mixes two different embedding sequences, allowing the model to attend to information from both sequences. In a Transformer, when the information is passed from encoder to decoder, that part is known as Cross-Attention. Plays a vital role in tasks like machine translation.
ref / ref [9 Feb 2023] - Sliding Window Attention (SWA): Used in Longformer. It uses a fixed-size window of attention around each token, allowing the model to scale efficiently to long inputs. Each token attends to half the window size tokens on each side, significantly reducing memory overhead. ref
- LLM 研究プロジェクト: ブログ記事一覧 [27 Jul 2023]
- ブレインパッド社員が投稿した Qiita 記事まとめ: ブレインパッド社員が投稿した Qiita 記事まとめ [Jul 2023]
- rinna: rinna の 36 億パラメータの日本語 GPT 言語モデル: 3.6 billion parameter Japanese GPT language model [17 May 2023]
- rinna: bilingual-gpt-neox-4b: 日英バイリンガル大規模言語モデル [17 May 2023]
- 法律:生成 AI の利用ガイドライン: Legal: Guidelines for the Use of Generative AI
- New Era of Computing - ChatGPT がもたらした新時代 [May 2023]
- 大規模言語モデルで変わる ML システム開発: ML system development that changes with large-scale language models [Mar 2023]
- GPT-4 登場以降に出てきた ChatGPT/LLM に関する論文や技術の振り返り: Review of ChatGPT/LLM papers and technologies that have emerged since the advent of GPT-4 [Jun 2023]
- LLM を制御するには何をするべきか?: How to control LLM [Jun 2023]
- 1. 生成 AI のマルチモーダルモデルでできること: What can be done with multimodal models of generative AI 2. 生成 AI のマルチモーダリティに関する技術調査 [Jun 2023]
- LLM の推論を効率化する量子化技術調査: Survey of quantization techniques to improve efficiency of LLM reasoning [Sep 2023]
- LLM の出力制御や新モデルについて: About LLM output control and new models [Sep 2023]
- Azure OpenAI を活用したアプリケーション実装のリファレンス: 日本マイクロソフト リファレンスアーキテクチャ [Jun 2023]
- 生成 AI・LLM のツール拡張に関する論文の動向調査: Survey of trends in papers on tool extensions for generative AI and LLM [Sep 2023]
- LLM の学習・推論の効率化・高速化に関する技術調査: Technical survey on improving the efficiency and speed of LLM learning and inference [Sep 2023]
- 日本語LLMまとめ - Overview of Japanese LLMs: 一般公開されている日本語LLM(日本語を中心に学習されたLLM)および日本語LLM評価ベンチマークに関する情報をまとめ [Jul 2023]
- Azure OpenAI Service で始める ChatGPT/LLM システム構築入門: サンプルプログラム [Aug 2023]
- Azure OpenAI と Azure Cognitive Search の組み合わせを考える [24 May 2023]
- Matsuo Lab: 人工知能・深層学習を学ぶためのロードマップ ref / doc [Dec 2023]
- AI事業者ガイドライン [Apr 2024]
- LLMにまつわる"評価"を整理する [06 Jun 2024]
- コード生成を伴う LLM エージェント [18 Jul 2024]
- Japanese startup Orange uses Anthropic's Claude to translate manga into English: [02 Dec 2024]
- Machine Learning Study 혼자 해보기 [Sep 2018]
- LangChain 한국어 튜토리얼 [Feb 2024]
- AI 데이터 분석가 ‘물어보새’ 등장 – RAG와 Text-To-SQL 활용 [Jul 2024]
- LLM, 더 저렴하게, 더 빠르게, 더 똑똑하게 [09 Sep 2024]
- 생성형 AI 서비스: 게이트웨이로 쉽게 시작하기 [07 Nov 2024]
- Harness를 이용해 LLM 애플리케이션 평가 자동화하기 [16 Nov 2024]
- Attention Is All You Need: [cnt]: 🏆 The Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. [12 Jun 2017] Illustrated transformer
- Must read: the 100 most cited AI papers in 2022 : doc [8 Mar 2023]
- The Best Machine Learning Resources : doc [20 Aug 2017]
- What are the most influential current AI Papers?: NLLG Quarterly arXiv Report 06/23 git [31 Jul 2023]
- OpenAI Cookbook Examples and guides for using the OpenAI API
- gpt4free for educational purposes only [Mar 2023]
- Comparing Adobe Firefly, Dalle-2, OpenJourney, Stable Diffusion, and Midjourney: Generative AI for images [20 Jun 2023]
- Open Problem and Limitation of RLHF: [cnt]: Provides an overview of open problems and the limitations of RLHF [27 Jul 2023]
- IbrahimSobh/llms: Language models introduction with simple code. [Jun 2023]
- DeepLearning.ai Short courses: DeepLearning.ai Short courses [2023]
- DAIR.AI: Machine learning & NLP research (omarsar github)
- ML Papers of The Week [Jan 2023]
- Deep Learning cheatsheets for Stanford's CS 230: Super VIP Cheetsheet: Deep Learning [Nov 2019]
- LLM Visualization: A 3D animated visualization of an LLM with a walkthrough
- Best-of Machine Learning with Python:🏆A ranked list of awesome machine learning Python libraries. [Nov 2020]
- Large Language Models: Application through Production: A course on edX & Databricks Academy
- Large Language Model Course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. [Jun 2023]
- CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization [Apr 2020]
- Foundational concepts like Transformers, Attention, and Vector Database [Feb 2024]
- LLM FineTuning Projects and notes on common practical techniques [Oct 2023]
- But what is a GPT?📺🏆3blue1brown: Visual intro to transformers [Apr 2024]
- Daily Dose of Data Science [Dec 2022]
- Machine learning algorithms: ml algorithms or implementation from scratch [Oct 2016]
- eugeneyan blog:💡Lessons from A year of Building with LLMs, Patterns for LLM Systems. git