<aside> 💡 My notes from Karpathy’s YouTube tutorial on building GPT. Watched it around January 20, 2024 These notes are meant for me to recall the things I want to make sure to recall
</aside>
Google Labs Colab for the lecture!
<aside> 💡 My own note: one of the biggest differentiators between OpenAI’s system and a GPT that I build on my laptop is probably the data the model is trained on. Elon Musk talks about this in Lex Fridman’s podcast. He talks about how the model’s codebase is actually quite simple, and the heavy lift is filtering through all of the noise on the internet to get good data to train the model on
</aside>
first, pull out all unique characters that occur in the text
will then map characters to integers (encode = characters—> integers translation, vice versa for decode)
you do this using lambda functions!
<aside>
💡 side note for me: lambda functions are small, anonymous functions that are usually used for short, simple, one-off operations. their notation is: lambda parameters: expression
</aside>
# create a mapping from characters to integers
stoi = { ch:i for i,ch in enumerate(chars) }
itos = { i:ch for i,ch in enumerate(chars) }
encode = lambda s: [stoi[c] for c in s] # encoder: take a string, output a list of integers
decode = lambda l: ''.join([itos[i] for i in l]) # decoder: take a list of integers, output a string
print(encode("hii there"))
print(decode(encode("hii there")))
#[46, 47, 47, 1, 58, 46, 43, 56, 43]
#hii there
can do encoding in different ways. Google uses SentencePiece