KV Caching — By Hand

May 12

Doing Context Augmented Generation by hand

2 Comments

Great article! When calculating the attention in the second iteration, should "New Query" be [1.18, 1.15, 1.36, 1.11] instead of [0.80, 0.85, 0.30, 1.07] (used in first iteration)?

Expand full comment

Reply (1)

Daniel Warfield

May 14

Probably, great catch! The by-hand articles contain a fair amount of human-error prone steps to produce, I likely missed an edit.

Under the hood I'm doing it with Numpy, so the end results should be fine, the issues are typically in transcription.

Expand full comment

Intuitively and Exhaustively Explained

KV Caching — By Hand