2 Comments
User's avatar
Leo He's avatar

Great article! When calculating the attention in the second iteration, should "New Query" be [1.18, 1.15, 1.36, 1.11] instead of [0.80, 0.85, 0.30, 1.07] (used in first iteration)?

Expand full comment
Daniel Warfield's avatar

Probably, great catch! The by-hand articles contain a fair amount of human-error prone steps to produce, I likely missed an edit.

Under the hood I'm doing it with Numpy, so the end results should be fine, the issues are typically in transcription.

Expand full comment