Welcome to the Tensor Collective

Hi! Welcome to my blog TTC. I’m a PhD student at the University of Washington developing protein design methodologies in AI. I graduated in 2024 from Oxford with a Master’s degree in chemistry, specialising in generative models for amorphous materials. TTC is a small collection of topics within ML, but is very much a work in progress. However, do check out my links, CV and whatnot here.

Introduction

The Tensor Collective Machine learning is inherently simple. Take a random matrix, apply it to your input and calculate a number that is proportional to how close you are to your target for said input. Now use an automatic differentiation package to calculate how much you need to change the parameters to get a better model. Repeat this enough times and you’re on your way to the vast majority of deep learning algorithms....

Efficiency is good, but scale is better

Jan 9 2025 Introduction I recently came across a very interesting paper by Meta, Pagnoni 2024 et al.. The title of which, Byte Latent Transformer: Patches Scale Better Than Tokens, has a rather interesting assumption; it may be more desirable to scale better than to simply perform better. The main result of their paper is unsurprisingly that their new transformer, BLT, scales better than previous SOTA techniques: ![[Pasted image 20250109011752.png]] Figure 1 Pagnoni 2024 et al....

From Matrices to Transformers

Introduction When I first read about the transformer architecture, I found the grounding for the decisions that govern the attention mechanism difficult to decipher from the equations and the many “let’s build a transformer from scratch in pytorch” articles alone. I’ll give my best attempt here to decompose it’s building blocks to those that are interested in understanding the lower-level components that lead to the great models we see today....

DALLE-3 generated image with prompt - Impressionist oil painting illustrating a bug, but instead of its natural features, it bursts with vibrant colors and patterns evoking an exploding nebula

Hallucination isn't a bug, it's a feature

Simple insights into the cognition behind large language models. With the advent of large language models (LLMs), much research and debate has naturally revolved around the flaws large language models exhibit. Ever wondered how ChatGPT manages to overlook aspects, but will immediately apologise and correct it’s error once prompted? Here, we gently introduce how hallucinations are related to the cognition ongoing within a large language model. Discussing ways to probe the true knowledge within these models....