The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
-
Updated
Jun 11, 2024 - Python
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
Testing KAN-based text generation GPT models
Kolmogorov–Arnold Networks (KAN) in PyTorch
NeuroBender, where the mystical powers of KAN network meet the robust stability of autoencoders!
An implementation of the KAN architecture using learnable activation functions for knowledge distillation on the MNIST handwritten digits dataset. The project demonstrates distilling a three-layer teacher KAN model into a more compact two-layer student model, comparing the performance impacts of distillation versus non-distilled models.
given beta-holder continuous function f:[0,1]^d -> R, this deep ReLU network approximates f up to approximation rate of 2^(-K beta) using 2^Kd parameters. Here, K is a set positive integer and d the dimension.
Add a description, image, and links to the kolmogorov-arnold-representation topic page so that developers can more easily learn about it.
To associate your repository with the kolmogorov-arnold-representation topic, visit your repo's landing page and select "manage topics."