The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
-
Updated
May 17, 2024 - Python
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
Implementation on how to use Kolmogorov-Arnold Networks (KANs) for classification and regression tasks.
Improved LBFGS and LBFGS-B optimizers in PyTorch.
Testing KAN-based text generation GPT models
KANs for text classification on GLUE tasks
KAN meets Gram Polynomials
An implementation of the KAN architecture using learnable activation functions for knowledge distillation on the MNIST handwritten digits dataset. The project demonstrates distilling a three-layer teacher KAN model into a more compact two-layer student model, comparing the performance impacts of distillation versus non-distilled models.
This is the repo for the MixKABRN Neural Network (Mixture of Kolmogorov-Arnold Bit Retentive Networks), and an attempt at first adapting it for training on text, and later adjust it for other modalities.
Add a description, image, and links to the kolmogorov-arnold-networks topic page so that developers can more easily learn about it.
To associate your repository with the kolmogorov-arnold-networks topic, visit your repo's landing page and select "manage topics."