About

MaximoFN

I’m Máximo Fernández and my goal is to help people learn Artificial Intelligence by publishing content.

Come in and learn everything you can

maximofn

Projects

More projects –>

Latest posts

  • DoLa – Decoding by Contrasting Layers Improves Factuality in Large Language Models
    Have you ever talked to an LLM and they answered you something that sounds like they’ve been drinking machine coffee all night long 😂 That’s what we call a “hallucination” in the LLM world! But don’t worry, because it’s not that your language model is crazy (although it can sometimes seem that way 🤪). The truth is that LLMs can be a bit… creative when it comes to generating text. But thanks to DoLa, a method that uses contrast layers to improve the feasibility of LLMs, we can keep our language models from turning into science fiction writers 😂. In this post, I’ll explain how DoLa works and show you a code example so you can better understand how to make your LLMs more reliable and less prone to making up stories. Let’s save our LLMs from insanity and make them more useful! 🚀
  • QLoRA: Efficient Finetuning of Quantized LLMs
    Hello everyone! 🤗 Today we are going to talk about QLoRA, the technique that will allow you to make your language models more efficient and faster ⏱️. But how does it do it? 🤔 Well, first it uses quantization to reduce the size of the model weights, which saves memory and speed 📈. Then, it applies LoRA (Low-Rank Adaptation), which is like a superpower that allows the model to adapt to new data without retraining from scratch 💪. And, for you to see how it works in practice, I leave you with a code example that will make you say ‘Eureka!’ 🎉 Let’s dive into the world of QLoRA and discover how we can make our models smarter and more efficient! 🤓
  • GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
    Attention developers! 🚨 Do you have a language model that is too big and heavy for your application? 🤯 Don’t worry, GPTQ is here to help you! 🤖 This quantization algorithm is like a wizard that makes unnecessary bits and bytes disappear, reducing the size of your model without losing too much precision. 🎩 It’s like compressing a file without losing quality – it’s a way to make your models more efficient and faster! 🚀

Previous posts –>