Posts

DoLa – Decoding by Contrasting Layers Improves Factuality in Large Language Models

DoLa – Decoding by Contrasting Layers Improves Factuality in Large Language Models

Have you ever talked to an LLM and they answered you something that sounds like they've been drinking machine coffee all night long 😂 That's what we call a hallucination in the LLM world! But don't worry, because it's not that your language model is crazy (although it can sometimes seem that way 🤪). The truth is that LLMs can be a bit... creative when it comes to generating text. But thanks to DoLa, a method that uses contrast layers to improve the feasibility of LLMs, we can keep our language models from turning into science fiction writers 😂. In this post, I'll explain how DoLa works and show you a code example so you can better understand how to make your LLMs more reliable and less prone to making up stories. Let's save our LLMs from insanity and make them more useful! 🚀

QLoRA: Efficient Finetuning of Quantized LLMs

QLoRA: Efficient Finetuning of Quantized LLMs

Hello everyone! 🤗 Today we are going to talk about QLoRA, the technique that will allow you to make your language models more efficient and faster ⏱️. But how does it do it? 🤔 Well, first it uses quantization to reduce the size of the model weights, which saves memory and speed 📈. Then, it applies LoRA (Low-Rank Adaptation), which is like a superpower that allows the model to adapt to new data without retraining from scratch 💪. And, for you to see how it works in practice, I leave you with a code example that will make you say 'Eureka!' 🎉 Let's dive into the world of QLoRA and discover how we can make our models smarter and more efficient! 🤓

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Attention developers! 🚨 Do you have a language model that is too big and heavy for your application? 🤯 Don't worry, GPTQ is here to help you! 🤖 This quantization algorithm is like a wizard that makes unnecessary bits and bytes disappear, reducing the size of your model without losing too much precision. 🎩 It's like compressing a file without losing quality - it's a way to make your models more efficient and faster! 🚀

llm.int8() – 8-bit Matrix Multiplication for Transformers at Scale

llm.int8() – 8-bit Matrix Multiplication for Transformers at Scale

Get ready to save space and speed up your models! 💥 In this post, I'm going to explore the llm.int8() method, a quantization technique that allows you to reduce the size of your machine learning models without sacrificing too much accuracy. 📊 That means you'll be able to train and deploy larger and more complex models in less space and with lower resource consumption! 💻 Let's see how to use llm.int8() with transformers to quantize a model and make it more efficient, without losing the essence of its artificial intelligence. 🤖

LLMs quantization

LLMs quantization

Imagine having a giant language model that can answer any question, from the capital of France to the perfect brownie recipe! 🍞️🇫🇷 But what happens when that model has to fit on a mobile device? 📱 That's where quantization comes in! 🎉 This technique allows us to reduce the size of models without sacrificing their accuracy, which means we can enjoy artificial intelligence on our mobile devices without the need for a supercomputer. 💻 It's like compressing an elephant into a shoebox, but without crushing the elephant! 🐘😂

LoRA – low rank adaptation of large language models

LoRA – low rank adaptation of large language models

Get ready to take your model adaptation to the next level with LoRA! 🚀 This low-rank adaptation technique is like a superhero cape for your neural networks - it helps them learn new tricks without forgetting old ones 🤯. And the best thing about it? You can implement it in just a few lines of PyTorch code 💻 And if you're like me, a poor GPU guy struggling with limited resources 💸, LoRA is like a godsend: it lets you adapt your models without training them from scratch or spending a fortune on hardware 🙏 Check out the post for a step-by-step guide and a practical example!

Fine tuning Florence-2

Fine tuning Florence-2

You've already got Florence-2 on your radar, but now you want to take it to the next level! 🚀 It's time for fine-tuning! 💻 In this post, I'll walk you step-by-step through the process of adapting this multimodal model to your specific needs. 📊 From preparing your data to setting up hyperparameters, I'll show you how to get the most out of Florence-2. 💡 With PyTorch and Python, we'll make this model fit your needs and become your trusted tool for solving language and vision tasks. 📈 So get ready to take your model to the next level and make Florence-2 shine in all its glory! ✨

Florence-2

Florence-2

Attention everyone! 🚨 We have a new king in town! 👑 Florence-2, the multimodal model that is revolutionizing the artificial intelligence game. 🤯 With only 200M parameters in its base version (or 700M in its large version, for those who want with everything 💥), this model is sweeping the benchmarks, beating models with 10 and 100 times more parameters. 🤯 It's like having a Swiss army knife in your AI toolkit! 🗡️ Modify the prompt and voila, Florence-2 adapts to any task you need. 🔧 In short, Florence-2 is the new SOTA (State-Of-The-Art) model in MLLMs (Multimodal Language Models) and you can't miss it. 🚀 Let's see what other surprises this beast has in store! 🤔

GPT1 – Improving Language Understanding by Generative Pre-Training

GPT1 – Improving Language Understanding by Generative Pre-Training

Unlock the power of language!!!! 💥 In my last post, I take you by the hand through the GPT-1 paper, explaining in a clear and concise way how this pioneer model in natural language processing works. And not only that! I also show you how to fine-tuning the model so you can adapt it to your specific needs 📊 Don't miss the opportunity to learn about one of the most influential models in history! 🚀 Read my post and find out how you can improve your artificial intelligence skills! 📄

Hugging Face Transformers

Hugging Face Transformers

🤖 Transform your world with Hugging Face Transformers! 🚀 Ready to make magic with natural language? From super-fast techniques with pipeline 🌪️ to ninja tricks with AutoModel 🥷, this post takes you by the hand on an epic adventure into the NLP universe. Explore how to generate text that surprises, train models that dazzle, and share your creations on the Hugging Face Hub like a pro. Get ready to code and laugh, because the future of NLP is now and it's hilarious! 😂

CSS

CSS

CSS, or Cascading Style Sheets, is a fundamental technology in web design that allows developers and designers to style and present HTML documents in a sophisticated and efficient manner. Through CSS, we can control layout, colors, fonts, and much more, enabling the creation of rich and visually appealing user experiences. This post explores the basics of CSS, from selectors and properties to the box model, offering a detailed guide for those looking to get started or improve their web design skills. With practical examples and helpful tips, it demonstrates how CSS makes it easy to separate content from presentation, improving the accessibility and maintainability of web projects.

OpenAI API

OpenAI API

🚀 Discover the power of the OpenAI API in this post! 🌟 Learn how to install the OpenAI library ✨ and I'll guide you through the first steps to become an artificial intelligence guru. 🤖 No matter if you're a curious beginner or a coding expert looking for new adventures, this post has everything you need to get started. Get ready to explore the universe of GPT for text generation and DALL-E, image analysis, all with a touch of fun and a lot of innovation! 🎉👩‍💻 Dive into the exciting world of AI and start your journey to unlimited creativity! 🌈💻

Continue reading

DoLa – Decoding by Contrasting Layers Improves Factuality in Large Language Models

DoLa – Decoding by Contrasting Layers Improves Factuality in Large Language Models

Have you ever talked to an LLM and they answered you something that sounds like they've been drinking machine coffee all night long 😂 That's what we call a hallucination in the LLM world! But don't worry, because it's not that your language model is crazy (although it can sometimes seem that way 🤪). The truth is that LLMs can be a bit... creative when it comes to generating text. But thanks to DoLa, a method that uses contrast layers to improve the feasibility of LLMs, we can keep our language models from turning into science fiction writers 😂. In this post, I'll explain how DoLa works and show you a code example so you can better understand how to make your LLMs more reliable and less prone to making up stories. Let's save our LLMs from insanity and make them more useful! 🚀

Last posts -->

Have you seen these projects?

Subtify

Subtify Subtify

Subtitle generator for videos in the language you want. Also, it puts a different color subtitle to each person

View all projects -->

Do you want to apply AI in your project? Contact me!

Do you want to improve with these tips?

Last tips -->

Use this locally

Hugging Face spaces allow us to run models with very simple demos, but what if the demo breaks? Or if the user deletes it? That's why I've created docker containers with some interesting spaces, to be able to use them locally, whatever happens. In fact, if you click on any project view button, it may take you to a space that doesn't work.

View all containers -->

Do you want to apply AI in your project? Contact me!

Do you want to train your model with these datasets?

short-jokes-dataset

Dataset with jokes in English

opus100

Dataset with translations from English to Spanish

netflix_titles

Dataset with Netflix movies and series

View more datasets -->