LM Studio on Mac: Install and take full advantage of local AI

  • LM Studio allows you to run LLM models locally on Macs with Apple Silicon, prioritizing privacy and full control over your data.
  • Requires macOS 13.4 or higher and at least 16 GB of RAM to smoothly handle 7–8B parameter models.
  • The app makes it easy to download, configure, and compare models like DeepSeek, Mistral, Gemma, or Phi, adjusting context and creativity.
  • Features like RAG and Developer mode turn your Mac into a local AI server useful for documents, development, and automation.

LM Studio on Mac local AI models

if you feel like it have your own “ChatGPT” running directly on your Mac Without relying on the cloud, LM Studio is currently one of the most comprehensive and user-friendly options. Instead of sending your data to external servers, AI models run on your own machine, giving you complete control over privacy, resource consumption, and the type of model you want to use.

In this guide we will see How to install, configure, and get the most out of LM Studio on a MacWe'll cover what requirements you need, which models are best suited to your hardware, and how to leverage advanced features like Developer Mode and RAG to work with your own documents. The goal is for you to finish this article with your first local model running on your Mac, ready to help you program, write, summarize texts, or simply experiment with AI.

What is LM Studio and why is it worth using on a Mac?

LM Studio is a Free desktop application designed to download and run LLM models (Large Language Models) locally. It works like an AI chat in the style of ChatGPT, but all the processing is done on your computer, without needing an account, API, or permanent internet connection.

The interface is very visual and easy to useYou choose a model from the discovery section, download it, upload it to the chat section, and start chatting. Everything is based on GGUF format models and, on Mac, also MLX optimized for Metal, so It makes excellent use of Apple Silicon chips. (M1, M2, M3 and M4).

The models that run in LM Studio allow generate coherent and creative text, summarize documents, translate, extract ideas, or analyze informationYou can also connect LM Studio as an OpenAI API-compatible server to use your local models from other applications, making it a Completely private “AI headquarters” on your Mac.

Compared to alternatives like Ollama, LM Studio stands out for offer more configuration controls per model, allowing you to choose between multiple variants (different sizes and quantizations) and a somewhat more user-friendly experience for those who don't want to struggle with the command line.

Advantages and disadvantages of using local AI models

Before we get started, it's important to be clear about the following: What do you gain and what do you lose by moving AI to your Mac? instead of always using cloud services.

main advantages

The first advantage is the absolute privacyEverything you type and the documents you upload stay on your computer. This is especially useful if you use contracts, internal reports, sensitive technical documentation or personal data that you don't want to upload to external services.

Another big advantage is the autonomyOnce the templates are downloaded, you can use them even without an internet connection; perfect for working in mobile environments, while traveling, or on restricted networksYou are not dependent on the server being in good health or on sudden changes in usage policies.

You also save on recurring costs. Having one or more models installed in LM Studio allows you to avoid monthly subscriptions or token limitsprovided the performance you obtain is sufficient for your daily needs. For writing, programming, translating, or generating summaries, A good 7-8B model is usually more than enough for your Mac..

Finally, using AI locally is a great way to learn how LLM models actually workYou can experiment with parameters, change models, adjust the context, or try RAG, which helps you understand better. Where are the limits and how can you get the most out of them? in your projects.

Inconvenients to consider

The main limit is in the your Mac's hardwarePerformance and smoothness depend directly on the available RAM, the type of chip, and the size of the model you load. Models that are too large may run slowly or not load at all..

LM Studio interface on Mac

Furthermore, a local model cannot Check for updated information online by himself. All his knowledge is limited to what he was originally trained on and the documents you provide him via RAG, so Don't expect real-time data such as prices, daily news, or recent sports results.

Finally, the heaviest models They consume a lot of storage.It's easy to use 10 to 30 GB of SSD space, or even more, across several 7-20B models, so it's wise to have some disk space available if you plan to collect multiple options; also, consider Disable local Time Machine backups if you need to free up space.

LM Studio requirements on Mac and other systems

Although this guide focuses on macOS, it's helpful to understand LM Studio requirements on each system to find out how far you can go if you also use Windows or Linux.

macOS Requirements

To use LM Studio comfortably on a Mac, you need a computer with an Apple Silicon processorM1, M2, M3, or M4. Macs with Intel processors are not recommended for the current version of LM Studio; if you're sticking with Intel and want local AI, the most sensible option is to choose... alternatives like Mstywhich are designed more for that hardware.

Regarding the operating system, LM Studio requires macOS 13.4 or higherIf you're using an older version of macOS, the first step is to upgrade, provided your Mac is compatible, to avoid compatibility issues.

Regarding memory and storage, the reasonable recommendation is 16 GB of RAM for stable operation with medium-sized models (7-8B) and reserve between 10 and 30 GB SSD for models and caches. The more models you want to test, the more space you'll need, because Each model can occupy anywhere from about 2 GB to more than 20 GB.

You don't need a dedicated GPU: Apple Silicon chips already integrate Optimized CPU, GPU, and AI acceleratorsLM Studio takes advantage of that architecture using MLX files when they are available.

Windows Requirements

If you also have a PC in addition to your Mac, you'll be interested to know what LM Studio on Windows requires. a 64-bit CPU with AVX2 supportMost modern Intel and AMD processors meet this requirement, but it's always best to check the official specifications.

To ensure everything runs smoothly, the following are recommended: 16 GB RAM for 7-8B modelsWith only 8 GB you can boot smaller models (3-4 GB) with short contexts, but you'll notice limitations right away if you try to do very long chats or summaries of very long documents.

Windows file transfer

The GPU is not mandatory, although A decent graphics card helps speed things up the heaviest models. In terms of storage, you should expect each model to take a significant bite out of your hard drive space. a minimum of 20 GB free This is reasonable if you want to have several models installed.

Linux Requirements

On Linux, LM Studio is typically distributed in AppImage format for x64with support in distributions like Ubuntu 20.04 and later. If your CPU does not support AVX2, the experience may be very limited in performance and compatibility with certain models.

In memory and storage, the criteria are the same as in Windows: 16 GB of RAM for mid-range models and around 20 GB of SSD If you want to accommodate multiple models, in some layouts you will have to Mark the AppImage as executable and allow its integration on the desktop so that it works normally.

Install LM Studio on your Mac step by step

Once the requirements are clear, let's get down to business with the installation process on macOS, which is quite straightforward and not very complicated.

LM Studio download

The first step is to access the tool's official website, where you will find A download selector for macOS, Windows, and LinuxIn your case, click on the option to Mac and download the installer corresponding to your architecture: you will usually see it clearly indicated that it is for Apple silicon.

Once the file has been downloaded, you just have to drag the LM Studio app to the Applications folderJust like any other macOS application, there are no complex wizards or complicated settings at this stage.

First boot and permissions on macOS

On the first startup, macOS may show you a warning because the app does not come from the App StoreNothing unusual: this is standard Gatekeeper behavior with software downloaded from the web.

To allow its execution, go to System Preferences > Security and privacy > General and click on the button “Open as usual” next to the message regarding LM Studio. From that moment on, you will be able to open the application normally from Launchpad or the Applications folder.

Activate advanced mode and prepare the interface

When you open LM Studio, you'll see a fairly clean interface. It's recommended to activate the mode “PowerUser” or advanced, which is usually found in the bottom left of the window. This unlocks Additional buttons in the sidebar with features such as Discover, My Models or the Developer section.

The main structure is organized into four icons in the left vertical barEach section has a specific color and function: the chat section, the Developer area for the local API, the list of installed models, and the discovery area. We'll look at them in more detail shortly.

LM Studio structure: key sections

To navigate the application smoothly, it's important to understand what each section of the sidebar does and in what order to use them when you want to start chatting with a model.

LM Studio

On the left side you will usually find these icons:

  • Yellow message icon (Chats)This is the section where you open and manage your conversations. Here you choose the model you want to use, write the prompts, and see the responses, just like you would in any online AI chat.
  • Green Window (Developer): activates a local server compatible with the OpenAI APIFrom here you can review the endpoint and use your LM Studio models with other apps that expect an OpenAI-type API, ideal for integrations and automations.
  • Red Folder (My Models): displays an inventory of all the models downloaded to your machineFrom here you can view sizes, versions, open advanced settings, or launch a specific model.
  • Purple magnifying glass (Discover): is the gateway to the catalog of Open templates available for downloadYou can search by name, filter by popularity, or see which ones have been recently updated.

The typical workflow on a Mac is simple: first Go to “Discover” to find and download a modelThen you go to "My Models" or directly to "Chats" to Charge it and start using it.

Choose and download your first model on Mac

Once inside the tab DiscoverYou'll see a fairly long list of open AI models ready to use with LM Studio: from families like Qwen, LLaMA, Gemma, Mistral or Phi to specific models focused on reasoning or programming.

Recommended models to start with

At the top of the list, using the search bar, you can find models specially optimized for local use and very popular among users:

  • OpenAI gpt-oss 20BOpen model from the creators of ChatGPT, licensed under Apache 2.0. It is instruction-oriented and configurable, designed as “official” local alternative to ChatGPT in GGUF/MLX formats.
  • DeepSeek R1 Distill Qwen 7BA distillation of DeepSeek's reasoning model on Qwen 7B. It offers a good balance between quality and requirementsIt supports Q4-Q6 quantizations and long contexts. It is a highly recommended option for devices with 16 GB of RAM.
  • Gemma 3n E4BGemma 3 variant, Google's open alternative to Gemini. multimodal and optimized for everyday devices, with GGUF builds ready for LM Studio and contexts of up to approximately 32.000 tokens.
  • Qwen 3B and 4B ThinkingThese “Thinking” versions are designed for computers with fewer resourcesThey offer contexts of around 8.000 tokens and are designed to give responses developed with low memory consumptionvery useful if your computer is running low on RAM.
  • Magistral Small 2509A lightweight model based on Mistral AI, with approximately 2,5 KB of parameters. It is optimized for understanding instructions and general reasoning on machines with limited resources; ideal for writing, summaries, and technical support.
  • Mistral 7BThe flagship Mistral AI model, highly proficient in Spanish and logical and mathematical tasks. Its GGUF variants typically offer contexts of up to 32.000 tokens, perfect for long chats and analysis of long texts.
  • Africa 4Microsoft's compact model, with approximately 1,7 KB of parameters. It is geared towards concise and precise answersIt consumes little memory and is ideal for quick dialogues, technical help, or generating short texts.

You'll also find models like google/gemma-3n-e4b, mistralai/mistral-small-3.2 o deepseek/deepseek-r1-0528-qwen3-8ball of them well supported in LM Studio and suitable for different uses and hardware levels.

Download process from the “Discover” section

How to use LM Studio on Mac to implement RAG with local models

To download a template, simply search for their name in the search bar From the top, or locate it in the lists filtered by popularity or update date. Clicking on the model will open a fact sheet with detailed information on the right side.

In that record you will see, among other data, the format (GGUF, MLX), file size in GB, the supported context length and a description of your approach (instructions, reasoning, code, etc.). In the bottom right corner you will find a button “Download” followed by the model's weight.

Clicking on “Download” will open LM Studio internal download manager where you can check the progress and speed. You just have to wait until it reaches 100%; the time will depend on your connection and the size of the model, which can range from a few GB to more than 12 GB.

One important tip for Mac users is Don't choose models larger than your RAM.Try to ensure the disk model size doesn't exceed the available physical memory too much to avoid bottlenecks, but don't go for tiny models either if you need quality for complex tasks.

Configure and load a model in LM Studio

Once the model has been successfully downloaded, you have two options: enter the section My models to manage it or go directly to Cat To load it and start chatting, we'll follow the typical process from the chat area.

Select the model to chat

In the section of CatIn the central part, you'll see a panel where you can choose the model you want to loadIf this is your first time using LM Studio, a very small model may appear installed by default (such as Llama 3.2 1B) for initial testing.

When you click on the model selection dropdown, you should see the model you just downloaded, for example DeepSeek R1 0528 Qwen 3 8B or the Mistral variant you have chosen. Click on its name and a configuration preview panel will open.

Key adjustments before launching the model

In that settings panel you'll find several important parameters. The first is “Model file”Here you can choose the specific model file if you have multiple quantizations or downloaded versions. If there's only one, you can usually leave it as is.

Next you'll see a control for the context lengthThis setting defines how many tokens (units of text) the model can "remember" within a conversation or task. The more context you give it, the more memory it uses and the slower it can run.But it will also be able to maintain long conversation threads or analyze extensive documents without forgetting important parts.

Moving the context to the right increases the limit, which translates to higher RAM/VRAM consumption and more computing power per tokenIf your Mac is running on fumes, it's best to start with moderate values ​​(for example, 8-16K tokens) and only increase if you see that the performance holds up well.

Another key parameter is the load on GPU (This is more explicit on Windows and Linux; on Mac, it's managed through MLX and Metal). The idea is to distribute part of the model's layers to the GPU. accelerate generationespecially in models with 7-20GB parameters. If you exceed the available memory, you may notice drastic performance drops, so it's prudent to... gradually increase until you find the optimal point..

In the specific case of DeepSeek R1 0528 Qwen 3 8BA reasonable starting point is to use contexts of 8-16K tokensAlthough the model supports up to 128K, beyond 16-32K, on ​​standard consumer hardware, the speed usually starts to suffer. For GPU loading, you can start with average values ​​and adjust according to the token rate.

Launch the model and start chatting

Once you have adjusted these parameters, click on “Load model”LM Studio will take a few seconds to initialize the model in memory, after which you will see the chat window ready to type.

From here you can launch any prompt: from a simple "Hello, who are you?" to complex requests like “Write a summary of this text,” “Help me proofread this email,” or “Explain World War II to me, focusing on Japan and the bombing of Hiroshima.”The model will respond with a text generated on the fly, and LM Studio will also show you statistics such as tokens generated and generation speed.

Attach files and use RAG in LM Studio

Load limit in macOS 26.4

Language models have one major limitation: They only know what they were taught during trainingThey cannot, on their own, access your private documents or learn new things in real time about your company or your projects.

This is where the approach comes into play. Retrieval Augmented Generation (RAG)LM Studio lets you upload documents from your Mac so that the model Please keep them in mind when answering your questions.without needing to retrain it from scratch. It's like giving it a temporary mini reference library.

In practice, you can go up Up to 5 files at a time, with a combined maximum of 30 MBSupported formats include PDF, DOCX, TXT and CSVThis covers most common working documents: reports, contracts, data tables, notes, etc.

Once the documents have been added to the conversation, it's key to be Please be as specific as possible in your questions.Instead of asking “what does the contract say?”, something like this is much more helpful. “According to the penalty clause in the contract I uploaded, what happens if I am more than 30 days late in making the payment?”The more details you provide, the better the model will be able to retrieve the relevant fragment.

The model will analyze both your query and the content of the attached files and generate a response based on that information. You can experiment by uploading different sets of documents and testing different prompt strategies to see how it performs. how the accuracy and usefulness of the answers vary.

Developer mode and advanced generation options

LM Studio doesn't stop at a simple chat window. With the Developer mode You can adjust very fine generation parameters and expose a local server compatible with the OpenAI APIso that other applications treat your local model as if it were a remote endpoint.

Creativity and response diversity controls

One of the most important parameters is the temperatureWith low values, the model behaves more deterministic and conservativeRepeating the most likely answers, which is ideal for summaries, technical explanations, or tasks where you want consistency over creativity.

If you raise the temperature, the responses become more varied and creativeBut it also increases the risk of inconsistencies or minor "fantasies" in the model. For writing stories, brainstorming, or more imaginative texts, A higher temperature might be very good for you..

In addition to temperature, LM Studio lets you play Top-K and Top-PTop-K limits how many token options are considered at each step, which tends to lead to somewhat more rigid and controlled responsesTop-P, on the other hand, works on cumulative probability and allows a smoother balance between precision and diversity.

System Prompt: define the character of your local AI

Another key adjustment is the System Prompt, a text that serves as permanent base instruction for the modelHere you can tell them, for example: “You are an expert in Spanish labor law”, “Always answer in a friendly and clear tone” or “Be concise and add practical examples to each answer”.

This system prompt is applied before your questions, so It defines the style and role of the model throughout the entire session. It is especially useful for repetitive tasks, such as Summarize texts, answer professional emails, or generate reports with a consistent format.

Keep in mind that all these parameters—temperature, Top-K, Top-P, context, etc.—have direct impact on performance and perceived qualityAdjusting them carefully allows you to find that sweet spot between speed, consistency and creativity that best suits your way of working.

Practical benefits of having a local LLM on your Mac

The MacBook Air M5 will only have one chip upgrade.

Beyond the technical aspects, what's interesting is What can you do on a daily basis with LM Studio on your MacThe use cases are many, and they multiply if you add RAG and Developer mode.

First, there is everything related to the management of personal and professional documentsImagine having a bunch of PDFs of invoices, contracts, or historical reports: with LM Studio you can upload them and ask questions like “Find me the most important confidentiality clauses” or “locate emails related to this project in 2022”.

If you study or do research, a local model is perfect for summarize long articles, generate study outlines, and extract key ideas without needing to upload anything to the cloud. You can upload your notes and ask it lists of concepts, comparisons between theories, or simpler explanations.

In the professional sphere, LM Studio can serve as assistant for writing formal emails, preparing draft reports, or doing quick translationsBy adjusting the System Prompt, you can tell it to act as “action-oriented professional assistant”that they write clearly, to the point, and with a tone appropriate to the type of client.

And if you enjoy programming, you can connect the local API server to your editor or your own scripts to autocomplete code, generate functions, check for errors, or document modulesAll with your local models, without exposing your repository to third parties.

Ultimately, LM Studio on a Mac with Apple Silicon gives you a very powerful combination of privacy, control, performance and flexibilityYou can switch models depending on the task, adjust parameters to your liking, and work both locally and by integrating your AI with other tools in your usual workflow, all without depending on subscriptions or a permanent cloud connection.

Apple tutorials, guides, and manuals: essential resources for Mac, iPhone, iPad, and Apple Watch users
Related article:
Apple tutorials, guides, and manuals to get the most out of your Mac, iPhone, iPad, and Apple Watch