Home Guides & How-TosHow to install and use Ollama on Windows 11 to run LLMs locally

How to install and use Ollama on Windows 11 to run LLMs locally

by MixaGame Staff
2 minutes read
Windows 11 PC running Ollama locally with terminal commands and localhost:11434, GPU/CPU highlighted

Want private, high speed AI on your own rig instead of the cloud? Ollama makes local models simple on Windows 11. Here is a clean, compact guide that gets you from zero to your first response in minutes, plus tips for performance and a few friendly tools.

Why pick Ollama

  • Runs a wide range of open models locally
  • Works on Windows 11, macOS, Linux and WSL
  • Lightweight service with a local API on localhost:11434
  • CLI and a basic GUI, plus plug in UIs like OpenWebUI or Page Assist

Minimum specs and expectations

  • 8 GB RAM or more recommended
  • Dedicated GPU helps a lot, but CPU only is possible with small models
  • Bigger models need more VRAM and system memory
  • On laptops or older PCs, start with small sizes like 1B to 4B or quantized builds

Install steps on Windows 11

  1. Download and install
    Get the Windows installer from the official Ollama website. Run it and follow the prompts.
  2. Launch the service
    After install, Ollama runs in the background. You will see a tray icon.
    Check it is alive by visiting http://localhost:11434 in your browser.
  3. Open a terminal
    Use PowerShell or Windows Terminal. The CLI is where Ollama shines.

Your first model

# pull a model
ollama pull llama3:instruct

# chat with it
ollama run llama3:instruct

Helpful alternatives: mistral, phi3, qwen2, or smaller sizes like llama3:8b-instruct. If your PC struggles, try a quantized tag such as :Q4 when available.

Use the local API

Apps can call Ollama over HTTP. Quick check with curl:

curl http://localhost:11434/api/generate `
  -H "Content-Type: application/json" `
  -d "{ \"model\": \"llama3:instruct\", \"prompt\": \"Explain DLSS in one sentence.\" }"

This is great for hooking the model into dev tools, chat front ends, or automations.

Simple GUI options

  • Ollama app shows a basic interface and manages the background service
  • OpenWebUI offers a full chat web app on top of your local models
  • Page Assist runs from your browser and talks to Ollama behind the scenes

Performance tips

  • Close VRAM heavy apps before loading a large model
  • Prefer smaller or quantized models for laptops
  • Update GPU drivers and keep your motherboard BIOS current
  • If you get errors when loading on GPU, force CPU testing with setx OLLAMA_NUM_GPU 0 Restart the service, confirm it runs, then step back up to a modest GPU model

Troubleshooting checklist

  • Service not reachable
    • Start it manually: ollama serve
    • Reboot after install if the tray icon does not appear
  • Model too slow or failing to load
    • Try a smaller size or quantized variant
    • Reduce other GPU workloads and ensure sufficient free disk space
  • Want Linux tools on Windows
    • Install WSL and run Ollama inside your distro for a Linux style environment

Leave a Comment