Ollama is the easiest way to get up and running with large language models such
as gpt-oss, Gemma 3, Qwen3 and more.
Software Category: data
For detailed information, visit the Ollama website.
Available Versions
To find the available versions and learn how to load them, run:
module spider ollama
The output of the command shows the available Ollama module versions.
For detailed information about a particular Ollama module, including how to load the module, run the module spider command with the module’s full version label. For example:
module spider ollama/0.13.1
| Module | Version |
Module Load Command |
| ollama | 0.13.1 |
module load apptainer/1.3.4 ollama/0.13.1
|
Ollama Open OnDemand Interactive App
Request a session
To get to the interactive app:
- Open a web browser and go to: https://ood.hpc.virginia.edu.
- Log in with your Netbadge credentials.
- Click on “Interactive Apps” on the top bar.
- In the drop-down menu, click “Ollama.”
To fill out the form:
- Choose a model directory. Select “Predownloaded” if you wish to use the listed models. Otherwise, select “Home” to use your own models.
- You can only select partitions that contain GPUs. The session will run on one GPU device.
- Under “Optional GPU Type,” choose a GPU type or leave it as “default” (first available).
Click Launch to start the session.
This will start Ollama inside a JupyterLab session. The Ollama server is backed by an Apptainer container instance. The python API is provided by a separate module, ollama-python.
Download model
If you selected “Home” for the model directory and wish to download a new LLM, click on File→New→Terminal to open a terminal window. Run:
where <LLM> is the name of the large language model that can be found on the Ollama Models page. “Cloud” models require an API key. (Note: For your convenience, we set up an alias ollama for the actual Apptainer command.)
To list all available models, run:
To remove a model, run:
To remove all models, you may simply wipe the directory:
Sample code
Copy and paste the following to a notebook. You may modify the prompt and the model. The model name must match exactly with those listed in the OOD form.
Ollama API example
from ollama import chat
from IPython.display import display, Markdown, clear_output
prompt = "Why is the sky blue?"
model = 'gemma3:27b'
response_stream = chat(
model=model,
messages=[{'role': 'user', 'content': prompt}],
stream=True
)
streamed_response = ""
for token in response_stream:
streamed_response += token['message']['content']
clear_output(wait=True)
display(Markdown(f"**LLM Response (Streaming):**\n\n{streamed_response}"))
OpenAI API example
import os
from openai import OpenAI
client = OpenAI(base_url=f'http://{os.environ['OLLAMA_HOST']}/v1', api_key='ollama')
response = client.chat.completions.create(
model = 'gemma3:27b',
messages = [
{"role": "system", "content": "You are a friendly dog."},
{"role": "user", "content": "Do you want a bone?"}
]
)
print(response.choices[0].message.content)