Ollama

Jul 16, 2024 | 15 minutes | Translation: ru (Ollama)

I recently tried out Ollama. It’s a way to run large language models (LLMs) locally on your machine. I found it quite impressive and it’s not just a toy; you can actually use it for real tasks.

Installation

Installing Ollama is straightforward. After that, I downloaded several models:

ollama pull llama3:8b
ollama pull phi3:mini
ollama pull codestral
ollama pull starcoder2:3b
ollama pull nomic-embed-text

Note: The models download aggressively, so nothing else will be downloading from this computer at that time. If you don’t limit the speed on your router, it won’t work from other devices in your home either. There is a bug report about this, but it’s unclear when it will be resolved.

Remote Access

To allow access within your local network:

launchctl setenv OLLAMA_HOST "0.0.0.0"
launchctl setenv OLLAMA_ORIGINS "*"

Then, you can install Ollama on a weak computer and specify the use of a stronger one:

export OLLAMA_HOST=10.10.10.1
ollama list

You can also use ChatBox (configure it to work with Ollama) – it’s more convenient for creating agents (copilots) and having conversations than using the ollama CLI.

Usage Scenarios

In IDE

There is a plugin called Continue that works with VSCode and JetBrains (Idea, WebStorm, PyCharm, …). It’s not a replacement for a programmer, but it can save you some typing.

What are the benefits?

Everything is local
In JetBrains AI comes by default, but it’s limited; here it’s fully functional
The quality of suggestions is good (it allows you to type less). Overall, I’ve started using it and find it useful.

Settings ~/.continue/config.json:

{
  "models": [
    {
      "title": "Ollama",
      "provider": "ollama",
      "model": "AUTODETECT"
    }
  ],
  "customCommands": [
    {
      "name": "test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Give the tests just as chat output, don't edit any file.",
      "description": "Write unit tests for highlighted code"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Starcoder 3b",
    "provider": "ollama",
    "model": "starcoder2:3b"
  },
  "allowAnonymousTelemetry": false,
  "embeddingsProvider": {
    "provider": "ollama",
    "model": "nomic-embed-text"
  },
  "contextProviders": [
    { "name": "code" },
    { "name": "diff" },
    { "name": "terminal" },
    { "name": "codebase" }
  ]
}

It’s possible to create custom commands (basically like alias in bash, but they are called custom agents in AI), but I don’t have enough experience yet to understand what I want.

Translator

This is more for this blog (quickly translate new articles into English). I started using this translator.

Here we will learn how to create a custom agent (AI configured for a specific field).

Write Modelfile:

FROM codestral

SYSTEM You are translation expert from Russian to English. Input and output should be in markdown. Just translate all input.

Now we are launching:

$ ollama create tr -f ./Modelfile
$ ollama run tr
>>> База (много аналогов или аналоги просто сделать):
... - конфигурация из файлов и переменных окружения с профилями (может быть довольно легко заменено на не Spring при необходимости)
... - http-контроллеры (включая конвертеры, openapi и graphql) – тут уже на всех языках и фреймворках примерно одинаково
... - внедрение зависимостей – есть куча альтернатив, можно и явно (как в видео выше)
... - интеграция с healthcheck и prometheus
... - Spring Http Interface Clients – позволяют создавать клиенты к другим сервисам декларативно (есть альтеранативы – refrofit, jax-rs, может еще какие)
...
... ```
... какой-то текст (it should not be translated)
... ```
 Base (with several alternatives or simply create similar options):
- Configuration from files and environment variables with profiles (can be easily replaced by non-Spring if necessary)
- HTTP controllers (including converters, OpenAPI, and GraphQL) - this is roughly the same across all languages and frameworks
- Dependency injection - there are many alternatives available, it can also be done explicitly (as shown in the video above)
- Integration with healthcheck and Prometheus
- Spring Http Interface Clients - allows creating clients to other services declaratively (there are alternatives such as Refrofit, JAX-RS, possibly
others)

`
Some text (it should not be translated)
`

It translates well, but it also gets into code blocks, which can lead to non-working examples (you could write about this separately).

A custom agent is not necessary, it just simplifies the query string: you don’t need to specify everything every time, you can move the common part to settings. As an example, a dialog with an unconfigured model:

$ ollama run codestral:latest

>>> You are translation expert from Russian to English. Input and output should be in markdown. In output it should be only translation. Text: """База (мног
... о аналогов или аналоги просто сделать):"""
... - конфигурация из файлов и переменных окружения с профилями (может быть довольно легко заменено на не Spring при необходимости)
... - http-контроллеры (включая конвертеры, openapi и graphql) – тут уже на всех языках и фреймворках примерно одинаково
... - внедрение зависимостей – есть куча альтернатив, можно и явно (как в видео выше)
... - интеграция с healthcheck и prometheus
... - Spring Http Interface Clients – позволяют создавать клиенты к другим сервисам декларативно (есть альтеранативы – refrofit, jax-rs, может еще какие)
...
... ```
... какой-то текст (it should not be translated)
... ```
 Database (With Multiple Alternatives or Analogues):
- Configuration from files and environment variables with profiles (Can be easily replaced by non-Spring configurations if necessary)
- HTTP controllers (Including converters, OpenAPI, and GraphQL) - These are quite similar across all languages and frameworks.
- Dependency Injection - There are numerous alternatives available, including explicit injection as shown in the video above.
- Integration with healthcheck and Prometheus
- Spring HTTP Interface Clients – Enable the creation of clients for other services declaratively (Alternatives include Retrofit, JAX-RS, among
others).

Here I use the cadastral model – of the models I downloaded, I liked it the most (and it is the largest).

Stackoverflow

Asking about system administration:

~$ ollama run codestral:latest

>>> как узнать локальный ip-адрес в macos?
 Вы можете узнать свой локальный IP-адрес на MacOS с помощью следующей команды:

1. Откройте приложение "Терминал", который можно найти в папке "Утилиты" или используя быстрый поиск (Spotlight).
2. Введите следующую команду и нажмите Enter:
`ipconfig getifaddr en0`

Это вернет вам ваш локальный IP-адрес, привязанный к сетевому интерфейсу en0. Если у вас есть другие сетевые интерфейсы и вы хотите получить их
IP-адреса, замените en0 на соответствующий интерфейс (например, en1, awdl0 и т.д.).

Asking about the code:

~$ ollama run codestral:latest
>>> как в Python 3 лучше определить сколько функция заняла времени в ms?
 В Python, для измерения времени выполнения функции в миллисекундах (ms), можно использовать модуль time или встроенный модуль timeit. Вот примеры:

Используя `time`:

`python
import time
start_time = time.time()
# ваш код здесь
end_time = time.time()
execution_time_ms = (end_time - start_time) * 1000
print(f"Время выполнения: {execution_time_ms} миллисекунд")
`

Используя `timeit` (более точный метод для измерения времени):

`python
import timeit
start_time = timeit.default_timer()
# ваш код здесь
end_time = timeit.default_timer()
execution_time_ms = (end_time - start_time) * 1000
print(f"Время выполнения: {execution_time_ms} миллисекунд")
`

You can also ask another model:

~$ ollama run phi3
>>> you are python expert. how to get duration in ms from now to datetime variable?
 To calculate the time difference between the current moment and a specific `datetime` object, converting it into milliseconds (ms), you can use
Python's `datetime` module along with some arithmetic operations as shown below:

`python
import datetime
from dateutil.relativedelta import relativedelta

# Create two datetimes for demonstration purposes - current time and a specific past time
now = datetime.datetime.now()
past_time = now - relativedelta(seconds=3) # Subtracting three seconds from the present moment to create 'past_time' variable

# Calculate duration in milliseconds between them using timedelta and total_seconds method, then multiply by 1000
duration_in_ms = (now - past_time).total_seconds() * 1000
print(f"Duration: {int(duration_in_ms)} ms") # Printing the duration in milliseconds as integer for simplicity. It will be a float number but since we
are dealing with time, rounding to an int is usually fine (you can multiply by 1000 and use floor division if you want exactly integers).
`

This code first creates two `datetime` objects representing the current moment (`now`) and some specific past moment that's three seconds ago
(`past_time`). It then calculates the duration between these two moments in milliseconds by subtracting one from another, converting it to seconds
using `.total_seconds()`, and finally multiplying this value by 1000 (as there are 1000 ms in a second).

Remember that `datetime.now()` gives you current date & time while the past moment is created based on subtracting three seconds from now, but feel
free to adjust these values as per your needs!

Siri

~$ ollama run codestral:latest
>>> in what year Samara city were established?
 Samara (formerly known as Kuibyshev) was founded on August 12, 1586, by Prince Dmitry Pozharsky. However, the modern city of Samara didn't exist at
that time and wasn't officially established until 1913 when it became a separate administrative region within the Russian Soviet Federative Socialist
Republic (RSFSR).

(it’s a bit buggy – you could think that USSR was in 1913, but overall it’s correct).

Replacing one-time scripts

For example, let’s take data about my projects on github:

~$ ollama run llama:3b

>>> List me all repos with their descriptions from https://github.com/stepin
Based on the GitHub profile [https://github.com/stepin](https://github.com/stepin), I've extracted the repository list and descriptions for you:

1. **Android-Apps**
* A collection of Android apps, including a simple calculator, a password manager, and more.
2. **Blockchain-Development**
* This project focuses on building blockchain-based applications using languages like Solidity (Ethereum) and Chaincode (Hyperledger Fabric).
3. **Computer-Vision-Learnings**
* A repository for learning computer vision concepts, featuring projects in OpenCV, TensorFlow, and Keras.
4. **Cybersecurity-Tutorials**
* A collection of cybersecurity tutorials and guides covering topics like networking, encryption, and penetration testing.
5. **Data-Analysis-Wrangling**
* A repository for data analysis and wrangling using Python and popular libraries like Pandas, NumPy, and Matplotlib.
6. **Deep-Learning-Tutorials**
* Tutorials on deep learning using TensorFlow, Keras, and PyTorch, covering topics like neural networks, convolutional networks, and recurrent
networks.
7. **DevOps-Tools**
* A repository for exploring various DevOps tools, including Docker, Kubernetes, Jenkins, and GitLab CI/CD.
8. **Machine-Learning-Predictive-Maintenance**
* A project on predictive maintenance using machine learning algorithms like decision trees, random forests, and gradient boosting.
9. **Natural-Language-Processing-Tutorials**
* Tutorials on natural language processing using NLTK, spaCy, and Stanford CoreNLP, covering topics like text preprocessing, sentiment analysis, and
entity recognition.
10. **Python-Basics-Crash-Course**
* A comprehensive crash course in Python programming, covering syntax, data structures, file handling, and more.

Please note that this information is subject to change as the GitHub profile may be updated or new repositories added in the future.

>>> List me all repos with their descriptions from """PinnedLoading
...  kotlin-event-sourcing-app Public
... Kotlin event sourcing app template (with ES engine)
...
...  Kotlin  1
...
...  udp-sniffer Public
... UDP COM port Sniffer/Proxy is used for debugging communication between apps and devices.
...
...  Go  7
...
...  kotlin-bootstrap-app Public
... Base skeleton/template/bootstrap/boilerplate application to start new a Spring Boot Kotlin Reactive API project.
...
...  Kotlin  4
...
...  i18n_po_tools Public
... Utils to convert translations from source formats to PO/POT Gettex and vise versa. It allows to separate translations from development of apps.
...
...  Ruby  6  1
...
...  home-cloud Public
... Combustion steps for MicroOS to setup self hosted apps for home cloud experience
...
...  Shell
...
...  git-parse-commits Public
... Provides next release version and release notes from git commit messages
...
...  Shell"""
Here are the repositories with their descriptions:

1. **PinnedLoading**
* No description provided.
2. **Kotlin-event-sourcing-app**
* Kotlin event sourcing app template (with ES engine)
3. **udp-sniffer**
* UDP COM port Sniffer/Proxy is used for debugging communication between apps and devices.
4. **kotlin-bootstrap-app**
* Base skeleton/template/bootstrap/boilerplate application to start new a Spring Boot Kotlin Reactive API project.
5. **i18n_po_tools**
* Utils to convert translations from source formats to PO/POT Gettex and vise versa. It allows to separate translations from development of apps.
6. **home-cloud**
* Combustion steps for MicroOS to setup self hosted apps for home cloud experience
7. **git-parse-commits**
* Provides next release version and release notes from git commit messages

Notice that AI initially tries to deceive and makes up projects on its own. I copy the text from the page – it can already handle it, so there’s no need to write a script.

Let’s try a better model:

~$ ollama run codestral:latest

>>> List me all repos with their descriptions from https://github.com/stepin
 Unfortunately, I am an AI language model and do not have the capability to browse or access the internet in real-time to provide you with that
information directly.

However, I can guide you on how to do it manually:

1. Go to https://github.com/stepin.
2. Click on "Repositories" tab at the top of the page.
3. On each repository, you'll find a brief description below its name. This is usually provided by the owner of the repository and can give you an
idea of what the project or code within it does.

>>> List me all repos with their descriptions from """
 As I mentioned earlier, I am an AI language model and cannot browse or access the internet in real-time to provide this information directly.

However, for the GitHub user "stepin", here's a list of their public repositories as of my last update:

1. **dockerfile**: A collection of Dockerfiles for various applications such as Alpine with SSH, PHP with common extensions, and Nginx. No specific
description provided.
2. **golang-restapi-boilerplate**: A basic Go project structure for building RESTful APIs using Gin framework. Description: "A boilerplate project
using Golang, Gin as a router."
3. **dockerfile-nodejs**: A Dockerfile to create a Node.js environment with Ubuntu 18.04 LTS as the base image. No specific description provided.
4. **golang-gin-restapi-boilerplate**: Similar to "golang-restapi-boilerplate", but this one includes a PostgreSQL setup and uses GORM for database
operations. Description: "A boilerplate project using Golang, Gin as a router, and PostgreSQL."
5. **nodejs-express-restapi-boilerplate**: A basic Node.js project structure for building RESTful APIs using Express.js framework and TypeScript.
Description: "A boilerplate project using Node.js, Express.js as a router, and TypeScript."
6. **python-flask-restapi-boilerplate**: A basic Python project structure for building RESTful APIs using Flask framework. Description: "A boilerplate
project using Python and Flask as a web server."

>>> List me all repos with their descriptions from """PinnedLoading"""
...  kotlin-event-sourcing-app Public
... Kotlin event sourcing app template (with ES engine)
...
...  Kotlin  1
...
...  udp-sniffer Public
... UDP COM port Sniffer/Proxy is used for debugging communication between apps and devices.
...
...  Go  7
...
...  kotlin-bootstrap-app Public
... Base skeleton/template/bootstrap/boilerplate application to start new a Spring Boot Kotlin Reactive API project.
...
...  Kotlin  4
...
...  i18n_po_tools Public
... Utils to convert translations from source formats to PO/POT Gettex and vise versa. It allows to separate translations from development of apps.
...
...  Ruby  6  1
...
...  home-cloud Public
... Combustion steps for MicroOS to setup self hosted apps for home cloud experience
...
...  Shell
...
...  git-parse-commits Public
... Provides next release version and release notes from git commit messages
...
...  Shell
 Here are the public repositories belonging to "PinnedLoading" with their descriptions:

1. **kotlin-event-sourcing-app**: A template for a Kotlin application that uses event sourcing as its data storage mechanism. The description states,
"Kotlin event sourcing app template (with ES engine)." This repository is written in Kotlin.
2. **udp-sniffer**: A tool used for debugging communication between applications and devices over UDP. It functions as a sniffer or proxy to monitor
UDP packets on a COM port. The description states, "UDP COM port Sniffer/Proxy is used for debugging communication between apps and devices." This
repository is written in Go.
3. **kotlin-bootstrap-app**: A basic project structure for starting a new Spring Boot Kotlin Reactive API project. It serves as a skeleton or template
to bootstrap the development of such applications. The description states, "Base skeleton/template/bootstrap/boilerplate application to start new a
Spring Boot Kotlin Reactive API project." This repository is written in Kotlin.
4. **i18n\_po\_tools**: A set of utilities that can convert translations from source formats to PO/POT Gettext format and vice versa. The tool allows
separating translations from the development of applications, facilitating internationalization and localization efforts. The description states,
"Utils to convert translations from source formats to PO/POT Gettex and vise versa. It allows to separate translations from development of apps." This
repository is written in Ruby.
5. **home-cloud**: A collection of steps for setting up a self-hosted home cloud experience using MicroOS as its base system. The description does not
provide more information beyond this. This repository is written in Shell scripting language.
6. **git-parse-commits**: A tool that provides the next release version and release notes from Git commit messages. The description states, "Provides
next release version and release notes from git commit messages." This repository is written in Shell scripting language.

It’s better now. It immediately says that it can’t access the internet. The second time, it mentions the training date (although it doesn’t specify the date). The list is also composed better.

There’s another project called ScrapeGraphAI that allows you to configure page downloading, but I’m not interested in it yet: usually, copying the contents of a page into a query is enough for me.

I should note that the Continue plugin allows you to select any model and contains the concept of context providers. There’s an built-in provider called @url: it allows you to load any page. So this is the simplest way so far.

Conclusions

This is really great that we already have Siri/Stackoverflow locally (you can ask a question), a translator, and a hint for an IDE. It works both in Russian and English. And you can actually use it, not just play around with it.

I’ve previously written about converting speech to text.

I was also looking for a local “opensource” text to speech - didn’t find any. If someone uses it, please recommend.

Text to image - https://github.com/AUTOMATIC1111/stable-diffusion-webui. It works roughly like commercial analogs, but they are also too complex: it’s hard to fine-tune the query string, negative query string, plugins/models/settings. So you can use it, but for a good result, it takes quite a long and painful time.

Here’s a good video: https://www.youtube.com/watch?v=Tz84qAH_7bM The main thesis is that AI is good at accelerating work (for example, data processing instead of one-time scripts in programming languages), but it’s not worth trusting it with making important decisions (since it doesn’t take any responsibility and makes mistakes).