Cloud AI – convenient, no one argues.

At the same time, it’s already quite expensive: for the same LLMs with active use they offer 200+$ per month. And there are rumors that these are still considered subsidized rates for market capture.

Problems with cloud services:

  • data protection
  • availability without internet
  • unpredictability of updates – it can update poorly or break at any moment
  • unpredictability of price – the further cloud LLMs are from local ones, the more expensive they’ll become. Competition won’t really work, as globally there are very few players: there will be explicit or implicit price collusion.
  • expensiveness – already 2400$ per year can be – quite makes you think about buying your own hardware

Therefore, it’s extremely important to have a local alternative: even if you don’t use it, its presence will stop price growth in the cloud. On the other hand, many simple queries can be solved locally and thereby save on cloud usage costs.

It’s best now to build local AI on Apple M-processors (especially used): quiet, efficient power consumption, small size, price roughly like other options, can be resold or used as a regular computer. There are machines with 512 RAM (8550$ for new) and with 64 RAM (1840$ for new), as well as laptops with 32 gb RAM.

I hope that over time other computers with combined regular and graphics memory will appear to compete with Apple.

I don’t have special equipment yet, but something can be run. Throughout the first half of 2025 the favorites were:

  • codestral (12Gb)
  • deepseek-r1:14b (9Gb)
  • qwen3:30b-a3b (18Gb)

The release of ChatGPT 5 didn’t change anything, but the release of gpt-oss:20b is a quiet revolution:

  • really answers well
  • small (fits in a laptop with 32GB RAM with other open programs)
  • answers quickly
  • works well with MCP

At the moment my local favorite: if I ask something, it’s only to this model.

2 usage models:

  • through Ollama UI when context from local files is not needed
  • VS Code + Continue when needed