Sber has introduced a major update to its AI assistant GigaChat, which is now powered by the new flagship model GigaChat Ultra. This updated version enables the AI assistant to retain user-specific facts for personalized interactions and tailored solutions, autonomously search for information online, and generate text responses twice as quickly
The release of this new model creates opportunities not only for end users but also for developers building applied AI products and services with GigaChat Ultra. Users can also run code directly in the interface and ask questions about their own capabilities, with answers based on up-to-date documentation.
Anton Frolov, senior vice president, head of Generative AI Development, Sberbank:
“We’re moving beyond simply providing answers and evolving into a multi-agent AI assistant. But our ambitions don’t stop there: we’re building a future where neural-network-based interfaces replace traditional mobile apps. Needed features will appear upon request, making navigation through the digital world seamless. GigaChat Ultra is one of the world’s largest models fully developed and trained in Russia. It remembers your preferences, works faster, understands tasks more deeply, and provides higher-quality recommendations. We are removing the last barriers in human-to-machine interaction.”
A key innovation is long-term memory. While contextual (short-term) memory is limited to a single conversation session and resets once that session ends, GigaChat’s long-term memory works differently—it preserves user-specific information across sessions and uses it in future conversations.
Here’s what exactly GigaChat remembers:
The system automatically identifies significant facts without overloading memory with trivialities such as short-term plans or widely known general knowledge. All data is stored in a unified profile synchronized between web versions, mobile applications, and Telegram bot via the Sber ID sign-in. Users have full control over this feature: memory can be enabled or disabled anytime in settings.
GigaChat produces textual responses twice as fast as Sber’s previous flagship model. This directly impacts how quickly users receive replies—even for complex queries that demand detailed reasoning—with results appearing almost instantly
This increase in speed was achieved thanks to the Mixture of Experts (MoE) architecture. The model acts like a team of specialized experts, each handling specific types of tasks. Only relevant “experts” respond to any given query rather than the entire model working simultaneously.
GigaChat now autonomously initiates internet searches for real-time information, removing the need for users to enable this feature manually. This guarantees accurate answers when discussing recent news, stock quotes, and other frequently changing data. The search functionality also includes a dedicated rephraser—a system that restructures user queries to boost relevance and enhance the quality of the final response.
Online searching is now also available in the voice-communication mode. Dialogues have become truly interactive: users can interrupt the model, clarify details, or change topics instantaneously—with no delay in processing context shifts. After completing a chat session, a complete transcript of the dialogue is saved.
A self-awareness mechanism has been implemented in GigaChat, enabling the model to provide correct answers regarding its own characteristics. When answering such questions, the model consults current documentation that outlines its latest version, supported features, limitations, and behavioral nuances. This eliminates typical issues common among language models, such as providing incorrect or outdated information about their abilities—for example, falsely claiming nonexistent features or failing to recognize existing ones.
An integrated code interpreter transforms GigaChat into an isolated execution environment for running software code right inside the assistant’s interface. Before introducing this function, the model could merely write code and display it to users; executing and testing results required external tools. Now, GigaChat generates code and runs it instantly in a secure sandbox, with no impact on the user’s system.
The interpreter supports uploaded files, performs advanced numerical calculations, validates data structures, and creates graphs and charts directly in chats. This makes GigaChat a comprehensive analytical tool suitable for reports, tables, and large datasets.
The training involved three stages. Initially, the scope of knowledge was expanded by adding academic books, materials related to mathematics and programming, increasing multilingual data volumes—now covering ten languages. In the intermediate stage, specialized skills were enhanced: the code corpus was enlarged, additional data included physics, medicine, finance, records of actual dialogs, and security measures strengthened. Final tuning based on examples (editor texts, dialogs triggering functions, system prompts) ensured stable performance under real-world conditions.
Substantial improvements were recorded in answering both open-ended and closed-ended questions, as well as in tasks requiring advanced logical reasoning. Benchmark evaluations for Russian-language usage showed strong performance in grammatical accuracy, natural speech flow, readability, and response structure. Enhancements also extended to practical industry scenarios: the model became more adept at legal, cybersecurity, medical, financial, and trade-related tasks—especially those involving Russian-specific nuances and sectoral terminology. Notable progress was made in mathematical computations and code generation, broadening applicability in fintech, education, and development sectors.
Sber is making the source code and weights of its flagship GigaChat Ultra model openly available to the public. According to company experts’ assessments, it already outperforms DeepSeek V3.1, Qwen3-235B and its predecessor GigaChat 2 Max in Russian-language tasks, maths and general reasoning. By releasing the repository, organizations ranging from large banks to small startups will gain the ability to install the neural network within their private environments and adapt it to corporate data, marking a move toward genuine technological sovereignty.
Users can try the updated model free of charge in the web version, Android apps available in RuStore and AppGallery, as well as in the Telegram bot and MAX messenger. To activate voice mode and memory, simply sign in via Sber ID and turn on desired options in profile settings.


