Will Google's Gemini File Search kill homebrew RAG solutions? We test drive to compare function, performance and costs. Plus sample code for PDF Q&A app.Will Google's Gemini File Search kill homebrew RAG solutions? We test drive to compare function, performance and costs. Plus sample code for PDF Q&A app.

Google Gemini File Search - The End of Homebrew RAG?

2025/11/21 21:00
8 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

\

Introduction

Google announced Gemini File Search, and pundits claim it’s the death toll for homebrew RAG (Retrieval Augmented Generation). The reason is that now the app developer no longer needs to worry about chunking, embedding, file storage, vector database, metadata, retrieval optimization, context management, and more. And the entire document Q&A stack (used to be a middleware plus application layer logic) is now absorbed by the Gemini model and its peripheral cloud offerings.

\ In this article, we will try out the Gemini File Search and compare it with a homebrew RAG system in terms of capabilities, performance, cost, flexibility, and transparency. You will be able to make an educated decision for your use case. And to speed up your development, I included my example app on GitHub.

\ Here is the original Google announcement:

Build Your Own Agentic RAG

Traditional RAG - A Refresher

The architecture of a traditional RAG looks like this, which consists of a few sequential steps.

\

  1. The documents are first chunked, embedded, and inserted into a vector database. Often, related metadata are included in the database entries.
  2. User query was embedded and converted into a vector DB search to retrieve the relevant chunks.
  3. And finally, the original user query and the retrieved chunks (as context) are fed into the AI models to generate the answer for the user.

Agentic RAG

The architecture of an Agentic RAG system added a reflection & react loop, where the agent will check if the results are relevant and complete, and then rewrite the query to satisfy the search quality. So, the AI model is used in several places: to rewrite the user query into a vector DB query, to assess whether the retrieval is satisfactory, and finally to generate the answer for the user.

An Example Use Case - Camera Manual Q&A

There are many new photographers who are interested in using old film cameras. One of the main challenges for them is that many old cameras have unique and sometimes quirky ways to operate, even the basic things, such as loading film and resetting the film frame counter. Worse, you can even damage the camera if you do certain things in the “wrong order.” Therefore, accurate and exact instructions from a camera manual are essential.

\ A camera manual archive hosts 9,000 old camera manuals, mostly scanned PDFs. In an ideal world, you would just download a few for your camera, study them, get familiar, and be done with that. But we are all modern humans who are neither patient nor pre-planned. So, we need Q&A against camera manual PDFs on the go, e.g., in a phone app.

\ This fits the agentic RAG scope very well. And I assume it will be universally applicable to lots of hobbies (music instruments, Hi-Fi equipment, vintage cars) that require finding information from ancient user manuals.

Homebrew RAG for PDF Q&A

Our RAG system was implemented earlier this year based on the LLaMAIndex RAG workflow with substantial customization:

  1. Use Qrrant vector database: good price-performance ratio, support metadata.
  2. Use Mistral OCR API to ingest the PDF: good performance in understanding complex PDF files with illustrations and tables.
  3. Keep images of each PDF page so users can directly access a graphic illustration of complex camera operations, in addition to text instructions.
  4. Add an agentic loop of reflection and react based on the Google/Langchain example for agentic search.

How About Multi-Modal LLMs?

Since 2024, the multi-modal LLMs have already been getting really good. An obvious alternative approach was to feed the user query and the entire PDF to the LLM and get an answer. This is a much simpler solution that does not need to maintain any vector DB or middleware.

\ Our main concern was cost, so we did a cost calculation and comparison. And the short answer is that RAG is faster, more efficient, and much less costly once the number of user queries per day is greater than 10. So, the “directly feeding user query and entire matching PDF to a Multi-modal LLM” only really works for prototyping or very low volume use (a few queries a day).

\ At that time, it confirmed our belief that homebrew RAG is still critically important until Google drops the Gemini File Search. I think the decision is not that simple anymore.

The Gemini File Search - An Example

I built an example app for the camera manual Q&A use case, based on the Google AI Studio example. It is open source on GitHub, so you can try it very quickly. Here is a screenshot of the user interface and the chat thread.

\n

Example Q&A with PDFs using Gemini File Search:

https://github.com/zbruceli/pdf_qa

\ The main steps involved in the source code:

  1. Create a File Search Store, and persist it across different sessions.
  2. Upload Multiple Files Concurrently, and the Google backend will handle all the chunking and embedding. It even creates sample questions for the users. In addition, you can modify the chunking strategy and upload custom metadata.
  3. Run a Standard Generation Query (RAG): behind the scenes, it is agentic and can actually assess the quality of results before generating the final answer.

More Developer Information

Gemini File Search API doc

https://ai.google.dev/gemini-api/docs/file-search

\ Tutorial by Phil Schmidt

https://www.philschmid.de/gemini-file-search-javascript

Pricing of Gemini File Search

  • Developers are charged for embeddings at indexing time based on existing embeddings pricing ($0.15 per 1M tokens).
  • Storage is free of charge.
  • Query time embeddings are free of charge.
  • Retrieved document tokens are charged as regular context tokens.

So, Which Is Better?

Since Gemini File Search is still fairly new, my assessment is purely based on the initial testing for about a week.

Capability Comparison

Gemini File Search has all the basic features of a homebrew RAG system

  • Chunking (can configure size and overlap)
  • Embedding
  • Vector DB supporting custom metadata input
  • Retrieval
  • Generative output

\ And more advanced features under the hood:

  • Agentic capability to assess retrieval quality

\ If I have to nitpick, image output is currently missing. So far, the output of Google File Search is limited to text only, while a custom-built RAG can return images from the scanned PDF. I imagine it won’t be too difficult for Gemini File Search to offer multi-modal output in the future.

Performance Comparison

  • Accuracy: on par. There is no tangible improvement in retrieval or generation quality.

\

  • Speed: mostly on par. Gemini File Search might be slightly faster, since the vector DB and LLM are both “sitting” inside the Google Cloud infrastructure.

Cost Comparison

Finally, Gemini File Search is a fully hosted system that might cost less than a homebrew system.

\ The embedding of documents was run only once, and it costs $0.15 per million tokens. This is a fixed cost that is common for all RAG systems, and can be amortized over the lifespan of the document Q&A application. In my use case of camera manuals, this fixed cost is a very small portion of the total cost.

\ Since Gemini File Search offers “free” file storage and database, this is a saving over the homebrew RAG system.

\ Inference cost is about the same, since the amount of input tokens (question plus vector search results as context) and output tokens are comparable between Gemini File Search and the homebrew system.

Flexibility & Transparency for Tuning and Debugging

Naturally, Gemini File Search marries you to Gemini AI models for embedding and inference. It is essentially gaining convenience while sacrificing flexibility and choice.

\ In terms of fine-tuning your RAG system, Gemini File Search provides some level of customization. For example, you can define a chunkingConfig during upload to specify parameters like maxTokensPerChunk and maxOverlapTokens, and customMetadata to attach key-value pairs to the document.

\ However, it seems impossible to have an internal trace of the Gemini File Search system for debugging and performance tuning. So, you are using it more or less as a black box.

Conclusions

Google’s Gemini File Search is good enough for most applications and most people at a very attractive price. It is super easy to use and has minimal operational overhead. It is not only good for quick prototyping and mock-ups, but also good enough for a production system with thousands of users.

\ However, there are a few scenarios that you might still consider a homebrew RAG system: \n

  • You don’t trust Google to host your proprietary documents.
  • You need to return images to the user from the original documents.
  • You want full flexibility and transparency in terms of which LLM to use for embedding and inference, how to do chunking, how to control the agentic flow of the RAG, and how to debug potential retrieval quality issues.

\ So, give the Gemini File Search a try and decide for yourself. You can either use the Google AI Studio as a playground, or you can use my example code on GitHub. Please comment below on your findings for your use cases.

Market Opportunity
Quack AI Logo
Quack AI Price(Q)
$0.012885
$0.012885$0.012885
-5.46%
USD
Quack AI (Q) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Adoption Leads Traders to Snorter Token

Adoption Leads Traders to Snorter Token

The post Adoption Leads Traders to Snorter Token appeared on BitcoinEthereumNews.com. Largest Bank in Spain Launches Crypto Service: Adoption Leads Traders to Snorter Token Sign Up for Our Newsletter! For updates and exclusive offers enter your email. Leah is a British journalist with a BA in Journalism, Media, and Communications and nearly a decade of content writing experience. Over the last four years, her focus has primarily been on Web3 technologies, driven by her genuine enthusiasm for decentralization and the latest technological advancements. She has contributed to leading crypto and NFT publications – Cointelegraph, Coinbound, Crypto News, NFT Plazas, Bitcolumnist, Techreport, and NFT Lately – which has elevated her to a senior role in crypto journalism. Whether crafting breaking news or in-depth reviews, she strives to engage her readers with the latest insights and information. Her articles often span the hottest cryptos, exchanges, and evolving regulations. As part of her ploy to attract crypto newbies into Web3, she explains even the most complex topics in an easily understandable and engaging way. Further underscoring her dynamic journalism background, she has written for various sectors, including software testing (TEST Magazine), travel (Travel Off Path), and music (Mixmag). When she’s not deep into a crypto rabbit hole, she’s probably island-hopping (with the Galapagos and Hainan being her go-to’s). Or perhaps sketching chalk pencil drawings while listening to the Pixies, her all-time favorite band. This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Center or Cookie Policy. I Agree Source: https://bitcoinist.com/banco-santander-and-snorter-token-crypto-services/
Share
BitcoinEthereumNews2025/09/17 23:45
The Role of Reference Points in Achieving Equilibrium Efficiency in Fair and Socially Just Economies

The Role of Reference Points in Achieving Equilibrium Efficiency in Fair and Socially Just Economies

This article explores how a simple change in the reference point can achieve a Pareto-efficient equilibrium in both free and fair economies and those with social justice.
Share
Hackernoon2025/09/17 22:30
Unlocking Massive Value: Curve Finance Revenue Sharing Proposal for CRV Holders

Unlocking Massive Value: Curve Finance Revenue Sharing Proposal for CRV Holders

BitcoinWorld Unlocking Massive Value: Curve Finance Revenue Sharing Proposal for CRV Holders The dynamic world of decentralized finance (DeFi) is constantly evolving, bringing forth new opportunities and innovations. A significant development is currently unfolding at Curve Finance, a leading decentralized exchange (DEX). Its founder, Michael Egorov, has put forth an exciting proposal designed to offer a more direct path for token holders to earn revenue. This initiative, centered around a new Curve Finance revenue sharing model, aims to bolster the value for those actively participating in the protocol’s governance. What is the “Yield Basis” Proposal and How Does it Work? At the core of this forward-thinking initiative is a new protocol dubbed Yield Basis. Michael Egorov introduced this concept on the CurveDAO governance forum, outlining a mechanism to distribute sustainable profits directly to CRV holders. Specifically, it targets those who stake their CRV tokens to gain veCRV, which are essential for governance participation within the Curve ecosystem. Let’s break down the initial steps of this innovative proposal: crvUSD Issuance: Before the Yield Basis protocol goes live, $60 million in crvUSD will be issued. Strategic Fund Allocation: The funds generated from the sale of these crvUSD tokens will be strategically deployed into three distinct Bitcoin-based liquidity pools: WBTC, cbBTC, and tBTC. Pool Capping: To ensure balanced risk and diversified exposure, each of these pools will be capped at $10 million. This carefully designed structure aims to establish a robust and consistent income stream, forming the bedrock of a sustainable Curve Finance revenue sharing mechanism. Why is This Curve Finance Revenue Sharing Significant for CRV Holders? This proposal marks a pivotal moment for CRV holders, particularly those dedicated to the long-term health and governance of Curve Finance. Historically, generating revenue for token holders in the DeFi space can often be complex. The Yield Basis proposal simplifies this by offering a more direct and transparent pathway to earnings. By staking CRV for veCRV, holders are not merely engaging in governance; they are now directly positioned to benefit from the protocol’s overall success. The significance of this development is multifaceted: Direct Profit Distribution: veCRV holders are set to receive a substantial share of the profits generated by the Yield Basis protocol. Incentivized Governance: This direct financial incentive encourages more users to stake their CRV, which in turn strengthens the protocol’s decentralized governance structure. Enhanced Value Proposition: The promise of sustainable revenue sharing could significantly boost the inherent value of holding and staking CRV tokens. Ultimately, this move underscores Curve Finance’s dedication to rewarding its committed community and ensuring the long-term vitality of its ecosystem through effective Curve Finance revenue sharing. Understanding the Mechanics: Profit Distribution and Ecosystem Support The distribution model for Yield Basis has been thoughtfully crafted to strike a balance between rewarding veCRV holders and supporting the wider Curve ecosystem. Under the terms of the proposal, a substantial portion of the value generated by Yield Basis will flow back to those who contribute to the protocol’s governance. Returns for veCRV Holders: A significant share, specifically between 35% and 65% of the value generated by Yield Basis, will be distributed to veCRV holders. This flexible range allows for dynamic adjustments based on market conditions and the protocol’s performance. Ecosystem Reserve: Crucially, 25% of the Yield Basis tokens will be reserved exclusively for the Curve ecosystem. This allocation can be utilized for various strategic purposes, such as funding ongoing development, issuing grants, or further incentivizing liquidity providers. This ensures the continuous growth and innovation of the platform. The proposal is currently undergoing a democratic vote on the CurveDAO governance forum, giving the community a direct voice in shaping the future of Curve Finance revenue sharing. The voting period is scheduled to conclude on September 24th. What’s Next for Curve Finance and CRV Holders? The proposed Yield Basis protocol represents a pioneering approach to sustainable revenue generation and community incentivization within the DeFi landscape. If approved by the community, this Curve Finance revenue sharing model has the potential to establish a new benchmark for how decentralized exchanges reward their most dedicated participants. It aims to foster a more robust and engaged community by directly linking governance participation with tangible financial benefits. This strategic move by Michael Egorov and the Curve Finance team highlights a strong commitment to innovation and strengthening the decentralized nature of the protocol. For CRV holders, a thorough understanding of this proposal is crucial for making informed decisions regarding their staking strategies and overall engagement with one of DeFi’s foundational platforms. FAQs about Curve Finance Revenue Sharing Q1: What is the main goal of the Yield Basis proposal? A1: The primary goal is to establish a more direct and sustainable way for CRV token holders who stake their tokens (receiving veCRV) to earn revenue from the Curve Finance protocol. Q2: How will funds be generated for the Yield Basis protocol? A2: Initially, $60 million in crvUSD will be issued and sold. The funds from this sale will then be allocated to three Bitcoin-based pools (WBTC, cbBTC, and tBTC), with each pool capped at $10 million, to generate profits. Q3: Who benefits from the Yield Basis revenue sharing? A3: The proposal states that between 35% and 65% of the value generated by Yield Basis will be returned to veCRV holders, who are CRV stakers participating in governance. Q4: What is the purpose of the 25% reserve for the Curve ecosystem? A4: This 25% reserve of Yield Basis tokens is intended to support the broader Curve ecosystem, potentially funding development, grants, or other initiatives that contribute to the platform’s growth and sustainability. Q5: When is the vote on the Yield Basis proposal? A5: A vote on the proposal is currently underway on the CurveDAO governance forum and is scheduled to run until September 24th. If you found this article insightful and valuable, please consider sharing it with your friends, colleagues, and followers on social media! Your support helps us continue to deliver important DeFi insights and analysis to a wider audience. To learn more about the latest DeFi market trends, explore our article on key developments shaping decentralized finance institutional adoption. This post Unlocking Massive Value: Curve Finance Revenue Sharing Proposal for CRV Holders first appeared on BitcoinWorld.
Share
Coinstats2025/09/18 00:35