Every technologist, at some point, is tempted by the idea of creating a digital version of themselves. Not a simple chatbot that answers questions, but a true digital persona — an AI that doesn’t just know what you know, but thinks how you think. This project, which I call Damian AI, was my attempt to build just that: a conversational agent grounded in my public work and architected to mirror my own systematic, logic-driven cognitive model.
The journey was a multi-stage rocket of architectural pivots, each stage solving one problem while revealing a more subtle one underneath. It began with a simple web scraper and ended with a complex cognitive architecture running on a local Large Language Model (LLM). This is a post-mortem of that process and a frank analysis of the fascinating limitations of running a sophisticated AI persona on a local model like Llama 3 8B.
The initial concept was straightforward: use a web scraper to pull text from my articles and website, feed it into a vector database, and use that as the knowledge base for a Retrieval-Augmented Generation (RAG) system.
This approach failed spectacularly.
The scraper was brittle, often failing on different site structures or pulling in useless boilerplate. Worse, the semantic search of the vector database proved to be a flawed instrument for shaping a personality. The AI would “latch on” to one or two articles it deemed mathematically most relevant — in my case, an interview with Authority Magazine — and answer every query almost exclusively through the lens of that single source. The result wasn’t a reflection of my entire body of work, but a skewed funhouse mirror of one slice of it.
The lesson was clear: the integrity of the knowledge base is non-negotiable. Unreliable inputs will always produce an unreliable AI.
We pivoted. I ripped out the entire web scraping apparatus and replaced it with a simple, robust, and completely controlled system: a local database.py file. I manually curated the content of my 25 key articles and web pages into a static list.
This solved the knowledge problem instantly. The AI’s factual grounding became perfect. It could pull from the full breadth of my work and answer questions with high accuracy. But a new, more subtle problem emerged: the AI had all my knowledge, but it still didn’t sound like me. It was a fact-checker, not a persona. It answered questions with the generic, overly polite tone of a standard chatbot.
The problem wasn’t the knowledge; it was the cognitive process. My existing AI, Jeremy, is built on a more sophisticated architecture designed for maintaining a consistent narrative. Its core feature is a two-step “Decision-Execution” cognitive model. I realized I needed to give Damian AI a similar brain.
\n
\ \ Instead of a single, monolithic prompt trying to do everything at once, we re-architected the system:
DirectAnswer or Synthesis.execute_direct_answer function has a simple prompt tailored only to answering one question directly. The execute_synthesis function has a different prompt focused only on finding the common thread between multiple ideas.This cognitive assembly line was the breakthrough. By breaking down the complex task of “thinking like Damian” into two simpler steps, the local model could finally perform reliably. The persona locked in. The AI became direct, analytical, and confident. The generic chatbot was gone, replaced by a convincing digital persona.
After extensive testing, we concluded the AI was “journalist-ready.” It could accurately represent my work and maintain my persona with about 95% fidelity. But that final 5% is where the limitations of the local Llama 3 8B model become clear.
I call this the “Leaky Abstraction.” The Damian AI persona is a layer of instructions — an abstraction — painted on top of the base Llama 3 model. A massive, cloud-based model like GPT-4 has the sheer scale and alignment training to follow these instructions almost perfectly. A local 8B model, for all its efficiency, will always have tiny “leaks” where its base training as a helpful assistant shows through.
We saw this in two specific ways:
"Damian AI Response:"), a classic sign of a model "showing its work" instead of seamlessly embodying the persona.Could we engineer prompts to fix this? Perhaps. But at this stage, the risk of over-engineering the prompts and destabilizing the 95% that works is too high.
This project succeeded. It proves that a high-fidelity digital persona can be created and run effectively on a local machine, free from the constraints of APIs. The final 5% of robotic tells are not a failure, but an honest and acceptable trade-off for the privacy, speed, and control that a local LLM provides. The Damian AI is not a perfect replica, but it is a powerful, functional, and architecturally sound reflection.
If you would like to speak to me… There is a Damian AI tab on my website for you to look at and use. \n https://www.damiangriggs.com
\


