Theoretical results validate the algorithm’s efficiency and unify prior pricing models under a single differential‑learning formulation.Theoretical results validate the algorithm’s efficiency and unify prior pricing models under a single differential‑learning formulation.

A Theoretical and Practical Framework for Differential Machine Learning in Derivative Pricing

2025/11/04 23:00

:::info Author:

(1) Pedro Duarte Gomes, Department of Mathematics, University of Copenhagen.

:::

Abstract

  1. Keywords and 2. Introduction

  2. Set up

  3. From Classical Results into Differential Machine Learning

    4.1 Risk Neutral Valuation Approach

    4.2 Differential Machine learning: building the loss function

  4. Example: Digital Options

  5. Choice of Basis

    6.1 Limitations of the Fixed-basis

    6.2 Parametric Basis: Neural Networks

  6. Simulation-European Call Option

    7.1 Black-Scholes

    7.2 Hedging Experiment

    7.3 Least Squares Monte Carlo Algorithm

    7.4 Differential Machine Learning Algorithm

  7. Numerical Results

  8. Conclusion

  9. Conflict of Interests Statement and References

Notes

Abstract

This article introduces the groundbreaking concept of the financial differential machine learning algorithm through a rigorous mathematical framework. Diverging from existing literature on financial machine learning, the work highlights the profound implications of theoretical assumptions within financial models on the construction of machine learning algorithms.

\ This endeavour is particularly timely as the finance landscape witnesses a surge in interest towards data-driven models for the valuation and hedging of derivative products. Notably, the predictive capabilities of neural networks have garnered substantial attention in both academic research and practical financial applications.

\ The approach offers a unified theoretical foundation that facilitates comprehensive comparisons, both at a theoretical level and in experimental outcomes. Importantly, this theoretical grounding lends substantial weight to the experimental results, affirming the differential machine learning method’s optimality within the prevailing context.

\ By anchoring the insights in rigorous mathematics, the article bridges the gap between abstract financial concepts and practical algorithmic implementations.

\

1 Keywords

Differential Machine Learning, Risk Neutral valuation, Derivative Pricing, Hilbert Spaces Orthogonal Projection, Generalized Function Theory

\

2 Introduction

Within the dynamic landscape of financial modelling, the quest for reliable pricing and hedging mechanisms persists as a pivotal challenge. This article aims to introduce an encompassing theory of pricing valuation uniquely rooted in the domain of machine learning. A primary focus lies in overcoming a prominent hurdle encountered in implementing the differential machine learning algorithm, specifically addressing the critical need for unbiased estimation of differential labels from data sources, as highlighted in studies by Huge (2020) and Broadie (1996). This breakthrough holds considerable importance for contemporary practitioners across diverse institutional settings, offering tangible solutions and charting a course toward refined methodologies. Furthermore, this endeavour not only caters to the immediate requirements of practitioners but also furnishes invaluable insights that can shape forthcoming research endeavours in this domain.

\ The article sets off from the premise that the pricing and hedging functions can be thought of as elements of a Hilbert space, in a similar way as Pelsser and Schweizer, 2016. A natural extension of these elements across time, originally attained in the current article, is accomplished by the Hahn Banach extension theorem, an extension that would translate as an improvement of the functional through the means of the incorporation of the accumulating information. This functional analytical approach conveys the necessary level of abstraction to justify, and discuss the different possibilities of implementation of the financial models contemplated in Huge and Savine, 2020 and Pelsser and Schweizer, 2016. So, a bridge will be built from the deepest theoretical considerations into the practicality of the implementations, keeping as a goal mathematical rigour in the exposition of the arguments. Modelling in Hilbert spaces allows the problem to be reduced into two main challenges: the choice of a loss function and the choice of an appropriate basis function. A discussion about the virtues and limitations of two main classes of basis functions is going to unravel, mainly supported by the results in Hornik et al., 1989,Barron, 1993 and Telgarsky, 2020. A rigorous mathematical derivation of the loss functions, for the two different risk-neutral methods, is going to be exposed, where the result for the second method, was stated and proven originally in the current document. The two methods are the Least Squares Monte Carlo and the Differential machine learning, inspired in Pelsser and Schweizer, 2016 and Huge and Savine, 2020, respectively. It is noted that the first exposition of the Least Squares Monte Carlo Method was accomplished in Longstaff and Schwartz, 2001. The derivation of the differential machine learning loss function using generalized function theory allows us to relax the assumptions of almost sure differentiability and almost sure Lipschitz continuity of the pay-off function in Broadie and Glasserman, 1996. Instead, the unbiased estimate of the derivative labels only requires the assumption of local integrability of the pay-off function, which it must clearly satisfy, given the financial context. This allows the creation of a technique to obtain estimates of the labels for any derivative product, solving the biggest limitation in Huge and Savine, 2020. The differential machine learning algorithm efficiently computes differentials as unbiased estimates of ground truth risks, irrespective of the transaction or trading book, and regardless of the stochastic simulation model.

\ The implementations are going to be completely justified by the arguments developed in the theoretical sections. The implementation of the differential machine learning method relies on Huge and Savine, 2020. The objective of this simulation is to assess the effectiveness of various models in learning the Black-Scholes model within the context of a European option contract. Initially, a comparison will be drawn between the prices and delta weights across various spot prices. Subsequently, the distribution of Profit and Loss (PnL) across different paths will be examined, providing the relative hedging errors metric. These will serve the purpose of illustrating theoretical developments.

\

3 Set up

\ Since the dual of a Hilbert space is itself a Hilbert space,g can be considered a functional.[3]

\ Considering the sequence of conditional Hilbert spaces:

\

\ Now the pricing or hedging functional incorporates the accumulated information from period 0 to period l.

\ This allows us to see that the increasing information would shape the function, which is something well-seen, in statistical learning, with the use of increasing training sets defined across time. [4]

\ We will begin by dwelling upon the problem of how to find function g, developing the theoretical statistical objects that are necessary for that aim. The aim is to estimate the pricing or hedging functions. So, a criterion needs to be established in the theoretical framework.

\ Let Z and X be two respectively d and p dimensional real-valued random variables, following some unknown joint distribution p(z, x). The expectation of the loss function associated with a predictor g can be defined as:

\

\ The objective is to find the element g ∈ H which achieves the smallest possible expected loss. Assume a certain parameter vector θ ∈ Θ, where Θ is a compact set in the Euclidean space. As the analytical evaluation of the expected value is impossible, a training sample (zi , xi) for i = 1, …, n drawn from p(z, x) is collected. An approximate solution to the problem can then be found by minimising the empirical approximation of the expected loss:

\ \

\

:::info This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

:::

[1] H represents all prior knowledge, one of the common constraints for option pricing are non-negativity and positiveness on the second order derivatives

\ [2] Even when Z is expressed as a diffusion, T is finite so the different paths could never display infinite variance

\ [3] This property is easily verified by building the following map ϕH′ → H, defined by ϕ(v) = fv, where fv(x) = ⟨x, v⟩, for x ∈ H is an antilinear bijective isometry.

\ [4] The functional analytical results can be revisited by the reader in Rudin, 1974

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

What Every Platform Eventually Learns About Handling User Payments Across Borders

What Every Platform Eventually Learns About Handling User Payments Across Borders

There is a moment almost every global platform hits. It rarely shows up in dashboards or board meetings. It reveals itself quietly, one payout del
Share
Medium2025/12/10 21:54
U.S. AI leaders form foundation to compete with China

U.S. AI leaders form foundation to compete with China

The post U.S. AI leaders form foundation to compete with China appeared on BitcoinEthereumNews.com. A group of leading U.S. artificial intelligence firms has formed a new foundation to establish open standards for “agentic” AI. The founding members, OpenAI, Anthropic, and Block, have pooled their proprietary agent- and AI-related technologies into a new open-source project called the Agentic AI Foundation (AAIF), under the auspices of the Linux Foundation. This development follows tensions in the global race for dominance in artificial intelligence, leading U.S. AI firms and policymakers to unite around a new push to preserve American primacy. Open standards like MCP drive innovation and cross-platform collaboration Cloudflare CTO Dane Knecht noted that open standards and protocols, such as MCP, are critical for establishing an evolving developer ecosystem for building agents. He added, “They ensure anyone can build agents across platforms without the fear of vendor lock-in.” American companies face a dilemma because they are seeking continuous income from closed APIs, even as they are falling behind in fundamental AI development, risking long-term irrelevance to China. And that means American companies must standardize their approach for MCP and agentic AI, allowing them to focus on building better models rather than being locked into an ecosystem. The foundation establishes both a practical partnership and a milestone for community open-sourcing, with adversaries uniting around a single goal of standardization rather than fragmentation. It also makes open-source development easier and more accessible for users worldwide, including those in China. Anthropic donated its Model Context Protocol (MCP), a library that allows AIs to utilize tools creatively outside API calls, to the Linux Foundation. Since its introduction a year ago, MCP has gained traction, with over 10,000 active servers, best-in-class support from platforms including ChatGPT, Gemini, Microsoft Copilot, and VS Code, as well as 97 million monthly SDK downloads. “Open-source software is key to creating a world with secure and innovative AI tools for…
Share
BitcoinEthereumNews2025/12/10 22:10
Summarize Any Stock’s Earnings Call in Seconds Using FMP API

Summarize Any Stock’s Earnings Call in Seconds Using FMP API

Turn lengthy earnings call transcripts into one-page insights using the Financial Modeling Prep APIPhoto by Bich Tran Earnings calls are packed with insights. They tell you how a company performed, what management expects in the future, and what analysts are worried about. The challenge is that these transcripts often stretch across dozens of pages, making it tough to separate the key takeaways from the noise. With the right tools, you don’t need to spend hours reading every line. By combining the Financial Modeling Prep (FMP) API with Groq’s lightning-fast LLMs, you can transform any earnings call into a concise summary in seconds. The FMP API provides reliable access to complete transcripts, while Groq handles the heavy lifting of distilling them into clear, actionable highlights. In this article, we’ll build a Python workflow that brings these two together. You’ll see how to fetch transcripts for any stock, prepare the text, and instantly generate a one-page summary. Whether you’re tracking Apple, NVIDIA, or your favorite growth stock, the process works the same — fast, accurate, and ready whenever you are. Fetching Earnings Transcripts with FMP API The first step is to pull the raw transcript data. FMP makes this simple with dedicated endpoints for earnings calls. If you want the latest transcripts across the market, you can use the stable endpoint /stable/earning-call-transcript-latest. For a specific stock, the v3 endpoint lets you request transcripts by symbol, quarter, and year using the pattern: https://financialmodelingprep.com/api/v3/earning_call_transcript/{symbol}?quarter={q}&year={y}&apikey=YOUR_API_KEY here’s how you can fetch NVIDIA’s transcript for a given quarter: import requestsAPI_KEY = "your_api_key"symbol = "NVDA"quarter = 2year = 2024url = f"https://financialmodelingprep.com/api/v3/earning_call_transcript/{symbol}?quarter={quarter}&year={year}&apikey={API_KEY}"response = requests.get(url)data = response.json()# Inspect the keysprint(data.keys())# Access transcript contentif "content" in data[0]: transcript_text = data[0]["content"] print(transcript_text[:500]) # preview first 500 characters The response typically includes details like the company symbol, quarter, year, and the full transcript text. If you aren’t sure which quarter to query, the “latest transcripts” endpoint is the quickest way to always stay up to date. Cleaning and Preparing Transcript Data Raw transcripts from the API often include long paragraphs, speaker tags, and formatting artifacts. Before sending them to an LLM, it helps to organize the text into a cleaner structure. Most transcripts follow a pattern: prepared remarks from executives first, followed by a Q&A session with analysts. Separating these sections gives better control when prompting the model. In Python, you can parse the transcript and strip out unnecessary characters. A simple way is to split by markers such as “Operator” or “Question-and-Answer.” Once separated, you can create two blocks — Prepared Remarks and Q&A — that will later be summarized independently. This ensures the model handles each section within context and avoids missing important details. Here’s a small example of how you might start preparing the data: import re# Example: using the transcript_text we fetched earliertext = transcript_text# Remove extra spaces and line breaksclean_text = re.sub(r'\s+', ' ', text).strip()# Split sections (this is a heuristic; real-world transcripts vary slightly)if "Question-and-Answer" in clean_text: prepared, qna = clean_text.split("Question-and-Answer", 1)else: prepared, qna = clean_text, ""print("Prepared Remarks Preview:\n", prepared[:500])print("\nQ&A Preview:\n", qna[:500]) With the transcript cleaned and divided, you’re ready to feed it into Groq’s LLM. Chunking may be necessary if the text is very long. A good approach is to break it into segments of a few thousand tokens, summarize each part, and then merge the summaries in a final pass. Summarizing with Groq LLM Now that the transcript is clean and split into Prepared Remarks and Q&A, we’ll use Groq to generate a crisp one-pager. The idea is simple: summarize each section separately (for focus and accuracy), then synthesize a final brief. Prompt design (concise and factual) Use a short, repeatable template that pushes for neutral, investor-ready language: You are an equity research analyst. Summarize the following earnings call sectionfor {symbol} ({quarter} {year}). Be factual and concise.Return:1) TL;DR (3–5 bullets)2) Results vs. guidance (what improved/worsened)3) Forward outlook (specific statements)4) Risks / watch-outs5) Q&A takeaways (if present)Text:<<<{section_text}>>> Python: calling Groq and getting a clean summary Groq provides an OpenAI-compatible API. Set your GROQ_API_KEY and pick a fast, high-quality model (e.g., a Llama-3.1 70B variant). We’ll write a helper to summarize any text block, then run it for both sections and merge. import osimport textwrapimport requestsGROQ_API_KEY = os.environ.get("GROQ_API_KEY") or "your_groq_api_key"GROQ_BASE_URL = "https://api.groq.com/openai/v1" # OpenAI-compatibleMODEL = "llama-3.1-70b" # choose your preferred Groq modeldef call_groq(prompt, temperature=0.2, max_tokens=1200): url = f"{GROQ_BASE_URL}/chat/completions" headers = { "Authorization": f"Bearer {GROQ_API_KEY}", "Content-Type": "application/json", } payload = { "model": MODEL, "messages": [ {"role": "system", "content": "You are a precise, neutral equity research analyst."}, {"role": "user", "content": prompt}, ], "temperature": temperature, "max_tokens": max_tokens, } r = requests.post(url, headers=headers, json=payload, timeout=60) r.raise_for_status() return r.json()["choices"][0]["message"]["content"].strip()def build_prompt(section_text, symbol, quarter, year): template = """ You are an equity research analyst. Summarize the following earnings call section for {symbol} ({quarter} {year}). Be factual and concise. Return: 1) TL;DR (3–5 bullets) 2) Results vs. guidance (what improved/worsened) 3) Forward outlook (specific statements) 4) Risks / watch-outs 5) Q&A takeaways (if present) Text: <<< {section_text} >>> """ return textwrap.dedent(template).format( symbol=symbol, quarter=quarter, year=year, section_text=section_text )def summarize_section(section_text, symbol="NVDA", quarter="Q2", year="2024"): if not section_text or section_text.strip() == "": return "(No content found for this section.)" prompt = build_prompt(section_text, symbol, quarter, year) return call_groq(prompt)# Example usage with the cleaned splits from Section 3prepared_summary = summarize_section(prepared, symbol="NVDA", quarter="Q2", year="2024")qna_summary = summarize_section(qna, symbol="NVDA", quarter="Q2", year="2024")final_one_pager = f"""# {symbol} Earnings One-Pager — {quarter} {year}## Prepared Remarks — Key Points{prepared_summary}## Q&A Highlights{qna_summary}""".strip()print(final_one_pager[:1200]) # preview Tips that keep quality high: Keep temperature low (≈0.2) for factual tone. If a section is extremely long, chunk at ~5–8k tokens, summarize each chunk with the same prompt, then ask the model to merge chunk summaries into one section summary before producing the final one-pager. If you also fetched headline numbers (EPS/revenue, guidance) earlier, prepend them to the prompt as brief context to help the model anchor on the right outcomes. Building the End-to-End Pipeline At this point, we have all the building blocks: the FMP API to fetch transcripts, a cleaning step to structure the data, and Groq LLM to generate concise summaries. The final step is to connect everything into a single workflow that can take any ticker and return a one-page earnings call summary. The flow looks like this: Input a stock ticker (for example, NVDA). Use FMP to fetch the latest transcript. Clean and split the text into Prepared Remarks and Q&A. Send each section to Groq for summarization. Merge the outputs into a neatly formatted earnings one-pager. Here’s how it comes together in Python: def summarize_earnings_call(symbol, quarter, year, api_key, groq_key): # Step 1: Fetch transcript from FMP url = f"https://financialmodelingprep.com/api/v3/earning_call_transcript/{symbol}?quarter={quarter}&year={year}&apikey={api_key}" resp = requests.get(url) resp.raise_for_status() data = resp.json() if not data or "content" not in data[0]: return f"No transcript found for {symbol} {quarter} {year}" text = data[0]["content"] # Step 2: Clean and split clean_text = re.sub(r'\s+', ' ', text).strip() if "Question-and-Answer" in clean_text: prepared, qna = clean_text.split("Question-and-Answer", 1) else: prepared, qna = clean_text, "" # Step 3: Summarize with Groq prepared_summary = summarize_section(prepared, symbol, quarter, year) qna_summary = summarize_section(qna, symbol, quarter, year) # Step 4: Merge into final one-pager return f"""# {symbol} Earnings One-Pager — {quarter} {year}## Prepared Remarks{prepared_summary}## Q&A Highlights{qna_summary}""".strip()# Example runprint(summarize_earnings_call("NVDA", 2, 2024, API_KEY, GROQ_API_KEY)) With this setup, generating a summary becomes as simple as calling one function with a ticker and date. You can run it inside a notebook, integrate it into a research workflow, or even schedule it to trigger after each new earnings release. Free Stock Market API and Financial Statements API... Conclusion Earnings calls no longer need to feel overwhelming. With the Financial Modeling Prep API, you can instantly access any company’s transcript, and with Groq LLM, you can turn that raw text into a sharp, actionable summary in seconds. This pipeline saves hours of reading and ensures you never miss the key results, guidance, or risks hidden in lengthy remarks. Whether you track tech giants like NVIDIA or smaller growth stocks, the process is the same — fast, reliable, and powered by the flexibility of FMP’s data. Summarize Any Stock’s Earnings Call in Seconds Using FMP API was originally published in Coinmonks on Medium, where people are continuing the conversation by highlighting and responding to this story
Share
Medium2025/09/18 14:40