This article details the multi-step typographic attack pipeline, including Attack Auto-Generation and Attack Augmentation.This article details the multi-step typographic attack pipeline, including Attack Auto-Generation and Attack Augmentation.

Methodology for Adversarial Attack Generation: Using Directives to Mislead Vision-LLMs

2025/10/01 03:00

Abstract and 1. Introduction

  1. Related Work

    2.1 Vision-LLMs

    2.2 Transferable Adversarial Attacks

  2. Preliminaries

    3.1 Revisiting Auto-Regressive Vision-LLMs

    3.2 Typographic Attacks in Vision-LLMs-based AD Systems

  3. Methodology

    4.1 Auto-Generation of Typographic Attack

    4.2 Augmentations of Typographic Attack

    4.3 Realizations of Typographic Attacks

  4. Experiments

  5. Conclusion and References

4 Methodology

Figure 1 shows an overview of our typographic attack pipeline, which goes from prompt engineering to attack annotation, particularly through Attack Auto-Generation, Attack Augmentation, and Attack Realization steps. We describe the details of each step in the following subsections.

4.1 Auto-Generation of Typographic Attack

\ In order to generate useful misdirection, the adversarial patterns must align with an existing question while guiding LLM toward an incorrect answer. We can achieve this through a concept called directive, which refers to configuring the goal for an LLM, e.g., ChatGPT, to impose specific constraints while encouraging diverse behaviors. In our context, we direct the LLM to generate ˆa as an opposite of the given answer a, under the constraint of the given question q. Therefore, we can initialize directives to the LLM using the following prompts in Fig. 2,

\ Figure 1: Our proposed pipeline is from attack generation via directives to augmentation by commands and conjunctions to positioning the attacks and finally influencing inference.

\ Figure 2: Context directive for constraints of attack generation.

\ When generating attacks, we would impose additional constraints depending on the question type. In our context, we focus on tasks of ❶ scene reasoning (e.g., counting), ❷ scene object reasoning (e.g., recognition), and ❸ action reasoning (e.g., action recommendation), as follows in Fig. 3,

\ Figure 3: Template directive for attack generation, and an example.

\ The directives encourage the LLM to generate attacks that influence a Vision-LLM’s reasoning step through text-to-text alignment and automatically produce typographic patterns as benchmark attacks. Clearly, the aforementioned typographic attack only works for single-task scenarios, i.e., a single pair of question and answer. To investigate multi-task vulnerabilities with respect to multiple pairs, we can also generalize the formulation to K pairs of questions and answers, denoted as qi , ai , to obtain the adversarial text aˆi for i ∈ [1, K].

\

:::info Authors:

(1) Nhat Chung, CFAR and IHPC, A*STAR, Singapore and VNU-HCM, Vietnam;

(2) Sensen Gao, CFAR and IHPC, A*STAR, Singapore and Nankai University, China;

(3) Tuan-Anh Vu, CFAR and IHPC, A*STAR, Singapore and HKUST, HKSAR;

(4) Jie Zhang, Nanyang Technological University, Singapore;

(5) Aishan Liu, Beihang University, China;

(6) Yun Lin, Shanghai Jiao Tong University, China;

(7) Jin Song Dong, National University of Singapore, Singapore;

(8) Qing Guo, CFAR and IHPC, A*STAR, Singapore and National University of Singapore, Singapore.

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Market Opportunity
VisionGame Logo
VisionGame Price(VISION)
$0.0000847
$0.0000847$0.0000847
-18.00%
USD
VisionGame (VISION) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Shocking OpenVPP Partnership Claim Draws Urgent Scrutiny

Shocking OpenVPP Partnership Claim Draws Urgent Scrutiny

The post Shocking OpenVPP Partnership Claim Draws Urgent Scrutiny appeared on BitcoinEthereumNews.com. The cryptocurrency world is buzzing with a recent controversy surrounding a bold OpenVPP partnership claim. This week, OpenVPP (OVPP) announced what it presented as a significant collaboration with the U.S. government in the innovative field of energy tokenization. However, this claim quickly drew the sharp eye of on-chain analyst ZachXBT, who highlighted a swift and official rebuttal that has sent ripples through the digital asset community. What Sparked the OpenVPP Partnership Claim Controversy? The core of the issue revolves around OpenVPP’s assertion of a U.S. government partnership. This kind of collaboration would typically be a monumental endorsement for any private cryptocurrency project, especially given the current regulatory climate. Such a partnership could signify a new era of mainstream adoption and legitimacy for energy tokenization initiatives. OpenVPP initially claimed cooperation with the U.S. government. This alleged partnership was said to be in the domain of energy tokenization. The announcement generated considerable interest and discussion online. ZachXBT, known for his diligent on-chain investigations, was quick to flag the development. He brought attention to the fact that U.S. Securities and Exchange Commission (SEC) Commissioner Hester Peirce had directly addressed the OpenVPP partnership claim. Her response, delivered within hours, was unequivocal and starkly contradicted OpenVPP’s narrative. How Did Regulatory Authorities Respond to the OpenVPP Partnership Claim? Commissioner Hester Peirce’s statement was a crucial turning point in this unfolding story. She clearly stated that the SEC, as an agency, does not engage in partnerships with private cryptocurrency projects. This response effectively dismantled the credibility of OpenVPP’s initial announcement regarding their supposed government collaboration. Peirce’s swift clarification underscores a fundamental principle of regulatory bodies: maintaining impartiality and avoiding endorsements of private entities. Her statement serves as a vital reminder to the crypto community about the official stance of government agencies concerning private ventures. Moreover, ZachXBT’s analysis…
Share
BitcoinEthereumNews2025/09/18 02:13
South African lawmakers put Starlink launch on hold over policy clash

South African lawmakers put Starlink launch on hold over policy clash

Elon Musk’s Starlink may face delays in delivering satellite internet to South Africa. Lawmakers are opposing a recent…
Share
Technext2025/12/15 20:31
Logitech G Drops a Wide Array Of New Products And Innovations At Logitech G PLAY 2025

Logitech G Drops a Wide Array Of New Products And Innovations At Logitech G PLAY 2025

Logitech G PLAY 2025 is a live-streamed global gaming event that brings together press, partners, creators, and fans to explore the future of gaming. The array of products and experiences included major innovations across PC and console gaming, esports, sim racing, and streaming tools, along with partnerships with McLaren Racing, NVIDIA and more.
Share
Hackernoon2025/09/18 05:42