This article introduces a novel disk scrubbing framework powered by Mondrian Conformal Prediction (MCP) to optimize maintenance in data storage systems. The approach uses system and storage statistics — including SMART parameters, Background Media Scanning (BMS) data, and CPU/disk utilization metrics — to predict drive health and workload patterns. By turning these predictions into scrubbing frequencies and schedules, the system intelligently prioritizes drives that require attention, thereby reducing downtime, extending disk lifespan, and improving overall storage reliability.This article introduces a novel disk scrubbing framework powered by Mondrian Conformal Prediction (MCP) to optimize maintenance in data storage systems. The approach uses system and storage statistics — including SMART parameters, Background Media Scanning (BMS) data, and CPU/disk utilization metrics — to predict drive health and workload patterns. By turning these predictions into scrubbing frequencies and schedules, the system intelligently prioritizes drives that require attention, thereby reducing downtime, extending disk lifespan, and improving overall storage reliability.

Why Predictive AI Might Be the Future of Disk Hygiene

Abstract and 1. Introduction

  1. Motivation and design goals

  2. Related Work

  3. Conformal prediction

    4.1. Mondrian conformal prediction (MCP)

    4.2. Evaluation metrics

  4. Mondrian conformal prediction for Disk Scrubbing: our approach

    5.1. System and Storage statistics

    5.2. Which disk to scrub: Drive health predictor

    5.3. When to scrub: Workload predictor

  5. Experimental setting and 6.1. Open-source Baidu dataset

    6.2. Experimental results

  6. Discussion

    7.1. Optimal scheduling aspect

    7.2. Performance metrics and 7.3. Power saving from selective scrubbing

  7. Conclusion and References

5. Mondrian conformal prediction for Disk Scrubbing: our approach

In contrast to the conventional studies mentioned above, we propose a novel approach for disk drive scrubbing based on Mondrian conformal prediction to quantitatively assess the health status of disk drives and use it as a metric for selecting drives for scrubbing. Figure 1 shows a high-level overview of the proposed method.

\ Figure 1: Overall approach of Mondrian conformal disk drive scrubbing.

\ The proposed architecture consists of three subsystems. The first subsystem is responsible for collecting storage and system statistics, which includes retrieving disk drive data from the storage array, as well as capturing CPU and disk busy statuses. The second subsystem, referred to as the drive health predictor engine, predicts the health status of the drives. It uses MCP to output a set of ”No concern” drive disks, i.e. unhealthy/dying drives that can be flagged for manual diagnostics by experts (not discussed in this paper) or completely healthy drives that do not need any scrubbing, as well as a set of ”Concern” disks with assigned health scores based on the predictor’s confidence, which then are turned into scrubbing frequencies with the scrubbing frequency indicator. The underlying non-conformity score used is margin error function. The third subsystem is the workload predictor engine, which first predicts the resources’ utilization percentage by taking into account SAR logs[2], and then combine this result with the scrubbing frequencies in order to schedule when and how frequently disk drive scrubbing is performed. Finally, the scrubbing operation is triggered on the storage array based on the scrubbing cycle. In the following subsections, each component of the overall architecture is described in detail.

5.1. System and Storage statistics

The main components of this subsystem are:

\ • SMART: stands for Self-Monitoring, Analysis, and Reporting Technology, and refers to a set of predefined parameters provided by device manufacturers that offer insights into various aspects of a storage device’s performance, including temperature, error rates, reallocated sectors, and more. Each attribute has a threshold value assigned by the manufacturer, indicating the acceptable limit for that parameter. When a parameter exceeds its threshold value, it may indicate a potential issue with the storage device. We use SMART parameters as input features for the drive health predictor engine.

\ • BMS: stands for Background Media Scanning, and is a passive process that differs from disk scrubbing, which actively scans the disk for errors during idle periods without reading or writing data. BMS involves scanning the disk for errors in the background without interrupting normal operations. In our proposed architecture, we also extract this BMS feature, which is a numerical value for the number of times it encounters errors while performing a scan on the same drive, and feed it to the drive health predictor engine.

\ • Disk and CPU busy time: The performance of a drive is heavily dependent on its critical processes, such as data access and write speed. The numeric values range between 1 to 100 in terms of percentage and change over time with a sampling period of 1 hour. These system statistics are extracted from the SAR logs (standard logs for system utilization) and converted into time series data, which can then be used by the workload predictor engine.

\ \

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.

:::


[2] The System Activity Report is a command that provides information about different aspects of system performance. For example, data on CPU usage, memory and paging, interrupts, device workload, network activity, and swap space utilization


:::info Authors:

(1) Rahul Vishwakarma, California State University Long Beach, 1250 Bellflower Blvd, Long Beach, CA 90840, United States (rahuldeo.vishwakarma01@student.csullb.edu);

(2) Jinha Hwang, California State University Long Beach, 1250 Bellflower Blvd, Long Beach, CA 90840, United States (jinha.hwang01@student.csulb.edu);

(3) Soundouss Messoudi, HEUDIASYC - UMR CNRS 7253, Universit´e de Technologie de Compiegne, 57 avenue de Landshut, 60203 Compiegne Cedex - France (soundouss.messoudi@hds.utc.fr);

(4) Ava Hedayatipour, California State University Long Beach, 1250 Bellflower Blvd, Long Beach, CA 90840, United States (ava.hedayatipour@csulb.edu).

:::

\

Piyasa Fırsatı
WHY Logosu
WHY Fiyatı(WHY)
$0.00000001529
$0.00000001529$0.00000001529
0.00%
USD
WHY (WHY) Canlı Fiyat Grafiği
Sorumluluk Reddi: Bu sitede yeniden yayınlanan makaleler, halka açık platformlardan alınmıştır ve yalnızca bilgilendirme amaçlıdır. MEXC'nin görüşlerini yansıtmayabilir. Tüm hakları telif sahiplerine aittir. Herhangi bir içeriğin üçüncü taraf haklarını ihlal ettiğini düşünüyorsanız, kaldırılması için lütfen service@support.mexc.com ile iletişime geçin. MEXC, içeriğin doğruluğu, eksiksizliği veya güncelliği konusunda hiçbir garanti vermez ve sağlanan bilgilere dayalı olarak alınan herhangi bir eylemden sorumlu değildir. İçerik, finansal, yasal veya diğer profesyonel tavsiye niteliğinde değildir ve MEXC tarafından bir tavsiye veya onay olarak değerlendirilmemelidir.

Ayrıca Şunları da Beğenebilirsiniz

Crypto Market Cap Edges Up 2% as Bitcoin Approaches $118K After Fed Rate Trim

Crypto Market Cap Edges Up 2% as Bitcoin Approaches $118K After Fed Rate Trim

The global crypto market cap rose 2% to $4.2 trillion on Thursday, lifted by Bitcoin’s steady climb toward $118,000 after the Fed delivered its first interest rate cut of the year. Gains were measured, however, as investors weighed the central bank’s cautious tone on future policy moves. Bitcoin last traded 1% higher at $117,426. Ether rose 2.8% to $4,609. XRP also gained, rising 2.9% to $3.10. Fed Chair Jerome Powell described Wednesday’s quarter-point reduction as a risk-management step, stressing that policymakers were in no hurry to speed up the easing cycle. His comments dampened expectations of more aggressive cuts, limiting enthusiasm across risk assets. Traders Anticipated Fed Rate Trim, Leaving Little Room for Surprise Rally The Federal Open Market Committee voted 11-to-1 to lower the benchmark lending rate to a range of 4.00% to 4.25%. The sole dissent came from newly appointed governor Stephen Miran, who pushed for a half-point cut. Traders were largely prepared for the move. Futures markets tracked by the CME FedWatch tool had assigned a 96% probability to a 25 basis point cut, making the decision widely anticipated. That advance positioning meant much of the potential boost was already priced in, creating what analysts described as a “buy the rumour, sell the news” environment. Fed Rate Decision Creates Conditions for Crypto, But Traders Still Hold Back Andrew Forson, president of DeFi Technologies, said lower borrowing costs would eventually steer more money toward digital assets. “A lower cost of capital indicates more capital flows into the digital assets space because the risk hurdle rate for money is lower,” he noted. He added that staking products and blockchain projects could become attractive alternatives to traditional bonds, offering both yield and appreciation. Despite the cut, crypto markets remained calm. Open interest in Bitcoin futures held steady and no major liquidation cascades followed the Fed’s decision. Analysts pointed to Powell’s language and upcoming economic data as the key factors for traders before building larger positions. Powell’s Caution Tempers Immediate Impact of Fed Rate Move on Crypto Markets History also suggests crypto rallies after rate cuts often take time. When the Fed eased in Dec. 2024, Bitcoin briefly surged 5% cent before consolidating, with sustained gains arriving only weeks later. This time, market watchers are bracing for a similar pattern. Powell’s insistence on caution, combined with uncertainty around inflation and growth, has kept short-term volatility muted even as sentiment for risk assets improves. BitMine’s Tom Lee this week predicted that Bitcoin and Ether could deliver “monster gains” in the next three months if the Fed continues on an easing path. His view echoes broader expectations that liquidity-sensitive assets will outperform once the cycle gathers pace. For now, the crypto sector has digested the Fed’s move with restraint. Traders remain focused on signals from the central bank’s October meeting to determine whether Wednesday’s step marks the beginning of a broader policy shift or just a one-off adjustment
Paylaş
CryptoNews2025/09/18 13:14
Vitalik Buterin Reveals Ethereum’s Long-Term Focus on Quantum Resistance

Vitalik Buterin Reveals Ethereum’s Long-Term Focus on Quantum Resistance

TLDR Ethereum focuses on quantum resistance to secure the blockchain’s future. Vitalik Buterin outlines Ethereum’s long-term development with security goals. Ethereum aims for improved transaction efficiency and layer-2 scalability. Ethereum maintains a strong market position with price stability above $4,000. Vitalik Buterin, the co-founder of Ethereum, has shared insights into the blockchain’s long-term development. During [...] The post Vitalik Buterin Reveals Ethereum’s Long-Term Focus on Quantum Resistance appeared first on CoinCentral.
Paylaş
Coincentral2025/09/18 00:31
Federal Reserve Officials Forecast 2025 Rate Cuts

Federal Reserve Officials Forecast 2025 Rate Cuts

Detail: https://coincu.com/markets/federal-reserve-2025-rate-cuts/
Paylaş
Coinstats2025/09/18 13:11