Smartctl Command on Linux with Examples

Modern hard drives and solid-state drives incorporate Self-Monitoring, Analysis, and Reporting Technology (SMART) to track their health and performance metrics. Linux system administrators rely on the smartctl utility to access this critical information, enabling proactive drive maintenance and preventing catastrophic data loss. This comprehensive guide explores smartctl’s capabilities, from basic health checks to advanced diagnostic procedures.

SMART technology continuously monitors various drive attributes, including read error rates, temperature fluctuations, and operational hours. When potential issues arise, SMART provides early warnings that allow administrators to take preventive action before complete drive failure occurs.

Table of Contents

What is SMART Technology and Why It Matters

Self-Monitoring, Analysis, and Reporting Technology represents an embedded feature in modern storage devices that assesses drive health while anticipating potential malfunctions. Every modern HDD and SSD includes SMART capabilities that monitor current status and health through various attributes.

SMART monitoring provides several critical benefits for system administrators. The technology enables early detection of drive problems, allowing for timely data backups and drive replacements. Temperature monitoring prevents overheating damage, while error rate tracking identifies deteriorating drive components before complete failure.

The smartctl utility serves as the primary interface for accessing SMART data on Linux systems. According to the Linux man page, smartctl is a command-line utility designed to perform SMART tasks, including printing error logs and enabling or disabling automatic SMART testing. This powerful tool integrates seamlessly with Linux system administration workflows, providing comprehensive drive monitoring capabilities.

Installing Smartmontools Across Linux Distributions

The smartctl utility comes as part of the smartmontools package, which requires installation on most Linux distributions before use.

Ubuntu and Debian Installation

Ubuntu and Debian users can install smartmontools using the APT package manager. First, update the package database to ensure access to the latest versions:

sudo apt update

Install the smartmontools package with the following command:

sudo apt install smartmontools

Verify successful installation by checking the smartctl version:

smartctl --version

RedHat, CentOS, and Fedora Installation

Red Hat-based distributions use different package managers depending on the version. For older systems using YUM:

sudo yum install smartmontools

For newer systems using DNF:

sudo dnf install smartmontools

After installation, enable and start the smartd service for continuous monitoring:

sudo systemctl enable smartd
sudo systemctl start smartd

Arch Linux Installation

Arch Linux users can install smartmontools using the pacman package manager:

sudo pacman -S smartmontools

The installation process remains consistent across most distributions, with minor variations in package manager commands.

Understanding Smartctl Command Structure and Syntax

The smartctl command follows a straightforward syntax pattern that accommodates various monitoring tasks. The general command structure appears as:

smartctl [options] device

Understanding device paths is crucial for effective smartctl usage. Linux systems typically identify storage devices as /dev/sda, /dev/sdb, /dev/sdc, and so forth. NVMe drives may appear as /dev/nvme0n1, /dev/nvme1n1, etc.

Essential command options provide access to different aspects of drive information:

-h or --help: Displays comprehensive help text
-i or --info: Shows device identity information
-c or --capabilities: Displays device SMART capabilities
-a or --all: Shows all available SMART information
-x or --xall: Displays extended comprehensive information
-H or --health: Checks device SMART health status
-A or --attributes: Shows SMART attributes and values
-l or --log: Lists various SMART logs
-t or --test: Runs SMART self-tests
-s or --smart: Enables or disables SMART functionality

Performing Basic Drive Health Assessments

The most fundamental smartctl operation involves checking overall drive health status. This quick assessment provides immediate insight into drive condition without requiring detailed analysis.

Quick Health Status Check

Execute a basic health check using the -H flag:

sudo smartctl -H /dev/sda

Typical output resembles:

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

A “PASSED” or “OK” status indicates the drive operates within normal parameters. “FAILED” status suggests immediate attention and possible drive replacement.

Comprehensive Device Information

Retrieve detailed device information using the -i flag:

sudo smartctl -i /dev/sda

This command displays model number, serial number, firmware version, and SMART capability status. The output helps verify drive specifications and confirms SMART functionality availability.

Enabling SMART When Disabled

Occasionally, SMART capabilities may be disabled on storage devices. Enable SMART functionality with:

sudo smartctl -s on /dev/sda

Conversely, disable SMART if necessary:

sudo smartctl -s off /dev/sda

Most modern drives ship with SMART enabled by default, but manual enabling ensures full monitoring capability access.

Interpreting SMART Attributes and Values

SMART attributes provide detailed insights into drive performance and health metrics. These attributes track various operational parameters that indicate potential problems before complete failure occurs.

Displaying SMART Attributes

View complete SMART attribute tables using the -A flag:

sudo smartctl -A /dev/sda

The output displays a comprehensive table containing multiple columns:

ID#: Unique identifier for each attribute
ATTRIBUTE_NAME: Descriptive name for the monitored parameter
FLAG: Indicates attribute type and update frequency
VALUE: Current normalized value (higher is generally better)
WORST: Lowest recorded value during drive lifetime
THRESH: Failure threshold (when VALUE drops below this)
TYPE: Pre-fail or Old_age classification
UPDATED: Update frequency (Always or Offline)
WHEN_FAILED: Indicates if/when attribute failed
RAW_VALUE: Actual measured value from drive sensors

Critical Attributes to Monitor

Several SMART attributes deserve special attention for drive health assessment:

Raw_Read_Error_Rate (ID 1): Tracks read errors during normal operations. Increasing values may indicate surface defects or head problems.

Reallocated_Sector_Count (ID 5): Shows sectors moved to spare area due to read/write errors. Any non-zero value warrants attention.

Power_On_Hours (ID 9): Records total operational time. Higher values indicate older drives approaching end-of-life.

Temperature_Celsius (ID 194): Monitors drive operating temperature. Excessive heat accelerates component degradation.

Current_Pending_Sector (ID 197): Indicates sectors awaiting reallocation. Non-zero values suggest developing problems.

Offline_Uncorrectable (ID 198): Tracks sectors that cannot be read during offline scans. Critical indicator of drive failure.

Understanding Attribute Types

SMART attributes fall into two primary categories:

Pre-fail Attributes: Directly predict drive failure when threshold values are exceeded. These attributes require immediate attention when approaching failure thresholds.

Old_age Attributes: Indicate normal wear patterns and aging. While informative, these attributes don’t necessarily predict imminent failure.

Running Comprehensive SMART Self-Tests

SMART self-tests provide proactive drive assessment capabilities that identify potential problems through systematic testing procedures. Different test types offer varying levels of thoroughness and time requirements.

Available Test Types

smartctl supports several self-test variants:

Short Test: Quick diagnostic covering common failure modes. Typically completes within 2-10 minutes and checks mechanical, electrical, and read performance.

Long/Extended Test: Comprehensive surface scan examining entire drive capacity. Duration ranges from tens of minutes to several hours depending on drive size.

Conveyance Test: Specifically designed to detect transportation damage. Available only on ATA devices and usually completes within minutes.

Selective Test: Examines specified LBA (Logical Block Address) ranges. Useful for testing specific drive areas suspected of problems.

Executing Self-Tests

Check estimated test durations before starting:

sudo smartctl -c /dev/sda

This command displays approximate completion times for different test types.

Start a short test in background mode:

sudo smartctl -t short /dev/sda

Launch an extended test for comprehensive analysis:

sudo smartctl -t long /dev/sda

Execute conveyance test for transportation damage assessment:

sudo smartctl -t conveyance /dev/sda

Background vs Foreground Testing

Background testing allows continued system operation during test execution. The drive automatically pauses testing when I/O activity increases, resuming when resources become available.

Foreground testing monopolizes drive resources for faster completion. Use the -C flag for captive mode testing:

sudo smartctl -t short -C /dev/sda

Foreground testing should only be performed when the drive isn’t actively used by other processes.

Monitoring Test Progress

Check ongoing test status:

sudo smartctl -l selftest /dev/sda

This command displays current test progress and results from previous tests. Abort running tests if necessary:

sudo smartctl -X /dev/sda

Accessing Drive Logs and Error Information

SMART logs contain valuable historical data about drive performance, errors, and test results. Analyzing these logs helps identify patterns and recurring issues that may indicate developing problems.

Viewing Error Logs

Access drive error logs using the -l error option:

sudo smartctl -l error /dev/sda

Error logs record read/write failures, seek errors, and other operational problems. Empty error logs typically indicate healthy drive operation, while populated logs suggest potential issues requiring investigation.

Self-Test Result Logs

Review self-test history and results:

sudo smartctl -l selftest /dev/sda

Self-test logs display completion status, test duration, and any errors discovered during testing. Failed tests often include LBA addresses of problematic sectors, helping pinpoint specific drive areas with issues.

Sample self-test log output might show:

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       20%           717           555027747

This output indicates a short test completed with read failure at LBA address 555027747.

Additional Log Types

smartctl provides access to various specialized logs:

Selective test logs: Results from selective LBA range testing
Background scan logs: Continuous monitoring results
Temperature logs: Historical temperature data
Device statistics: Comprehensive operational metrics

Advanced Smartctl Usage and Configuration Options

Advanced smartctl features provide deeper drive analysis capabilities and specialized configurations for complex storage environments.

Comprehensive Information Display

The -a flag generates complete SMART reports including device information, attributes, logs, and test results:

sudo smartctl -a /dev/sda

For even more detailed output, use the -x flag:

sudo smartctl -x /dev/sda

Extended output includes additional vendor-specific information and detailed attribute explanations.

Device Type Specifications

Different storage technologies may require specific device type parameters. Specify device types when automatic detection fails:

sudo smartctl -d ata /dev/sda      # For ATA/SATA drives
sudo smartctl -d scsi /dev/sdb     # For SCSI drives
sudo smartctl -d nvme /dev/nvme0n1 # For NVMe drives

Working with RAID Controllers

RAID environments require special consideration for SMART monitoring. LSI MegaRAID controllers support direct drive access through smartctl:

sudo smartctl -a -d megaraid,N /dev/sdX

Replace N with the device ID from the RAID controller. Use StorCLI to identify device IDs:

sudo storcli /c0 /eall /sall show

USB Drive Considerations

External USB drives often require special handling due to adapter limitations. Many USB-to-SATA adapters don’t properly pass SMART commands, resulting in errors like “unsupported SCSI opcode”.

Practical Examples and Real-World Applications

Understanding smartctl through practical scenarios helps system administrators implement effective drive monitoring strategies.

Daily Health Monitoring Script

Create automated health checking scripts for regular drive assessment:

#!/bin/bash
# Basic drive health check script

DRIVES=("/dev/sda" "/dev/sdb" "/dev/sdc")

for drive in "${DRIVES[@]}"; do
    echo "Checking $drive..."
    health=$(sudo smartctl -H "$drive" | grep "SMART Health Status")
    echo "$drive: $health"
    
    # Check for reallocated sectors
    reallocated=$(sudo smartctl -A "$drive" | grep "Reallocated_Sector_Ct" | awk '{print $10}')
    if [ "$reallocated" -gt 0 ]; then
        echo "WARNING: $drive has $reallocated reallocated sectors"
    fi
done

Temperature Monitoring

Monitor drive temperatures to prevent overheating damage:

sudo smartctl -A /dev/sda | grep -i temperature

Implement temperature alerts when drives exceed safe operating ranges (typically above 50-60°C for mechanical drives).

Identifying Failing Drives

Combine multiple indicators to assess drive health comprehensively:

Check overall health status
Review error logs for patterns
Monitor critical attributes (reallocated sectors, pending sectors)
Run periodic self-tests
Track attribute trends over time

Best Practices for Drive Monitoring and Maintenance

Effective SMART monitoring requires systematic approaches and regular maintenance schedules.

Establishing Monitoring Schedules

Implement regular testing schedules based on drive importance and usage patterns:

Critical systems: Daily health checks, weekly short tests, monthly extended tests
Standard workstations: Weekly health checks, monthly comprehensive tests
Archive storage: Monthly health checks, quarterly extended tests

Interpreting Results and Taking Action

Develop clear criteria for drive replacement decisions:

Any failed SMART health status requires immediate attention
Reallocated sector counts above 5-10 warrant drive replacement planning
Increasing error rates indicate developing problems
Temperature consistently above manufacturer specifications suggests cooling issues

Automated Alerting Systems

Integrate smartctl with monitoring systems for automated alerting:

# Example cron job for daily health checks
0 6 * * * /usr/local/bin/smart-check.sh | mail -s "Daily SMART Report" admin@example.com

Data Backup Strategies

SMART monitoring enables proactive backup scheduling based on drive health trends. Increase backup frequency when drives show early warning signs.

Troubleshooting Common Issues and Solutions

Understanding common smartctl problems helps system administrators resolve monitoring issues effectively.

Permission and Access Issues

Ensure proper permissions for drive access:

sudo smartctl -i /dev/sda

Add users to disk group for regular access without sudo.

RAID Controller Limitations

Some RAID controllers don’t pass SMART data to the operating system. Use controller-specific tools or configure pass-through modes when available.

Device Detection Problems

Use device scanning to identify available drives:

sudo smartctl --scan

This command lists all detected storage devices and appropriate device types.

VPS Manage Service Offer

If you don’t have time to do all of this stuff, or if this is not your area of expertise, we offer a service to do “VPS Manage Service Offer”, starting from $10 (Paypal payment). Please contact us to get the best deal!

r00t