Support >
  About independent server >
  What are the common commands for diagnosing Linux hard disk problems
What are the common commands for diagnosing Linux hard disk problems
Time : 2024-11-29 15:00:05
Edit : Jtti

Identifying and resolving hard disk bottlenecks is important for keeping your system running smoothly, and they occur when system performance is limited by specific components. Slow disk operations can affect the performance of applications, databases, and even the entire system. What tools and commands are available in Linux to identify hard disk bottlenecks and to solve disk-related problems?

A disk bottleneck occurs when the disk cannot write or write data fast enough to meet system demands, resulting in slow response times, delays, and even, in extreme cases, system crashes. Common causes are disk I/O overload, disk fragmentation, hardware limitations, disk errors, etc.

If you want to find hard disk bottlenecks in Linux, there are a number of Linux commands and tools that can help. iostat, for example, is a command line application that provides device CPU and I/O usage statistics that can help pinpoint disk bottlenecks:

iostat -x 1

The key metrics to look for are: %util: This indicates how long the disk is busy processing requests. If this number is consistently high (over 80-90%), it indicates that the disk is a bottleneck. await: This is the average time (in milliseconds) to complete disk I/O requests. A higher value indicates poorer disk performance. svctm: This represents the average service time for I/O requests. A higher value indicates a longer disk response time.

iotop can also be used to monitor I/O in real time, showing processes and other disk activity to help identify which processes are consuming too much disk bandwidth.

sudo iotop

This displays a list of processes performing disk I/O and I/O read and write statistics. Read/Write: Finds processes with high read or write values. These processes can cause disk bottlenecks. IO priority: Check if any processes are consuming a disproportionate amount of I/O resources. You can use ionice to adjust the priority of processes to manage how they interact with disk I/O.

Using the df command, you can view the disk space usage of all mounted file systems. Approaching full disk space can cause significant slowdowns, especially on root or primary partitions:

DF-H

Make sure the disk is not full, if the disk is more than 85-90% full, it may be slow due to temporary files and insufficient disk space.

Dstat is a comprehensive system resource monitoring tool used to monitor various resource systems, such as disk I/O, and provide a comprehensive overview of system performance in real time:

dstat-dny

Disk read/write: View the peak of disk read/write activity. If you see a lot of persistent disk activity, it may indicate a bottleneck. disk await: shows how long each I/O operation takes. A long wait means a disk bottleneck.

The sar command is a powerful tool for collecting, reporting, and saving information about system activity, making it ideal for historical performance analysis.

sar -d 1 5

tps: Number of transactions per second. A higher value indicates that the disk is processing a large number of I/O requests. kB_read/s and kB_wrtn/s: indicates the rate at which data is read or written. If these numbers are unusually high, it may indicate a bottleneck.

Smartctl is used to check the health status of hard disks by querying the SMART status. Can help us identify physical problems with the disk, such as sector/component failures:

sudo apt install smartmontools

sudo smartctl -a /dev/sda

Reallocated_Sector_Ct: Number of sectors reassigned due to an error. A higher value indicates that the disk may be faulty. Seek_Error_Rate: A high value indicates that the disk may not be able to find data, usually a sign of physical damage.

The lsblk command can list all block devices in the system, such as hard drives and partitions, and obtain useful information about the system storage:

lsblk -o NAME,SIZE,ROTA,TYPE,MOUNTPOINT

Make sure the hard disk or partition is not overloaded by multitasking, SSDS generally provide better performance than HDDS, and overuse of spinning disks can cause performance bottlenecks.

Vmstat is virtual memory statistics, and although vmstat mainly shows memory usage, it also provides instructions on disk I/O operations and how the system handles memory swapping:

vmstat 1

bi (Number of blocks) : Number of blocks read from a disk. bo (blocks out) : indicates the number of blocks written to a disk. si and so (swap in and out) : If these values are high, it indicates that the system is swapping, which may be caused by insufficient RAM and excessive disk usage.

There are many reasons for disk bottlenecks. You need to use monitoring tools to identify and rectify these faults.

JTTI-Defl
JTTI-COCO
JTTI-Selina
JTTI-Ellis
JTTI-Eom