One of the ways to solve the problem of transferring large files over the network is to divide the files into blocks of a specified size for batch transfer. This article will show you how to use the tar tool and the split command on Linux to create an archive file and split it into smaller pieces for slow network environments.
First understand the syntax of tar and split commands:
tar options archive-name files
split options file "prefix”
To split a tar file into 10MB files, first create an archive file:
tar -cvjf home.tar.bz2 /home/aaronkilik/Documents/*
Confirm archive file size:
ls -lh home.tar.bz2
Then use the split utility to split the home.Tar.bx2 archive into small chunks of 10MB each:
split -b 10M home.tar.bz2 "home.tar.bz2.part"
ls -lh home.tar.bz2.parta*
The above option -b is used to specify the size of each block, and "home.tar.bz2.part" is the prefix of each block file name created after the split.
If you want to split the ISO image file into multiple parts. Start by creating an archive of the Linux Mint ISO file:
tar -cvzf linux-mint-18.tar.gz linuxmint-18-cinnamon-64bit.iso
Then split the archive file into 200MB smaller files:
ls -lh linux-mint-18.tar.gz
split -b 200M linux-mint-18.tar.gz “ISO-archive.part”
ls -lh ISO-archive.parta*
If you want to split a large file into smaller files, you can pipe tar output to split:
tar -cvzf - wget/* | split -b 150M - "downloads-part"
Confirmation document:
ls -lh downloads-parta*
To check the split file, run the ls command to check whether the split file is correctly created:
ls -lh archive_name_part_*
The above command can list all split files and sizes.
If you want to merge tar files after splitting, you can use the cat command. The Cat command is the most efficient and reliable way to perform merge operations:
cat home.tar.bz2.parta* >backup.tar.gz.joined
When you run the above command, you get all the small pieces you created earlier merged into the original tar archive file of the same size. To verify the integrity of the tar file, use the -tf option:
tar -tf archive_name.tar
A more reliable way is to calculate the original file and the reassembled file checksum, such as MD5 or SHA256, and compare whether it matches. Calculate the MD5 checksum of the original tar file:
md5sum original_archive.tar > original_md5.txt
Calculate the MD5 checksum of the reconstructed tar file:
md5sum archive_name.tar > rebuilt_md5.txt
Compare two MD5 checksum files:
diff original_md5.txt rebuilt_md5.txt
If the output is consistent, the reconstructed file is the same as the original file and is not damaged.
In addition to MD5, you can also use SHA256 or other ways to verify the integrity of the file:
sha256sum original_archive.tar > original_sha256.txt
sha256sum archive_name.tar > rebuilt_sha256.txt
diff original_sha256.txt rebuilt_sha256.txt
The checksum tool is usually in the coreutils package, which is pre-installed in most linux distributions. If the original file checksum is not available, you can at least compare the checksum of the tar file before and after splitting. For critical data, you are advised to check the checksum after splitting and combining the file.