File Checking With File (Linux)

File Man page

It does not take much for a file / folder structure to become a bit messy, especially if it is a folder containing downloads. One of the common issues that can occur is for file extensions being lost or wrote over by mistake.

For todays post I am going to use the MagPi magazine PDFs I downloaded via wget a few days ago with a script . As you may see in the screen shot I have 30 issues of MagPi magazine downloaded in my Downloads folder.

Downloads folder containing 30 PDFs
Downloads folder containing 30 PDFs

I am going to be using a Linux command called file, which is used to detect and provide the file type. So for “MagPi 50.pdf” it checks and confirms that the file is a PDF document.

Linux file command in action
Linux file command in action

This may not seem impressive as the filename already had the file extension .pdf. If I simulate a misnaming during the download of several files and instead give some of the files the wrong file extensions, e.g. .txt, .jpg, .gif, .png then from a casual glance it looks like the folder contains several PDFs, a text file and some pictures.

Files with wrong file extensions
Files with wrong file extensions

Using the file command along with the filename will still output the correct filetype (i.e. PDF) even for the files with the wrong file extension.

If you have a folder full out files though, you probably do not want to run the command individually against each file. Instead with a few lines you can get Linux to run the command for you against every file in the folder:

filetype checker
filetype checker

I created this script in Nano and named it filetype.sh:

for i in $(ls);

 do

 file $i

 done

And ran it using the command bash filetype.sh

bash filetype.sh in action
bash filetype.sh in action

file correctly identifies that the .txt, .jpg etc. files are actually PDFs and even outputs correctly that the .sh file is a file of ASCII text. If the folder contains compressed files then try running file with the -z or –uncompress option, as this will ask file to try and look into the compressed archive.

file man page
file man page

More about file can be read via the command man file