Today’s post is about one leg of the CIA Triad, Integrity, which means protecting data from unauthorized modification. There are a number of things needed to ensure data integrity, but let’s focus on detection. More specifically, let’s focus on detecting changes to files.
Detecting changes
The easiest way to detect if a file has been changed is to hash it and compare that to a previous hash of the file. Hashing algorithms will detect small changes in files to produce very different outputs. For example, this is the SHA256 hash of Allen Ginsberg’s Howl:
902761fbc2b389b0142d70efddb5ed0b31c4fd7cafc0497a199b98fdc9d0966b
and this is the SHA256 hash of Allen Ginsberg’s Howl with an extra space at the end:
05480bfeb192bffd7a702654173e27f1789bdb0a716e93e0f6855de76c27f906
So, having a hash of a file will allow you to quickly determine if it’s been modified. For this reason, hashing is frequently used in File Integrity Monitoring solutions.
Of course, your choice in hashing algorithm matters. A number of security issues have been discovered with the hashing algorithm MD5; a quick search can give you more details. TL;DR don’t use MD5 for integrity checking, use SHA256.
Use Case
You can use it to ensure the integrity of import documents and files. For instance, when submitting an Incident Response Report, you should always add an additional document providing hashes of every file included with the report. This way, it will be easy to detect if the report or supporting documents have been modified.
Here’s a little script I wrote to quickly hash all files in a directory and create a csv
Script
#!/bin/bash # directory from command line to recursively hash if [ -z "$1" ]; then echo "supply directory with files to hash" exit # input must end in / so the awk command later will function properly. elif [[ $1 = *"/" ]]; then DIR="$1" else DIR="$1/" fi # get basename of directory to name output file SAVETO=$(basename $DIR) # write CSV header echo "File, SHA256" > hashed_$SAVETO.csv # find all files in directory and: # hash it if the file is NOT DS_Store # print filename and its hash to file find $DIR -type f -print0 | xargs -0 openssl sha256 | awk -F \/\/ '$2 !~ /DS_Store/ {print $2}' | awk -F \= '{match($1,"\\)")}{OFS=","}{print substr($1, 0, RSTART-1), $2}' >> hashed_$SAVETO.csv echo "hashes saved to hashed_$SAVETO.csv"
Just include the directory of files as a command line argument and you’re good to go!