What is Shannons Entropy?
Well, Shannon’s Entropy is used for measuring the disarray of bits and information within a certain field. This is useful because modern encryption standards have a quirk called **diffusion** in which the algorithm scrambles the data its encrypting making analysis based attacks impossible. However, it does make it heavily known that the following data is encrypted.
*NOTE: This also goes for compression algorithms for some reason*
How does it work then?
Well the function it is defined by can go something like this:
ƒ(x) = -∑p(x_i)*log_2(p(x_i))
Where the `H(x)` represents as the overall entropy of the given input, `p(xi)` can be identified as the percent of that of unique character `xi` respect to the total input x. Also `log_2()` is used because the measure of bits usually goes in powers of 2.
#include <iostream> // For input-output operations
#include <fstream> // For file handling
#include <bits/stdc++.h> // Includes many standard libraries, primarily used here for mathematical functions
#include <string> // For string manipulations
int main(int argc, char* argv[])
{
// Check if exactly one argument (file path) is provided
if(argc != 2)
{
std::cout << "[-] Usage: " << argv[0] << " {filepath}\n"; // Print usage message
return 1; // Exit with error code
}
// Array to store frequency distribution of bytes and their values
int distribution[256][2] = {0};
double shannons = 0.00; // Variable to store Shannon entropy
// Initialize the first column of the distribution array with byte values (0 to 255)
for(int i = 0; i < 256; ++i)
{
distribution[i][0] = i;
}
// Open the specified file in binary mode
std::ifstream file(argv[1], std::ios::binary);
// Check if the file could not be opened
if(!file.is_open())
{
std::cout << "[-] error opening file!\n"; // Print error message
return 1; // Exit with error code
}
// Move the file pointer to the end to determine the file size
file.seekg(0, std::ios::end);
long length = file.tellg(); // Get the total file size
file.seekg(0, std::ios::beg); // Reset the file pointer to the beginning
char byte; // Variable to store each byte read from the file
// Count the frequency of each byte in the file
while(file.get(byte))
{
distribution[(int)(unsigned char)byte][1] += 1;
}
file.close(); // Close the file after reading
// Print the total file size
std::cout << "Length: " << length << "\n";
// Calculate and display the entropy for each byte
for(int (&x)[2] : distribution) // Loop through the distribution array
{
// Print the byte value, its frequency, and the entropy contribution
std::cout << x[0] << ": " << x[1] << " --> "
<< -(x[1] / (double)length) * (log2((x[1] / (double)length))) << "\n";
// Skip processing if the byte frequency is zero
if(x[1] == 0)
{
continue;
}
// Accumulate the entropy for the byte
shannons -= (x[1] / (double)length) * (log2((x[1] / (double)length)));
}
// Print the total Shannon entropy of the file
std::cout << "Entropy: " << shannons << "\n";
// Calculate and print the likeliness of packing based on Shannon entropy
std::cout << "Likeliness of packing is: " << (shannons / 8.00) * 100 << "%\n";
// 8.00 is used because 8 bits per byte is the maximum entropy value
// Return 0 to indicate successful execution
return 0;
}
What this does is it first gets the frequency of every byte in the file and then to a variable called `shannons` it subtracts the number that character is present in the file by the total file size which is multiplied by log base 2 of the original expression.
Check out more amazing work by LazyLearner on GitHub:
LazyLearner’s GitHub
But does it even work??
This is the output of the `hhh.exe` payload that I previously analyzed.
This is the output of the `hhh.exe` payload under zip compression
Lets Play a Game!!
Its called can you spot which file is encrypted with AES-256-CTR
This is the output of `file1` shown above.
This is the output of `file2` shown above.
If you guessed `file2` you guessed correct!
In conclusion, Shannon’s Entropy can be used to detect if a certain data is encrypted or compressed however it is not fool proof; using basic Xor cipher, Ceaser cipher, or Viginere cipher wont increase the entropy any higher
This is the output when every byte in `hhh.exe` was added by char ‘c’ (literally no difference)
This is the output when `hhh.exe` was encrypted Viginere cipher with a 5 byte long key
(Gave a total of ~8% increase which is not a lot of diffusion compared to 60% by AES)
Code for viginere cipher encryption given down in below(very sloppily made but it works)
#include <iostream> // For input-output operations
#include <fstream> // For file handling
#include <bits/stdc++.h> // Includes many standard libraries, primarily used here for mathematical functions
#include <sstream> // For reading entire file content into a stringstream
#include <string> // For string manipulations
int main(int argc, char* argv[])
{
// Simple "encryption key" (array of characters used to modify the file content)
char code[] = {'h', 'e', 'l', '1', 'o'};
// Check if the correct number of arguments is provided
if(argc != 2)
{
std::cout << "[-] Usage: " << argv[0] << " {filepath}\n"; // Print usage message
return 1; // Exit with error code
}
// Open the specified file in binary mode for reading
std::ifstream file(argv[1], std::ios::binary);
if(!file.is_open())
{
std::cout << "[-] error opening file!\n"; // Print error message
return 1; // Exit with error code
}
// Read the entire file content into a string using a stringstream
std::stringstream buffer;
buffer << file.rdbuf(); // Read the file into the buffer
std::string fileContent = buffer.str(); // Store the file content as a string
file.close(); // Close the input file
// Encrypt the file content using the "code" array
for(int i = 0; i < fileContent.length(); ++i)
{
// Modify each character by adding the corresponding character from the "code" array
// Use the modulo operator to cycle through the "code" array for long files
fileContent[i] += code[i % sizeof(code)];
}
// Open a new file named "let" in binary mode for writing
std::ofstream f0ile("let", std::ios::binary);
f0ile << (fileContent); // Write the encrypted content to the file
f0ile.close(); // Close the output file
}
This blog was written by LazyLearner.
Special thanks for the detailed code examples, clear explanations, and engaging content!
Check out more amazing work by LazyLearner on GitHub:
LazyLearner’s GitHub
Comments are closed.