-
Notifications
You must be signed in to change notification settings - Fork 0
Codecs
Stego Suite provides a number of methods for hiding a message within an image, some practical and some less so. These methods are referred to as codecs.
The following codecs are implemented:
- LSB: Least Significant Bit
- Edges: Edge Embedding
- SSDB: Randomly Spaced LSB
- Noise: Embedding in Noise
- Not: Inverted LSB
- Concat: Data Concatenation
Computers store images as a list of numbers, each of which represents a colour. The LSB codec hides data in an image by adding or subtracting one from some of these numbers, so that we can decide whether each one is odd or even. When we want to retrieve the data from the image, the LSB codec can use these tiny bits of information - whether each number is odd or even - to decode our message. This works because even though we only store a tiny amount of data in each number, there are lots and lots of numbers in each image. And because we're only changing each colour by a small amount each way, the visual change to the image is not noticeable when you look at it.
More technically, the LSB codec works by storing data in the least significant bit of every byte of image data. The message is also prefixed with a variable length integer representing the message length, so that we can tell where the message ends (variable length is used to prevent the large block of zeroes that would come with eg. a 64 bit integer from giving the presence of a message away).
You will notice two options when using the LSB codec. The first, "bits", specifies how many bits from each byte to change. This can be anything between one (the default) and eight: higher values mean that more data can be stored in a smaller image, but they make the changes to the image more noticeable. When the second option - "MSB" - is enabled, the codec will store data in the most significant bits instead of least. This intentionally makes the changes to the image more obvious, and is not intended for practical steganography. You can make some pretty cool pictures though :)
To hide the data, this codec uses the edges of objects detected in the image. This aims to ensure that the data cannot be read by someone looking at all the least significant bits in the image.
This method only works if the image has more than one channel, and each channel is represented by one byte per pixel. For it to achieve good results, the image should not be too uniform
The codec starts by splitting an image into its three channels. Then, one channel is chosen, and an edge detection algorithm is applied to it.
Using this image as an example:
In this example, the blue channel will be chosen. This would be the greyscale image of the edges that have been detected in it:
To encode the data, we first need to indicate which channel has been chosen. This is achieved by setting the least significant bit of that channel in pixel (0,0) to one. In the other two channels, it will be set to 0.
Now, the actual encoding of the message starts. The data is saved by modifying the LSB of all the pixels that are detected as edges. The data will only be stored in the other two channels, starting from the top-left and continuing row by row. This results in one channel containing the mask, while the other two channels contain the data
This codec randomly chooses which byte to use to save the next bit of the data using the LSB method. If we used true randomness, there would be no way that the data could be decoded. That's why the use of a password is necessary. It gets hashed and then used as the seed for the random number generation.
One thing to note is that, as the bytes for which the LSB is changed are chosen randomly, the actual number of bits that can be stored does not directly correlate to the number of bits that are actually viable. It would become increasingly difficult to target the last byte that has not already been changed.
As explained above, the LSB encoding stores encoded date in the Least Significant Bit within each byte of the file. However, this process may reveal a problem when using an image on the web. With access to the original file, it is reasonably simple to compare the two files byte-for-byte, revealing any changes made to the file. This can expose the changed bits and in-turn, your encoded message.
This codec hides your message in the same was as LSB, but uses an image of random noise instead. This means that it is impossible to tell which bits have been changed, as the "source image" is completely random.
This is a joke codec. It is not intended to be taken seriously. :p
Unlike the LSB encoding where your message is hidden within an image, the "not" codec hides the image inside another image of your secret message. In a subversion of expectations, your secret message is freely available, and your image data is hidden instead.
There may be issues if your source file is too large.
Although we used text wrapping and scaling to make sure your text fits nicely,i f your message is really large, then it may causing wrapping or rendering issues, or it may just be too small to read.
The data concatenation method appends data (bytes) to the beginning of the image using a custom encoding format. The custom encoding works by mainly using two basic operations which firstly is a bit shift followed by a XOR operation. For more detail, secret[i]
(left shift by 1) (then XOR the result with original) secret[i]
. This data is stored in-between a START and END header bytes so that data in it can be easily identified by the decoding process.
In order to get decode and retrieve extra data within the image, the retriever should make sure their secret's characters consists of alphabet (caps included) and numbers otherwise the secret will be hard make out. This is because the decoding process guesses if the char could be part of the secret hence why it could be hard to read if you not considering the above charset. Once secret is decoded it will be written onto the image itself as well as extracting it.