martes, 10 de septiembre de 2019

COMPRESSION SYSTEM T.G.V.

COMPRESSION SYSTEM T.G.V.


Use in Cryptography

Starting from a set of n bytes (characters), the first four bytes are taken.
Each byte is broken down into two nibbles (sets of 4 bits), each representing 0 to 15 in decimal, each of the 8 nibbles is taken and compared with the following table:
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111

If we want to express 4 bits with only 3 bits, that is, we can refer to an element of the upper or lower table with 3 bits and one bit or flag for each nibble that indicates 0 or 1.
4
two
one
x
x
x
Therefore we can write from 0 to 7, which will indicate its value plus 1 bit that will be 1 or 0.
At this point we have 8 groups of 3 bits plus 1 byte (8 bits) ordered from 0 to 7 referencing the nibble from 0 to 7 that point.
If we want to represent 3 bits with only 2 bits we find a problem similar to the previous one.
two
one
x
x
Therefore we can write from 0 to 3, which will indicate its value plus 1 bit that will be 1 or 0.
x000
x001
x010
x011
x100
x101
x110
x111

So now my signal bit will indicate to me in which left or right position (the same as before but instead of below or above, left and right) are the two bits of each nibble. Now we have 8 groups of 2 bits and one byte (8 bits) "pointer" (we already have a previous one and this total 2 bytes).
If we want to represent 2 bits with only 1bit we find a problem similar to the previous one.
one
x
Therefore we can write from 0 to 3, which will indicate its value plus 1 bit that will be 1 or 0.
xx00
xx01
xx10
xx11
So now my signal bit will indicate in which left or right position, it will be 0 or 1, and I will keep the 8 bits of the lowest weight (1 byte) and one byte (8 bits) “pointer”. In other words, we have not reduced the number of bytes or nibbles. We have only turned rows by columns.
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
Compression systems.

The existing systems offer several options for lossless schemes, but the example of the dictionary-based ones is interesting, let's start from the above, and suppose that what we do is define the probability that after a 0 another 0 or a 1 will come , that is 6 bits that indicate me and I take 1 single byte
And if a 0 is followed by another 0 in 2 bits it would be 101xxx, if it were a 0 followed by 1 it would be 110xxx and in the best case the 111xxx would indicate a 0 followed by a 1 and followed by 0. Now let's compare the different values ​​and see what we get. But of course we are generating 2 bytes (16 bits) to represent only 1 byte (8 bits), but we have generated 14 zeros and 2 ones, that is to say we have reduced 8 bits to 2 bits plus its position that is 4 bits, but more 1 bit which indicates left or right.
But what if the byte were two equal nibbles? We could represent it by 1 nibble with its position from this nibble, that is to say two nibbles, followed by the same value would indicate 0001 as a position of movement, 0000 is its initial position and the 4 of less weight would indicate the nibble.
Then the tables are simplified:
00 01 10 11 00 01 10 11
Since they are the same, in both nibbles, but what if instead of doing it in the two he did it with the 8 nibbles of 4 bytes and instead of 2 bits put 4, saying where he repeats from 0 to 15.
In the worst case I find 16 bytes to represent 16 nibbles (8 bytes), so I am doubling the size, but in case the bits a, b, c and d were 0 (the worst case), I introduce another 8 bytes (16 nibbles) and at least I have 16 bytes to represent 16 bytes, that is the 1: 1 compression ratio.
But what if we used a dictionary with the possible values ​​of nibbles?

00 00 01
00 00 10
00 00 11
00 01 00
00 01 01
00 01 10
00 01 11
00 10 00
00 10 01
00 10 10
00 10 11
00 11 00
01 00 00
01 00 01
01 00 10
01 00 11
01 01 00
01 01 01
01 01 10
01 01 11
01 10 00
01 10 01
01 10 10
01 10 11
01 11 00
01 11 01
01 1110
01 11 11
Now we are going to make a 'dictionary' with the possible values of a data of 23 pairs of nibbles (ATGC,(Amine,Tiamine,Guamine & Citosine in binary values)), the human genome, which is the information that we will send and will correspond to a wider information of the table stored in the receiver / sender The TGV compressor / decompressor will compress and decompress the information in 'DNA / DNA chains'. This method DOES NOT HAVE LOSS OF INFORMATION LIKE THE ABOVE.
The idea is that the sender and receiver of the message have a table with the 'conversion values' .. snehtuaÆ

No hay comentarios: