Network Security Library
Javascript Feeds    RSS Feed    Security Dashboard    SearchSecurity.com
About | Contact | Advertise | Site Map
Print Printer Friendly      PDF PDF Version
intrusion detection E-mail      Save Save This

Application Level Cryptography


{LANG_NAVORIGIN} Encryption
Ashish Anand 07/05/2005



Scenario:

Vulnerability:

My work:

Real time working of all concepts would be demonstrated…

“If I take a letter, lock it in a safe, hide the safe somewhere in New York, then tell you to read the letter, that’s not security. That’s obscurity. On the other hand, if I take a letter and lock it in a safe, and then give you the safe along with the design specifications of the safe and a hundred identical safes with their combinations so that you and the world’s best safecrackers can study the locking mechanism – and you still can’t open the safe and read the letter – that’s security!” [1]


Fig. 1. Basic Cryptography [7]

In addition to providing confidentiality, cryptography is often asked to do other jobs [1]:

Authentication: It should be possible for the receiver of a message to ascertain its origin; an intruder should not be able to masquerade as someone else.

Integrity: It should be possible for the receiver of a message to verify that it has not been modified in transit; in intruder should not be able to substitute a false message for a legitimate one.

Nonrepudiation: A sender should not be able to falsely deny later that he sent a message.

This paper is inclined primarily towards the integrity aspect of a successful cryptosystem.


Algorithms & Keys



A cryptographic algorithm, also called a cipher, is the mathematical function used for encryption and decryption. If the security of an algorithm is based on keeping the way that algorithm works a secret, it is a restricted algorithm. Restricted algorithms are woefully inadequate by today’s standards. A large or changing group of users cannot use them, because every time a user leaves the group, everyone else must switch to a different algorithm. If someone accidentally reveals the secret, everyone must change their algorithm.

Modern cryptography solves this problem with a key. All of the security in key based algorithms is based in the key (or keys); none is based in the details of the algorithm. This means that the algorithm can be published and analyzed. Products using the algorithm can be mass produced. It doesn’t matter if an eavesdropper knows your algorithm; if he doesn’t know your particular key, he can’t read your messages.

“If privacy is outlawed, only outlaws will have privacy…”


Choosing an algorithm



When it comes to evaluating and choosing algorithms, people have several alternatives:
  1. They can chose a published algorithm, based on the belief that a published algorithm has been scrutinized by many cryptographers; if no one has broken the algorithm yet, then it must be pretty good
  2. They can trust a manufacturer, based on the belief that a well-known manufacturer has a reputation to uphold and is unlikely to risk that reputation by selling equipment or programs with inferior algorithms.
  3. They can trust a private consultant, based on the belief that an impartial consultant is best equipped to make a reliable evaluation of different algorithms.
  4. They can trust the government, based on the belief that the government is trustworthy and wouldn’t steer its citizens wrong.
  5. They can write their own algorithms, based on the belief that their cryptographic ability is second-to-none and that they should trust nobody but themselves.
How I decided to design my own algorithm considering the above mentioned:
  1. The DES (Digital Encryption Standard) and IDEA (International Data Encryption Standard) are the most popular patented algorithms around today. I chose a much simpler and thus faster implementation of a combination of what these algorithms do, including whitening, transposition, block and stream ciphering resulting in the evolution of an inherited version of the DES variations (DESX) and IDEA used in PGP. Thus besides originality, elements of something widely published make the project qualitative.
  2. Trusting a manufacturer would mean purchasing a hardware implementation. Practically speaking, one man alone (me); cannot design an implementation of this magnitude, so the second option was ruled out.
  3. Hiring a private consultant seems feasible, but cryptography is not my profession, I haven’t dedicated my life to it either. You’re probably insane if you’re expecting someone to come up with a radical new idea all by himself and I’d rather not say why!
  4. The Indian Government is hardly involved in such projects on a scale comparable to the US or the Europeans. Besides, this project is not intended to be implemented on a scale so large that it would require the intervention of a nations Government.
  5. Writing my own algorithm…sounds fun, though the decision is certainly not based on the belief that my ability is second-to-none and that I don’t trust anyone but myself!
“Don’t worry that you’re reinventing the wheel all over again, that’s what learning is all about…!”


Compression & Encryption



Using a data compression algorithm together with an encryption algorithm makes sense for two reasons:
  1. Cryptanalysis relies on exploiting redundancies in the plaintext; compressing a file before encryption reduces these redundancies
  2. Encryption is time -consuming; compressing a file before encryption speeds up the entire process
The important thing to remember is to compress before encryption. If the encryption algorithm is any good, the cipher-text will not be compressible; it will look like random data. (This makes a reasonable test of an encryption algorithm; if the cipher-text can be compressed, then the algorithm probably isn’t very good).


Table 1. Observations upon encrypting a sample text file without any compression using the program I developed

The catch here is that the input text consisted of the source code of my program, which has about 1000 line breaks. Combine that with multiple repetitions of keywords like cout, printf, get, read, write and so on, and you have more redundancies than one could imagine. More the redundancies, greater is the compression.

In a way, this means that there are no bounds to the range of the input character set (by no bounds, I mean that it ranges across the entire ASCII character set; codes 0-255). Now that’s at least about 200 different symbols for any conventional input stream. But after the second round of my block ciphering algorithm, I’m limiting the output cipher text to an ASCII character set ranging from codes 33 – 132, that’s just about 100 different symbols. This leads to one observation and one question:

There would be a lot of character repetitions in the resulting cipher text. Imagine representing 16,947 (~17,000) characters consisting of 200 different symbols using a character set of just 100 symbols. No wonder the size of the cipher text is almost twice as that of the plain text!

Which one of the following ideal cases would lead to greater compression?
  1. Compressing an input plain text of 200 characters in which each symbol is unique (because it uses the 0-255 range, with some exceptions)
  2. Compressing an input plain text of 200 characters in which at least 100 symbols are unique (33-132) and each symbol occurs exactly twice
If you couldn’t already guess, it’s the latter. Better still, a practical example would verify that symbol occurrences would actually be more than twice, thus leading to even great amounts of compression.

What conventional block ciphers don’t do is limit their output to a smaller character set. And that’s exactly what I’ve done. Thus, I’m purposely compressing after encryption, rather than the other way round. One may argue that if the entire algorithm is intended to be made public, then compressing the data wouldn’t add to security as an eavesdropper could uncompress it to obtain the cipher text with all the redundancies. Now that would be undesirable.

To overcome this issue, I decided to treat the compressed data with a transposition function that would take the private key as an argument. Thus without the key, it would be impossible to uncompress the cipher-text.


IDEA - Block Cipher Algorithm



The first incarnation of the IDEA cipher, by Xuejia Lai and Kames Massey, surfaced in 1990. It was called PES (Proposed Encryption Standard). After subsequent strengthening of the algorithm, it was renamed to IDEA (International Data Encryption Algorithm) in 1992. IDEA is patented in Europe and the United States; the patent is held by Ascom-Tech AG.

IDEA’s key length is 128 bits. Assuming that a brute force attack is the most efficient, it would require 2128 (1038) encryptions to recover the key. Design a chip that can test a billion keys per second and throw a billion of them at the problem, and it will still take 1013 years – that’s longer than the age of the universe. An array of 1024 such chips can find the key in a day, but there aren’t enough silicon atoms in the universe to build such a machine.

This remarkable observation about this algorithm prompted me to devise a software implementation of something similar, but something that would work much faster, since IDEA uses 8 rounds of XOR and modular addition and multiplication operations along with 52 sub keys derived from the 128 bit key. I cut it down to just one round of XOR operations, while using just 2 sub keys derived from a single 10 bit key. Though increasing the key length to about 32 or 64 bits wouldn’t hurt, I shall stick to just one round of XOR operations. (Inspite of this simplification, encryption seems to remain a slow process for comparatively large amounts of data, as well as theoretically much less secure. But my implementation is obviously not intended to aim at achieving a Patent or anything, and the outputs produced from my program will appear secure enough to be implemented on a personal basis as well as within internal networks of an organization).

As already stated, one may keep adding rounds and rounds of complexity to your algorithm, but the power of your cryptosystem depends more on how you generate the key.

Some concepts involved


Transposition Ciphers [Discussed later]

Whitening


Whitening is the name given to the technique of XORing some key material with the input to a block algorithm, and XORing some other key material with the output. This was first done in the DESX variant developed by RSA Data Security, Inc.



Measuring Keyboard Latency


People’s typing patterns are both random and nonrandom. They are nonrandom enough that thy can be used as a means of identification, but they are random enough that they can be used to generate random bits. Here’s what I did:

Measure the time between successive keystrokes, then take the least significant bits of those measurements. These bits are going to be pretty random. This technique may not work on a UNIX system, since the keystrokes pass through filters and other mechanisms before they get to the program, but it will work just fine in a DOS/Windows environment.

Next, find the sum of these random numbers. Divide it by another random number between 1 – 11 generated by the C++ internal randomize() function. If this number is greater than 5, then divide the sum by this number, else multiply the sum with it.

Calculate the 10-bit binary of the result and use it as the private key.

Other sources of obtaining random data [1]: “Random Number Generators in your Compiler are NOT a source of randomness…!”

Statistics


During the program testing phase, the random key generator's results at each run were recorded, based on the typing patterns of a number of different users. 5 different users were made to generate 20 keys each, and the total number of keys (100), were entered into a database for further analysis.

Each user's data set was filtered out to obtain the number of repetitions in the 20 keys generated. The efficiency of my technique was obtained by dividing the number of unique keys generated with the total number of keys generated. Individual efficiencies for each user's data were calculated, along with the average efficiency of the entire key generation algorithm. Some users were made to type different strings each time, while others were asked to monotonously repeat the same string each time.


Table 2. Key Generation Statistics - If all 100 items are filtered, 23 repetitions are found. This implies an average efficiency of 77%


PGP – Pretty Good Privacy



“If all the personal computers in the world—260 million—were put to work on a single PGPencrypted message, it would still take an estimated 12 million times the age of the universe, on average, to break a single message.”

PGP is an e-mail security program written by Phil Zimmermann, based on the IDEA algorithm for encryption of plaintext and uses the RSA Public Key algorithm for encryption of the private key.


Fig. 2. The details of how PGP works, is out of the scope of this document [6]

PGP uses a pass phrase to encrypt your private key on your machine. You use the pass phrase to decrypt and use your private key. A pass phrase should be hard for you to forget and difficult for others to guess. It should be something already firmly embedded in your long-term memory, rather than something you make up from scratch. Why? Because if you forget your pass phrase, the game’s over! Your private key is totally and absolutely useless without your pass phrase and nothing can be done about it. PGP is cryptography that will keep major governments out of your files. It will certainly keep you out of your files, too!

Sounds pretty neat! Top that up with the fact that it’s all OpenSource, which has its own plethora of advantages which we don’t want to get into right now.




More Encryption tutorials and guides













E-Mail Link

Your IP address will be sent with this e-mail
From e-mail to e-mail



12756 Views
4.41/5 Rating
44 Votes
Newest
Highest Rated
Most Viewed
Reference

Javascript Feeds
RSS (New Papers)
Security Dashboard

About SecurityDocs
Advertise
Contact

Valid HTML 4.01!
Valid CSS!


Unless otherwise noted, all paper copyrights are owned by the author. The rest copyright 2003-2005 TechTarget

Privacy : Contact