mirror of
https://github.com/TheAlgorithms/Python.git
synced 2025-07-06 10:31:29 +08:00
Fix sphinx/build_docs warnings for ciphers (#12485)
* Fix sphinx/build_docs warnings for ciphers * Fix
This commit is contained in:
@ -11,33 +11,31 @@ def decrypt_caesar_with_chi_squared(
|
||||
"""
|
||||
Basic Usage
|
||||
===========
|
||||
|
||||
Arguments:
|
||||
* ciphertext (str): the text to decode (encoded with the caesar cipher)
|
||||
* `ciphertext` (str): the text to decode (encoded with the caesar cipher)
|
||||
|
||||
Optional Arguments:
|
||||
* cipher_alphabet (list): the alphabet used for the cipher (each letter is
|
||||
a string separated by commas)
|
||||
* frequencies_dict (dict): a dictionary of word frequencies where keys are
|
||||
the letters and values are a percentage representation of the frequency as
|
||||
a decimal/float
|
||||
* case_sensitive (bool): a boolean value: True if the case matters during
|
||||
decryption, False if it doesn't
|
||||
* `cipher_alphabet` (list): the alphabet used for the cipher (each letter is
|
||||
a string separated by commas)
|
||||
* `frequencies_dict` (dict): a dictionary of word frequencies where keys are
|
||||
the letters and values are a percentage representation of the frequency as
|
||||
a decimal/float
|
||||
* `case_sensitive` (bool): a boolean value: ``True`` if the case matters during
|
||||
decryption, ``False`` if it doesn't
|
||||
|
||||
Returns:
|
||||
* A tuple in the form of:
|
||||
(
|
||||
most_likely_cipher,
|
||||
most_likely_cipher_chi_squared_value,
|
||||
decoded_most_likely_cipher
|
||||
)
|
||||
* A tuple in the form of:
|
||||
(`most_likely_cipher`, `most_likely_cipher_chi_squared_value`,
|
||||
`decoded_most_likely_cipher`)
|
||||
|
||||
where...
|
||||
- most_likely_cipher is an integer representing the shift of the smallest
|
||||
chi-squared statistic (most likely key)
|
||||
- most_likely_cipher_chi_squared_value is a float representing the
|
||||
chi-squared statistic of the most likely shift
|
||||
- decoded_most_likely_cipher is a string with the decoded cipher
|
||||
(decoded by the most_likely_cipher key)
|
||||
where...
|
||||
- `most_likely_cipher` is an integer representing the shift of the smallest
|
||||
chi-squared statistic (most likely key)
|
||||
- `most_likely_cipher_chi_squared_value` is a float representing the
|
||||
chi-squared statistic of the most likely shift
|
||||
- `decoded_most_likely_cipher` is a string with the decoded cipher
|
||||
(decoded by the most_likely_cipher key)
|
||||
|
||||
|
||||
The Chi-squared test
|
||||
@ -45,52 +43,57 @@ def decrypt_caesar_with_chi_squared(
|
||||
|
||||
The caesar cipher
|
||||
-----------------
|
||||
|
||||
The caesar cipher is a very insecure encryption algorithm, however it has
|
||||
been used since Julius Caesar. The cipher is a simple substitution cipher
|
||||
where each character in the plain text is replaced by a character in the
|
||||
alphabet a certain number of characters after the original character. The
|
||||
number of characters away is called the shift or key. For example:
|
||||
|
||||
Plain text: hello
|
||||
Key: 1
|
||||
Cipher text: ifmmp
|
||||
(each letter in hello has been shifted one to the right in the eng. alphabet)
|
||||
| Plain text: ``hello``
|
||||
| Key: ``1``
|
||||
| Cipher text: ``ifmmp``
|
||||
| (each letter in ``hello`` has been shifted one to the right in the eng. alphabet)
|
||||
|
||||
As you can imagine, this doesn't provide lots of security. In fact
|
||||
decrypting ciphertext by brute-force is extremely easy even by hand. However
|
||||
one way to do that is the chi-squared test.
|
||||
one way to do that is the chi-squared test.
|
||||
|
||||
The chi-squared test
|
||||
-------------------
|
||||
--------------------
|
||||
|
||||
Each letter in the english alphabet has a frequency, or the amount of times
|
||||
it shows up compared to other letters (usually expressed as a decimal
|
||||
representing the percentage likelihood). The most common letter in the
|
||||
english language is "e" with a frequency of 0.11162 or 11.162%. The test is
|
||||
completed in the following fashion.
|
||||
english language is ``e`` with a frequency of ``0.11162`` or ``11.162%``.
|
||||
The test is completed in the following fashion.
|
||||
|
||||
1. The ciphertext is decoded in a brute force way (every combination of the
|
||||
26 possible combinations)
|
||||
``26`` possible combinations)
|
||||
2. For every combination, for each letter in the combination, the average
|
||||
amount of times the letter should appear the message is calculated by
|
||||
multiplying the total number of characters by the frequency of the letter
|
||||
multiplying the total number of characters by the frequency of the letter.
|
||||
|
||||
For example:
|
||||
In a message of 100 characters, e should appear around 11.162 times.
|
||||
| For example:
|
||||
| In a message of ``100`` characters, ``e`` should appear around ``11.162``
|
||||
times.
|
||||
|
||||
3. Then, to calculate the margin of error (the amount of times the letter
|
||||
SHOULD appear with the amount of times the letter DOES appear), we use
|
||||
the chi-squared test. The following formula is used:
|
||||
3. Then, to calculate the margin of error (the amount of times the letter
|
||||
SHOULD appear with the amount of times the letter DOES appear), we use
|
||||
the chi-squared test. The following formula is used:
|
||||
|
||||
Let:
|
||||
- n be the number of times the letter actually appears
|
||||
- p be the predicted value of the number of times the letter should
|
||||
appear (see #2)
|
||||
- let v be the chi-squared test result (referred to here as chi-squared
|
||||
value/statistic)
|
||||
Let:
|
||||
- n be the number of times the letter actually appears
|
||||
- p be the predicted value of the number of times the letter should
|
||||
appear (see item ``2``)
|
||||
- let v be the chi-squared test result (referred to here as chi-squared
|
||||
value/statistic)
|
||||
|
||||
(n - p)^2
|
||||
--------- = v
|
||||
p
|
||||
::
|
||||
|
||||
(n - p)^2
|
||||
--------- = v
|
||||
p
|
||||
|
||||
4. Each chi squared value for each letter is then added up to the total.
|
||||
The total is the chi-squared statistic for that encryption key.
|
||||
@ -98,16 +101,16 @@ def decrypt_caesar_with_chi_squared(
|
||||
to be the decoded answer.
|
||||
|
||||
Further Reading
|
||||
================
|
||||
===============
|
||||
|
||||
* http://practicalcryptography.com/cryptanalysis/text-characterisation/chi-squared-
|
||||
statistic/
|
||||
* http://practicalcryptography.com/cryptanalysis/text-characterisation/chi-squared-statistic/
|
||||
* https://en.wikipedia.org/wiki/Letter_frequency
|
||||
* https://en.wikipedia.org/wiki/Chi-squared_test
|
||||
* https://en.m.wikipedia.org/wiki/Caesar_cipher
|
||||
|
||||
Doctests
|
||||
========
|
||||
|
||||
>>> decrypt_caesar_with_chi_squared(
|
||||
... 'dof pz aol jhlzhy jpwoly zv wvwbshy? pa pz avv lhzf av jyhjr!'
|
||||
... ) # doctest: +NORMALIZE_WHITESPACE
|
||||
|
Reference in New Issue
Block a user