I am always thinking about some new, fun projects, which are possible in the afternoon or on a weekend. My last project was a weekend project where I brushed up my Python, Web Mining and Natural Language Processing skills.
This project is a little smaller, perfect for an afternoon. I wanted to practice some more Python and my idea was self-destructing code. So once it runs, the file would be overwritten or gone. Additionally, I got the idea to combine this with learning more about encoding and encrypting data in Python. I sat down for an afternoon and this project was created.
You can find the full code on my Github.
The idea for this project started with creating a self-destructing code. So let’s start with this. The goal was to write a short function that would override the existing code.
To be able to do this, the first step was to get the correct filename. Instead of statically adding a string in the code, I wanted to make it work, even if the name of the file would be changed. I found two options for this:
path = sys.argv- When writing a command-line tool you can add arguments right when executing the script. These arguments are a list that can be accessed with
sys.argv. The very first argument will be the name of the script.
path = __file__- Python offers some special attributes and functions. (If you have worked with Python before, you might have seen this:
__name__ == '__main__'.) One of these attributes is
__file__, which is the variable that contains the pathname of the file.
Knowing which file to write to, I next use the integrated Python function to open and write to the file.
- using the
with .. as ..construction, I do not need to worry about closing the file later.
- I want to open the file in write mode (‘w’), because I want the already existing data to be overwritten.
Self Encoding Code
Since the ‘Self-Destructing Code’ was rather small, I thought I combine this with other interesting programming/Python concepts. I decided to expand the program to also be able to encode itself. I wanted to see how the different encodings are possible in Python.
First, I looked at binary encoding. This transforms text/characters into zeros and ones (base 2). This is achieved the following way:
- Iterate over the string to operate on each character individually:
for x in data
- For each character, get integer representation:
- Format this integer representation to binary:
- The output of the for loop would be a list. However, I wanted to get a String again, so the last step was to join the separate binary representation of the characters again.
Hex encoding transforms text/characters into hex (base 16) numbers. Hex is generally an easier/more readable representation of binary data. Python offers a a function to encode text to hex (
.hex()), if the text is ‘UTF-8’ encoded (
Decimal encoding transforms text/characters into decimal (base 10) numbers. I only transformed letters, but this could be extended to also transform other characters.
The code is similar to the Binary Encoding part, only that instead of using
format() to transform an integer into binary, I immediately make the integer to a string, to combine all the integers to a string.
Base64 is an encoding that allows representing binary data in an ASCII (human-readable) string format. To be able to easily encode data in Base64 in Python, the
base64 library needs to be used.
As already described to encode Base64 the string needs to be in byte format. The function
bytes(data, "utf-8") takes the string and returns a bytes object of that string. (When printing the byte object it will look something like this:
b'The String content'.) Then the library function
base64.b64encode() can be used, with the byte object as a parameter.
Self Encrypting Code
Finally, I wanted to look at some easy encryption options. Since I want to encrypt a rather large text, I didn’t think asymmetric encryption would make a lot of sense. So instead, I looked into ways of encrypting the code with symmetric encryption.
Some of the older, easier and no more secure ciphers are substitution ciphers. One very popular cipher that is based on the Caesar cipher is ROT13. It shifts letters 13 places. The shift for 13 places is based on the fact that the Latin alphabet has 26 letters. So performing ROT13 two times on the same text will give back the original text.
To make this a little more interesting, I wanted to find a way to code ROT13 as a one-liner.
- Iterate over the string to operate on each character individually:
for c in data
- Check if the character is a letter if not the character should not be changed:
if c in abc else c
- If the character is a letter, do the substitution:
abc[(abc.find(c) + 13) % 26]
abc.find(c)looks for the index of the character in the alphabet string
+ 13shifts the character
%26makes sure that the index stays in the range of the alphabet string
abc[idx]then returns the shifted character
- Finally, join the list elements to a String:
I also wanted to look for a little more sophisticated symmetric encryption. I found the ‘Fernet’ packet. - I have not looked too deep into how secure this encryption implementation is since this project does not need very secure encryption.
The first step for encrypting is to generate a key. Then the data can be encrypted with the help of this key. For this packet, I found that the data should be ‘UTF-8’ encoded. For writing the encrypted data as well as the key to a file, I decoded the data again.
I decided to save the key to another file that starts with the same name. To make this possible, I needed to remove the file ending
.py. To achieve this, I spitted the string at the ‘.’:
path.split("."). This returns a list with the file name on position 0 and the file ending on position 1.
Now I created the different functions to replace the original code in the file, I need to give the possibility to choose which function to use.
First, you can see the already mentioned line
if __name__ == '__main__':. This means, when the file is executed (and not imported into another file), then the code after will be executed - like a main function. An alternative would have been to replace the line with
def main(): and then call this main function.
The last Python concept that I integrated into this small code I want to talk about is asking for user input. This is done with the following line
instruction = input("Enter Instruction: ") The parameter for the input function will be shown on the command line. The program waits for an input and then safes the input in the variable
instruction. The input can then be used.
For me, this was a fun little project to learn about and brush up on my Python programming skills. Especially learning more about Python encoding formats as well as changing encoding has been something I have struggled with before.
Even though this project did not have a real purpose, I believe that learning and practising concepts and programming languages in a fun way makes these kinds of projects worth it.
If you have a small project in mind that you want to code, but you don’t see a purpose for it, I encourage you to do it anyway!