Phrases Generator
A.k.a. The Supercalifragilisti script
Cracking 50% of a password list is easy.
Reaching the 60% is nice.
Achieving the 70% requires some work (or patience).
Getting beyond that needs some creative thinking.
Remaining passwords are harder to guess, since most likely they are very long or with a crazy amount of entropy. In the latter case there's nothing we can do: we have to rely on bruteforcing or get some valid wordlist and start generating random rules.
However we know that users tend to create passphrases: we can scrape Twitter to find some interesting word combinations, but sometimes people love to put quotes of their favorite book.
This is why I created the Phrases script.
The script
Phrases will do a very simple thing: given a text, it will create a floating "window" of several words, writing them into a new file.
For example given the first few lines of the Bible, you can have the following result:
In the beginning God created the heaven and the earth. And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.
In the beginning God the beginning God created beginning God created the God created the heaven created the heaven and the heaven and the heaven and the earth
Usage
This is a very light script, with no external library requirements.
You can simply invoke the script with:
python phrases.py [OPTIONS] original_file.txt
Options
Phrases Generator has some options for fine tuning:
-o OUTFILE --outfile OUTFILE Output file (default to phrases.txt) -w WORDS --words WORDS Number of words for each row (default to 4)
Let's see it in action
Honestly I wrote this script just for fun, to check if my hunch was correct. I created 4 different wordlists based on the Bible, using from 4 to 7 words; then I ran Hashcat on a very large password collection (thank you LinkedIn).
I was baffled when I checked the recovered list (first column is the length):
25 withhisstripeswearehealed 23 amnotashamedofthegospel 22 rejoiceinthelordalways 21 wonderswithoutnumbers 20 theearthisthelords11 20 lordofheavenandearth 20 fearnotiwillhelpthee 20 andthewordwaswithgod 19 onehourwiththebeast 19 hardennotyourhearts
This is just a small snippet of the final output, but as you can see I was able to recover words longer than 20 chars!
Tips and Tricks
Words are written without any modifications, preserving the original spaces. This is useful if you want to apply some Hashcat rules.
Here you can find some basic rules that I used in my tests:
# Lowercase everything and remove spaces l@ # Uppercase everything and remove spaces u@ # Capitalize the first word and remove all spaces E@
Please note All lines end with a space, sadly it's hard to spot if you don't copy/paste the rules.
Conclusions
Obviously this shouldn't be your first option when trying to crack a password list: this is something you should use when you're pretty close to toss the towel and give up (or you're really willing to start bruteforcing 16+ chars...).
As usual, you should choose the right text for the right scenario: using the Bible vs the Ashley Madison dump produced no results ;)
Comments: