l, s, x, i, a, c, t, f, b, g, k, u, n, q, h, z, m, w, d, p, o, y, j, r, v, e
A part of why I made my last (and below) post is because I couldn't get this idea of random alphabetical order out of my mind - the order of a, b, c, d... and so on is so fundamental to our existence (especially in a digital world where everything must be filed away and searchable) that trying to work over in my head how to sort titles based off of a random order of our letters was tough.
So instead I worked it around in code.
Python below
Creating a random order of letters was easy (once string and random were imported):
alpha = list(string.ascii_lowercase)Give the script a list to look at (I used some CSV from a this massive Excel file), find the length of the longest list item (movie titles in my case):
random.shuffle(alpha)
listToSort = ['Some', 'List', 'Of', 'Strings]All that's left are some For Loops!
n = len(max(listToSort, key = len))
These took me awhile to work because they are doing the alphabetizing which isn't just simple because any built in 'sort by alphabet' isn't going to work as we aren't using an A = 1, Z = 26 alphabet! This was the trickiest part to try and conceive as I was laying in bed thinking about this last night.
Right. Why did this come to mind in bed, you ask? Well, I was struggling to get to sleep and I figured that I'd do something monotonous and boring (like counting sheep) so I began to count letters and numbers
A, 1. B, 2. C, 3.
and so on.
As I was getting down lower in the ranks (past R is like a dark alley to me seeing as JACK! only requires dipping into the first 11 characters (and a !) to spell), I was thinking about how the first few letters are so rote to me:
A, B, C, D, E;
1, 2, 3, 4, 5.
Those combos are so synonymous and comfortable to me that as I was working further in I started to feel like I didn't really know the letters past the elimenopee turn. Silly I know.
But then something resonated with me - the book I had been reading before bed was talking about not limiting people to just be certain identities and either including them to just be that! or discluding them because their ways are too foreign, scary, or an affront. Those letters that were troubling me began to take on that personality - the people shunned out, the "Marginalized Outsiders" as Nate Collins puts it in All but Invisible. At that point, the deep Southern Baptist in me must have perked up because the first thought I had was Object Lesson!
Anyhoo - the hard part about trying to formulate categorizing by a new alphabet was how to somewhat elegantly sort and place words. First letters only are easy - assign a value and rank 'em. But, as we learned in elementary school, it's not that simple. Some words have more than one letter, what do you do then, huh?
As I was trying to figure out if I could just assign a value to the word and not have to constantly be tracking words and comparing them to figure out who comes in third, I thought that maybe I could use a count - a 'slot' type system where the first letter would really be weighted and it would taper down - for the point of my object lesson I wouldn't need anything too granular - the first five or so letters should give enough variation to work...
Then I realized that wouldn't work. At least not simply as I hoped I could just have a slot multiplier work with the index of the alpha list and work backwards:
3 letters means 3 slots; multipliers of 3, 2, 1 respectively
ABC would be worth 10 points
|
So while that worked for the simple; AAA, AAB, BAA, I realized that something as basic as BAA and ABB would have the same values (9) and that means no worky. Shocker.
Before dozing off I did think that I could just smash the independent indices of the letters together. 'AAAA' would be '1111' and 'BBBB' would be '2222' and ABBA would be '1221'. Perfect! So with that solved, I went off to bed playing a congratulatory 'Dancing Queen' in my head.
So, so good. And Fernando!? Don't get me startedFlash forward to today when I'm putting some code to paper and I forgot all about my just smash the values together! and kept trying to make a calculated value work and of course, it didn't. But then I remembered my idea to put the values together and quickly hit the walls of 'oh, a lot of letters have 2 digits of value but nine of them just have 1 digit...' and the realization that to make this fairly decent, I'd need to look out further than just the first four letters (Heat was the shortest in my sample data) because lots of stuff uses the same first words (Death was represented twice in my 50 title sample set).
But then, just as I was learning about "%02d" % and feeling hopeless it wouldn't work, it did and it's all thanks to strings.
Integers are compared going right to left, yeah? We count in bases up from the decimal - ones, tens, hundreds, etc. but strings! Strings are read left to right! So when comparing 1002 and 10010 as strings, guess which one is bigger? 1002! As they're read left to right, the computer (or at least Python) is going 1, 1... same - keep looking. 0, 0... same... 0, 0 ..SAME, 2, 1... Oooh! Bigger! That's my guy! I'll stop there.
Bless you, quotation marks |
This method works really well right now - I can look out past the end of the word because Python just ignores that you're a dolt and looking for indices that doesn't exist when you do "Fernando"[:42]. Just stick in an If..Else to make sure the index you're looking for exists and if it doesn't give that 'slot' a low value (say 00) and it's off to the races!
listDict = dict()Make a dictionary to store the Title and value of said title to sort later
for listItem in listToSort:
listItemValue = list()
for letter in listItem[:n].lower():
if letter in alpha:
listItemValue.append("%02d" % (alpha.index(letter) + 1))
else:
listItemValue.append("00")
listDict[listItem] = "".join(listItemValue)
sortedList = sorted(listDict, key=listDict.get)
Iterate through your list, iterate through letters of the iteration
Format the index (add one to it so no letters can get the 00) to always be two digits
If the letter (' ', number, punctuation, etc) doesn't exists in alpha, give it the double zero
append, append, append and then smash together that list with 'ol .join
Update the dictionary and sort it.
Mamma Mia. Done.
Yeah, it's simple and I bet it can be more simple but man, it's not the easiest task to sort alphabetically manually.
I'd like to do this in Javascript and embed a stupid little tool on my website.
No comments:
Post a Comment