Turns out I’ve once again read the wrong readings for the week. So, this morning I burned through the 80/20 Rule and the Rich Get Richer, both of which I found to be surprisingly entertaining. I’m serious!
In high school I was the nerdy kid who hanged out with the arty kids. As in, I did best in the maths and science subjects, and found the humanities to be pretentious and overwrought. I didn’t like thinking too much, and for the most part mathematical theories and chemistry and biology are all quantifiable, largely unbreakable systems that are based on rules and processes that don’t change depending on interpretation. There are exceptions, but there are always exceptions. For the most part, the number-y shit was just easier to comprehend, and I appreciated that. Therefore, to have a complex notion, such as why some sites succeed on the web while others fail, expressed using rules was great. Especially when they were mathematical rules. I’m all over that shit!
I mean, look: NORMAL DISTRIBUTIONS!
Ok, that normal isn’t actually relevant to the topic at hand, but you get the idea. Most distributions that have a random varying factor (someone who can English good could explain this better) have a distribution that looks like this: heights, rainfall, stuff along those lines. A normal distribution means that the grand majority of what is being graphed takes place within a certain statistical area. To understand this, you’d have to look into standard deviation and stuff, but for the most part, it doesn’t really allow for much in the way of extremes. That said, unlike what the reading implies it does allow a very, very small chance for something to arise that significantly deviates from the average, but the chances are minuscule.
Apparently, if you measure the number of links per hub in a network, you don’t get a normal distribution. There is no standard deviation, pretty much, which means that you end up with an exponential distribution, i.e.
That means that the decline away from the average is not as steep as in a normal curve, which significantly increases the chances of there being statistics that deviate greatly from the average. There you go. I get this.
In the case of the exponential graph, the number of links is represented in the x-axis, while the number of hubs with that number of links is the y-axis. Simples. Apparently this is because of preferential attachment, or the ‘Rich Get Richer’ idea. Sites with lots of links are more likely to be linked to than sites without a lot of links, so naturally sites will move along the x-axis as they pick up more links, picking up speed as they do it and moving faster and faster down the line. Still, 80% of pages just linger in no-man’s land, but the 20% that seem to make it down the winding x-axis do well for themselves, and props to them.