Monday, March 19, 2012

How Does PageRank Really Work?

There is a lot of mysticism on the web about what page rank really is. Let's take a look behind the scenes of Google to see how they actually do it. How do we obtain these secrets? We simply have to do a little reading of the original paper Google's founders wrote on the subject. Here is a snippet from the paper that really gets the idea down:
PageRank can be thought of as a model of user behavior. We assume there is a "random surfer" who is given a web page at random and keeps clicking on links, never hitting "back" but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank. And, the d damping factor is the probability at each page the "random surfer" will get bored and request another random page.
Even that is a bit rocket-sciency but we'll break it up to see what it means. What they are trying to explain here is how does a person find your website at all? PageRank says that there are two possibilities:
  1. Someone can directly type the url into the address bar at the top of your browser - best of luck arranging for that to happen!
  2. Someone can arrive at your page through a link from another page.
Incidentally, Google itself is a web page, so just being listed by Google increases your PageRank, although by just a little, we will see why.

Let's forget about option one and concentrate on the second one.
The probability that the random surfer visits a page is its PageRank.
In other words, your PageRank is how likely it is that someone will reach your webpage if they were just randomly clicking on web page links (essentially a Monkey at a typewriter). Say your visitor started at an incredibly popular website that linked directly to your article. I'm sure you understand that would be very good for you because people can find you easily and most likely will. Why? Because many people go the original website and since there is a link there to you, they may very well come to your site. No rocket science there. This is what PageRank does its best to calculate.

So to get extra PageRank you need to have popular sites linking in to you. You probably knew that before reading this. What might not be so clear is that the more links there are on the site linking to you, the less quality that link is for improving your PageRank. That monkey randomly clicking links has a smaller chance of clicking through to you if there are more links. A prime example here is Google itself. They basically point to every page on the web, so even though they have an enormous PageRank, their link to you is essentially worthless. Now we see why being listed by Google will not significantly increase your PageRank.

The best situation is to have an incredibly popular site point to you and be pointed to by the ONLY link on their site. Fat Chance! This of course means when you are the owner of high PageRank site you have something valuable to offer, namely who you link to. Google did a good job with this one, that makes sense too.

But this is only one half of the story... what can you do with a site to increase your own PageRank. It come back to that explanation from the original PageRank paper.
The probability that the random surfer visits a page is its PageRank.
Now its true that linking out to a random place doesn't directly affect your PageRank. The PageRank is the chance that someone comes to your website, not leaves it. But there is an indirect effect. If someone left your site, wouldn't you want them to come back? Remember we are talking about someone randomly clicking on links here. If you are going to have an outgoing link, the best place to send the random surfer is to another site that is both popular and points back to you!. This is the idea behind BackLinking. But beware, this doesn't actually increase your PageRank, its just the most effective way to keep your PageRank high when having outgoing links.

Why? Here are some scenarios to think about
  1. A high page rank site points to you and you don't point back.
  2. A high page rank site points to you and you do point back.
  3. You point to a high page rank site, then they point to you.
In the first situation, from PageRank's point of view, once someone randomly comes to your site they don't leave. In the second one, they come to your site, you send them back from whence they came, but they might not come back to you at all. The second one is worse than the first. PageRank is the chance that someone is at your site, not someone else's. The third one is the common practice. This way you build your networking and might get the attention of another site, they then link to you. This actually puts you in situation two, it doesn't matter the order of who linked to who as far as PageRank is concerned. Of course, you could then delete your link to them and get into situation one... if you so choose.

No comments: