Composed by Galina Vitkova
In the field of information retrieval on the web, PageRank has emerged as the primary (and most widely discussed) hyperlink analysis algorithm. But how it works still remains an obscurity to many in the SEO online community.
Nevertheless, regarding to the importance of PageRank it worth trying to examine or analyse how it is calculated. The study is meaningful even if Google keeps the real algorithm of PageRank calculations secret.
In any case PageRank calculations are performed in compliance with The PageRank Algorithm
Let us consider the example consisting of 4 pages: Page A, Page B, Page C and Page D (or simply A, B, C, D having their PageRanks with the same notation). The pages link to each other as shown in the following picture. In the beginning the PageRanks for the pages are unknown, so we’ll just assign „1“ to each page.
It means that the first calculation begins with PageRanks as follows:
A = 1 B = 1 C = 1 D = 1
According to the rules about passing rank, which come out from the mentioned formula, each page passes a part of its PageRank to other pages. So, first we apply the dampening factor “d”, which ensures that a page cannot pass to another page its entire PageRank. Then the remaining value is divided by the number of links outcoming from this page. Finally the entire ranking is summed up and added to each page. In the first table below you see the value of PageRanks passing from one page to another:
A (2 links) = 1*0.85 / 2 = 0.425 | passes 0.425 to B 0.425 to C |
B (1 link) = 1*0.85 = 0.85 | passes 0.85 to C |
C (1 link) = 1*0.85 = 0.85 | passes 0.85 to A |
D (1 link) = 1*0.85 = 0.85 | passes 0.85 to C |
The resulting PageRanks are depicted in the following table below:
A = 1 + 0.85 = 1.85 |
B = 1 + 0.425 = 1.425 |
C = 1 + 0.425+0.85+0.85 = 3.125 |
D = 1 |
So, the next run of calculations begins with:
A = 1.85 B = 1.425 C = 3.125 D = 1
And after performing the same operations it comes to the result as follows:
A = 4.50625 B = 2.9975 C = 5.18625 D = 1
In practice it is necessary to do identical operations 50 to 100 times to guarantee the sufficient accuracy of the iterations.
Here needful to notice that in the first run of the calculations, Page C increases PageRank of Page A. In the next run Page C gets itself an increase in PageRank that is proportional to the new improved PageRank of Page A. It means Page C gets a proportion of its PageRank back to itself. It is PageRank feedback, an essential part of the way how PageRank works.
Links to and from your site
PageRank is the hardest factor to manipulate when optimising your pages. It is both difficult to achieve and more difficult to catch up with.
When trying to optimise your PageRank the following factors should be taken into consideration:
When looking for links to your site, from a purely PageRank point of view, the pages with the highest Toolbar PageRank seem to be the best solution. Nonetheless, it is not truthful.
As more and more people try and get links from only high PageRank sites, it becomes less and less profitable. Thus sites that need to improve their PageRanks should be more receptive and exchange links with sites that have similar interests. Moreover, the number of links on the page linking to you will alter the amount of feedback, etc.
Therefore, maybe the best solution is getting links from sites that seem appropriate and have good quality, regardless of their current PageRank. The quality sites will either help your PageRank now, or will do so in the future.
To consider the best strategy concerning links out from your site, the general rule is: keep PageRank within your own site. Control of feedback by using the internal pages of your site, is much easier than control with the help of links to external pages. It means to make links out from a page on your site that has a low PageRank itself, and which also contains many internal links. Then, when linking out choose those external sites, which do not point to your page with a significant number of links. It will get a better increase in PageRank, in particular due to the power of feedback.
Placing some your links back into your site system rather than letting it go to external links improves PageRanks of your pages. That is why larger sites generally have a better PageRank than smaller ones.
References:
Dear friend of technical English, Do you want to improve your professional English? Do you want at the same time to gain comprehensive information about the Internet and Web?
Subscribe to “Why Technical English“ clicking SIGN ME UP at the top of the sidebar
Is a link analysis algorithm used by the Google Internet search engine. The algorithm assigns a numerical weighting to each element of hyperlinked documents on the World Wide Web with the purpose of “measuring” its relative importance within it. According to the Google theory if Page A links to Page B, then Page A is saying that Page B is an important page. If a page has more important links to it, then its links to other pages also become more important.
PageRank was developed at the Stanford University by Larry Page (thus the term PageRank is after him) and Sergey Brin as part of a research project about a new kind of a search engine. Now the “PageRank” is a trademark of Google. The PageRank process has been patented and assigned to the Stanford University, not to Google. Google has exclusive license rights on this patent from the university. The university received 1.8 million shares of Google in exchange for use of the patent; the shares were sold in 2005 for $336 million.
The first paper about the project, describing PageRank and the initial prototype of the Google search engine, was published in 1998: shortly after, Page and Brin founded the company Google Inc. Even if PageRank now is one of about 200 factors that determine the ranking of Google search results, it continues to provide the basis for all of Google web search tools.
Since 1996 a small search engine called “RankDex” designed by Robin Li has already been exploring a similar strategy for site-scoring and page ranking. This technology was patented by 1999 and was used later by Li when he founded Baidu in China.
There is some basic information, which is needed to know for understanding PageRank.
First, PageRank is a number that only evaluates the voting ability of all incoming (inbound) links to a page.
Second, every unique page of a site that is indexed in Google has its own PageRank.
Third, internal site links interact in passing PageRank to other pages of the site.
Forth, the PageRank stands on its own. It is not tied in with the anchor text of links.
Fifth, there are two values of the PageRank that should be distinguished:
a. PageRank which you can get from the Internet Explorer toolbar (http://toolbar.google.com);
b. Actual or real PageRank that is used by Google for calculation of ranking web pages.
PageRank from the toolbar (sometimes called the Nominal Pagerank) has value from zero to ten. It is not very accurate information about site pages, but it is the only thing that gives you any idea about the value. It is updated approximately once every three months, more or less, while the real PageRank is calculated permanently as the Google bots crawl the web finding new web pages and new backlinks.
Thus, in the following text the term actual PageRank is employed to deal with the actual PageRank value stored by Google, and the term Toolbar PageRank concerns the evaluation of the value that you see on the Google Toolbar.
The Toolbar value is just a representation of the actual PageRank. While real PageRank is linear, Google uses a non-linear graph to show its representation. So on the toolbar, moving from a PageRank of 2 to a PageRank of 3 takes less of an increase than moving from a PageRank of 3 to a PageRank of 4.
This is illustrated by a comparison table (from PageRank Explained by Chris Ridings). The actual figures are kept secret, so for demonstration purposes some guessed figures were used:
If the actual PageRank is between |
The Toolbar Shows |
0.00000001 and 5 6 and 25 25 and 125 126 and 625 626 and 3125 3126 and 15625 15626 and 78125 78126 and 390625 390626 and 1953125 1953126 and infinity | 1 2 3 4 5 6 7 8 9 10 |
Lawrence Page and Sergey Brin have published two different versions of their PageRank algorithm in different papers.
First version (so called the Random Surfer Model) was published on the Stanford research paper titled The Anatomy of a Large-Scale Hypertextual Web Search Engine in 1998:
PR(A) = (1-d) + d(PR(T1)/C(T1) + … + PR(Tn)/C(Tn))
Where PR(A) is the PageRank of page A. d is a damping factor, which is set between 0 and 1, nominally it is set to 0.85. PR(T1) is the PageRank of a site page pointing to page A. C(T1) is the number of outgoing links on page T1.In the second version of the algorithm, the PageRank of page A is given as:
PR(A) = (1-d) / N + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))
Where N is the total number of all pages on the Web.The first model is based on a very simple intuitive concept. The PageRank is put down as a model of user behaviour, where a surfer clicks on links at random. The probability that the surfer visits a page is the page PageRank. The probability that the surfer clicks on one link at the page is given by the number of links at the page. The probability at each page that the surfer will get bored and will jump to another random page is the damping factor d.
The second notation considers PageRank of a page the actual probability for a surfer reaching that page after clicking on many links. The PageRanks then form a probability distribution over web pages, so the sum of all pages PageRanks will be one.
As for calculating PageRank the calculations by means of its first model are easier to compute because the total number of web pages is disregarded.
References:
Dear friend of technical English,
Do you want to improve your professional English?
Do you want at the same time to gain comprehensive information about the Internet and Web?