User:Iritscen/PageCountAudit
This page looks into the math that is used by MediaWiki to give various page counts.
Magic words
These first two magic words provide easy answers to the amount of content on the wiki, but are they accurate and useful?
NUMBEROFPAGES: 4,642
MW code says: Simply count all entries in page table.
Iritscen says: Though problematic in the past, this magic word now matches the grand total (see below) of all PAGESINNS counts, including files and redirects. However, we don't really want to display that catch-all number on our main page.
NUMBEROFARTICLES: 901
MW code says: If the count method global is set to 'link', the software gets a distinct count of the entries in the pagelinks table, "pl_from" field, that match those page ids. In other words, it filters out pages that do not link to other pages (the reasoning presumably being that "those aren't real wiki pages" if they're not connecting to anything else). It also filters out redirects. If the method is set to 'comma', it counts all non-blank pages (yes, really).
Iritscen says: Okay, $wgArticleCountMethod has now been set to 'comma'. (Note: As of MW 1.31, "The 'comma' value for $wgArticleCountMethod is no longer supported for performance reasons, and installations with this setting will now work as if it was configured with 'any'." It appears that 'any' will return the same result, counting all pages that are not redirects.)
PAGESINNS, AKA PAGESINNAMESPACE: These counts agree with the number of pages displayed for each namespace on the Special:AllPages page, which provides some much-needed verifiability. However, since Allpages counts redirect pages, that means that PAGESINNS does too. Therefore, we can't use a straight sum of PAGESINNS results as our page count. See final section for the adjusted number.
PAGESINNS breakdown:
Namespace | ID | Page count |
---|---|---|
Media | -2 | <not available> |
Special | -1 | <not available> |
Main | 0 | 749 |
Talk | 1 | 113 |
User | 2 | 123 |
User talk | 3 | 51 |
OniGalore | 4 | 15 |
OniGalore talk | 5 | 2 |
File | 6 | 2,665 |
File talk | 7 | 17 |
MediaWiki | 8 | 39 |
MediaWiki talk | 9 | 1 |
Template | 10 | 131 |
Template talk | 11 | 6 |
Help | 12 | 2 |
Help talk | 13 | 2 |
Category | 14 | 191 |
Category talk | 15 | 6 |
BSL | 100 | 60 |
BSL talk | 101 | 6 |
OBD | 102 | 191 |
OBD talk | 103 | 36 |
AE | 104 | 23 |
AE talk | 105 | 15 |
Oni2 | 108 | 37 |
Oni2 talk | 109 | 18 |
XML | 110 | 125 |
XML talk | 111 | 18 |
All articlespaces (without File) totalled using PAGESINNS: 1686
All talkspaces totalled using PAGESINNS: 274
All contentspaces (as currently defined in $wgContentNamespaces = {0, 2, 100, 102, 104, 108, 110}) totalled using PAGESINNS: 1308
The grand total for all namespaces (including File) is: 4642
Redirects
There were 305 redirects as of 4/5/13 according to Special:ListRedirects.
Redirect breakdown:
- Main: 267
- Talk: 0
- Help: 1
- Help talk: 1
- File: 0
- File talk: 0
- AE: 5
- AE talk: 1
- BSL: 2
- BSL talk: 0
- OBD: 22
- OBD talk: 0
- OniGalore: 3
- Oni2: 3
- Oni2 talk: 0
- User: 0
- User talk: 0
- XML: 0
- XML talk: 0
Conclusion
NUMBEROFPAGES is too broad to be useful, but now that the page-count method is 'comma', I am able to reconcile NUMBEROFARTICLES with PAGESINNS. PAGESINNS in turn reconciles with AllPages, which lists each page onscreen and is thus verifiable by a direct count (which I have done in the past). So to see how the math works out, we can get the directly-verifiable count by using PAGESINNS on all "content" namespaces, and then manually subtracting redirects as counted above.
Namespaces Main, User, BSL, OBD, XML, AE, and Oni2 totaled using PAGESINNS: 1308
Minus redirects that I've counted in those namespaces: 1009
At the time of this writing (4/5/13), my personal total is only two higher than the value returned by NUMBEROFARTICLES, which is certainly an acceptable margin of error. So it looks like NUMBEROFARTICLES is reliable!