User:Iritscen/PageCountAudit
Magic words
These first two magic words provide easy answers to the amount of content on the wiki, but are they accurate and useful?
NUMBEROFPAGES: 4,783
MW code says: Simply count all entries in page table.
Iritscen says: Though problematic in the past, this magic word now matches the grand total (see below) of all PAGESINNS counts, including files and redirects. However, we don't really want to display that catch-all number on our main page.
NUMBEROFARTICLES: 904
MW code says: From MW 1.18 on, if the count method global is set to 'link', the software gets a distinct count of the entries in the pagelinks table, "pl_from" field, that match those page ids. In other words, it filters out pages that do not link to other pages (the reasoning presumably being that "those aren't real wiki pages" if they're not connecting to anything else). It also filters out redirects. If the method is set to 'comma', it counts all non-blank pages (yes, really).
Iritscen says: Okay, $wgArticleCountMethod has now been set to 'comma'.
PAGESINNS, AKA PAGESINNAMESPACE: These counts agree with the number of pages displayed for each namespace on the Special:AllPages page, which provides some much-needed verifiability. However, since Allpages counts redirect pages, that means that PAGESINNS does too. Therefore, we can't use a straight sum of PAGESINNS results as our page count. See final section for the adjusted number.
PAGESINNS breakdown:
- Namespace Media has ID -2, but we can't get a page count
- Namespace Special has ID -1, but we can't get a page count
- Namespace Main does not return an ID number, but apparently it's 0, because {{PAGESINNS:0}} returns 756 pages, which agrees with Special:AllPages
- Namespace Talk has ID 1 and 114 pages
- Namespace User has ID 2 and 124 pages
- Namespace User talk has ID 3 and 51 pages
- Namespace OniGalore has ID 4 and 18 pages
- Namespace OniGalore talk has ID 5 and 2 pages
- Namespace File has ID 6 and 2,772 pages
- Namespace File talk has ID 7 and 18 pages
- Namespace MediaWiki has ID 8 and 42 pages
- Namespace MediaWiki talk has ID 9 and 1 pages
- Namespace Template has ID 10 and 135 pages
- Namespace Template talk has ID 11 and 6 pages
- Namespace Help has ID 12 and 3 pages
- Namespace Help talk has ID 13 and 2 pages
- Namespace Category has ID 14 and 202 pages
- Namespace Category talk has ID 15 and 6 pages
- Namespace BSL has ID 100 and 60 pages
- Namespace BSL talk has ID 101 and 6 pages
- Namespace OBD has ID 102 and 189 pages
- Namespace OBD talk has ID 103 and 37 pages
- Namespace AE has ID 104 and 23 pages
- Namespace AE talk has ID 105 and 15 pages
- Namespace Oni2 has ID 108 and 37 pages
- Namespace Oni2 talk has ID 109 and 18 pages
- Namespace XML has ID 110 and 127 pages
- Namespace XML talk has ID 111 and 19 pages
All articlespaces (without File) totalled using PAGESINNS: 1716
All talkspaces totalled using PAGESINNS: 277
All contentspaces (as currently defined in $wgContentNamespaces = {0, 2, 100, 102, 104, 108, 110}) totalled using PAGESINNS: 1316
The grand total for all namespaces (including File) is: 4783
Redirects
There were 305 redirects as of 4/5/13 according to Special:ListRedirects.
Redirect breakdown:
- Main: 267
- Talk: 0
- Help: 1
- Help talk: 1
- File: 0
- File talk: 0
- AE: 5
- AE talk: 1
- BSL: 2
- BSL talk: 0
- OBD: 22
- OBD talk: 0
- OniGalore: 3
- Oni2: 3
- Oni2 talk: 0
- User: 0
- User talk: 0
- XML: 0
- XML talk: 0
Conclusion
NUMBEROFPAGES is too broad to be useful, but now that the page-count method is 'comma', I am able to reconcile NUMBEROFARTICLES with PAGESINNS. PAGESINNS in turn reconciles with AllPages, which lists each page onscreen and is thus verifiable by a direct count (which I have done in the past). So to see how the math works out, we can get the directly-verifiable count by using PAGESINNS on all "content" namespaces, and then manually subtracting redirects as counted above.
Namespaces Main, User, BSL, OBD, XML, AE, and Oni2 totaled using PAGESINNS: 1316
Minus redirects that I've counted in those namespaces: 1017
At the time of this writing (4/5/13), my personal total is only two higher than the value returned by NUMBEROFARTICLES, which is certainly an acceptable margin of error. So it looks like NUMBEROFARTICLES is reliable!