User:Iritscen/PageCountAudit

< User:Iritscen
Revision as of 17:55, 5 April 2013 by Iritscen (talk | contribs) (→‎Conclusion: oops, I derped)

Magic words

These first two magic words provide easy answers to the amount of content on the wiki, but are they accurate and useful?

NUMBEROFPAGES: 4,783

MW code says: Simply count all entries in page table.

Iritscen says: Though problematic in the past, this magic word now matches the grand total (see below) of all PAGESINNS counts, including files and redirects. However, we don't really want to display that catch-all number on our main page.

NUMBEROFARTICLES: 904

MW code says: From MW 1.18 on, if the count method global is set to 'link', the software gets a distinct count of the entries in the pagelinks table, "pl_from" field, that match those page ids. In other words, it filters out pages that do not link to other pages (the reasoning presumably being that "those aren't real wiki pages" if they're not connecting to anything else). It also filters out redirects. If the method is set to 'comma', it counts all non-blank pages (yes, really).

Iritscen says: Okay, $wgArticleCountMethod has now been set to 'comma'.

PAGESINNS, AKA PAGESINNAMESPACE: These counts agree with the number of pages displayed for each namespace on the Special:AllPages page, which provides some much-needed verifiability. However, since Allpages counts redirect pages, that means that PAGESINNS does too. Therefore, we can't use a straight sum of PAGESINNS results as our page count. See final section for the adjusted number.

PAGESINNS breakdown:

  • Namespace Media has ID -2, but we can't get a page count
  • Namespace Special has ID -1, but we can't get a page count
  • Namespace Main does not return an ID number, but apparently it's 0, because {{PAGESINNS:0}} returns 756 pages, which agrees with Special:AllPages
  • Namespace Talk has ID 1 and 114 pages
  • Namespace User has ID 2 and 124 pages
  • Namespace User talk has ID 3 and 51 pages
  • Namespace OniGalore has ID 4 and 18 pages
  • Namespace OniGalore talk has ID 5 and 2 pages
  • Namespace File has ID 6 and 2,772 pages
  • Namespace File talk has ID 7 and 18 pages
  • Namespace MediaWiki has ID 8 and 42 pages
  • Namespace MediaWiki talk has ID 9 and 1 pages
  • Namespace Template has ID 10 and 135 pages
  • Namespace Template talk has ID 11 and 6 pages
  • Namespace Help has ID 12 and 3 pages
  • Namespace Help talk has ID 13 and 2 pages
  • Namespace Category has ID 14 and 202 pages
  • Namespace Category talk has ID 15 and 6 pages
  • Namespace BSL has ID 100 and 60 pages
  • Namespace BSL talk has ID 101 and 6 pages
  • Namespace OBD has ID 102 and 189 pages
  • Namespace OBD talk has ID 103 and 37 pages
  • Namespace AE has ID 104 and 23 pages
  • Namespace AE talk has ID 105 and 15 pages
  • Namespace Oni2 has ID 108 and 37 pages
  • Namespace Oni2 talk has ID 109 and 18 pages
  • Namespace XML has ID 110 and 127 pages
  • Namespace XML talk has ID 111 and 19 pages

All articlespaces (without File) totalled using PAGESINNS: 1716

All talkspaces totalled using PAGESINNS: 277

All contentspaces (as currently defined in $wgContentNamespaces = {0, 2, 100, 102, 104, 108, 110}) totalled using PAGESINNS: 1316

The grand total for all namespaces (including File) is: 4783

Redirects

There were 305 redirects as of 4/5/13 according to Special:ListRedirects.

Redirect breakdown:

  • Main: 267
  • Talk: 0
  • Help: 1
  • Help talk: 1
  • File: 0
  • File talk: 0
  • AE: 5
  • AE talk: 1
  • BSL: 2
  • BSL talk: 0
  • OBD: 22
  • OBD talk: 0
  • OniGalore: 3
  • Oni2: 3
  • Oni2 talk: 0
  • User: 0
  • User talk: 0
  • XML: 0
  • XML talk: 0

Conclusion

NUMBEROFPAGES is too broad to be useful, but now that the page-count method is 'comma', I am able to reconcile NUMBEROFARTICLES with PAGESINNS. PAGESINNS in turn reconciles with AllPages, which lists each page onscreen and is thus verifiable by a direct count (which I have done in the past). So to see how the math works out, we can get the directly-verifiable count by using PAGESINNS on all "content" namespaces, and then manually subtracting redirects as counted above.

Namespaces Main, User, BSL, OBD, XML, AE, and Oni2 totaled using PAGESINNS: 1316

Minus redirects that I've counted in those namespaces: 1017

At the time of this writing (4/5/13), my personal total is only two higher than the value returned by NUMBEROFARTICLES, which is certainly an acceptable margin of error. So it looks like NUMBEROFARTICLES is reliable!