User:Iritscen/PageCountAudit

From OniGalore
< User:Iritscen
Revision as of 17:28, 16 April 2013 by Iritscen (talk | contribs) (+cat)
Jump to navigation Jump to search

Magic words

These first two magic words provide easy answers to the amount of content on the wiki, but are they accurate and useful?

NUMBEROFPAGES: 4,642

MW code says: Simply count all entries in page table.

Iritscen says: Though problematic in the past, this magic word now matches the grand total (see below) of all PAGESINNS counts, including files and redirects. However, we don't really want to display that catch-all number on our main page.

NUMBEROFARTICLES: 901

MW code says: From MW 1.18 on, if the count method global is set to 'link', the software gets a distinct count of the entries in the pagelinks table, "pl_from" field, that match those page ids. In other words, it filters out pages that do not link to other pages (the reasoning presumably being that "those aren't real wiki pages" if they're not connecting to anything else). It also filters out redirects. If the method is set to 'comma', it counts all non-blank pages (yes, really).

Iritscen says: Okay, $wgArticleCountMethod has now been set to 'comma'.

PAGESINNS, AKA PAGESINNAMESPACE: These counts agree with the number of pages displayed for each namespace on the Special:AllPages page, which provides some much-needed verifiability. However, since Allpages counts redirect pages, that means that PAGESINNS does too. Therefore, we can't use a straight sum of PAGESINNS results as our page count. See final section for the adjusted number.

PAGESINNS breakdown:

  • Namespace Media has ID -2, but we can't get a page count
  • Namespace Special has ID -1, but we can't get a page count
  • Namespace Main does not return an ID number, but apparently it's 0, because {{PAGESINNS:0}} returns 749 pages, which agrees with Special:AllPages
  • Namespace Talk has ID 1 and 113 pages
  • Namespace User has ID 2 and 123 pages
  • Namespace User talk has ID 3 and 51 pages
  • Namespace OniGalore has ID 4 and 15 pages
  • Namespace OniGalore talk has ID 5 and 2 pages
  • Namespace File has ID 6 and 2,665 pages
  • Namespace File talk has ID 7 and 17 pages
  • Namespace MediaWiki has ID 8 and 39 pages
  • Namespace MediaWiki talk has ID 9 and 1 pages
  • Namespace Template has ID 10 and 131 pages
  • Namespace Template talk has ID 11 and 6 pages
  • Namespace Help has ID 12 and 2 pages
  • Namespace Help talk has ID 13 and 2 pages
  • Namespace Category has ID 14 and 191 pages
  • Namespace Category talk has ID 15 and 6 pages
  • Namespace BSL has ID 100 and 60 pages
  • Namespace BSL talk has ID 101 and 6 pages
  • Namespace OBD has ID 102 and 191 pages
  • Namespace OBD talk has ID 103 and 36 pages
  • Namespace AE has ID 104 and 23 pages
  • Namespace AE talk has ID 105 and 15 pages
  • Namespace Oni2 has ID 108 and 37 pages
  • Namespace Oni2 talk has ID 109 and 18 pages
  • Namespace XML has ID 110 and 125 pages
  • Namespace XML talk has ID 111 and 18 pages

All articlespaces (without File) totalled using PAGESINNS: 1686

All talkspaces totalled using PAGESINNS: 274

All contentspaces (as currently defined in $wgContentNamespaces = {0, 2, 100, 102, 104, 108, 110}) totalled using PAGESINNS: 1308

The grand total for all namespaces (including File) is: 4642

Redirects

There were 305 redirects as of 4/5/13 according to Special:ListRedirects.

Redirect breakdown:

  • Main: 267
  • Talk: 0
  • Help: 1
  • Help talk: 1
  • File: 0
  • File talk: 0
  • AE: 5
  • AE talk: 1
  • BSL: 2
  • BSL talk: 0
  • OBD: 22
  • OBD talk: 0
  • OniGalore: 3
  • Oni2: 3
  • Oni2 talk: 0
  • User: 0
  • User talk: 0
  • XML: 0
  • XML talk: 0

Conclusion

NUMBEROFPAGES is too broad to be useful, but now that the page-count method is 'comma', I am able to reconcile NUMBEROFARTICLES with PAGESINNS. PAGESINNS in turn reconciles with AllPages, which lists each page onscreen and is thus verifiable by a direct count (which I have done in the past). So to see how the math works out, we can get the directly-verifiable count by using PAGESINNS on all "content" namespaces, and then manually subtracting redirects as counted above.

Namespaces Main, User, BSL, OBD, XML, AE, and Oni2 totaled using PAGESINNS: 1308

Minus redirects that I've counted in those namespaces: 1009

At the time of this writing (4/5/13), my personal total is only two higher than the value returned by NUMBEROFARTICLES, which is certainly an acceptable margin of error. So it looks like NUMBEROFARTICLES is reliable!