18,908
edits
(→Exceptions: about the audit) |
(this feature already exists, I just haven't been using it) |
||
Line 4: | Line 4: | ||
While MediaWiki makes it easy to find bad intrawiki links (links to nonexistent pages on our own wiki), marking them in red and providing tools like [[Special:Wantedpages]], there is no automatic check of external (outbound) links. MediaWiki compiles external links into a table, but it does not ping the URLs to see if they give any response. Over the years, many links on our wiki went dead as the Web changed and various file hosts went out of business. ValExtLinks has been used to fix over 1,000 link issues on OniGalore such as 404s and redirects. | While MediaWiki makes it easy to find bad intrawiki links (links to nonexistent pages on our own wiki), marking them in red and providing tools like [[Special:Wantedpages]], there is no automatic check of external (outbound) links. MediaWiki compiles external links into a table, but it does not ping the URLs to see if they give any response. Over the years, many links on our wiki went dead as the Web changed and various file hosts went out of business. ValExtLinks has been used to fix over 1,000 link issues on OniGalore such as 404s and redirects. | ||
Here's how the process works: at 6:20am and 2:20pm (GMT) each day, a script written by [[User:Admin|Alloc]] dumps the wiki's external links table to [https://wiki.oni2.net/w/extlinks.csv this location]. ValExtLinks, which Iritscen runs on his computer periodically, walks through the exported table and looks for URLs that return problematic codes such as 404. It also detects other lesser problems with links. Val then makes suggestions for fixing these links and uploads its report in HTML, RTF and TXT formats to [http://iritscen.oni2.net/val/ | Here's how the process works: at 6:20am and 2:20pm (GMT) each day, a script written by [[User:Admin|Alloc]] dumps the wiki's external links table to [https://wiki.oni2.net/w/extlinks.csv this location]. ValExtLinks, which Iritscen runs on his computer periodically, walks through the exported table and looks for URLs that return problematic codes such as 404. It also detects other lesser problems with links. Val then makes suggestions for fixing these links and uploads its report in HTML, RTF and TXT formats to [http://iritscen.oni2.net/val/ this directory]. A wiki editor can then review the report and act accordingly. | ||
==How to fix link issues== | ==How to fix link issues== | ||
Here are the codes that you'll see on problem links in the report. | Here are the codes that you'll see on problem links in the report. | ||
*'''NG''': In most cases, fixing an | *'''NG''': In most cases, fixing an NG ("no good") link will mean finding the desired web page in the Internet Archive's [https://archive.org/web/ Wayback Machine] and linking to that archived page instead. In some cases, an NG link will not be salvageable and should be either removed from the page or, if the link was a part of a conversation and it would be confusing for it to be absent, it should be surrounded in nowiki tags [[Special:Diff/16377/26212|like this]] to prevent it from showing up in future reports. | ||
**Val automatically queries the Archive for the latest snapshot of each NG page and will put the returned snapshot URL in its report. Note that you still have to verify this link by clicking on it, as it may not have the correct content. You may have to go further back in the Wayback Machine to find the proper snapshot to use. Sometimes the Archive simply never got around to archiving a given site. In that case, you will need to follow the advice above as to deleting the link or marking it with nowiki tags. | **Val automatically queries the Archive for the latest snapshot of each NG page and will put the returned snapshot URL in its report. Note that you still have to verify this link by clicking on it, as it may not have the correct content. You may have to go further back in the Wayback Machine to find the proper snapshot to use. Sometimes the Archive simply never got around to archiving a given site. In that case, you will need to follow the advice above as to deleting the link or marking it with nowiki tags. | ||
**Note: In a typical run of Val across the 3,000+ links on the wiki, 1-3 sites will happen to be offline at the moment or the HTTP packets requesting them will get lost in the Internet. It's best to wait for another Val report to make sure that the URL is really dead before performing any of the above fixes. | **Note: In a typical run of Val across the 3,000+ links on the wiki, 1-3 sites will happen to be offline at the moment or the HTTP packets requesting them will get lost in the Internet. It's best to wait for another Val report to make sure that the URL is really dead before performing any of the above fixes. | ||
Line 24: | Line 24: | ||
In the summary at the bottom of the report, Val will list any exception that didn't have the intended effect because the link is no longer present on the listed page, or because it doesn't return that error code anymore. You can then edit the above exceptions list accordingly. Note that the HTML report only gives the number of issues detected, and the list of issues is found in the RTF and TXT versions of the report. | In the summary at the bottom of the report, Val will list any exception that didn't have the intended effect because the link is no longer present on the listed page, or because it doesn't return that error code anymore. You can then edit the above exceptions list accordingly. Note that the HTML report only gives the number of issues detected, and the list of issues is found in the RTF and TXT versions of the report. | ||
==Source code== | ==Source code== |