Jump to content

Validate External Links: Difference between revisions

→‎How to fix bad links: improving documentation, linking to new reference subpages
(→‎Background: giving specific run times)
(→‎How to fix bad links: improving documentation, linking to new reference subpages)
Line 6: Line 6:
So here's how ValExtLinks helps with this: at 6:20am and 2:20pm (GMT) each day, a script written by [[User:Admin|Alloc]] dumps the wiki's external links table to [http://wiki.oni2.net/w/extlinks.csv this location]. Val, which runs on Iritscen's computer at 3:00pm (GMT) each day, then walks through the table and looks for URLs that return problematic codes such as 404. It also detects other lesser problems with links. Val then makes suggestions for fixing these links.
So here's how ValExtLinks helps with this: at 6:20am and 2:20pm (GMT) each day, a script written by [[User:Admin|Alloc]] dumps the wiki's external links table to [http://wiki.oni2.net/w/extlinks.csv this location]. Val, which runs on Iritscen's computer at 3:00pm (GMT) each day, then walks through the table and looks for URLs that return problematic codes such as 404. It also detects other lesser problems with links. Val then makes suggestions for fixing these links.


==How to fix bad links==
==How to fix link issues==
*In most cases, fixing an [[wp:List_of_Japanese_Latin_alphabetic_abbreviations#N|"NG"]] link will mean finding the desired web page in the Internet Archive's [https://archive.org/web/ Wayback Machine] and linking to that archived page instead. In some cases, an "NG" link will not be recoverable and should be either removed from the page or, if the link was a part of a conversation and it would be confusing for it to be absent, it should be surrounded in nowiki tags [[Special:Diff/16377/26212|like this]] to prevent it from showing up in future reports.
*'''NG''': In most cases, fixing an [[wp:List_of_Japanese_Latin_alphabetic_abbreviations#N|"NG"]] link will mean finding the desired web page in the Internet Archive's [https://archive.org/web/ Wayback Machine] and linking to that archived page instead. In some cases, an NG link will not be recoverable and should be either removed from the page or, if the link was a part of a conversation and it would be confusing for it to be absent, it should be surrounded in nowiki tags [[Special:Diff/16377/26212|like this]] to prevent it from showing up in future reports.
**Val automatically queries the Archive for the latest snapshot of each NG page and will put the returned snapshot URL in its report. Note that you still have to verify this link by clicking on it, as it may not have the correct content. You may have to go further back in the Wayback Machine to find the proper snapshot to use. Sometimes the Archive simply never got around to archiving a given site. In that case, you will need to follow the advice above as to deleting the link or marking it with nowiki tags.
**Val automatically queries the Archive for the latest snapshot of each NG page and will put the returned snapshot URL in its report. Note that you still have to verify this link by clicking on it, as it may not have the correct content. You may have to go further back in the Wayback Machine to find the proper snapshot to use. Sometimes the Archive simply never got around to archiving a given site. In that case, you will need to follow the advice above as to deleting the link or marking it with nowiki tags.
*A link marked as "RD" is redirecting the browser to a new page. The new page should be evaluated, and if it has the content we intended to link to then we should update the link to point to the new location. However, many redirects actually are "soft 404s" and simply redirect the browser to the site's main page. In this case, an RD link needs to be treated like an NG link (see above).
*'''RD''': The site is redirecting the browser to a new page. The new page should be evaluated, and if it has the content we intended to link to then we should update the link to point to the new location. However, many redirects actually are "soft 404s" and simply redirect the browser to the site's main page. In this case, an RD link needs to be treated like an NG link (see above).
*A link marked as "IW" is an external link that could be an [[Help:Editing#Interwiki_links|interwiki link]]. Interwiki links are shorter and more resistant to rot. The suggested interwiki link markup will be given in the report. For Wikipedia, you can add a language code in order to link to a page in a specific language, e.g. <nowiki>[[wp:de:Test]]</nowiki>.
*'''EI''': An external link (bare URL) for a page on our own wiki that could simply be an [[Help:Editing#Intrawiki_links|intrawiki link]]. Sometimes an "external internal" may seem to be necessary, but there's a special wiki feature that allows you to avoid it:
*A link marked as "EI" is an external link to a page on our own wiki that could simply be an [[Help:Editing#Intrawiki_links|intrawiki link]].
**If you want to link to a specific version of a page, which used to require putting the full URL, [http://wiki.oni2.net/w/index.php?title=Oni&oldid=7685 like this]. In fact, there's no need to link to any page at all, as the "ID" of an edit, like the one you see in that sample URL, is unique wiki-wide. All you need to do is supply the revision ID to the Special:Permalink page like this — [[Special:Permalink/7685]] — and you're done.
**Sometimes an "external internal" may seem to be necessary; for instance you may wish to link to a specific version of a page, which used to require putting the full URL, [http://wiki.oni2.net/w/index.php?title=Oni&oldid=7685 like this]. In fact, there's no need to link to any page at all, as the "ID" of an edit, like the one you see in that sample URL, is unique wiki-wide. All you need to do is supply the revision ID to the special Permalink page like this — [[Special:Permalink/7685]] — and you're done.
**If you need to link to a diff between two revisions of a page, or between two different pages, plug the old and new revision numbers into the Special:Diff page like this: [[Special:Diff/21491/21492]] (no need for page names, as explained above).
**Sometimes you need to link to a diff between two revisions of a page, or between two different pages. In this case you plug the old and new revision numbers into the special Diff page like this [[Special:Diff/21491/21492]] (no need for page names as the revision IDs are unique, as explained above).
**If there's no provision like this for replacing a bare URL with a smarter link, see "Exceptions" below to remove the link from the report.
*Some links simply have to be presented the way that they are, and some links return error codes but actually work fine. These links can be added to the [[/Exceptions|exceptions list]] in order to hide them in future reports.
*'''IW''': An external link (bare URL) that could be an [[Help:Editing#Interwiki_links|interwiki link]]. Interwiki links are shorter and more resistant to rot. The suggested interwiki link markup will be given in the report. For foreign-language Wikipedia pages, you can add a language code, e.g. <nowiki>[[wp:de:Test]]</nowiki> for the German version of the page.
*'''(xxx)''': The HTTP response code (see reference [[/HTTP codes|HERE]]).
*'''(000-xx)''': The Unix tool 'curl' did not get an HTTP response code, but instead returned this exit code (see [[/Curl codes|HERE]]). The most common by far is "000-28", a timeout.
 
===Exceptions===
Some links simply have to be presented the way that they are. Some links return error codes but actually work fine. These links can be added to the [[/Exceptions|exceptions list]] in order to hide them in future reports.


==Coming features==
==Coming features==