UESPWiki talk:Upgrade History/Archive 1
This is an archive of past UESPWiki talk:Upgrade History discussions. Do not edit the contents of this page, except for maintenance such as updating links. |
Contents
July 30, 2006 Upgrades
Problems that have been fixed
- Categories are now being sorted properly (i.e., they are taking into account changes made to the underlying templates). See Category talk:Oblivion-Quests.
- It is now possible to use a transclusion as a parameter for a template. See UESPWiki_talk:Editing#Template Problems.
- Special:Statistics is giving a valid page count.
- When browsing with IE7 beta 3, the user menu used to appear on the top left, covered by the site logo, instead of the top right.
- Thumbnails are displaying correctly, and images can now be re-uploaded to the site.
- ParserFunctions are indeed working. The only trick is that unlike Mediawiki and Wikipedia the functions are used without the leading #. There's one test in my sandbox, User:Nephele/Sandbox/1, being used on the page User:Nephele/Sandbox/2.--Nephele 17:58, 31 July 2006 (EDT)
New problems that have cropped up
- Wanted pages Special:Wantedpages isn't recognizing that the pages exist
- Editing a page results in "Sorry! We could not process your edit due to a loss of session data. Please try again. If it still doesn't work, try logging out and logging back in." The suggested solution doesn't work. Previewing will remove the message, and saving works whether the message is there or not. GarrettTalk 01:03, 31 July 2006 (EDT)
- I've seen this message a couple of times now. Both times it was when I'd started editing a page, then been idle long enough to get logged out, logged in again in another window, then went back to editing the page. Each time, just hitting "show preview" a second time made the message go away. So I'm guessing it's triggered by the window thinking that you're logged out (although you're not any more). --Nephele 00:59, 4 August 2006 (EDT)
- Will it go away if we log out/log in? --Aristeo 07:41, 4 August 2006 (EDT)
Test Ignore...--Davetest1 19:53, 31 October 2006 (EST)
October 31, 2006 Upgrades
- A bizarre quirk that I've noticed since the upgrade is that an empty line does not necessarily create a paragraph break any more. See for example, Lore:Frontier, Conquest, or really just about any book, or Template:Book_Info. In some testing I've done, the glitch appears to be triggered by the "center" tags in the Book Info template... everything after those tags in a document has messed up formatting. I'll try look into this some more myself, but I thought I'd post a notice about it in case anyone else knows what's going on. --Nephele 00:59, 2 November 2006 (EST)
-
- The problem is specifically caused by having the "center" open tag and the "/center" closing tag on different lines. I've implemented the easy fix: replace the "center" tag with a "div style=text-align:center" tag, so the above linked pages are now working. To document the problem, I instead set up some examples in my sandbox. My research suggests that this is actually a wiki glitch in Parser.php (and looking at the code "doBlockLevels" near line 2076 it may affect more tags than just center), rather than an issue with a local setting. --Nephele 21:12, 2 November 2006 (EST)
- Special:Undelete does not work when accessed by itself. When accessed with a subpage, for example Special:Undelete/Nirn:Nirn, it does work.
wgAjaxSearch
So far, I don't particularly like the change to use the Ajax search feature that was just added to the site. Personally, I really don't like having the page that I'm viewing replaced by a search box every time I type something into the box. But even more importantly, the ajax search feature doesn't seem to work. It doesn't take into account any of the transparent namespace features that have been added to the site. Therefore, it only tries to match what you type with articles in the main namespace, i.e., 99.99% of the time it has no chance of finding the article that a reader is looking for. Given that it I can't see it ever helping readers, it seems to be a huge number of extra server requests without any apparent benefit. That's my impression right now, at least. --NepheleTalk 02:50, 18 November 2007 (EST)
Proposal for Performance Tweaks
Warning: long rambling post ahead, full of obscure technical details ;)
I've spent some time today researching some ideas on how to improve our site's performance. In particular, while at the book store I impulsively picked up a book titled "High Performance Web Sites" (by Steve Souders; O'Reilly, of course!). It covers more than a few interesting points that I think could be really useful for UESP. Especially since UESP scores an F on the book's performance tests ;) (Using YSlow for those who are interested). A lot of what follows may only be of interest to Daveh and me (for example, the technical details of what configuration lines to add). However, there are also a few points that may be of interest to the community as a whole, in that they may have a visible effect. I've tried to arrange each topic so that the general interest points and any tradeoffs are near the beginning of each topic... just in case any one else is intrepid enough to wade through this :)
First, one basic point of the Souders book is that in addition to the actual "page" that you request (i.e., UESPWiki_talk:Upgrade_History), there are a whole slew of other files (components) that end up also being requested. In particular with the wiki, every single page requests the various wiki logos/wallpaper (4+ images per page), a batch of script files (.js files, 8 per page), and a batch of style sheet files (.css files, 7 per page). These 19+ files are the same for every wiki page that gets requested; some pages will also have a bunch of extra pages (i.e. any icons or images on the page). Huge performance boosts can be gained by optimizing all of those additional components.
Another general point that I learned while reading this book is how caching works, which has puzzled me a few times. Those 19+ fixed wiki files will get cached on your local machine the first time you visit UESP. But those cached files do not automatically get used. Before using them, your browser checks with UESP each and every time to find out whether or not the cached version is the currently valid one. If the cached one is valid, it doesn't need to get it again from UESP; so the file is not re-transferred, saving bandwidth. But just the process of checking means that 20 times as many HTTP requests are being made as necessary. And that checking process becomes a huge obstacle whenever the site is slow: your browser won't show you the page you just downloaded from UESP until the browser has checked on the status of all 19+ components. In particular, whenever you're shown a page with a messed up style sheet/logo/layout, the problem is that your browser couldn't get confirmation that the cached version is OK for use, so your browser just ignores the cache completely.
Based on that, I've got a few ideas about things that can be done. The ideas are basically in decreasing order of usefulness (increasing order of work that needs to be done before moving forward). In other words, I'm really enthusiastic about idea #1 (mod_expires) and think it could be added to the site right away without too much effort. Further down the list, more feedback and/or work will be necessary for the ideas.
So into the gory details....
#1: Use the mod_expires
apache module
I think this alone could make a huge difference, especially for regulars on the site, because it will fix the caching issue I just described: browsers will assume that any cached version of the file is OK for use, and won't double check with UESP first. This is done by giving an expiration date to every file ahead of time, the first time it gets downloaded. Until the expiration date that file will be treated as a valid file.
The tradeoff here is that if the file at UESP does get changed, browsers won't immediately find out about the file change. So:
- This setting will not be used for any of the site's "actual" content (i.e., the wiki pages). It will only be used for the auxiliary files (js, css, images, icons, etc.)
- I'm thinking of starting by setting the expiration period to be just one day. I can't imagine a one day delay in updates to js/css/images being a catastrophe. And hard refreshes can always be used to bypass this setting (which most of us probably use already anyway whenever a change to an image doesn't appear right away).
- Once some other issues get optimized, I think we can get much more agressive with the core mediawiki files (i.e., the ones that only Daveh has permission to access), and set the expiration on those to longer periods (1 month, 1 year?). These files don't ever change except when the software gets upgraded (at which point the wiki software already has mechanisms in place to ensure that these files get updated for all readers, regardless of mod_expires settings). And increasing the expiration date on these universally used files ensures that these performance benefits apply even to readers who only pull up a wiki page once a day or once a week.
Into the technical details.... mod_expires
has to be installed and enabled in apache first and foremost. Then the configuration settings I'm proposing are (cross my fingers I've caught all the typos...):
# images, CSS, etc. that come with Mediawiki (with version ID). Later this can be changed to "access plus 1 year" <FilesMatch "\/w\/skins\/.*\.(gif|png|xcf|js|css)\?\d+?$"> ExpiresDefault "access plus 1 day" </FilesMatch> # images, CSS, etc. that come with Mediawiki (without version ID). Later this can be changed to "access plus 1 month" <FilesMatch "\/w\/skins\/.*\.(gif|png|xcf|js|css)?$"> ExpiresDefault "access plus 1 day" </FilesMatch> # standard site images used on all wiki pages. Later this can be changed to "access plus 1 month" <FilesMatch "\/w\/images\/(Somerights|Mainpage-logo|Wiki|Parchment_bg)\.png$"> ExpiresDefault "access plus 1 day" </FilesMatch> # wiki images (will apply to all images; most useful for icons that get used on a lot of pages...) <FilesMatch "\/w\/images\/.*\.(gif|png|jpe?g)$"> ExpiresDefault "access plus 1 day" </FilesMatch> # customizable js/css <FilesMatch "\/w\/index.php\?.*\.(js|css)\&"> ExpiresDefault "access plus 1 day" </FilesMatch> # Oblivion map and SI map tiles. Once the map tiles are finalized this can be changed to "access plus 1 month" <FilesMatch "\/oblivion\/(simap|map)\/.*\.jpg$"> ExpiresDefault "access plus 1 day" </FilesMatch>
#2: Add the mod_gzip
module
Here the idea is just to reduce the size of files that get transferred by compressing them before sending them out. Smaller files = less bandwidth.
The tradeoff is that compressing the files takes up the server CPU cycles. And right now the site seems to be CPU limited, rather than bandwidth limited. Even so, I think we can get some performance boosts, depending upon how aggressive we want to be (perhaps we want to experiment some to find a good balance). For example:
- Minimal compression, option A(no effect on server CPU): we manually compress and permanently save to hard disk some of the core, fixed files that get downloaded with every request (in particular, the .js and .css files that come with the mediawiki distributions). Just those files add up to 102.6 kB of data that gets sent every time a new reader visits the site. Configure gzip so that those files are the only ones that get compressed. (For comparison, the html for the site's Main Page is 30 kB).
- Minimal compression, option B (almost no effect on server CPU): we allow mod_gzip to automatically create the .gz files on the hard disk, but still only compress .js and .css files. Saves us some manual work and ensures that if the files are updated, the .gz file also gets updated. But it does mean that mod_gzip will be automatically creating files in the web directories (and potentially replacing any existing .gz files). Given that the files of interest are hardly ever going to change, that may be an unnecessary risk.
- Moderate compression (more CPU usage, but better bandwidth): in addition to the .js and .css files we allow some subset of the site's content to be compressed. I'm running out of steam for tonight, so I haven't researched the options here in too much detail. Mainly because my of impression that at the moment, we can't afford any extra CPU. But it's worth remembering that if do free up some CPU, it might be worth experimenting.
So, after getting mod_gzip
installed (from what I've seen, version 1.3.26 or higher is required for some of the following features), the options that I'd suggest starting with, to give us option A above, are:
mod_gzip_can_negotiate Yes mod_gzip_static_suffix .gz AddEncoding gzip .gz mod_gzip_update_static No mod_gzip_item_include file \.js$ mod_gzip_item_include mime ^application/x-javascript$ mod_gzip_item_include file \.css$ mod_gzip_item_include mime ^text/css$ mod_gzip_item_exclude mime ^text/html$ mod_gzip_item_exclude mime ^text/plain$ mod_gzip_item_exclude mime ^httpd/unix-directory$
Notes:
- The include/exclude settings are the opposite of the default values: by default, .js/.css are excluded; html and plain files are included. From what I can tell, the only reason why .js and .css are by default excluded is because of obsolete problems with Netscape 4.
- Someone has to manually create .gz files for the .js and .css files found under uesp/w/skins
- Changing mod_gzip_update_static to Yes changes this from option A to option B
#3: Minifying/combining .js and .css files
This is another (complementary) way of reducing the overhead associated with those 100 kB of .js and .css files that get downloaded all of the time, in addition to compressing. The possible methods to use include:
- Reduce the number of files (currently at 15!)
- Merge content from the wiki (editable) files (such as Mediawiki:Common.js, Mediawiki:Common.css, Mediawiki:Monobook.css) into the core files. The merge would reduce unnecessary overlap, and move the content to a location where it can be gzipped.
- Eliminate all unnecessary characters from the version of the file served out to users (delete comments, whitespace, etc.); the original, legible file would be kept and used for any future updates/modifications. But create a separate "minified" version of the file that gets distributed.
The tradeoffs here are first, it requires some manual work to edit the various files. Second, I'm not too sure we can really reduce the number of files (the number is that high because of the various layers of wiki options)... more research is necessary to see what's possible on file number. But even without changing the number, I'd guess we can get the 100 kB total size of the files down to something far smaller (less than 20 kB, maybe less than 10 kB). That's before compression, another 50% or more will get taken off by compression.
#4: Tweaking the wiki search engine
Okay, for this one I haven't researched the details yet, but let me just throw the idea out there. We know the site is CPU strapped; I know that every time I look at the server logs I see wiki search requests showing up in the logs, and their number seems to be larger whenever the site is being sluggish. Those two facts alone make me suspect that a lot of the site's CPU is being used up by search requests.
What I think could be easily done to reduce the CPU used by searches is make it so that whenever someone hits "search" in the standard search box (the one seen on every wiki page), the search only scans article titles; it does not scan the entire text of the articles. Given the huge number of redirects we know have on the site, I think the chances are that just a search of article titles is likely to pull up the information that most readers are looking for. For those readers who do want to do a full text scan, doing a search once you're already on the search page would still work the way it always has, and would give you both titles and full text.
If it's something others think is worth pursuing, I'd be happy to research the details. See whether there are already wiki options that we can take advantage of (perhaps even just some temporary setting tweaks to quickly see whether searches alone really are a major culprit). Then if necessary tweak the relevant PHP code. I'm already familiar enough with that part of the code (from the work I did with the namespaces) to know that these tweaks are very realistic, and wouldn't even take too much work.
Conclusion
That's my long ramble / attempt to share findings with everyone else. Of course, any feedback on whether any editors are concerned about the tradeoffs would be welcome. Or any feedback on which ideas are worth pursuing in more detail (or which ones seem like hairbrained ideas resulting from a day of over-immersion in performance books). Also, does anyone have any other ideas on things to tackle that might have a big effect on site performance? For the record, I know Daveh has been working on implementing some type of squid cache system, which should be another way to get some big performance boosts (I just don't know the details myself!). --NepheleTalk 01:22, 10 January 2008 (EST)
-
- I've enabled mod_expires and made most of the requested settings. The first regex gave me an error and I'm too tired to bother to figure it out atm. A quick check with FireBug doesn't seem to show any expires header for the content that should have it (the mod_expires is working from other tests). -- Daveh 23:19, 23 January 2008 (EST)
-
-
- Thanks, Daveh! (Assuming there are no other anonymous editors who'd be able to modify our apache settings!) I think I found the typo in the first entry; delete the last ?:
-
# images, CSS, etc. that come with Mediawiki (with version ID). Later this can be changed to "access plus 1 year" <FilesMatch "\/w\/skins\/.*\.(gif|png|xcf|js|css)\?\d+$"> ExpiresDefault "access plus 1 day" </FilesMatch>
-
-
- Another chunk to add in would probably be
-
# images used by the forums <FilesMatch "phpbb\/templates\/.*\.(gif|png|jpg)$"> ExpiresDefault "access plus 1 month" </FilesMatch>
Minor Tweaks
This time around, just some minor things that can be done:
- Copy w/images/Wiki.png to a non-images location, e.g., to w/skins/common/images/UESPlogo.png
(Make it so that someone can't accidentally or maliciously overwrite our site's logo just by uploading a new image to Image:Wiki.png) - Set $wgLogo to "{$wgStylePath}/common/images/UESPlogo.png" (or whatever corresponds to new logo location/name)
(Make it so that the logo is actually treated as the site logo instead of the old image at wiki.png; once that change is made I can change the Mediawiki files so that there are no longer two logos on every page). - Copy w/images/Parchment_bg.jpg to w/skins/monobook/headbg.jpg (overwriting the existing file; you may want to rename the original first)
(Same concept as for logo, but with the page background image) - Set $wgEnableSidebarCache = true;
(cache the links shown in the righthand sidebar and reduce overhead of re-creating sidebar for every page) - Install the ProtectSection extension
(based on discussion at UESPWiki:Community Portal/User Page Warnings) - Install the ExpandTemplates extension
(creates a useful special page for recursively expanding templates)
I'll add more as I think of them ;) --NepheleTalk 12:46, 10 January 2008 (EST)
- Could I add an upgrade to the latest version of the API to the list? The newer versions [1] have several more functions that could be of use, and we seem to have fallen a few versions behind. –Rpeh•T•C•E• 12:17, 15 January 2008 (EST)
-
- The API is bundled in with the entire mediawiki package, and gets upgraded every time we upgrade mediawiki as a whole. Replacing just the one component without a complete package upgrade seems a bit strange to me. It seems like a safe bet that none of our wiki software counts on API functions that don't exist in our API, given that they were all released together as one package; our software definitely wouldn't be calling API functions that hadn't been invented yet at the time the software was written. Also, it's not just a matter of replacing the file "api.php": that file is a minimal length bit of scripting that just collects information on the functions made available by all the other wiki components.
-
- Overall, yes, we are one version behind the current mediawiki distribution (we're using 1.10.0; MediaWiki has 1.11.0 as the current version). But given that mediawiki automatically releases a new "stable branch" every quarter, being one quarter behind isn't too bad ;) And with all of our customizations to the software, upgrading the wiki software requires a bit more than a single mouse click. --NepheleTalk 13:03, 15 January 2008 (EST)
More minor tweaks. Today's focus being to clear out the apache error logs which are swamped with error messages for requests for non-existent files. Besides the wasted overhead of unnecessary requests and unnecessary error logging, getting these errors out of the way would also make it much easier to notice any legitimate errors that might be popping up in the logs.
- requests for subSilver/images//: There are hundreds of requests per hour for this non-existent image, all being generated by the forum's style sheets (specifically various components with "background-image: url(templates/subSilver/images/);"). Two possible ways to fix it:
- option A: add a "transparent.gif" image to the subSilver/images directory; then in the forum admin/theme software change all empty background image settings (should be th_class1, th_class2, and th_class3... or at least that's what they get saved to) to "transparent.gif".
- option B: edit the phpbb template files, specifically "overall_header.tpl" and "simple_header.tpl": change any line that currently refers to the variables T_TH_CLASS1, T_TH_CLASS2, or T_TH_CLASS3 to read "background-image: none;" (Overall, this seems better for the site. But it does require editing the php software).
- requests for eqwiki/favicon.ico: Perhaps just bad timing when I was viewing the logs, but there are more than a few of these showing up, too. One easy fix: copy for example everquest's favicon to eqwiki/favicon.ico.
- requests for siteicon.ico: There are requests looking for this file in various directories from the old site (morrow/hints/images/siteicon.ico, morrow/map/images/siteicon.ico, morrow/quest/images/siteicon.ico, etc.) Fix it by just linking images/siteicon.ico to all the other images subdirectories?
If you'd like me to help with any of these edits, just let me know. Most of them involve write access to various directories, of course, which is why I can't do them right now ;) --NepheleTalk 18:46, 16 January 2008 (EST)
Timeout Too Short
Decreasing apache's timeout from 300 seconds to 60 seconds does seem to have helped with making the lingering R connections less of a problem today. On the other hand, it's now become nearly impossible to edit a significant number of pages; other pages have become blank because of previous problems saving pages and can't be fixed (e.g., Oblivion:Artifacts is completely inaccessible right now). Basically, there are legitimate apache requests that take more than a minute to process, and all of those legitimate requests are now being cut off. I'll try later tonight and see whether some of the pages I've noticed can be fixed when the site is less busy, but it seems very possible that there are some pages that fundamentally cannot be processed within 60 seconds, no matter how well the site is operating. We already had pages that were difficult to save with a 300 second limit. (Also I'll check and see whether I can figure out how to purge some of the broken pages on the server itself; it doesn't help with the editing problem but at least I can try to make all of our pages available for readers).
So we seem to have an ugly balancing act here: a short timeout makes it easier to access the site, but at the expense of making it impossible to edit significant portions of the site. If the short timeout is absolutely necessary as a short term measure just to keep the site available for readers, maybe it's a sacrifice we all have to make. But I'd much rather find other ways to deal with these IPs as quickly as possible, and get the timeout back up to 300 seconds. --NepheleTalk 22:02, 24 January 2008 (EST)
- Yah, I was wondering about that. I'll reset it back to 300. -- Daveh 23:22, 24 January 2008 (EST)
Total Site Outage - 09:50 GMT
The site just died completely for about two minutes - every request came back with:
- UESPWiki has a problem
- Sorry! This site is experiencing technical difficulties.
- Try waiting a few minutes and reloading.
- (Can't contact the database server: Too many connections (localhost))
The interesting thing was what Server Status showed - 99 out of the 100 connections were being made by 72.55.137.132, the IP address of those notorious vandals, www.uesp.net :)
I assume that's because of the way the squid cache works now but it was rather odd. After a couple of minutes the connections started timing out and everything got back to normal. Just thought I ought to mention it. –Rpeh•T•C•E• 04:55, 25 January 2008 (EST)
- Strange. It might have just been a cache/DNS snafu when just starting or possibly a short lived DoS as if someone asked for a few hundred pages in a short time. I'll look into it but unless it happens again I'm not too worried. -- Daveh 21:42, 25 January 2008 (EST)
-
- It just happened again - exactly the same time and exactly the same error message. As I type, the proper style sheet isn't coming up and so everything is black and white. Server Status showed the same thing - every connection taken up by squid and then gradually timing out. The fact that it happened at the same time is interesting. Are there any common features in the logs? Ah - when I did show preview I got the proper style sheet. –Rpeh•T•C•E• 04:56, 30 January 2008 (EST)
-
-
- Strange...the cron jobs on content1/squid1 are setup to run at 4:02 daily, 4:22 weekly, and 4:42 monthly. Since you saw the same problem within five days it would have to be a daily cron issue if anything. Offhand I cannot think of any of the daily scripts that would cause an issue at 4:50 but will take a closely look tonight.
- I assume this is not an issue on your end? If you happen to be online at 9:50 GMT again check it the site out as well as www.iweb8.com. If you cannot reach either site it may be something else (although from your description it does seem like a site issue). Whatever it is, it is short enough to not trigger any of the site monitoring trackers I use (so under 5 minutes). -- Daveh 10:24, 30 January 2008 (EST)
-
ERROR The requested URL could not be retrieved While trying to retrieve the URL: http://www.uesp.net/server-status The following error was encountered: * Unable to forward this request at this time. This request could not be forwarded to the origin server or to any parent caches. The most likely cause for this error is that: * The cache administrator does not allow this cache to make direct connections to origin servers, and * All configured parent caches are currently unreachable. Your cache administrator is root. Generated Sun, 03 Feb 2008 09:48:43 GMT by cl-t021-251cl.privatedns.com (squid/2.6.STABLE6)
Another Feature Request
In addition to the two extensions I mentioned above, there's another addition to the wiki that looks like it would be very useful. At its most basic, what I'd like to request is that we add $wgUseTidy = true;
to LocalSettings.php. But in order to do that, Tidy needs to be installed: Tidy. Also, there are some explanatory comments in the DefaultSettings.php file. Installing Tidy will provide improved HTML sanitizing (i.e., the behind-the-scenes cleanup that ensures that our wiki pages spit out valid HTML no matter what random stuff gets inserted into a wiki page).
It's a somewhat obscure request, but it's a missing feature that has now tripped us a couple of times. Probably the main reason we need it is that Wikipedia is set up to use Tidy, and therefore Wikipedia templates end up being built assuming that it exists. That's the root of the problem that Lurlock's hit trying to implement the Familytree template (which would be useful for family trees such as the Septim lineage, but perhaps even more important for wikifying and updating the quest flow charts, such as the one at Morrowind:Main Quest). More generally, the reason for adding Tidy is that Tidy will no longer force all HTML tags to be closed at the end of a template: you can set up a template to provide formatting at the top of a page and actually have it work. This is the problem that's shut down my attempts to add some improved book formatting.
In other words, getting this done will allow us to move forward with several useful projects that are currently stuck in impossible-to-implement limbo. Thanks! --NepheleTalk 18:03, 30 January 2008 (EST)
- I've created an example on Wikipedia. It still needs a little work, but it shows the ease of creating such a thing. If I would have to rate the difficulty of using this on a scale of 1-10, it'd have to be a 2; it requires basic knowledge of tables. --Brandol 04:30, 31 January 2008 (EST)
Prev: None | Up: UESPWiki talk:Upgrade History | Next: None |