V for vandalism

April 25, 2014

Email

— File Photo
— File Photo

The idea behind Wikipedia was to create the largest online encyclopaedia, both in terms of the range of topics and the depth of discussion. This was made possible through crowdsourcing and a consistent drive to acquire and share knowledge. However, as the platform grew and people started depending on it for quick, reliable information, the cases of occasional pranks of innocuous editing turned into objectionable vandalism.

Several recent studies and experiments reveal that the turn-around-time for most revert-backs is in seconds to minutes. However, there’s still a lot of work to be done. But before we break into our zombie apocalypse version of vandal slaying stories, let’s see how it all comes together.

User Groups

Wikipedia is an intricate community based on regions, roles and specialties. Various stakeholders are categorised through 'User Groups' (also called ‘Flags’ or ‘Bits’).

User/Editor: anyone with an internet connection can view, set up accounts and edit Wikipedia pages

Confirmed/AutoConfirmed users have more rights, like page creation

Administrators/Sysops can perform tasks like page deletion, page protection, blocking/unblocking etcetera

Patrollers can check newly created pages and create auto-checked new pages

Bots: Automated programs used to perform pre-assigned tasks

Cyborgs: Users patrolling Wikipedia with the help of various software tools

Besides these, there’s an array of specialised roles such as Bureaucrat, Steward, Ombudsman, Reviewer, Rollback, Template Editor, Importer and TransWiki, and Course Coordinator etcetera.

Inglorious “editors”

The first most popular vandal, “the Squidward Vandal”, replaced several wiki pages (text) with an image of Squidward Tentacles, a character from the animated series SpongeBob Squarepants. The user claimed to be a programmer who used various hiding tactics like IP proxies and multiple user accounts. This led the community to fight back with the first automated script called AntiVandalBot. The Squidward Vandal was eventually caught through editor patrols, but the bots were found to be a valuable asset.

Wikipedia has a list of approved bots, based on which any developer around the world can create such bots in compliance to the Bot Policy and then apply for approval. There are more than 700 bots currently working on Wikipedia, but the most popular one is probably ‘Cluebot NG’, programmed using machine learning over a massive data set of vandalism samples categorised by users. It works non-stop and executes more than 9,000 edits per minute.

Another popular bot is the ‘WikiScanner’, which specialises in catching organisations that change Wikipedia content for their own benefit. The IPs of many popular companies were tracked, including Apple, Microsoft, BBC, The Guardian, Reuters, the FBI and Amnesty International. The bot also got some backlash when Stephen Colbert joked about its workings, declaring it a democratic process and a right for organisations and governments to decide what should be considered as the ‘truth’.

Besides bots, semi-automated tools/scripts also help users perform tedious sorting and data monitoring tasks, rendering them to the (unofficial) title of Cyborgs. For example, ‘Twinkle’ provides quick rollback options, user warning and reporting features.

Types of vandalism

One of the “absolute and non-negotiable” Wikipedia policies is maintenance of a Neutral Point Of View (NPOV). This helps generalise the acts that come under vandalism.

Vandalism can be any act that invalidates or diminishes a Wikipedia page’s reliability. It could be in the form of silly or crude humour, image edits, link edits, spam insertion, advertising content, copyrighted material, hoax pages, removal of valid page content, etcetera. Some clever vandals perform minor editing that is almost plausible or well hidden. More technical vandals attack the source code, templates, page tags, CSS or even create VandalBots.

Edit Wars is another form of vandalism on Wikipedia. When rival groups of editors edit controversial pages, for example the wiki page of George W. Bush, then the page is sent into a semi-protected/protected state, where both sides must agree on all segments of the content in order for it to make the final cut. More arguable sections are sometimes extracted as new articles to resolve the dispute separately.

Repercussions

Wikipedia co-founder Larry Sanger (who left Wikipedia in 2002 and has been critical of the project since), points out that if trolls are dealt with strongly, it reflects more badly on the organisation than the trolls, and some may even (arguably) scream: “censorship!”

Wikipedia’s solution to this problem is the development of various levels of dispute resolution schemes. This includes Mediation, Arbitration and a Counter Vandalism Unit.

Mediation requests are voluntary and in good faith, involving third-party editors or administrators as mediators. Arbitration, however, requires evidence of incorrect/misleading information and may lead to binding rulings, including possible sanctions against the involved users. The Counter Vandalism Unit is a resource centre to help pool the available resources, including editors, bots, tools, unit members and recorded policies.

The most basic action is reverting the vandal’s edits and issuing a warning. On persistence, the vandal may be temporarily blocked, even if he/she has a proxied IP or has multiple user logins. Also, their contributions to Wikipedia will automatically come under scrutiny. Next step is ‘Indefinite Blocking’, where he/she can be ‘Banned’ for life. This activity is also recorded on the user’s public records, including ‘contribution pages’ and ‘talk pages’, eliminating any chances of respect or promotion in the community.

Are bots anti-community?

Are bots against the communal ideology of Wikipedia? The community provides entitlement, which leads to disappointment when an unforeseen restriction is applied to a user. Bots are known to revert changes frequently that were not made in bad faith, but were technical faults or silly mistakes. Public/shared IPs get blocked all the time, causing distress to editors.

Efforts are already underway, focusing on humanising the bots through intelligent algorithms and addition of more human patrollers (cyborgs) to ensure a social element to Wikipedia, which is much needed to welcome new (and repeat) contributors. Tools like Snuggle are helping in monitoring and mentoring new users through social messages.

Participate

Besides the obvious means of contributing content, there are several ways users can opt to get more involved. Users can visit ‘Talk Pages’: discussion boards available for most Wikipedia pages (even the policy and guideline pages). This provides an off-the-page means of discussing ideas rather than engaging in edit wars. Users can also use the ‘Watchlist’ tool to monitor changes to talk pages (or regular pages they are interested in).

Software developers can help develop/improve bots, tools and platform algorithms; enthusiasts/subject matter experts can help refine the content, and data analysts or people with keen eyes for detail can help patrol the pages. In whatever capacity we can help, knowledge must prevail.