Uncategorized

The origins of WikiWash

t all started because we thought it would be fun to map the sort of chicanery Toronto's absurdly partisan politics inspired on Wikipedia.

Shortly after Metro's digital team arrived at Techraking's inaugural Toronto edition, we were assigned a fairly simple task: Come up with a data journalism pitch to throw your freshly-tuned techniques at.

Our kernel was telling the story of Toronto's municipal election through Wikipedia edits. What invented indiscretions would worm their way into our notorious incumbent's entry? If somebody was spotted sowing confusion on a candidate's page, what other entries drew their roguish attention?

That kernel went on to be heated by several rounds of prodding, questions and encouragement from our Center for Investigative Reporting and The Working Group hosts.

They challenged us to expand our idea outside Toronto's electoral boundaries and cast our eye toward Wikipedia itself. The site's open nature pulls double duty as its greatest strength and its greatest weakness.

A way to easily sort what entries are being targeted by ill-intentioned edits, where they're being made from and even how they're being corrected could go a long way to taming the more damaging aspects of Wikipedia's crowd-sourcing model.

So it was time to think big. How much data could realistically be scraped from Wikipedia? Would we be able to search a constellation of pages to dig up potential patterns? What's the best way to display it?

Eventually, chats with our hosts grew shorter and the question primarily became "Wait, this doesn't already exist?"

It didn't. Now it does. WikiWash still has a distance to go before it becomes what we all think it can be, but we couldn't be happier to share it with you while we watch it grow.