The value of Tor and anonymous contributions to Wikipedia

commoner · on June 26, 2020

If you are using Tor for security or privacy reasons, and you would like to edit Wikipedia while using Tor, you can request permission to do so:

https://en.wikipedia.org/wiki/Wikipedia:IP_block_exemption

This requires a Wikipedia account and an email address (which can be used exclusively for Wikipedia). Signing up for a Wikipedia account involves providing a username and a password, but no personal information is needed.

The "Advice to users using Tor" page has more information:

https://en.wikipedia.org/wiki/Wikipedia:Advice_to_users_usin...

FabHK · on June 26, 2020

I have requested permission, and it was denied.

You have to "demonstrate the need" and "be trusted not to abuse".

> Editors who may reasonably request an exemption include users who show they can contribute to the encyclopedia, and existing users with a history of valid non-disruptive contribution.

So, "a Wikipedia account and an email address", as you write, is not enough.

EDIT to add: As acknowledged by Wikipedia in your second link:

> IP block exemptions for this purpose should be requested by following these instructions. However, it may be difficult to establish good standing and remain completely anonymous, as the former requires editing without using Tor.

reagle · on June 26, 2020

I’m a longtime contributor and asked to use my VPN and was similarly denied. I’d like to have my VPN on most of the time, but Wikipedia won’t let me.

em-bee · on June 26, 2020

but once i have an account, why would i still need tor?

what protection does tor give to someone logged in to wikipedia?

i don't like to get an account because that would leave a public trail of all my edits, and given the eclectic selection of topics i am interested, in this would allow anyone to deanonymize me. whereas if my edits get spread over multiple ip addresses you may tie a particular edit to an ip of mine, but you won't be able to build a profile of my person from all my edits.

hackernews has the same problem, but on hn the topic selection is more limited, and not all topics i am interested in are being discussed here. also it is a lot more difficult to categorize hn comments compared to wikipedia edits.

Hello71 · on June 26, 2020

If you believe you require privacy in your contributions, you can create multiple accounts pursuant to the policy on sockpuppetry: https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry#Legiti.... In these cases, you "must not use alternative accounts to mislead, deceive, disrupt, or undermine consensus". It is also recommended to "notify a checkuser or members of the arbitration committee if they believe editing will attract scrutiny".

77pt77 · on June 26, 2020

> This requires a Wikipedia account and an email address

It's borderline impossible nowadays to create an email address without a phone number or a pre-existing email address.

Also almost impossible to do using Tor.

lucasmullens · on June 26, 2020

Any email address? Surely there's still unpopular email services out there that don't require a phone number. There's even websites that provide instant temporary email addresses.

shrimp_emoji · on June 27, 2020

I think, for example, https://www.tutanota.com/

mkup · on June 26, 2020

Last time I tried to create e-mail address via Tor, ProtonMail worked flawlessly. No phone number required. They even have .onion domain available exclusively via Tor.

But other e-mail providers like GMail and Hotmail won't like Tor and ask user to provide phone number, you are partially right. So solution is simply not to use these e-mail providers.

Funes- · on June 26, 2020

>Last time I tried to create e-mail address via Tor, ProtonMail worked flawlessly. No phone number required.

It must've been some time ago, then, because they now require an SMS (your phone number), a donation, or another e-mail account.

TechBro8615 · on June 26, 2020

I don’t think that’s true for every exit node. Keep cycling your IP and you may find one where it’s not required.

jandrese · on June 26, 2020

Even if you get the account created it will tend to self-lock after a short time and ask for SMS verification.

Email services that don't SMS verify on VPN or TOR endpoints usually find themselves on anti-spam blacklists sooner rather than later. Spammers are constantly on the lookout for email services that don't have trashed reputations to get around Gmail and other provider's filters.

metrokoi · on June 26, 2020

ProtonMail does not allow new account creation over VPN or Tor without donation or SMS. I checked just now. The point of creating an account with Tor is that it's completely untraceable. Yandex does, however.

77pt77 · on July 4, 2020

> Yandex does, however

Nope. Also fails. I git curious a few weeks ago and tried most services mentioned here.

All failed, all asked for either SMS or another email.

77pt77 · on July 4, 2020

> Last time I tried to create e-mail address via Tor, ProtonMail worked flawlessly.

I got the silent treatment. They say nothing and just don't create the address.

slipheen · on June 26, 2020

Perhaps the best way to do this right now may be renting a phone number online.

There are services you can pay in btc, which will receive a verification code for you, using a real phone number, rather than a voip number.

AFAIK they're often used by relatively unscrupulous people to bypass bans in games, etc, but could also be used here.

jandrese · on June 26, 2020

The threat model of Blizzards anti-gold farming department vs. your national government's secret police isn't the same.

slipheen · on June 26, 2020

Sure? But most Tor users aren't trying to avoid their national secret police.

jandrese · on June 26, 2020

Yeah, but that's the design goal for Tor. The designers don't need to cater to the needs of people buying drugs or scamming old people.

Fnoord · on June 27, 2020

Its a purpose one could use Tor for.

You can use Tor to circumvent firewalls or NAT (e.g. SSH tunneling, or avoiding proxies and censorship).

You can use Tor to ensure Facebook or Wikipedia does not have your IP address (you might not trust the admins).

You can use Tor to provide others the above mentioned benefits.

And in an authoritarian country such as China, trying to hide from the authorities (as you put it "national secret police") makes sense quite quickly as so much is forbidden. That being said, Tor was not designed to make you hide from the "national secret police"; at least not specifically (and which nation anyway?). IIRC it started out as a project by the US military.

Forbo · on June 27, 2020

The Venn diagram of "national secret police" and the foreign intelligence agencies the Navy was trying to avoid probably has quite a bit of overlap in many areas of the world. So, yeah, it was in fact originally designed for just that.

pbhjpbhj · on June 26, 2020

Don't you just shift things down the road, now you need to buy BTC, in UK I think you need a bank account (to setup PayPal, say) or phone number (to buy a disposable payment card).

slipheen · on June 26, 2020

I don't know your particular situation, but there are places you can buy bitcoin in cash. At least in the US, this doesn't need a bank account or phone number.

Googling suggests you could try https://coinatmradar.com/

alasdair_ · on June 26, 2020

Find someone with bitcoin and have them give you some for cash?

young_unixer · on June 26, 2020

I always use cock.li for temporary emails and it never asks for anything. As long as you don't feel offended by the domain names, you can use it too.

dustingetz · on June 26, 2020

You can host a mailserver, it comes with linux

jandrese · on June 26, 2020

You'll need a domain, and getting one anonymously is another issue roughly on par with getting an anonymous email account. In fact the domain provider is pretty much always going to ask you for an email account and a form of payment. The whole point of this exercise is to avoid having your nation's secret police kidnap and kill you because you posted the list of details about how the leaders are stealing from the people and killing dissidents.

c22 · on June 27, 2020

You actually don't need a domain, but you will at least have to obtain an ip address.

dependenttypes · on June 27, 2020

dkim needs dns

getting a host will be more difficult though

dustingetz · on June 26, 2020

Subdomain works, get on IRC and ask someone for one

77pt77 · on July 4, 2020

That's another fight.

Try sending email from your own box and not be blocked by the big ones (gmail, apple, yahoo, outlook, etc).

It's borderline impossible even if you just send a couple of emails a week (no spam) and set up everything (DKIM, SPF, DMARK)

dependenttypes · on June 27, 2020

a lot of services only accept emails from "reputable" servers (gmail, yahoo, etc)

nullc · on June 26, 2020

I burned myself out advocating for this some seven years ago:

https://lists.wikimedia.org/pipermail/wikitech-l/2013-Decemb...

Unfortunately, the incentive structure at Wikipedia has challenges. Editors suffer under every ounce of abuse but the cost of excluded contributions is nearly invisible and not felt personally by anyone.

The result is that convincing people to take even small risks (of abuse) or costs (of tech measures to mitigate abuse without compromising user privacy) is extremely hard.

Hopefully this research will help shift the balance.

FabHK · on June 26, 2020

Hope so. I nearly always use a VPN, and can't contribute to Wikipedia (even if logged in) due to that. I've once requested an exemption, but it was not granted.

So, they basically exclude anyone that's somewhat privacy conscious, or frequently travels to weird jurisdictions that make use of a VPN advisable, or lives in such jurisdictions. As you say, an invisible loss.

raverbashing · on June 27, 2020

True

At the same time, as you mention, excluding contributions is cheap and that's what initial contributors feel worse about: that they can't contribute effectively

I like how the article starts with:

> An activist in the Middle East can provide a different perspective on an article about politics in their own country than a collaborator in northern Europe. And they deserve to add their voices to the conversation safely.

When they will most likely get a "original research" deletion of their contribution

bawolff · on June 26, 2020

I think this somewhat misses the point though - i dont think people are worried about the average TOR user. The average tor user is probably just fine. People are worried about the one person with an axe to grind making everyone's life miserable who turns to tor after being blocked through normal means.

Should people be worried about that? Idk, i think there are probably ways to mitigate the risk of that at least somewhat. However the article didn't address that concern, and you can't change hearts and minds if you don't talk about what people are actually worried about.

Jap2-0 · on June 26, 2020

Exactly. With an IP, it can be pretty easy to stop someone from doing anything - block one IP (maybe a small range for IPv6), and maybe another for their phone, and then block account creation from those IPs (I'm a bit rusty on the technical details of how that works - if it's something that happens automatically when blocking an IP, or is separate, or I think I heard something about doing it based on a cookie at one point). No such luck with Tor.

nullc · on June 27, 2020

The kind of dedicated habitual abuser bawolff is talking about will also use an endless series of free/cheap VPSes an VPN services, at least the subset of ones that might also use tor.

So yes, letting them get away with tor some will let them in a little more, but not a huge qualitative change to not allowing them to do so.

Not allowing the access, however, blocks a significant amount of legitimate contributions too.

Hitton · on June 26, 2020

Wikipedia has rather heavy handed admins. I remember being caught in /17 IP range ban. It was apparently because single vandal with ISP which granted dynamic IPs and had carrier-grade NAT. It caught tens of thousands households. Luckily the damage was not that great, because the ban was only on english wikipedia but imho still overkill.

jancsika · on June 26, 2020

I'd be interested to know more about the nature of the actual vandalism and abuse that came from Tor exit nodes prior to Wikipedia implementing the ban. How long did it stay up before being reverted? How difficult was it to track, revert, etc. compared to non-Tor-based vandalism? And how effective a tool was IP-banning for non-Tor-based vandalism at the time?

Plus anything else I haven't thought about regarding the severity of the vandalism during that time.

Any Wiki admins have first-hand experience?

Edit: clarifications

duskwuff · on June 26, 2020

Not an administrator, but an onlooker during that period.

Before Wikipedia had any explicit policy re. Tor exit nodes, their IPs would typically be blocked anyway -- either under longstanding policy regarding open proxies, or as a result of spam/vandalism edits originating from the IP. Automatically blocking all Tor exit nodes wasn't a huge change in practice; it just meant that the process was automatic (so new exit nodes would be blocked more quickly, and old exit node IPs would be unblocked automatically), and that the block messages for users on those IPs became more informative.

jfengel · on June 26, 2020

Given Wikipedia's demand for citations, is there really any advantage to allowing anonymous edits? If somebody has privileged information, that may be valuable knowledge, but it's not what Wikipedia was intended for. I'd expect anything like that to be deleted as Original Research, or marked as Citation Needed.

If the citation is public, a non-anonymous person could make the edit as easily as an anonymous one. That's not to say it will necessarily be made, but there doesn't seem to be a case that only one person could make the edit. Allowing anonymous contributions does increase the work force to include people who feel the need to be more secure, but they don't have access to special information that makes them uniquely qualified.

nullc · on June 26, 2020

Under your argument, why should Wikpedia even exist at all? If it's sufficient that the information is "out there" -- well then it's all already out there.

The relevant expertise-- which sometimes includes privileged information-- is part of what lets you know which public information is valuable, relevant, and worth the effort to bother including.

A user's reason for protecting their privacy may also have absolutely nothing to do with their possession of any privileged knowledge. It's really impossible to predict what the long term consequence of compromised privacy are, and we know that any interaction online can be an invitation to abuse by crazy people.

So, for example, as part of some discussion online I might find myself reading an article on some venereal disease. While reading the article I might notice some omissions or errors and decide to fix them. Later, my edits could end up as part of a debate about the content of the article (even if my edits were utterly unobjectionable) ... with an end result of this potentially embarrassing subject turning up in search results about me, or being discovered by a political opponent in an entirely unrelated debate a decade later, and being pulled out of context just to smear me.

This isn't conjectural. I can speak to it personally: For example, over 13 years ago I got in an edit war on Wikipedia over some site policy thing about users including copyright law violating images on their user pages. I was a bit of a hothead about it and got myself blocked from editing for 24 hours. I was appropriately chastised for being an idiot about how it was handled. All the edits all ultimately went through. But I get regularly slandered by abusive anonymous accounts about it that are mad at me about unrelated Bitcoin debates (and can find literally nothing else negative to say about me). They love to characteristics it in various ways ("fired from wikipedia!"), divorce it from the context, yank out completely inaccurate off the cuff comments from other wikipedians made during the event (apparently some random troll was mistaken for me for a little bit during discussions about the incident).

I would have been much better off contributing anonymously as a result. And for the little personal benefit I got out of contributing, I would have been better off not contributing at all rather than end up with this nonsense.

Is Wikipedia or the world really a better place where only people who either fail to make the above calculation correctly or whom expect some big personal pay-off by contributing are left editing the site? I don't think so.

readhn · on June 26, 2020

Totally makes sense. Fun fact: In the past i was involved in a company where they implemented anonymous feedback system. The feedback system did so well .... that they shut it down in a month and never discussed the results. It uncovered too many problems that nobody in the management wanted to address. This company is on the verge of bankruptcy now, 5 years after those surveys..

When people are allowed to speak up freely - truth will come out quickly. (Yes, you will get some noise too, but id rather adjust my signal to noise ratio then just deal with meaningless noise all the time).

FirstLvR · on June 26, 2020

i've said this everywhere... we all need to invest on wikipedia, to make it powerful, free and useful

it may not work as intented, if you are doing university research but ... for general purpose we must have a general encyclopedia that actually works

frandroid · on June 26, 2020

Huh, Wikipedia is already free and useful?

Anyway, Wikipedia raises TONS of money but it goes to staff of dubious purpose with regards to its goals.

metrokoi · on June 26, 2020

Same with most non-profits.

kevin_thibedeau · on June 26, 2020

Not going to happen so long as self-appointed kings decide what gets to stay. Exhaustive list of Pokemon? Sure thing. "Non-notable" programming language? Not worthy.

MultusSalus · on June 26, 2020

Equally, everything on bulbapedia was within the scope of the project initially, so stats of Pokémon in each gen are now feeling excluded, just as programming languages are. This piece is pretty good on the topic: https://www.gwern.net/In-Defense-Of-Inclusionism

amatecha · on June 26, 2020

Yep, I see so many pages get deleted due to "non-notability". Meanwhile you see exhaustive content about temporarily-popular subjects like a complete plot synopsis for an entire TV series, plus detailed information about every location ever described in that series. Like this comprehensive list of all the characters in the ReBoot series[0]. Hey, I liked the series at the time, but come on. I can't help but feel a bit frustrated when valuable information is deleted permanently, but content hundreds of times the size persists for super-obscure topics that a couple people feel passionate about enough to fight for and rally their fellow account-holders to vote for.

[0] https://en.wikipedia.org/wiki/List_of_ReBoot_characters

opo · on June 27, 2020

There should be a flag indicating that content is considered 'Not Notable' by the admins so it is easy to ignore in searches if you so desire.

The idea that content is actually being deleted because it isn't considered 'notable' by some person in 2020 is sad. Who knows what will be considered 'notable' in 2050?

peter_d_sherman · on June 27, 2020

>"Wikipedia has tried to block users coming from the Tor network since 2007, alleging vandalism, spam, and abuse. This research tells a different story: that people use Tor to make meaningful contributions to Wikipedia, and Tor may allow some users to add their voice to conversations in which they may not otherwise be safely able to participate."

A few philosophical observations:

>"Wikipedia has tried to block users coming from the Tor network since 2007, alleging vandalism, spam, and abuse."

Question #1. If let's say a country like China blocks users and/or content, citing "vandalism, spam, and abuse" as their reasons for doing so, then is this Censorship, or is this blocking content/users for "vandalism, spam, and abuse"?

?

That is, how exactly does one distinguish blocking content/users for "vandalism, spam, and abuse" differ from Censorship -- if the net effect is the same in both cases?

See, whatever reasons we give as criteria for this distinguishing process -- we must then be able to apply them equally to the other party.

If we say that it's OK for a company to block users based on "vandalism, spam, and abuse" -- then how do we know that the Chinese government (or any other government that engages in censorship) does not apply the exact same criteria when it apparently censors, and if so, is it justified in those content takedowns?

In other words, if we're saying that it's OK for Wikipedia to engage in its actions, then why is China not justified in doing the same thing?

And if it's not OK for China to do, then why is it justified for Wikipedia to do?

Disclaimer: I am neither for nor against China, and I am neither for nor against Wikipedia.

I merely think it that it would make for an interesting philsophical debate as to why one's actions are justfied, and the other's actions are not, IF BOTH SETS OF ACTIONS RESULT IN THE SAME EFFECT.

Help me to understand. Pretend I am Socrates, that is, I claim to know nothing, and it's your goal to educate me...

hatmatrix · on June 26, 2020

It doesn't seem to discuss the justification for considering the ban in the first place. I can imagine as an example, Exxon-Mobile trying dominating the climate change discussion through anonymous edits (though they didn't, at least partially; they were retroactively caught trying to modify relevant pages from IP associated with their business).

ryanisnan · on June 26, 2020

Technically speaking, does Wikipedia just keep IPs of all exit nodes? Other than that, I'm curious what attributes designates traffic as "TOR" traffic.

bawolff · on June 26, 2020

Yes, it downloads a list of all exit nodes.

Code is open source and viewable at https://github.com/wikimedia/mediawiki-extensions-TorBlock

ryanisnan · on June 26, 2020

I'm curious why TOR would maintain a list of exit nodes so that this is possible?

bawolff · on June 26, 2020

Tor is a pretty centralized architecture-the client using tor gets to choose its path through the tor network. It needs to know all the nodes in the system in order to construct a valid path

ryanisnan · on June 26, 2020

Gotcha. Thanks!

surround · on June 26, 2020

Is there any reason why Wikipedia doesn’t assign non-logged-in users a unique user ID instead of exposing their IP address?

bawolff · on June 26, 2020

There are some proposals in this direction (serious proposals that might actually happen. Not just wishful thinking proposals)

https://meta.wikimedia.org/wiki/IP_Editing:_Privacy_Enhancem...

surround · on June 26, 2020

This proposal echoes my concerns and ideas very closely, thank you. Unfortunately, the project is “currently in very early phases,” there’s no “particular deadline,” and there’s a lot of opposition to the proposal [0], so perhaps it’s wishful thinking after all.

[0]https://meta.m.wikimedia.org/wiki/Talk:IP_Editing:_Privacy_E...

nullc · on June 26, 2020

Doing so would diminish the user community's ability to self police those edits somewhat. It's important to understand that Wikipedia is a public collaboration to a much greater degree than any other large websites are-- it's more like a big open source software project.

Wikipedia's viability depends critically on hundreds of millions of dollars of year of uncompensated volunteer labor by self-selecting contributors-- and a major component of that includes anti-vandalism.

As my sibling comment laments, Wikipedians directly feel the pain of any reduction in their anti-abuse toolchest-- while the countributor's privacy is almost a pure externality.

It's trivial to create an account, so any contributor that cares can self-serve a solution-- at least if they're sophisticated enough. If they're not even that sophisticated they're not exactly likely to make a case for this kind of change.

It also provide a bit of a false sense of privacy since the information is still retained by the site and available to many-- though the same situation exists for logged in users.

I think advocating access via tor would be more important: It's unfortunate to create a situation where a user's life or freedom might be endangered due to their contributions unethical-but-lawful order (or an accidental data breach) forces Wikimedia to expose them.

surround · on June 26, 2020

But how does allowing everyone to see IP addresses help with anti-vandalism? There’s already the CheckUser if there needs to be an IP-block ban.

https://en.m.wikipedia.org/wiki/Wikipedia:CheckUser

As for privacy, allowing everyone to associate your IP address with your edit history is quite different than only allowing the website to see it.

The reason I bring this up is because there have been quite a few times where I’ve wanted to make a quick copy edit, but I don’t, because I don’t want the edit and the name of the article I read to be permanently associated with my IP address. Preferring anonymity and the ability to change identities, I hate having a permanent account for online interactions. I‘ve made a lot of temporary account on various websites, but most of the time it’s not worth creating a temporary account for a tiny edit on Wikipedia. So I don’t bother.

Maybe I’m a unique case, but I’m still willing to bet that there would be more interaction from non-logged-in users on Wikipedia if their IP address were hidden.

(As a side note, the reason why I’ve stuck with this HN account for so long is because there is a limit to how many accounts an IP address can make before getting banned.)

capableweb · on June 26, 2020

I guess you're one of few with a static address, where doing something via your IP can automatically be connected to you as a person. I've made plenty of Wikipedia contributions, but going to "my" user page with my current one, I find none. I don't care about if someone can connect the IP with my contributions.

But I completely understand if you're making more "risky" edits and/or have a IP address attached to you like that. Then it makes sense. And if it's too much to create an account for the tiny edit, I guess it'll be left like it is. Some things I see on Wikipedia are wrong too, but the work to verify, archive, link and do the edit, is more than the edit is worth in terms of shared knowledge, so I skip it too.

On the other hand, I quite like that you can connect the ASN with edits on Wikipedia. Makes it very obvious what organizations are trying to influence what in which direction. But maybe the information doesn't have to be fully public, require a different access rule than just user.

surround · on June 26, 2020

I don’t have a static IP (but I do share one), and I don’t really make any “risky” edits. Yet the thought of my online activity potentially being tied back to myself makes me uncomfortable enough to not contribute.

nullc · on June 26, 2020

> But how does allowing everyone to see IP addresses help with anti-vandalism? There’s already the CheckUser if there needs to be an IP-block ban.

It scales better.

Pretty much just that.

> Maybe I’m a unique case, but I’m still willing to bet that there would be more interaction from non-logged-in users on Wikipedia if their IP address were hidden.

You're not. Though WP would generally prefer you not use throwaway accounts (as would HN, for that matter!).

It's unclear how many people it discourages-- a lot of people manage to make edits without realizing their IP will be displayed because they miss the warnings. Users asking the site to delete edits that were accidentally IP-exposing is a not-infrequent support request.

Users suffering unwelcome exposure is largely an external cost (except for those support requests, and the invisible long term discouragement-to-contribution).

surround · on June 26, 2020

Another thought: If anyone can make an account to hide their IP anyways, why not hide the IP of all users? But I guess the answer is that enough vandals don’t create accounts to warrant keeping IP addresses public.

nullc · on June 26, 2020

And other users can make an account easily enough to not bother fixing it.

Probably on the balance, considering everyone's costs and risks it would be better to make IP edits private. But the decision to do this is made by Wikipedians, and considering just their costs, favours not doing it more strongly.

fsflover · on June 26, 2020

Because now you can do something like

https://news.ycombinator.com/item?id=8024417

https://news.ycombinator.com/item?id=15457335

AnthonyMouse · on June 26, 2020

What stops them from just making their edits from home or coffee house wifi?

kevin_thibedeau · on June 26, 2020

The coffee house WiFi is logging your MAC for the oppressive regime you live under. One night you get disappeared for being disruptive.

AnthonyMouse · on June 26, 2020

This is the thing you believe would be happening to members of the Norwegian parliament who edit their own wikipedia articles?

fsflover · on June 26, 2020

Laziness probably. And it works very well.

lucb1e · on June 26, 2020

Wait, am I just completely misreading this or is this contradicting itself?

> the research team found that Tor users made similar quality edits to those of IP editors [...] and first-time editors. The paper notes that Tor users, on average, contributed higher-quality changes to articles than non-logged-in IP editors.

Is it similar quality or higher-quality now? The text also appears (word for word) on the linked website at nyu.edu. Reading the original paper, guess what I found?

> Using hand-coded data and a machine-learning classifier, we estimated that edits from Tor users are of similar quality to those by IP editors and First-time editors. We estimated that Tor users make more higher quality contributions than other IP editors, on average, as measured by PTRs.

Almost the same wording and contradiction again. There is a subtle change, namely that the "of similar quality" judgement is a result of hand- and machine-classifying, and "more higher quality contributions" is the judgement of a metric called PTR* . There might also be a difference between "similar quality edits" and "more high quality edits" (e.g. Tor users might do more crap edits and more great edits by one metric but simply be about average by another metric), but I'm not sure if that's just random variation in phrasing or intentional.

* PTRs are "persistent token revisions". I don't find it very succinctly/adequately explained at first use, but probably if you read the whole paper it makes more sense. To my understanding, it's basically just how much of the contribution was later changed (within a fixed number of subsequent edits), presuming that if it was largely left unchanged, it was probably a welcome and correct edit.

While the article and paper are all positive, I'm not sure whether this might just be because of course we'd all love to hear how great Tor users are (many of us are also Tor users: we like to think of ourselves as freedom fighters, privacy advocates, etc.), but I'm not sure that's what this unambiguously shows. Perhaps it's worth the moderation effort to unban them, perhaps not. The paper does acknowledge this to an extent: "We simply cannot know if our sample of Tor edits is representative of the edits that would occur if Wikipedia did not block anonymity-seeking users."

Perhaps, instead of ban vs unban, we just need another system to anonymously contribute changes, like a moderation queue, which would make it less attractive for vandalism.

(On StackOverflow/StackExchange, anyone can edit without logging in and not even your IP address is shown. While moderating it, I very very rarely see trolls or spambots there. I'm not sure if that's because of some magic system I don't know about or if it's simply because a manual review filters all the garbage and there is no point in trying.)

MauranKilom · on June 27, 2020

Incidentally, SO just released a blog post about their spam prevention measures.

https://stackoverflow.blog/2020/06/25/how-does-spam-protecti...

lucb1e · on June 27, 2020

Oh cool, that explains why we see so little spam. Still though, Wikipedia is also a big site with a huge audience, they must have similar issues and perhaps protections. Assuming there's more to it than just IP bans, the same could be applied to Tor plus a review queue to make non-spam vandalism also unattractive.

nl · on June 27, 2020

> Is it similar quality or higher-quality now?

I also read the paper. Depending on what metric you choose, it is either similar quality (based on revert numbers) or higher-quality (based on PTR - basically the characters that survive subsequent edits).

> While the article and paper are all positive... I'm not sure that's what this unambiguously shows.

You are absolutely correct!

The paper actually shows that Tor edits were much worse than any other group ("Overall, 41.12% of Tor edits on article pages are reverted, while only30.3% of IP edits, 35.2% of First-time edits, and 5.5% of Registered edits are reverted.") before the ban was implemented.

After the 2013 ban they "consider the small number of edits made in this later period. Using the methods described above, we computed the reversion rate and the PTRs for the population of 536 edits made after 2013 along with the same number of time-matched edits from other groups as described above." - and THIS dataset is what the headline comes from!

So this positive conclusion is based on these 536 edits, not the much bigger dataset! Additionally, they filter out edits to non-article pages where "this analysis provides evidence that Tor was used to engage in edit warring in violation of Wikipedia policy."

It's a pity, but this paper doesn't really support the idea that Tor users are good Wikipedia editors.

vmception · on June 26, 2020

tl;dr preventing spam by blocking all TOR users is lazy.

hombre_fatal · on June 26, 2020

Just depends on the service.

You're not lazy just because you decide something isn't worth it.

I've worked on some services where I'm not exaggerating to say that almost all Tor traffic was abuse.

vmception · on June 26, 2020

But its more likely that one person was doing one thing taking up the majority of the badnwidth originating from tor

And the distinct users would be trying to do another thing

forgotmypw17 · on June 27, 2020

I agree. They could set up a public queue for moderation of edits suggested from untrustworthy IPs by users with trustworthy IPs and accounts.

Community-based fully open spam moderation is a beast hard to contend with for even large entities.

coronadisaster · on June 26, 2020

Maybe they had to block TOR because someone asked them to, like the Government

MintelIE · on June 26, 2020

I use Tor for most of my browsing these days. While I'm under no illusion that it protects me from the US government (it IS NSA software after all), I'm fairly confident that it does protect me against non-15-eyes and corporate spying and data collection.

slim · on June 26, 2020

15 eyes is a good one (supposed to be 5 eyes)

MintelIE · on June 27, 2020

It’s 15 eyes now and has been for years.