Page MenuHomePhabricator

Add basic abuse prevention to UrlShortener
Closed, ResolvedPublic

Description

If I read the spec correctly, UrlShortener will not produce deterministic short urls. As such, they should be lazy created as needed afaik.

Overhead should be minimal, but one possible way (depending on the implementation) may be to add a basic ping limiter so that a single user cannot submit 100s of urls at the same time. Attackers may do that in attempt to poison the database and take all smaller keys, or in an effort to arrive at a specific short id (if it's based on auto-increment internally, to try and spell some word).

Event Timeline

The extension already has a ping limiter set up?

We currently have ping limiter functionality implemented, but no rate limits set. What would you suggest we make the default be?

As for trying to spell words, I think it would be more likely that an attacker just waits for other people to fill up the database until it gets near their word, and then create their url. Of course, since we only hand out new short codes for unique URLs, they'd have to submit unique things each time, giving them only once chance (or they include a dummy query string or anchor that is unique). All that said, I'm not sure it's an attack we should necessarily be worried about since many external URL shorteners (e.g. tinyurl, bit.ly) allow you to pick your own word when shortening.

We currently have ping limiter functionality implemented, but no rate limits set. What would you suggest we make the default be?

I suggest no more than 10 per 2 minutes by default anon and newbie. - applying to the API POST module and/or dedicated special page.

There'd be no rate limit for native requests that shorten canonical page view urls, which would presumably happen automatically whenever a new page is first viewed, rendered and/or requested through the Query GET API.

yeah, I think a rate limit could prevent the biggest amount of this. That makes it a lot harder to try and force a problematic, specific, url to use. 10 per 2 is probably ok for anon/newbie (building it in allows us to adjust if we see people trying to abuse anyway). Honestly for users without rate limit exemption (or higher limits in general) I would think it doesn't need to be set crazy high either (50-100?) but could certainly be higher then 10.

Also: How hard is it to deactivate a url if worst comes to worst? Is it just a db row being dropped (or changed) or is it something more massively pita?

Also: How hard is it to deactivate a url if worst comes to worst? Is it just a db row being dropped (or changed) or is it something more massively pita?

Technically: Dropping a database row and having someone in ops purge the URL from varnish.

However, one of the problems with URL shorteners that we're trying to fix is that they are a reliability concern, so I would like to set the expectation that once created, w.wiki URLs will never break. Given that we have a whitelist of domains, what scenarios are you thinking of that would require breaking short links?

Change 327707 had a related patch set uploaded (by Legoktm):
Set default rate limits

https://gerrit.wikimedia.org/r/327707

Also: How hard is it to deactivate a url if worst comes to worst? Is it just a db row being dropped (or changed) or is it something more massively pita?

Technically: Dropping a database row and having someone in ops purge the URL from varnish.

However, one of the problems with URL shorteners that we're trying to fix is that they are a reliability concern, so I would like to set the expectation that once created, w.wiki URLs will never break. Given that we have a whitelist of domains, what scenarios are you thinking of that would require breaking short links?

If people can abuse it, they will. People will create urls like en.wikipedia.org/wiki/The_real_name_of_Legoktm_is_Santa_Claus that could give privacy concerns. I would counter abuse with the existing tools we have: Make it possible to use abusefilter rules (so you don't have to mess in software) and make it possible for meta admins to delete an entry. Delete should of course not deleting the actual row, but just set a field like rev_deleted in the revision table (see https://www.mediawiki.org/wiki/Manual:Revision_table#rev_deleted ).

Does the extension keep track of who created an url? If true, where?

Change 327707 merged by jenkins-bot:
Set default rate limits

https://gerrit.wikimedia.org/r/327707

Also: How hard is it to deactivate a url if worst comes to worst? Is it just a db row being dropped (or changed) or is it something more massively pita?

Technically: Dropping a database row and having someone in ops purge the URL from varnish.

However, one of the problems with URL shorteners that we're trying to fix is that they are a reliability concern, so I would like to set the expectation that once created, w.wiki URLs will never break. Given that we have a whitelist of domains, what scenarios are you thinking of that would require breaking short links?

If people can abuse it, they will. People will create urls like en.wikipedia.org/wiki/The_real_name_of_Legoktm_is_Santa_Claus that could give privacy concerns. I would counter abuse with the existing tools we have: Make it possible to use abusefilter rules (so you don't have to mess in software) and make it possible for meta admins to delete an entry. Delete should of course not deleting the actual row, but just set a field like rev_deleted in the revision table (see https://www.mediawiki.org/wiki/Manual:Revision_table#rev_deleted ).

Does the extension keep track of who created an url? If true, where?

I have a rather wild idea how to do this, I don't know how hard it would be to implement my idea but it seems to me that: 1- Implementing an admin system is too much work 2- currently you can scan through short urls (by incrementing the id). So Instead of adding the admin system, let's make it hard to scan through short urls. For example I can make a short url in bitly for https://en.wikipedia.org/wiki/The_real_name_of_Legoktm_is_Santa_Claus but it will be useless because it makes a random (?) id for that url. So the url shortner will make something like https://w.wiki/rYEDA3JcQqw instead of a number.

I looked into what other URL shorteners do for abuse prevention...and it doesn't look like much. So to resolve this (as the last blocker before deployment), I will create a Special:ManageShortUrls page, which allows privileged users (meta sysops?) to disable a short url by setting urlshortcodes.usc_deleted = 1. If that field is set, then the URL will no longer redirect, and will be redacted from future dumps.

Thanks @Legoktm for helping moving this forward :)

Change 422070 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/extensions/UrlShortener@master] [WIP] Support hiding short URLs

https://gerrit.wikimedia.org/r/422070

Change 422070 merged by jenkins-bot:
[mediawiki/extensions/UrlShortener@master] Support hiding short URLs

https://gerrit.wikimedia.org/r/422070

Change 496805 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/UrlShortener@master] Add SpecialManageShortUrls

https://gerrit.wikimedia.org/r/496805

Change 496808 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/UrlShortener@master] Prevent blocked users from making short URLs

https://gerrit.wikimedia.org/r/496808

Change 496805 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/UrlShortener@master] Add SpecialManageShortUrls

https://gerrit.wikimedia.org/r/496805

Smalyshev triaged this task as Medium priority.Mar 25 2019, 10:31 PM

Change 496808 merged by jenkins-bot:
[mediawiki/extensions/UrlShortener@master] Prevent blocked users from making short URLs

https://gerrit.wikimedia.org/r/496808

Change 496805 merged by jenkins-bot:
[mediawiki/extensions/UrlShortener@master] Add SpecialManageShortUrls

https://gerrit.wikimedia.org/r/496805

Change 499777 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/mediawiki-config@master] Add the 'urlshortener-manage-url' right and enable it for stewards

https://gerrit.wikimedia.org/r/499777

Change 499777 merged by jenkins-bot:
[operations/mediawiki-config@master] Add the 'urlshortener-manage-url' right and enable it for stewards

https://gerrit.wikimedia.org/r/499777

Mentioned in SAL (#wikimedia-operations) [2019-04-02T11:21:02Z] <ladsgroup@deploy1001> Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:499777|Add the urlshortener-manage-url right and enable it for stewards (T133109)]], Part I (duration: 00m 53s)

Mentioned in SAL (#wikimedia-operations) [2019-04-02T11:22:26Z] <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:499777|Add the urlshortener-manage-url right and enable it for stewards (T133109)]], Part I (duration: 00m 51s)

@Ladsgroup Is there documentation about this new user-right and specialpage? I see the basics are covered in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/UrlShortener/+/496805/10/i18n/en.json
But I'm not sure if there's more details that show up for the stewards themselves in https://en.wikipedia.beta.wmflabs.org/wiki/Special:ManageShortUrls (which we obviously cannot see), or if it's mentioned anywhere onwiki yet. Thanks!

@Ladsgroup Is there documentation about this new user-right and specialpage? I see the basics are covered in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/UrlShortener/+/496805/10/i18n/en.json
But I'm not sure if there's more details that show up for the stewards themselves in https://en.wikipedia.beta.wmflabs.org/wiki/Special:ManageShortUrls (which we obviously cannot see), or if it's mentioned anywhere onwiki yet. Thanks!

We just put up https://meta.wikimedia.org/wiki/Wikimedia_URL_Shortener#Delete_a_link, I will improve it later.