The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,414 other followers

Automated “take-down” algorithm simulation: thread by @AlecMuffett: “Regards Article13, I wrote up a little command-line false-positive emulator; it tests 10 million events with a test (for copyrighted material […]” #Article13

Posted by jpluimers on 2018/07/08

Via [WayBack] Artikel 13 (Uploadfilter) vs. Math – Math wins – Kristian Köhntopp – Google+:

Simulation of the proposed law effects are easy: [WayBackThread by @AlecMuffett: “Regards Article13, I wrote up a little command-line false-positive emulator; it tests 10 million events with a test (for copyrighted material) […]” #Article13

What it shows that an automated test for content-originality only succeeds when there are a truckload of copyrighted-material uploads than original-content uploads:

about 1 in 67 postings have to be “bad” in order to break even

So if you have less than 1% false uploads, even with a 98.5% accuracy (which is very very good for a take-down algorithm!), you will piss off far more good items wrongly marked as false positive, than bad items correctly marked bad.

When the accuracy gets less, you piss-off far more original-content uploads, but also catch less copyrighted-material uploads..

This is called the a far less “sexy” term False positive paradox – Wikipedia, which is a specialisation of the far mor dull sounding Base rate fallacy – Wikipedia

Source code: [WayBack] random-code-samples/falsepos.py at master · alecmuffett/random-code-samples · GitHub

Original thread:

[WayBack] Alec Muffett on Twitterさん: “Regards #Article13, I wrote up a little command-line false-positive emulator; it tests 10 million events with a test (for copyrighted material, abusive material, whatever) that is 99.5% accurate, with a rate of 1-in-10,000 items actually being bad.… https://t.co/CJvxdvkiom”

https://twitter.com/alecmuffett/status/1015594170424193024

and

[WayBack] next_ghost on Twitter: “And for the nerds who want to learn more, this is called a “False positive paradox”. https://t.co/CIvw2ni21q… “

 

–jeroen

One Response to “Automated “take-down” algorithm simulation: thread by @AlecMuffett: “Regards Article13, I wrote up a little command-line false-positive emulator; it tests 10 million events with a test (for copyrighted material […]” #Article13”

  1. thaddy said

    It is not a paradox as Karl Popper already discovered in the ’30’s
    I still wonder why the powerful concept of falsifiability is structurally overlooked in computer science.
    https://en.wikipedia.org/wiki/Falsifiability

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: