One of the coolest Twitter bots commands: @AltTextCrew OCR please
Posted by jpluimers on 2021/10/28
Twitter account [Archive.is] @AltTextCrew is cool: it can OCR text from images, which is great for visually impaired people.
Just answer a tweet containing such an image and it replies with a series of tweets with the texts of that image.
@AltTextCrew OCR please
You can also have it check and analyse the links from a tweet, just reply this to that tweet:
@AltTextCrew analyze links
[Archive.is] @hbeckpdx is the driving force behind both @AltTextCrew and [Archive.is] @AltTxtReminder:
- [Archive.is] databass 🏳️⚧️⚢ (@hbeckpdx) | Twitter
[Archive.is] World, I’d like you to meet
@AltTextCrew!Are you a live-tweeter that needs help describing the media they post? DM me on this account.
Are you willing to help them to describe their content? Follow the bot. Thanks to
@fraxstal for the idea!- [Archive.is] Alt Text Reminder on Twitter:
Just followed? It may take 15 minutes for reminders to start
Sometimes tweets non-bot things about alt text, those will be tagged
#AltTxtReminderOOCTo opt-out, block the bot
DMs not checked, message
@hbeckpdx insteadWritten/Maintained by a sighted person
Edit 20220510: AltTxtReminder got open sourced!
Below are two examples of @AltTextCrew usage:
OCR
- image: [Archive.is] databass 🏳️⚧️⚢ on Twitter: “@AltTextCrew OCR please… “
- text: [Wayback] Thread by @AltTextCrew on Thread Reader App – Thread Reader App
Text 1/5:
CVE-2021-20022 Arbitrary file upload through post- authenticated “branding” feature Like many enterprise products with a web- based user interface, SonicWall Email Security includes a feature known as
Text 2/5:
“branding” which gives administrators the ability to customize and add certain assets to the interface, such as company logos. These branding assets are managed via packages, and new packages can be
Text 3/5:
created by uploading ZIP archives containing custom text, image files, and layout settings. A lack of file validation can enable an adversary to upload arbitrary files, including executable code, such
Text 4/5:
as web shells. Once uploaded, these branding package ZIP archives are normally expanded and saved to the <SonicWall ES install path>\data\branding directory. However, an adversary could place
Text 5/5:
malicious files in arbitrary locations, such as a web accessible Apache Tomcat directory, by crafting a ZIP
Link analysis
- Original: [Archive.is] Clinton St. Theater on Twitter: “Did you catch the Oregonian’s article about our Saturday night tradition? @oregonian …”
- Request: [Archive.is] databass 🏳️⚧️⚢ on Twitter: “Analyze links please, @AltTextCrew… “
- Analysis: [Archive.is] In Need Of Description on Twitter: “https://t.co/z0L1581pnY…: Preview has no alt text ❌, 4/5 images have alt text… “
Explanation
- [Archive.is] In Need Of Description on Twitter: “I mostly boost media from live-tweeters in need of description, but if you tag me I can also: OCR images if you reply and include “OCR” Analyze linked pages if you reply and include “Analyze Links” CW: Retweeted media may contain a variety of triggering content!”
- [Archive.is] In Need Of Description on Twitter: “Maintenance has concluded. The bot has a new feature, though it is not yet entirely tested If you @-mention it and include the phrase “analyze links” it will provide a rough analysis of alt text usage on the linked page, as well as info on whether any preview images have alt text”
I really want to know what programming languages, frameworks, libraries and APIs they use for this bot.
Edit 20211028:
It uses the Google Vision API, as Tesseract was too slow and inaccurate:
- [Archive.is] Hannah Kolbeck 🏳️⚧️ #GWOut on Twitter: “It uses google’s vision API 🙂… “
- [Archive.is] Hannah Kolbeck 🏳️⚧️ #GWOut on Twitter: “It started out using tesseract, but it was very slow and pretty inaccurate 😕… “
- [Wayback/Archive.is] Cloud Vision documentation | Cloud Vision API | Google Cloud
Edit 20211211:
Note that usually the text will be published in the alt tag of the images:
[Archive] Hannah Kolbeck 🏳️⚧️ on Twitter: “@jpluimers @AltTextCrew No, it always prefers to tweet images with alt text. Right now if the ocr result from the targeted tweet is too long to fit in 4 images worth it will fall back to posting a thread.” / Twitter
–jeroen
@AltTextCrew OCR please
— Hannah Kolbeck 🏳️⚧️ (@hbeckpdx) April 20, 2021
I mostly boost media from live-tweeters in need of description, but if you tag me I can also:
OCR images if you reply and include "OCR"
Analyze linked pages if you reply and include "Analyze Links"CW: Retweeted media may contain a variety of triggering content!
— Alt Text Crew (@AltTextCrew) April 19, 2021
Maintenance has concluded. The bot has a new feature, though it is not yet entirely tested
If you @-mention it and include the phrase "analyze links" it will provide a rough analysis of alt text usage on the linked page, as well as info on whether any preview images have alt text— Alt Text Crew (@AltTextCrew) April 13, 2021
Did you catch the Oregonian’s article about our Saturday night tradition? @oregonianhttps://t.co/f0yFHQd90I
— Clinton Street Theater (@cstpdx) April 13, 2021
https://t.co/z0L1581pnY…: Preview has no alt text ❌, 4/5 images have alt text
— Alt Text Crew (@AltTextCrew) April 13, 2021
This entry was posted on 2021/10/28 at 06:00 and is filed under OCR, Power User, SocialMedia, Twitter, TwitterBot. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
This site uses Akismet to reduce spam. Learn how your comment data is processed.
Leave a Reply