The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,860 other subscribers

Archive for the ‘Cloud’ Category

GitLab pages issues today again? (and report on 2023-10-30: Gitlab.com is down (#17054) · Issues · GitLab.com / GitLab Infrastructure Team / production · GitLab)

Posted by jpluimers on 2024/03/12

Still working on handling open Chrome tabs after having moved in the period that GitLab had quite a few issues causing my PagerDuty alerts to go wild.

Today PagerDuty gave me 7 calls in 4 hours again (see [Wayback/Archive] Jeroen Wiert Pluimers @wiert@mastodon.social on X: “@gitlab Since 20240312T1727Z I get PagerDuty alerts from HetrixTools for some pages hosted on GitLab. It would be nice if someone could have a look at gitlab.com/gitlab-com/gl-infra/production/-/issues/17717).

In adddition I need to check if anything made it to the GitLab issue list from the 20230827 connectivity issues I mentioned at [Wayback/Archive] Jeroen Wiert Pluimers @wiert@mastodon.social on X: “Is it @gitlab hosting having transcontinental issues, or are other continental connections affected as well? These are from two different *.gitlab.io pages as measured via @HetrixTools . No issues are listed at status.gitlab.com.

Back then, this was the most important one: [Wayback/Archive] GitLab System Status: GitLab.com availability issues – October 30, 2023 15:39 UTC

Likely because of this, wiert.me.gitlab.io had been down for a while as well on 20231031 (see [Wayback/Archive] wiert.me.gitlab.io (Recent History) – HetrixTools down from 2023-10-30T15:24Z until 2023-10-30T16:14Z for 3 + 3 + 11 + 27 = 44 minutes.)

Back then, the hardest part was to quickly find out if there was indeed an issue being investigated at all.

The GitLab status multi-media account on Twitter just points to the status page, which makes it hard to find the underlying issue.

I didn’t archive that one in time, but when I got the alerts it didn’t show anything and when it was resolved it was already beyond the cut-off timestamp to mark it as “same day” and the graph didn’t show much down-time [Wayback/Archive] GitLab System Status graph didn’t show much down-time:

Read the rest of this entry »

Posted in *nix, Cloud, Development, DVCS - Distributed Version Control, GitLab, hetrixtools, Infrastructure, Monitoring, PagerDuty, Power User, Software Development, Source Code Management | Leave a Comment »

b0rk does fun things with DNS: CNAME records at the root of the domain; technically not allowed, definitely not recommended, but somehow work for web browsing

Posted by jpluimers on 2023/12/21

[Wayback/Archive] 🔎Julia Evans🔍 on Twitter: “I’ve always heard that you can’t create CNAME records at the root of the domain. But apparently you can? It seems to work fine as far as I can tell but I’m curious about the possible consequences. (yes, I registered cnameroot.com just to make this tweet) “

Read the rest of this entry »

Posted in Cloud, Cloudflare, DNS, Infrastructure, Internet, Power User | Leave a Comment »

Looking for maintainer(s) for fritzcap (Python project that captures calls from a Fritz!Box)

Posted by jpluimers on 2023/07/12

Given my health uncertainty, I am looking for maintainers for the fritzcap project (it captures calls from a Fritz!Box modem/router and is written in Python).

History

The fritzcap project was originally started in2007 by [Wayback/Archive] spongebob | IP Phone Forum, first as a binary fritzcap.exe Windows executable (see his first post at [Wayback/Archive] FritzBox: Tool für Etherreal Trace und Audiodaten-Extraktion | IP Phone Forum). In 2010 it became an open source Python project at [Wayback/Archive] Google Code Archive – Long-term storage for Google Code Project Hosting.

Read the rest of this entry »

Posted in About, Audio, Cloud, Communications Development, Containers, Development, Docker, ffmpeg, Fritz!, Fritz!Box, fritzcap, Hardware, HTTP, Infrastructure, Internet protocol suite, Media, Network-and-equipment, Personal, Power User, Python, Scripting, Software Development, TCP | Leave a Comment »

An unexpected turn of events when Jeff Geerling posted “I’m hosting my website on a FARM!”

Posted by jpluimers on 2023/07/06

Some links on the unexpected turn of events after [Archive] Jeff Geerling (@geerlingguy) / Twitter posted

First his site got more traffic because of the post, then within an hour traffic exploded because of a DDoS overflowing both his Raspberry Pi cluster and his mobile data capacity.

Jeff will likely do blog posts on these and update the underlying GitHub repository at [Wayback/Archive] geerlingguy/turing-pi-2-cluster: Turing Pi 2 Cluster , but until then (since his Tweets were not threaded), this is what happened on 20220209 as it taught me a few bits:

Read the rest of this entry »

Posted in Cloud, Cloudflare, Containers, Development, Docker, Hardware Development, Infrastructure, Internet, Kubernetes (k8n), LifeHacker, OpenSpeedTest, Power User, Raspberry Pi, SpeedTest | Leave a Comment »

GitLab pages on a custom domain are nice, but be aware of intermittent 502 and certificate errors

Posted by jpluimers on 2023/06/05

Reminder to self GitLab pages on the gitlab.com are free, so Setting up a GitLab project so it is served over https as a gitlab.io and a custom subdomain comes with two caveats:

  1. Intermittent HTTP error 502 Bad Gateway
  2. Intermittent NET::ERR_CERT_COMMON_NAME_INVALID (Chrome) or SSL_ERROR_BAD_CERT_DOMAIN (Firefox):

–jeroen

Read the rest of this entry »

Posted in Cloud, Development, GitLab, Infrastructure, Power User, Source Code Management | Leave a Comment »

The CPU load average metric often is not a good one to alert on

Posted by jpluimers on 2023/04/20

Boy I wish threads with more than one person could be saved by the ThreadReaderApp.

Anyway:

[WayBack] Thread by @mipsytipsy: oh boy.. i was just idly musing over how the single most ubiquitous/useless metric is “CPU load average”, lol i wonder if you could use CPU…

oh boy.. i was just idly musing over how the single most ubiquitous/useless metric is “CPU load average”, lol

i wonder if you could use CPU load alerts to score how modern and powerful a team’s toolchain is, like a Waffle House Index for tooling. 🤔

 

…oh oh! but i was gonna say, this thread between @drk and @shelbyspees is a killer nanotutorial in how to ask better questions about your code — where to start, how to drill down and dig in, how to instrument, and how to approach such an open-ended exploratory jaunt. 👏🐝❤️

it’s a really good illustration of this thing we end up saying all the time, which is “don’t fear the future, it is simpler and clearer and *easier* here! the way you are doing it NOW is the hard way!” 😖

time for cpu load average to go the way of the PC LOAD LETTER …

0:00
/ 0:01

 

 

Read the rest of this entry »

Posted in *nix, Cloud, Development, DevOps, Infrastructure, Power User, Software Development, Systems Architecture | Leave a Comment »

smile.amazon.de: AmazonSmile Is Ending (in fact it already has ended)

Posted by jpluimers on 2023/04/07

I missed this 2 weeks ago, but in fact “AmazonSmile Is Ending” means it has already ended. Amazon.de purchases do not help charity any more.

[Wayback/Archive] smile.amazon.de

AmazonSmile Is Ending

Thank you for your support over the past decade. We appreciate your help in donating more than $450 million across hundreds of thousands of charities worldwide. Moving forward, we are excited to focus on other philanthropic initiatives.

You can continue shopping on http://www.amazon.de for the same selection of products you know and love.

in German:

AmazonSmile wird eingestellt

Wir danken Ihnen für Ihre Unterstützung in den letzten sieben Jahren. Mit Ihrer Hilfe konnten hunderttausende gemeinnützige Organisationen weltweit mit mehr als 450 Millionen US-Dollar unterstützt werden. Amazon wird sein soziales Engagement künftig auf andere Bereiche fokussieren.

Sie können weiterhin auf http://www.amazon.de einkaufen, wo Sie die gleiche Produktauswahl vorfinden.

–jeroen

Posted in Amazon.com/.de/.fr/.uk/..., Cloud, Infrastructure | Leave a Comment »

Need to take a look a Scaleway

Posted by jpluimers on 2023/04/05

Based on

I need to take a look at Scaleway, at least at thee links via [Wayback/Archive] scaleway instance – Google Search:

Related blog post: Dave Anderson on Twitter: “Cool minor @Tailscale moment: I’m recommissioning a server that got moved from a different network, so all its network config was wrong, and generally I couldn’t get at it over the network, only IPKVM console. But then my ping over Tailscale started working?!” / Twitter

–jeroen

Posted in Cloud, Infrastructure, Power User, Scaleway, Scoop, Windows | 1 Comment »

howisthecloud (@howisthecloud) / Twitter

Posted by jpluimers on 2023/02/09

Getting the latest cloud outages is easy: just follow [Archive] howisthecloud (@howisthecloud) / Twitter

Feeding cloud provider statuses from all over the Internets. Currently tweeting AWS,SFDC,Heroku,GoogleApps,Azure,Sakura,Rackspace,Pivotal statuses. By

I used it to keep an eye on the december 2021 US-EAST-1 AWS outage I wrote about in Does it still hold: “Never keep anything important on AWS in US-EAST-1”?.

–jeroen

Posted in Amazon.com/.de/.fr/.uk/..., AWS Amazon Web Services, Azure Cloud, Cloud, Infrastructure, Uncategorized | Leave a Comment »

Does it still hold: “Never keep anything important on AWS in US-EAST-1”?

Posted by jpluimers on 2023/01/31

Reminder to self to check if this still holds: [Archive] Varun Krishnan on Twitter: “Never keep anything important on AWS in US-EAST-1” / Twitter

Slightly more than a year ago, the Amawon Web Services region US-EAST-1 collapsed with world-wide downtime consequences for many AWS services. It took some 8 hours to recover most of the services.

Before that, it was plagued with outages, maybe because it was their first ever region:

The outage was covered many times. I have included this El Reg link, as I like their tone of voice: [Wayback/Archive] AWS technical woes in US East region cause widespread outage • The Register.

Basically, any cloud stack is founded on these three layers:

  • Storage (S3 or Simple Storage Service in AWS speak)
  • Compute (EC2 or Elastic Compute Cloud in AWS speak)
  • Authentication and Authorisation (IAM or Identity and Access Management in AWS speak)

On top of that, any other services are implemented. And for Amazon Web Services, many of these have become available over the last two decades.

Indeed Anders Borum was right in his tweet: US-EAST-1 is the first ever AWS EC2 region and started in 2006, more than 15 years ago. It is also the region with the largest capacity. Likely both play a role in US-EAST-1 being part or initiating factor in many of the major AWS outages. If you look in all AWS outages, US-EAST-1 plays a role in most if not all outages since 2017,

So for now, if hosting at AWS, I would host outside of US-EAST-1.

Depending on the kind of application and money involved, I would consider hosting in multiple regions, and if a truckload of money was involved: hosting on multiple clouds.

I fully agree with [Archive] Gergely Orosz on Twitter: “If you were impacted by the recent AWS outage, the decision to invest in multi-cloud / multi-datacenter is simple: How much did this outage cost you vs the cost of adding a (lot) more complexity & maintenance with multi-cloud/DC? If outage cost >> this, only then do it.” / Twitter

Some more insight on multi-cloud hosting is via [Archive] Redmond on Twitter: “New feature from @jdanton: A full post-mortem from AWS is still to come, but in the meantime, IT pros should start bolstering their cloud disaster recovery strategies now — before the next outage. https://t.co/ios5Re5ZCs” / Twitter at [Wayback/Archive] AWS Outage Fallout: What Lessons You Should Learn — Redmondmag.com

Is It Time to Go Multicloud?

No. Well…if you are running a major property with a big customer-facing presence, it can be a good strategy to have static Web and app content hosted in a second cloud. In the case of an outage like yesterday’s, you’d have the option to direct traffic to the static presence, which can supply some level of experience for your users.

A good example of how this approach can be useful is an outage dashboard. Whenever a cloud provider has an outage, they are notoriously bad at properly reporting ongoing status. This is because they have hosted their dashboards in their own clouds using their own APIs — and when these APIs go down, they take the monitoring with them. Using DNS, you can quickly redirect traffic to this static site, where your engineers can update the page with status updates.

Related

–jeroen

Read the rest of this entry »

Posted in AWS Amazon Web Services, Cloud, Cloud Development, Deployment, Development, DevOps, Infrastructure, Power User, Software Development | Leave a Comment »