The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 4,224 other subscribers

Archive for the ‘DevOps’ Category

The CPU load average metric often is not a good one to alert on

Posted by jpluimers on 2023/04/20

Boy I wish threads with more than one person could be saved by the ThreadReaderApp.

Anyway:

[WayBack] Thread by @mipsytipsy: oh boy.. i was just idly musing over how the single most ubiquitous/useless metric is “CPU load average”, lol i wonder if you could use CPU…

oh boy.. i was just idly musing over how the single most ubiquitous/useless metric is “CPU load average”, lol

i wonder if you could use CPU load alerts to score how modern and powerful a team’s toolchain is, like a Waffle House Index for tooling. 🤔

 

…oh oh! but i was gonna say, this thread between @drk and @shelbyspees is a killer nanotutorial in how to ask better questions about your code — where to start, how to drill down and dig in, how to instrument, and how to approach such an open-ended exploratory jaunt. 👏🐝❤️

it’s a really good illustration of this thing we end up saying all the time, which is “don’t fear the future, it is simpler and clearer and *easier* here! the way you are doing it NOW is the hard way!” 😖

time for cpu load average to go the way of the PC LOAD LETTER …

0:00
/ 0:01

 

 

Read the rest of this entry »

Posted in *nix, Cloud, Development, DevOps, Infrastructure, Power User, Software Development, Systems Architecture | Leave a Comment »

Does it still hold: “Never keep anything important on AWS in US-EAST-1”?

Posted by jpluimers on 2023/01/31

Reminder to self to check if this still holds: [Archive] Varun Krishnan on Twitter: “Never keep anything important on AWS in US-EAST-1” / Twitter

Slightly more than a year ago, the Amawon Web Services region US-EAST-1 collapsed with world-wide downtime consequences for many AWS services. It took some 8 hours to recover most of the services.

Before that, it was plagued with outages, maybe because it was their first ever region:

The outage was covered many times. I have included this El Reg link, as I like their tone of voice: [Wayback/Archive] AWS technical woes in US East region cause widespread outage • The Register.

Basically, any cloud stack is founded on these three layers:

  • Storage (S3 or Simple Storage Service in AWS speak)
  • Compute (EC2 or Elastic Compute Cloud in AWS speak)
  • Authentication and Authorisation (IAM or Identity and Access Management in AWS speak)

On top of that, any other services are implemented. And for Amazon Web Services, many of these have become available over the last two decades.

Indeed Anders Borum was right in his tweet: US-EAST-1 is the first ever AWS EC2 region and started in 2006, more than 15 years ago. It is also the region with the largest capacity. Likely both play a role in US-EAST-1 being part or initiating factor in many of the major AWS outages. If you look in all AWS outages, US-EAST-1 plays a role in most if not all outages since 2017,

So for now, if hosting at AWS, I would host outside of US-EAST-1.

Depending on the kind of application and money involved, I would consider hosting in multiple regions, and if a truckload of money was involved: hosting on multiple clouds.

I fully agree with [Archive] Gergely Orosz on Twitter: “If you were impacted by the recent AWS outage, the decision to invest in multi-cloud / multi-datacenter is simple: How much did this outage cost you vs the cost of adding a (lot) more complexity & maintenance with multi-cloud/DC? If outage cost >> this, only then do it.” / Twitter

Some more insight on multi-cloud hosting is via [Archive] Redmond on Twitter: “New feature from @jdanton: A full post-mortem from AWS is still to come, but in the meantime, IT pros should start bolstering their cloud disaster recovery strategies now — before the next outage. https://t.co/ios5Re5ZCs” / Twitter at [Wayback/Archive] AWS Outage Fallout: What Lessons You Should Learn — Redmondmag.com

Is It Time to Go Multicloud?

No. Well…if you are running a major property with a big customer-facing presence, it can be a good strategy to have static Web and app content hosted in a second cloud. In the case of an outage like yesterday’s, you’d have the option to direct traffic to the static presence, which can supply some level of experience for your users.

A good example of how this approach can be useful is an outage dashboard. Whenever a cloud provider has an outage, they are notoriously bad at properly reporting ongoing status. This is because they have hosted their dashboards in their own clouds using their own APIs — and when these APIs go down, they take the monitoring with them. Using DNS, you can quickly redirect traffic to this static site, where your engineers can update the page with status updates.

Related

–jeroen

Read the rest of this entry »

Posted in AWS Amazon Web Services, Cloud, Cloud Development, Deployment, Development, DevOps, Infrastructure, Power User, Software Development | Leave a Comment »

Web issues are either DNS or caching

Posted by jpluimers on 2022/12/22

I can relate to both DNS and caching posts:

–jeroen

Read the rest of this entry »

Posted in Development, DevOps, Infrastructure, Power User | Leave a Comment »

A twitter call to say nice things about technology sparked interesting threads

Posted by jpluimers on 2022/05/27

A while ago [Archive.is] Adam Jacob on Twitter: “Let’s say nice things about technology today. I’ll start. If it wasn’t for @lkanies and @puppetize, there is no way we would have been able to adapt as an industry to the rise of the cloud. Quote tweet me with your own.” sparked some interesting threads.

First posts are below; click on them to see the full threads.

Read the rest of this entry »

Posted in Chrome, Configuration Management, Development, DevOps, Firefox, History, IaC - Infrastructure as Code, Infocom and Z-machine, Infrastructure, KVM Kernel-based Virtual Machine, LSI/3ware, Open Source, PDP-11, Power User, PowerShell, Puppet, Python, Qemu, Rust, Safari, Scripting, Software Development, UCSD Pascal, Vagrant, Veewee, Virtualization, Web Browsers, Xen | Leave a Comment »

Need to revisit osquery: SQL powered operating system instrumentation, monitoring, and analytics supports more platforms and also aggregates to central log locations

Posted by jpluimers on 2022/01/18

Almost two years ago, GitHub – facebook/osquery: SQL powered operating system instrumentation, monitoring, and analytics published from the automatic blog queue.

It was in the midst of my rectum cancer treatment, so I was glad the blog queue back then was still about 18 months deep.

This meant I looked into osquery in 2018, which I remember because I needed it on MacOS as I did not want to remember the syntax for MacOS specific commands on getting system information. It also coincides with how much my repository fork was behind: [Wayback: jpluimers/osquery commits/Archive: jpluimers/osquery commits].

Fast forward to now, the breath of systems I’m involved with has widened, so I was glad to see that Kristian Köhntopp mentioned it:

So time to try it again (:

The links he mentioned:

  • [Wayback/Archive] Welcome to osquery – osquery

    osquery is an operating system instrumentation framework for Windows, OS X (macOS), Linux, and FreeBSD. The tools make low-level operating system analytics and monitoring both performant and intuitive.

  • [Wayback/Archive] Welcome to osquery – osquery: High Level Features
    The high-performance and low-footprint distributed host monitoring daemon, osqueryd, allows you to schedule queries to be executed across your entire infrastructure. The daemon takes care of aggregating the query results over time and generates logs which indicate state changes in your infrastructure. You can use this to maintain insight into the security, performance, configuration, and state of your entire infrastructure. osqueryd‘s logging can integrate into your internal log aggregation pipeline, regardless of your technology stack, via a robust plugin architecture.
    The interactive query console, osqueryi, gives you a SQL interface to try out new queries and explore your operating system. With the power of a complete SQL language and dozens of useful tables built-in, osqueryi is an invaluable tool when performing incident response, diagnosing a systems operations problem, troubleshooting a performance issue, etc.
  • [Wayback/Archive] osqueryd (daemon) – osquery
  • [Wayback/Archive] osqueryi (shell) – osquery
  • [Wayback/Archive] Aggregating Logs – osquery
  • [Wayback/Archive] AWS Logging – osquery

Main site: [Wayback/Archive] osquery | Easily ask questions about your Linux, Windows, and macOS infrastructure

Repository: [Wayback/Archive] osquery/osquery: SQL powered operating system instrumentation, monitoring, and analytics.

–jeroen

Posted in *nix, *nix-tools, Apple, Development, DevOps, Facebook, Infrastructure, Mac, Mac OS X / OS X / MacOS, Power User, SocialMedia, Software Development, Windows | Leave a Comment »

 
%d bloggers like this: