Archive for the ‘Algorithms’ Category
Posted by jpluimers on 2021/07/16
Everyone can learn from an outage. CloudFlare shows how to do it right, for instance on the RegEx-going-wild downtime 2 years ago.
So it’s time to link to that one again: [WayBack] Details of the Cloudflare outage on July 2, 2019
More like these at [WayBack] Post Mortem – The Cloudflare Blog.
More on evaluating regular expressions in linear time:
- [WayBack] Regular Expression Search Algorithm KEN THOMPSON Bell Telephone Laboratories, Inc., Murray Hill, New Jersey
- [WayBack] Programming Techniques: Regular expression search algorithm / [WayBack] Programming Techniques: Regular expression search algorithm
A method for locating specific character strings embedded in character text is described and an implementation of this method in the form of a compiler is discussed. The compiler accepts a regular expression as source language and produces an IBM 7094 program as object language. The object program then accepts the text to be searched as input and produces a signal every time an embedded string in the text matches the given regular expression. Examples, problems, and solutions are also presented.
Programming Techniques: Regular expression search algorithm
| Full Text: |
PDF |
| Author: |
Ken Thompson |
Bell Telphone Labs, Inc., Murray Hill |
Published in:
 |
|
| · Magazine |
| Communications of the ACM CACM Homepage archive |
Volume 11 Issue 6, June 1968
Pages 419-422
ACM New York, NY, USA
table of contents doi>10.1145/363347.363387 |
|
|
- Thompson’s construction – Wikipedia
is a method of transforming a regular expression into an equivalent nondeterministic finite automaton (NFA)
The algorithm works recursively by splitting an expression into its constituent subexpressions, from which the NFA will be constructed using a set of rules.[3] More precisely, from a regular expression E, the obtained automaton A with the transition function δ respects the following properties:
- A has exactly one initial state q0, which is not accessible from any other state. That is, for any state q and any letter a, {\displaystyle \delta (q,a)}
does not contain q0.
- A has exactly one final state qf, which is not co-accessible from any other state. That is, for any letter a, {\displaystyle \delta (q_{f},a)=\emptyset }
.
- Let c be the number of concatenation of the regular expression E and let s be the number of symbols apart from parentheses — that is, |, *, a and ε. Then, the number of states of A is 2s − c (linear in the size of E).
- The number of transitions leaving any state is at most two.
- Since an NFA of m states and at most e transitions from each state can match a string of length n in time O(emn), a Thompson NFA can do pattern matching in linear time, assuming a fixed-size alphabet.
- [WayBack] A Regular Expression Matcher Code by Rob Pike Exegesis by Brian Kernighan
Via [WayBack] Details of the Cloudflare outage on July 2, 2019 | Hacker News
–jeroen
Posted in Algorithms, Development, Power User, RegEx, Software Development | Leave a Comment »
Posted by jpluimers on 2021/07/15
For my link archive: [WayBack] c# – Algorithm to check whether a certain hour falls into a given time period – Stack Overflow answered by [WayBack] kennytm:
Assume you only have the time and not the date.
if end_time >= start_time:
return start_time <= current_time <= end_time
else:
return start_time <= current_time or current_time <= end_time
I totally agree with this comment:
– [WayBack]
I love it when algorithms are simple and elegant.
It reminded me of another scheduling related algorithm: [WayBack] Activity Selection Problem | Greedy Algo-1 – GeeksforGeeks
–jeroen
Posted in .NET, Algorithms, C#, Development, Software Development | Leave a Comment »
Posted by jpluimers on 2021/05/13
A while ago I needed an estimate for [WayBack] How to calculate bitmap size? – Stack Overflow, which does an accurate calculation for the size of the pixel storage (i.e. without headers):
pixelStorageSize = (( width*bitsPerPixel + 31) / 32) * 4 * height
Here:
- 4 is the number of bytes per quad-byte as bitmaps are stored with pixel lines aligned on quad-byte boundaries
- 32 is the bits per quad-byte
- 31 helps to round up to the quad-word boundary
It was derived from [WayBack] Capturing an Image – Windows applications | Microsoft Docs and shorthand for what is explained at [WayBack] BMP file format: Pixel storage – Wikipedia
The bits representing the bitmap pixels are packed in rows. The size of each row is rounded up to a multiple of 4 bytes (a 32-bit DWORD) by padding.
For images with height above 1, multiple padded rows are stored consecutively, forming a Pixel Array.
The total number of bytes necessary to store one row of pixels can be calculated as:
- {\displaystyle {\text{RowSize}}=\left\lceil {\frac {{\text{BitsPerPixel}}\cdot {\text{ImageWidth}}}{32}}\right\rceil \cdot 4=\left\lfloor {\frac {{\text{BitsPerPixel}}\cdot {\text{ImageWidth}}+31}{32}}\right\rfloor \cdot 4,}

- ImageWidth is expressed in pixels, note the special parentheses.
The total amount of bytes necessary to store an array of pixels in an n bits per pixel (bpp) image, with 2ncolors, can be calculated by accounting for the effect of rounding up the size of each row to a multiple of 4 bytes, as follows:
- PixelArraySize = RowSize · |ImageHeight|
- ImageHeight is expressed in pixels. The absolute value is necessary because ImageHeight can be negative.
I think I needed this to estimate how many bitmaps would fit in
[WayBack] Virtual address space – Wikipedia, which on 32-bit Windows by default is limited to 2 GiB, and can be extended to 3 GIB by enabling the IMAGE_FILE_LARGE_ADDRESS_AWARE executable header flag.
This flag imposes the risk of many libraries and DLLs to fail because they do not get the pointer math right. You often do not have control of future versions of DLLs being loaded in your process space and their risks, so do not ever use that flag.
–jeroen
Posted in Algorithms, Development, Software Development | 1 Comment »
Posted by jpluimers on 2021/04/07
For past timestamps (or date-times), as long as you know the associated location, you always know the time zone rule that applies, no matter if you store them in UTC or local time zone.
For future dates, UTC might not be the best option, as you have no knowledge on future time zone rules. There you need to have at least three fields:
Read the rest of this entry »
Posted in .NET, Algorithms, Development, Jon Skeet, Software Development | Leave a Comment »
Posted by jpluimers on 2021/03/18
Originally created for anime upscaling, the waifu2x algorithm works very well for photos, logos and text.
It is open source too!
More information:
- [WayBack] GitHub-lltcggie / waifu2x-caffe: Caffe version of waifu2x (in Japanese, but Google Translate does an amazing job:
rewritten for Windows, using Caffe to rewrite only the conversion function of the image conversion software ” waifu2x ” . It can be converted by CPU, but it can be converted faster than CPU by using CUDA (or cuDNN).
GUI supports English, Japanese, Simplified Chinese, Traditional Chinese, Korean, Turkish, Spanish, Russian, and French.
- [WayBack] waifu2x: Einzelbild Super-Auflösungs Konvertierung für Bilder im Anime Stil mithilfe eines künstlichen neuronalen Netzwerk. Zusätzlich dazu unterstützt es auch Fotos.
- [WayBack] waifu2x: Single-Image Super-Resolution for Anime-Style Art using Deep Convolutional Neural Networks. And it supports photo.
- waifu2x – Wikipedia
- Comparison gallery of image scaling algorithms – Wikipedia
–jeroen
Read the rest of this entry »
Posted in Algorithms, Development, Software Development | Leave a Comment »
Posted by jpluimers on 2021/03/10
TL;DR of https://css-tricks.com/debouncing-throttling-explained-examples/:
- debounce: Grouping a sudden burst of events (like keystrokes) into a single one.
- throttle: Guaranteeing a constant flow of executions every X milliseconds. Like checking every 200ms your scroll position to trigger a CSS animation.
- requestAnimationFrame: a throttle alternative. When your function recalculates and renders elements on screen and you want to guarantee smooth changes or animations. Note: no IE9 support.
Full article [WayBack] Debouncing and Throttling Explained Through Examples | CSS-Tricks
Delphi implementations:
–jeroen
Posted in Algorithms, Delphi, Development, JavaScript/ECMAScript, Scripting, Software Development | Leave a Comment »
Posted by jpluimers on 2020/06/24
From about 4 years ago, so time to see how many development stacks support Unum by now: [WayBack] The End of (Numeric) Error
Crunching numbers was the prime task of early computers. The common element of these early computers is they all used integer arithmetic. John Gustafson, one of the foremost experts in scientific computing, has proposed a new number format that provides more accurate answers than standard floats, yet saves space and energy. The new format might well revolutionize the way we do numerical calculations.
Back then, I found these links through my G+ circles:
Read the rest of this entry »
Posted in Algorithms, Development, Floating point handling, Software Development, Unum | Leave a Comment »
Posted by jpluimers on 2020/04/29
Naming is number # in the Programmers’ hardest tasks so when you have to name things, read [WayBack] Approved Verbs for Windows PowerShell Commands.
It is guidance on naming schemes and lists of appropriate nouns and verbs to use in names.
–jeroen
From twitter:
$PARAMETERS = get-command | where-object {$_.CommandType -eq "Cmdlet" } | select Parameters
$PARAMETER_KEYS = foreach ($P in $PARAMETERS)
{
foreach ($K in $($P.Parameters | select keys))
{
$K.Keys
}
}
$PARAMETER_KEYS | sort-object -unique
Resulting in this list with PowerShell 5.1.18362.752 on Windows 10.0.18363.815:
AccessMode
Action
Activity
Add
AddToHistoryHandler
Adjust
After
Alias
AliasDefinitions
AliasesToExport
All
AllMatches
AllowClobber
AllowRedirection
Amended
AnsiEscapeTimeout
Any
AppDomainName
Append
AppendPath
ApplicationArguments
ApplicationBase
ApplicationName
ArgumentList
Arguments
As
AsBaseObject
AsCustomObject
AsHashTable
AsHtml
AsJob
AsSecureString
AssembliesToLoad
Assembly
AssemblyName
AsString
Attachments
Attributes
Authentication
Author
Authority
AutoRemoveJob
AutoSize
Average
BackgroundColor
BaseDirectory
Bcc
Before
Begin
BellStyle
BinaryPathName
BindingVariable
Body
BodyAsHtml
Bound
BreakAll
Breakpoint
BriefDescription
BufferSize
CancelTimeout
CanonicalName
CaseSensitive
Category
CategoryActivity
CategoryReason
CategoryResourceFile
CategoryTargetName
CategoryTargetType
Cc
CContains
CEQ
Certificate
CertificateThumbprint
CGE
CGT
Character
ChildJobState
ChildPath
Chord
CimNamespace
CimResourceUri
CimSession
CIn
Class
CLE
CLike
ClrVersion
CLT
CMatch
Cmdlet
CmdletsToExport
CNE
CNotContains
CNotIn
CNotLike
CNotMatch
CodeDomProvider
Colors
Column
Command
CommandLine
CommandName
CommandType
CommandValidationHandler
ComObject
CompanyName
CompatiblePSEditions
CompilerParameters
Completed
CompletionQueryItems
Component
Compress
ComputerName
ConfigurationName
ConfigurationTypeName
Confirm
ConnectionUri
Container
ContainerId
Contains
Content
ContentType
Context
ContinuationPrompt
Copyright
Count
Credential
CssUri
Culture
CurrentOperation
Date
Day
Days
DcomAuthentication
Debug
Debugger
DefaultCommandPrefix
DefaultDisplayProperty
DefaultDisplayPropertySet
DefaultKeyPropertySet
Definition
DefinitionName
DefinitionPath
Delay
Delimiter
DeliveryNotificationOption
DependentServices
DependsOn
Depth
Descending
Description
Destination
DestinationPath
Detailed
DifferenceObject
DingDuration
DingTone
Directory
DirectRead
DisableKeepAlive
DisableNameChecking
DisplayError
DisplayHint
DisplayName
DomainCredential
DomainName
DotNetFrameworkVersion
Drive
DriveLetter
DscResourcesToExport
EditMode
EnableAllPrivileges
EnableNetworkAccess
Encoding
End
EntryType
EnvironmentVariables
EQ
ErrorAction
ErrorId
ErrorPopup
ErrorRecord
ErrorVariable
EventArguments
EventId
EventIdentifier
EventName
Example
Examples
Exception
Exclude
ExcludeDifferent
ExcludeProperty
ExecutionPolicy
Expand
ExpandProperty
Expression
ExtraPromptLineCount
File
FileList
FileName
FilePath
FileVersionInfo
Filter
FilterScript
First
For
Force
ForegroundColor
Format
FormatsToProcess
FormatTypeName
Forward
Fragment
From
FromSession
Full
FullyQualifiedModule
FullyQualifiedName
Function
Functionality
FunctionDefinitions
FunctionsToExport
GE
Global
GroupBy
GroupManagedServiceAccount
GT
Guid
HasMoreData
Head
Header
Headers
Height
HelpInfoUri
Hidden
HideComputerName
HideTableHeaders
HistoryNoDuplicates
HistorySavePath
HistorySaveStyle
HistorySearchCaseSensitive
HistorySearchCursorMovesToEnd
HostProcessInfo
Hour
Hours
IconUri
Id
IdleTimeout
IdleTimeoutSec
IgnoreWarnings
IgnoreWhiteSpace
Impersonation
In
Include
IncludeChildJob
IncludeEqual
IncludeExtent
IncludePortInSPN
IncludeScriptBlock
IncludeTotalCount
IncludeUserName
Independent
Index
InDisconnectedSession
InFile
InformationAction
InformationVariable
InheritPropertySerializationSet
InitializationScript
InputObject
InstanceId
Is
IsAbsolute
IsNot
IsValid
ItemType
Job
JobName
Keep
Language
LanguageMode
Last
LastStatus
LE
Leaf
LicenseUri
Like
Line
List
ListAvailable
ListenerOption
ListImported
LiteralName
LiteralPath
LoadUserProfile
LocalCredential
Locale
Location
LogName
LT
Match
MaxConcurrentCommandsPerSession
MaxConcurrentUsers
MaxConnectionRetryCount
MaxIdleTimeoutSec
Maximum
MaximumHistoryCount
MaximumKillRingCount
MaximumReceivedDataSizePerCommand
MaximumReceivedDataSizePerCommandMB
MaximumReceivedObjectSize
MaximumReceivedObjectSizeMB
MaximumRedirection
MaximumSize
MaximumVersion
MaxMemoryPerSessionMB
MaxProcessesPerSession
MaxSessions
MaxSessionsPerUser
MaxTriggerCount
MemberDefinition
MemberName
MemberType
Message
MessageData
MessageResourceFile
Method
Millisecond
Milliseconds
Minimum
MinimumVersion
Minute
Minutes
Mode
Module
ModuleInfo
ModuleList
ModulesToImport
ModuleVersion
Month
MountUserDrive
Name
Namespace
Native
NE
NestedModules
NewerThan
Newest
NewName
NoClobber
NoCommonParameter
NoCompression
NoElement
NoEncryption
NoEnumerate
NoMachineProfile
NoNewline
NoNewScope
NoNewWindow
NoQualifier
NoRecurse
NoServiceRestart
NotContains
NotePropertyMembers
NotePropertyName
NotePropertyValue
NotIn
NotLike
NotMatch
NoTypeInformation
Noun
NoWait
Object
Off
OlderThan
Online
OnType
OpenTimeout
OperationTimeout
Option
Options
OUPath
OutBuffer
OutFile
OutputAssembly
OutputBufferingMode
OutputMode
OutputModule
OutputType
OutTarget
OutVariable
OverflowAction
Paging
Parameter
ParameterName
ParameterResourceFile
ParameterType
Parent
ParentId
PassThru
Path
PathType
Pattern
PercentComplete
Persist
PipelineVariable
Port
PostContent
PowerShellHostName
PowerShellHostVersion
PowerShellVersion
PreContent
Prefix
PrependPath
Priority
PrivateData
Process
ProcessIdleTimeoutSec
ProcessName
ProcessorArchitecture
ProjectUri
Prompt
PromptText
Property
PropertyNames
PropertySerializationSet
PropertyType
Protocol
Proxy
ProxyAccessType
ProxyAuthentication
ProxyCredential
ProxyUseDefaultCredentials
PSDrive
PSEdition
PSHost
PSProvider
PSSession
PSVersion
PutType
Qualifier
Query
Quiet
Raw
RawData
ReadCount
ReadOnly
RecommendedAction
Recurse
RedirectStandardError
RedirectStandardInput
RedirectStandardOutput
ReferencedAssemblies
ReferenceObject
Refresh
Registered
Relative
ReleaseNotes
RemainingScripts
Remove
RemoveFileListener
RemoveListener
Repair
RepeatHeader
Replace
RequiredAssemblies
RequiredGroups
RequiredModules
RequiredServices
RequiredVersion
Resolve
Restart
RestorePoint
RestorePointType
RetentionDays
ReturnResult
Role
RoleDefinitions
RollbackPreference
Root
RootModule
RunAs32
RunAsAdministrator
RunAsCredential
RunAsVirtualAccount
RunAsVirtualAccountGroups
Runspace
RunspaceId
RunspaceInstanceId
RunspaceName
SchemaVersion
Scope
Script
ScriptBlock
ScriptsToProcess
Second
Seconds
SecondsRemaining
SecondValue
SecurityDescriptorSddl
Sender
Separator
SerializationDepth
SerializationMethod
Server
Session
SessionName
SessionOption
SessionType
SessionTypeOption
SessionVariable
SetSeed
ShowCommandInfo
ShowError
ShowSecurityDescriptorUI
ShowToolTips
ShowWindow
SimpleMatch
Skip
SkipCACheck
SkipCNCheck
SkipLast
SkipNetworkProfileCheck
SkipRevocationCheck
SmtpServer
Source
SourceId
SourceIdentifier
SourcePath
Stack
StackName
Start
StartupScript
StartupType
State
Static
Status
Step
Stream
Strict
StringData
StringSerializationSource
Subject
SubscriptionId
Sum
SupportedCommand
SupportEvent
SyncWindow
Syntax
System
Tags
Tail
TargetObject
TargetTypeForDeserialization
TemplateContent
TemplateFile
TextFormatType
ThreadApartmentState
ThreadOptions
ThrottleLimit
Timeout
TimeoutSec
TimeToLive
Title
To
ToSession
TotalCount
Trace
TransactedScript
Transcript
TranscriptDirectory
TransferEncoding
TransportOption
Type
TypeAdapter
TypeConverter
TypeData
TypeDefinition
TypeName
TypesToProcess
UFormat
UICulture
Unbound
Unique
UnjoinDomainCredential
Unsecure
UpdateTemplate
Uri
UseBasicParsing
UseCulture
UseDefaultCredential
UseDefaultCredentials
UseNewEnvironment
UserAgent
UserDriveMaximumSize
UserName
UseSharedProcess
UseSSL
UseTransaction
UseUTF16
UsingNamespace
Value
ValueOnly
Variable
VariableDefinitions
VariablesToExport
Verb
Verbose
Version
View
ViMode
ViModeIndicator
Visibility
VisibleAliases
VisibleCmdlets
VisibleExternalCommands
VisibleFunctions
VisibleProviders
VMId
VMName
Wait
WarningAction
WarningVariable
WebSession
WhatIf
Width
WindowStyle
Word
WordDelimiters
WorkgroupName
WorkingDirectory
Wrap
WriteEvents
WriteJobInResults
WsmanAuthentication
Xml
XPath
Year
Posted in Agile, Algorithms, Code Quality, Development, Software Development | Leave a Comment »