The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 2,481 other followers

Archive for the ‘NTFS’ Category

More on empty files

Posted by jpluimers on 2021/10/07

TL;DR: Empty files are indeed of size zero, but there is some disk space involved for their meta-data (like name, permission, timestamps)

Some links (via [WayBack] create zero sized file – Google Search):

  • [WayBack] Zero-byte file – Wikipedia
  • [WayBack] filesystems – How can a file size be zero? – Super User (thanks [WayBack] phuclv):

    Filesystems store a lot of information about a file such as file name, file size, creation time, access time, modified time, created user, user and group permissions, fragments, pointer to clusters that store the file, hard/soft links, attributes… Those are called file metadata. Why do you count those metadata into file size when users do not (need to) care about them and don’t know about them? They only really care about the file content

    Moreover each filesystem stores different types of metadata which take different amounts of space on disk. For example POSIX permissions are very different from NTFS permission, and there are also inode numbers in POSIX which do not exist on Windows. Even POSIX filesystems vary a lot, like ext3 with 32-bit block address, ext4 with 48-bit, Btrfs with 64-bit and ZFS with 128-bit address. So how will you count those metadata into file size?

    Take another example with a 100-byte file whose metadata consumes 56 bytes on the current filesystem. We copy the file to another filesystem and now it takes 128 bytes of metadata. However the file contents are exactly the same, the number of bytes in the files are also the same. So displaying file size as 156 bytes on a system but 228 bytes on another is very confusing and counter-intuitive.

  • [WayBack] What is the concept of creating a file with zero bytes in Linux? – Unix & Linux Stack Exchange:

    touch will create an inode, and ls -i or stat will show info about the inode:

    $ touch test
    $ ls -i test
    28971114 test
    $ stat test
      File: ‘test’
      Size: 0           Blocks: 0          IO Block: 4096   regular empty file
    Device: fc01h/64513d    Inode: 28971114    Links: 1
    Access: (0664/-rw-rw-r--)  Uid: ( 1000/1000)   Gid: ( 1000/1000)
    Access: 2017-03-28 17:38:07.221131925 +0200
    Modify: 2017-03-28 17:38:07.221131925 +0200
    Change: 2017-03-28 17:38:07.221131925 +0200
     Birth: -
    

    Notice that test uses 0 blocks. To store the data displayed, the inode uses some bytes. Those bytes are stored in the inode table. Look at the ext2 page for an example of an inode structure [WayBack].

Oh and a nice NTFS thing (thanks [WayBack] Paweł Bulwan):

and in case of NTFS, the size of file reported by Windows and most tools is actually the size of the main stream of the file, which we perceive as the content of the file. The file stored on NTFS partition can additionaly have some data stored in alternative data streams, and still have the reported size of 0. It’s a nice filesystem feature to know if you want to have the full picture :)

Related: my really old post command line – create empty text file from a batch file (via: Stack Overflow)

–jeroen

Posted in *nix, btrfs, Development, File-Systems, NTFS, Power User, Software Development, Windows | Leave a Comment »

Twitter thread by thread by @0xdade; More unicode shit: zero width space and a zero width nonjoiner in filenames

Posted by jpluimers on 2021/09/22

[WayBack] Thread by @0xdade: Today I learned that you can put zero width spaces in file names on Linux. Have fun. I’m playing with this because punycode/IDN is fascinati…

Today I learned that you can put zero width spaces in file names on Linux. Have fun.

I’m playing with this because punycode/IDN is fascinating, and I wanted to know what happened when I started shoving unicode in the path portion of the url, which isn’t part of how browsers try to protect URLs, as far as I can tell

wiki.mozilla.org/IDN_Display_Al…

I think it’s more entertaining to have a file that is named *only* a zero width space, but I think using them throughout a filename is better to break tab completion and not stand out too much. A filename that is just blank looks strange in ls output.
Thank goodness adduser is looking out for our best interests.
Oooh this one is pretty subtle.
Just about pissed myself with this one.

Not related to the terminal fun, but related to zero width characters:

You can:
– Break url previews https://0xda​​​​​​.​de
– @​0xdade without tagging
– Make a word like system​d not searchable twitter.com/search?q=from%…

Okay but back to command line crap. I really like this one. Create a directory named .[ZWS]

One thing that is cool about using zero width spaces is that “ls” has a flag, “-b”, that is meant to escape non-graphic characters. Inserting a newline, for instance, would be escaped to \n. But the zero width space is technically a graphic character, so nothing happens.

Fun.

Have no fear, though. It’s not unbeatable. It’s only fun if the language and LC settings are set to support utf-8. If you set LC_ALL=C or whatever that isn’t utf-8, then it looks like this.

Putting a link to this tweet here so that I don’t lose it again in the future.

dade@0xdade

My god, it is beautiful. I mean except all the whitespace I can’t get rid of before the command lmao.

View image on Twitter
But on the other hand if you just have a search for the zws, then whatever you find is probably worth investigating. 
I guess I’ll start the hashtag before @QW5kcmV3 does for #irresponsibleutf8 🤭😏😂 

And these tweets:

[WayBack] Thread by @Plazmaz: @0xdade Was doing some real fucking around with urls recently: gist.github.com/Plazmaz/565a5c… (was gonna flesh it out more but didn’t find…:

mentions Was doing some real fucking around with urls recently:
mentions This one is my fave:
‘⁄’ (\u2044)
or
‘∕’ (\u2215)
Allow for this:
google.com⁄search⁄query⁄.example.com
google.com⁄search⁄query⁄@example.com 

[WayBack] url-screwiness.md · GitHub:

This is a list of methods for messing with urls. These are often useful for bypassing filters, SSRF, or creating convincing links that are difficult to differentiate from legitimate urls.

And a bit of documentation links:

–jeroen

 

Posted in *nix, .NET, C#, Development, NTFS, Power User, Python, Scripting, Software Development, Windows | Leave a Comment »

Unix and NTFS file systems, hardlinks, inodes, files, directories, dot directories, bugs and implementation details

Posted by jpluimers on 2021/09/21

Lots of interesting tidbits on unix and NTFS file systems.

If you want to blow up your tooling, try creating a recursive hardlink…, which is likely one of the reasons that nx file systems do not support them.

Covered and related topics:

The tweets (especially follow the train of thought in the various subtrees: a great way to learn new things!):

It is important to understand that the concept File IDs and inode/vnode has far reaching consequences, for instance from [WayBack] inode – Wikipedia

  • Files can have multiple names. If multiple names hard link to the same inode then the names are equivalent; i.e., the first to be created has no special status. This is unlike symbolic links, which depend on the original name, not the inode (number).
  • An inode may have no links. An unlinked file is removed from disk, and its resources are freed for reallocation but deletion must wait until all processes that have opened it finish accessing it. This includes executable files which are implicitly held open by the processes executing them.
  • It is typically not possible to map from an open file to the filename that was used to open it. The operating system immediately converts the filename to an inode number then discards the filename. This means that the getcwd() and getwd() library functions search the parent directory to find a file with an inode matching the working directory, then search that directory’s parent, and so on until reaching the root directorySVR4 and Linux systems maintain extra information to make this possible.
  • Historically, it was possible to hard link directories. This made the directory structure into an arbitrary directed graph contrary to a directed acyclic graph. It was even possible for a directory to be its own parent. Modern systems generally prohibit this confusing state, except that the parent of root is still defined as root. The most notable exception to this prohibition is found in Mac OS X (versions 10.5 and higher) which allows hard links of directories to be created by the superuser.[10]
  • A file’s inode number stays the same when it is moved to another directory on the same device, or when the disk is defragmented which may change its physical location. This also implies that completely conforming inode behavior is impossible to implement with many non-Unix file systems, such as FAT and its descendants, which don’t have a way of storing this invariance when both a file’s directory entry and its data are moved around.
  • Installation of new libraries is simple with inode file systems. A running process can access a library file while another process replaces that file, creating a new inode, and an all-new mapping will exist for the new file so that subsequent attempts to access the library get the new version. This facility eliminates the need to reboot to replace currently mapped libraries.
  • It is possible for a device to run out of inodes. When this happens, new files cannot be created on the device, even though there may be free space available. This is most common for use cases like mail servers which contain many small files. File systems (such as JFS or XFS) escape this limitation with extents or dynamic inode allocation, which can “grow” the file system or increase the number of inodes.

A very cool read in the midst of the tweet tree was this reference to former Google Plus by [WayBack] Rob Pike – Wikipedia (of Golang, Unix team and Plan 9 fame).

WayBack: A lesson in shortcuts.Long ago, as the design of the Unix file system was being worked out, the entries . and .. appeared, to make navigation easier. … – Rob Pike – Google+

A lesson in shortcuts.

Long ago, as the design of the Unix file system was being worked out, the entries . and .. appeared, to make navigation easier. I’m not sure but I believe .. went in during the Version 2 rewrite, when the file system became hierarchical (it had a very different structure early on).  When one typed ls, however, these files appeared, so either Ken or Dennis added a simple test to the program. It was in assembler then, but the code in question was equivalent to something like this:
if (name[0] == ‘.’) continue;
This statement was a little shorter than what it should have been, which is
if (strcmp(name, “.”) == 0 || strcmp(name, “..”) == 0) continue;
but hey, it was easy.

Two things resulted.

First, a bad precedent was set. A lot of other lazy programmers introduced bugs by making the same simplification. Actual files beginning with periods are often skipped when they should be counted.

Second, and much worse, the idea of a “hidden” or “dot” file was created. As a consequence, more lazy programmers started dropping files into everyone’s home directory. I don’t have all that much stuff installed on the machine I’m using to type this, but my home directory has about a hundred dot files and I don’t even know what most of them are or whether they’re still needed. Every file name evaluation that goes through my home directory is slowed down by this accumulated sludge.

I’m pretty sure the concept of a hidden file was an unintended consequence. It was certainly a mistake.

How many bugs and wasted CPU cycles and instances of human frustration (not to mention bad design) have resulted from that one small shortcut about  40 years ago?

Keep that in mind next time you want to cut a corner in your code.

(For those who object that dot files serve a purpose, I don’t dispute that but counter that it’s the files that serve the purpose, not the convention for their names. They could just as easily be in $HOME/cfg or $HOME/lib, which is what we did in Plan 9, which had no dot files. Lessons can be learned.)

–jeroen

Read the rest of this entry »

Posted in *nix, Development, File-Systems, History, NTFS, Power User, Software Development, Windows, Windows Development | Leave a Comment »

Cipher: a command-line tool to decrypt/encrypt files and directories (een recursively) on Windows

Posted by jpluimers on 2020/07/03

A while ago, I had to mass encrypt a lot of directories and files on Windows for some directories in an existing directory structure.

This helped me to find out which ones were already done (it lists all encrypted files on all drives; the /n ensures the files or encryption keys are not altered):

cipher.exe /u /n /h

This encrypted recursively in one directory B:\Directory:

cipher /D /S:B:\Directory /A

It also has options to wipe data (/W), export keys into transferrable files (/X) and many more.

If you like the Windows Explorer more then to encrypt/decrypt (it is a tedious process): [WayBack] How do I encrypt/decrypt a file? | IT Pro.

Via:

–jeroen

Posted in Encryption, NTFS, Power User, Security, Windows | Leave a Comment »

When storing huge files under NTFS compression, ensure you have twice the disk space

Posted by jpluimers on 2020/01/31

When copying over a 400 gigabyte file over the network to an NTFS compressed folder on a drive with having 600 gigabytes free space, the volume became full after copying ~350 gigabytes.

What I learned is that compressing huge files for later read-only access is fine, but you need about twice the disk space while the copy operation is in progress.

For non-compressed files you can go without this extra reservation.

Background information:

Note there are also issues with NTFS compression and de-duplication. I’m not sure about sparse files. Be careful when you try to compress the system drive where your Windows OS lives on:

–jeroen

Posted in NTFS, Power User, Windows | Leave a Comment »

 
%d bloggers like this: