windows – What does SetFileValidData doing ? what is the difference with SetEndOfFile? – Stack Overflow
Posted by jpluimers on 2024/08/21
While researching how to allocate space for empty Windows files, I bumped into this: [Wayback/Archive] windows – What does SetFileValidData doing ? what is the difference with SetEndOfFile? – Stack Overflow.
Interesting but dangerous: SetFileValidData allows setting the end of the “valid” file data to a point into the file without Windows pretending the content was zero-filled.
The big important thing here (a drawback for security, a blessing for adversaries): the file will incorporate data that was on disk before it got incorporated into the file, potentially leaking deleted data.
That’s why the SetFileValidData required at least the SE_MANAGE_VOLUME_NAME privilege.
QA content and salvaged/archived related links:
Q (thanks [Wayback/Archive] wenxibo)
I look for a way to extend a file asynchronously and efficiently .
In a support document Asynchronous Disk I/O Appears as Synchronous on Windows NT, Windows 2000, and Windows XP said:
NOTE: Applications can make the previously mentioned write operation asynchronous by changing the Valid Data Length of the file by using the SetFileValidData function, and then issuing a WriteFile.
in MSDN,
SetFileValidDatais a function forSets the valid data length of the specified file.But I still not understand what is the “valid data”, what is the difference between it and the size of file?
I can use
SetFilePointerExandSetEndOfFileto extend the file size, but how do this bySetFileValidData?
SetFileValidDatacannot input a argument large than the size of file. In this case, what is the living meaning ofSetFileValidData?C (thanks [Wayback/Archive] arx)
The documentation for SetEndOfFile explains the difference.
[Wayback/Archive] SetEndOfFile function ->[Wayback/Archive] SetFileValidData function (fileapi.h) – Win32 apps | Microsoft Learn
A (thanks [Wayback/Archive] Harry Johnston)
… you can use
SetEndOfFileto make a file very large very quickly, and if you read from the new part of the file you’ll just get zeros. The valid data length increases when you write actual data to the new part of the file.That’s fine if you just want to reserve space, and will then be writing data to the file sequentially. But if you make the file very large and immediately write data near the end of it, zeros need to be written to the new part of the file, which will take a long time. If you don’t actually need the file to contain zeros, you can use
SetFileValidDatato skip this step; the new part of the file will then contain random data from previously deleted files.Addendum:
- The rules for sparse files are different.
- You should not use
SetFileValidDataon a file that non-privileged users have read access to; this could leak content from deleted files that belonged to other users.A (thanks [Wayback/Archive] Melonhead)
Please note that
SetEndOfFile()doesn’t write any zeros to any allocated sectors on disk, it just allocates the space pointers inside MFT records and then updates the space bitmap of the whole file system. But the OS, or FS, will record the valid/logical file length in its MFT record.If you enlarge the file, from 1GB to 2GB, then the appended 1GB should be all zeros, but the FS won’t write the zeros to disks, it refers to this file’s valid length to know that the 1GB should be zeros. If you try to read from this enlarged 1GB portion, it will fill zeros directly in RAM and then feedback to your application. But if you write any byte inside this 1GB portion, the FS has to fill with zeros from the original 1GB offset to the current pointer that your application is trying to write to, but not the other bytes from the current location to the tail of the file. Meanwhile, it records the valid/logical length to be from 0 to the current location, the physical size and allocated size is still 2GB.
But, if you use
SetFileValidData(), the FS will set the valid length to 2GB directly, and won’t bother to fill any zeros. Whatever location you write to, it just writes, but whatever location you read from, you may read out some garbage data which was previously generated by other applications before the file was extended into that disk space.A (thanks [Wayback/Archive] robbie fan)
Agree with Harry Johnston’s answer, and from the practice point of view, while SetFileValidData has performance advantage because it does not require writing zeros, it does have security implications because the file might contain data from other deleted files. So a special privilege, SE_MANAGE_VOLUME_NAME, is required, as MSDN mentioned: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365544(v=vs.85).aspx
The reason is that, if the user account of the running program doesn’t have that privilege, using SetFileValidData can expose other user’s deleted data into the view of that particular file, so normal users (non-administrators) are not allowed to do that. Even for privileged users, they still need to take care to use ACL (access control lists) in the file system to protect that file so that it is not shared with non-privileged users.
C (thanks [Wayback/Archive] user2014859 and Harry Johnston)
Referring to flexhex, when use
SetEndOfFileto extend a file, the operating system may just appends the file with sparse zeros, especially if length of the appendix is large, but not really allocated space or automatically written zeros. The file system just keeps any sparse holes in mind and logically returns zeros when read requests cover these sparse holes.…
Windows supports sparse files, but it doesn’t happen automatically, the programmer has to specifically request it. For non-sparse files, SetEndOfFile allocates space but does not zero the content. For sparse files, SetEndOfFile does not allocate space.
The flexhex domain went down, but the link has been archived as well as the tools it refers to:
- [Wayback/Archive] SetFileValidData function (fileapi.h) – Win32 apps | Microsoft Learn
…
The SetFileValidData function sets the logical end of a file. To set the size of a file, use the SetEndOfFile function. The physical file size is also referred to as the end of the file.
Each file stream has the following properties:
- File size: the size of the data in a file, to the byte.
- Allocation size: the size of the space that is allocated for a file on a disk, which is always an even multiple of the cluster size.
- Valid data length: the length of the data in a file that is actually written, to the byte. This value is always less than or equal to the file size.
Typically, the SetFileValidData function is used by system-level applications on their own private data. Not all file systems use valid data length. Some file systems can track multiple valid data ranges. In general, most applications will never need to call this function.
The SetFileValidData function allows you to avoid filling data with zeros when writing nonsequentially to a file. The function makes the data in the file valid without writing to the file. As a result, although some performance gain may be realized, existing data on disk from previously existing files can inadvertently become available to unintended readers. The following paragraphs provide a more detailed description of this potential security and privacy issue.
A caller must have the SE_MANAGE_VOLUME_NAME privilege enabled when opening a file initially. Applications should call SetFileValidData only on files that restrict access to those entities that have SE_MANAGE_VOLUME_NAME access. The application must ensure that the unwritten ranges of the file are never exposed, or security issues can result as follows.
If SetFileValidData is used on a file, the potential performance gain is obtained by not filling the allocated clusters for the file with zeros. Therefore, reading from the file will return whatever the allocated clusters contain, potentially content from other users. This is not necessarily a security issue at this point, because the caller needs to have SE_MANAGE_VOLUME_NAME privilege for SetFileValidData to succeed, and all data on disk can be read by such users. However, this caller can inadvertently expose this data to other users that cannot acquire the SE_MANAGE_VOLUME_PRIVILEGE privilege if the following holds:
- If the file was not opened with a sharing mode that denies other readers, a nonprivileged user can open it and read the exposed data.
- If the system stops responding before the caller finishes writing up the ValidDataLength supplied in the call, then, on a reboot, such a nonprivileged user can open the file and read exposed content.
If the caller of SetFileValidData opened the file with adequately restrictive access control, the previous conditions would not apply. However, for partially written files extended with SetFileValidData (that is, writing was not completed up to the ValidDataLength supplied in the call) there exists yet another potential privacy or security vulnerability. An administrator could copy the file to a target that is not properly controlled with restrictive ACL permissions, thus inadvertently exposing the extended area’s data to unauthorized reading.
It is for these reasons that SetFileValidData is not recommended for general purpose use, in addition to performance considerations, as discussed below.
For more information about security and access privileges, see Running with Special Privileges and File Security and Access Rights.
You can use the SetFileValidData function to create large files in very specific circumstances so that the performance of subsequent file I/O can be better than other methods. Specifically, if the extended portion of the file is large and will be written to randomly, such as in a database type of application, the time it takes to extend and write to the file will be faster than using SetEndOfFile and writing randomly. In most other situations, there is usually no performance gain to using SetFileValidData, and sometimes there can be a performance penalty.
In Windows 8 and Windows Server 2012, this function is supported by the following technologies.
Technology Supported Server Message Block (SMB) 3.0 protocol Yes SMB 3.0 Transparent Failover (TFO) Yes SMB 3.0 with Scale-out File Shares (SO) Yes Cluster Shared Volume File System (CsvFS) Yes Resilient File System (ReFS) Yes …
- [Wayback/Archive] SetEndOfFile function (fileapi.h) – Win32 apps | Microsoft Learn
…
The SetEndOfFile function can be used to truncate or extend a file. If the file is extended, the contents of the file between the old end of the file and the new end of the file are not defined.
Each file stream has the following:
- File size: the size of the data in a file, to the byte.
- Allocation size: the size of the space that is allocated for a file on a disk, which is always an even multiple of the cluster size.
- Valid data length: the length of the data in a file that is actually written, to the byte. This value is always less than or equal to the file size.
The SetEndOfFile function sets the file size. Use SetFileValidData to set the valid data length.
If
CreateFileMapping is called to create a file mapping object for hFile, UnmapViewOfFile must be called first to unmap all views and call CloseHandle to close the file mapping object before you can call SetEndOfFile.…
- [Wayback/Archive] Why does my single-byte write take forever? – The Old New Thing
…
on NTFS, extending a file reserves disk space but does not zero out the data. Instead, NTFS keeps track of the “last byte written”, technically known as the valid data length, and only zeroes out up to that point. The data past the valid data length are logically zero but are not physically zero on disk. When you write to a point past the current valid data length, all the bytes between the valid data length and the start of your write need to be zeroed out before the new valid data length can be set to the end of your write operation. (You can manipulate the valid data length directly with the
SetFileValidDatafunction, but be very careful since it comes with serious security implications.)Two solutions were proposed to the customer.
Option 1 is to force the file to be zeroed out immediately after setting the end of file by writing a zero byte to the end. This front-loads the cost so that it doesn’t get imposed on subsequent writes at seemingly random points.
Option 2 is to make the file sparse. Mark the file as sparse with the
FSCTL_SET_SPARSEcontrol code, and immediately after setting the end of file, use theFSCTL_SET_ZERO_DATAcontrol code to make the entire file sparse. This logically fills the file with zeroes without committing physical disk space. Anywhere you actually write gets converted from “sparse” to “real”. This does open the possibility that a later write into the middle of the file will encounter a disk-full error, so it’s not a “just do this and you won’t have to worry about anything” solution, and depending on how randomly you convert the file from “sparse” to “real”, the file may end up more fragmented than it would have been if you had “kept it real” the whole time. - [Wayback/Archive] When you create an object with constraints, you have to make sure everybody who uses the object understands those constraints – The Old New Thing
As the documentation for
CreateFilenotes, theFILE_FLAG_NO_BUFFERINGflag requires that all I/O operations on the file handle be in multiples of the sector size, and that the I/O buffers also be aligned on addresses which are multiples of the sector size. - [Wayback/Archive] Sparse Files – Win32 apps | Microsoft Learn
…
When sparse file functionality is enabled, the system does not allocate hard disk drive space to a file except in regions where it contains nonzero data. When a write operation is attempted where a large amount of the data in the buffer is zeros, the zeros are not written to the file. Instead, the file system creates an internal list containing the locations of the zeros in the file, and this list is consulted during all read operations. When a read operation is performed in areas of the file where zeros were located, the file system returns the appropriate number of zeros in the buffer allocated for the read operation. In this way, maintenance of the sparse file is transparent to all processes that access it, and is more efficient than compression for this particular scenario.
…
- [Wayback/Archive] FSCTL_SET_SPARSE – Win32 apps | Microsoft Learn
Marks the indicated file as sparse or not sparse. In a sparse file, large ranges of zeros may not require disk allocation. Space for nonzero data will be allocated as needed as the file is written.
…
- [Wayback/Archive] FSCTL_SET_ZERO_DATA – Win32 apps | Microsoft Learn
Fills a specified range of a file with zeros (0). If the file is sparse or compressed, the NTFS file system may deallocate disk space in the file. This sets the range of bytes to zeros (0) without extending the file size.
- [Wayback/Archive] winapi – Allocate file on NTFS without zeroing – Stack Overflow
Q (thanks [Wayback/Archive] basin)
I want to make a tool similar to zerofree for linux. I want to do it by allocating a big file without zeroing it, look for nonzero blocks and rewrite them.
With admin privileges it is possible, uTorrent can do this: http://www.netcheif.com/Articles/uTorrent/html/AppendixA_02_12.html#diskio.no_zero , but it’s closed source.
- [Wayback/Archive] Ubuntu Manpage: zerofree — zero free blocks from ext2/3 file-systems -> [Wayback/Archive] Ubuntu Manpage: zerofree — zero free blocks from ext2, ext3 and ext4 file-systems
- [Wayback/Archive] µTorrent User Manual > Appendix A: The µTorrent Interface > Preferences: Preferences: Advanced
C (thanks [Wayback/Archive] Raymond Chen)
The function you want is
SetFileValidData.A (thanks [Wayback/Archive] basin)
Wrote a tool https://github.com/basinilya/winzerofree . It uses
SetFileValidData()as @RaymondChen suggested[Wayback/Archive] GitHub – basinilya/winzerofree
Wipe the free disk space on Windows. Inspired by zerofree at http://intgat.tigress.co.uk/rmy/uml/sparsify.html Admin required. No need to unmount - uses OS API. Supports XP/2003 or later, NTFS, FAT32
A (thanks [Wayback/Archive] mox)
I am not sure this answers your question (need), but such a tool already exists. You might have a look at fsutil.exe Fsutil command line tool. This tool has a huge potential to discover the internal structures of NTFS files and can also create file of any size (without the need to zeroing it manually). Hope that helps.
[Wayback/Archive] Fsutil file -> [Wayback/Archive] Fsutil | Microsoft Learn + [Wayback/Archive] Fsutil file | Microsoft Learn
fsutil file [setvaliddata] <FileName> <DataLength>…
fsutil file [setzerodata] offset=<Offset> length=<Length> <FileName>…
- In NTFS, there are two important concepts of file length: the end-of-file (EOF) marker and the Valid Data Length (VDL). The EOF indicates the actual length of the file. The VDL identifies the length of valid data on disk. Any reads between VDL and EOF automatically return 0 to preserve the C2 object reuse requirement.
- The setvaliddata parameter is only available for administrators because it requires the Perform volume maintenance tasks (SeManageVolumePrivilege) privilege. This feature is only required for advanced multimedia and system area network scenarios. The setvaliddata parameter must be a positive value that is greater than the current VDL, but less than the current file size.
It is useful for programs to set a VDL when:
- Writing raw clusters directly to disk through a hardware channel. This allows the program to inform the file system that this range contains valid data that can be returned to the user.
- Creating large files when performance is an issue. This avoids the time it takes to fill the file with zeroes when the file is created or extended.
…
To set the valid data length to 4096 bytes for a file named Testfile.txt on an NTFS volume, type:fsutil file setvaliddata c:\testfile.txt 4096To set a range of a file on an NTFS volume to zeros to empty it, type:
fsutil file setzerodata offset=100 length=150 c:\temp\sample.txt…
C (thanks [Wayback/Archive] basin)
Ah, there’s another command
fsutil file setvaliddata E:\bigfile 9001works.
--jeroen






Leave a comment