If you’ve been using and/or developing on macOS for some time, it’s possible that you’ve heard that macOS uses the Apple File System (APFS). You’ve possibly also heard of its numerous benefits over traditional filesystems, such as copy on write, cloning (where unchanged blocks can be shared between multiple files), and, the topic of this post, sparse files. Sparse files are files with blank sections, and are used for virtual machines with Apple’s Virtualization.framework instead of the Qcow2 format that QEMU uses. I’ve seen it stated1 that APFS supports copying sparse files across volumes (which can be on entirely different disks), so I was wondering (and getting annoyed) about why UTM was copying ~70GB of zeroes (this was somewhat of a worst case scenario, since it was a newly created VM with no actual data on the disk, so it could have been a several second task if implemented correctly).
As it turns out, there isn’t a whole lot of information out there about this form of copying. Finder (the macOS file manager) does it automatically, but while the Swift FileManager.copyItem(at:to:)
method performs cloning (which saves space on the same volume), it doesn’t do sparse copies.
After going down some rabbit holes with attempting to modify the URLResourceValues
of the source and/or destination file (which proved pointless as the sparseness property is get-only), I stumbled upon a mirror2 for the Apple open source code of the copyfile(3)
function. As I found it using GitHub search, it linked me directly to sparse_test.c
, which from its code seemed promising, so I opened the manpage for copyfile(3)
, and found that, indeed, it seemed to do what I needed.
Building the project and running it showed a test called sparse_recursive
which helped to answer a question I had with copyfile
: if it would allow copying sparse files recursively (more specifically, copying sparse files inside directories). It seemed like it did, so thanks to this confirmation, I decided to try using copyfile
from Swift to copy a directory with a UTM-like structure, with a sparse “disk image” of random data inside it.
Fortunately, it worked! I then also tried cloning the files (which would allow the non-sparse parts to not have to be copied if both the source and destination is on the same volume). That worked too, so I was able to just have one copyfile
function call, with these 4 flags: COPYFILE_ALL | COPYFILE_RECURSIVE | COPYFILE_CLONE | COPYFILE_DATA_SPARSE
.
Here’s the code I used. The first file creates the data, and the second does the sparse copying/cloning.
MakeSparse.swift
:
import Foundation
do {
let fm = FileManager.default
let dir = URL(fileURLWithPath: "VM.almostutm")
try fm.createDirectory(at: dir, withIntermediateDirectories: false)
let url = dir.appendingPathComponent("disk0.img")
fm.createFile(atPath: url.path, contents: nil)
let handle = try FileHandle(forWritingTo: url)
let random = (0 ..< 10_000_000).map { _ in UInt8.random(in: UInt8.min ... UInt8.max) }
let d1 = Data((0 ..< 10).map { _ in random }.flatMap { $0 }) // Change to either 10 or 100 based on desired size (100M or 1G)
let d2 = Data([8,7,6,5,4,3,2,1])
handle.write(d1)
handle.seek(toFileOffset: UInt64(3_000_000_000))
handle.write(d2)
handle.closeFile()
let config = "This is _definitely_ the configuration plist. No doubt about it :)"
try config.write(to: dir.appendingPathComponent("config.plist"), atomically: true, encoding: .utf8)
} catch {
print("Error: \(error)")
}
The comment indicating where to change the generated file size shows where you can make the size smaller, if you want to test sparse copying to another disk, or to make it larger, to confirm that the near-instantaneous cloning actually happened, instead of just a really fast local SSD copy.
SparseCopy.swift
:
import Darwin
import Foundation
let url = URL(fileURLWithPath: "VM.almostutm")
let otherUrl = URL(fileURLWithPath: "NewVM.almostutm")
let result = copyfile(url.path, otherUrl.path, copyfile_state_t(bitPattern: 0), copyfile_flags_t(COPYFILE_ALL | COPYFILE_RECURSIVE | COPYFILE_CLONE | COPYFILE_DATA_SPARSE))
print("Result: \(result)")
You can prefix the NewVM.almostutm
with the path to an external volume (/Volumes/.../
) in order to test the cross-volume sparse copying.
https://eclecticlight.co/2023/01/02/inside-apfs-from-containers-to-clones/#:~:text=Duplicating%20or%20moving,retain%20their%20format. ↩︎
Only after writing this post did I think to check if it was the official Apple repository, and it wasn’t, but I’ve linked the official repository. ↩︎