Hello and welcome to Module of the Week. Inspired by PyMOTW (and similar ideas), this blog series is meant to dig into some modules available when working with Rust. This will primarily cover the Standard Library but you'll probably see some popular Crates show up once in a while as well.

Each issue has a bit of a story (not to the length you'd find on some Recipe sites) to guide new users and make learning a bit more fun.

Enough said, let's get to std::fs!

The module

std::fs is the standard Rust module for interacting with a Filesystem. It handles a lot of the messy work when it comes to all the different ways that the underlying OS likes to handle files. Creating, Reading, Updating, Deleting, it's all there! Not every operation is supported by every OS, but this guide will try to remain as platform agnostic as possible.

Note

There is a lot of overlap with std::path in all things filesystem related, but that will be the next module discussed.

The story

Rudgal The Delirious is the local Ornithologist. Armed with his trusty camera he's been tasked by the local Government to catalog and identify all of the birds in his local area. As he has to identify not only the species but the individual birds he's ended up with quite a few photos!

Having coded in college (he didn't end up with the title The Delirious for nothing), he sits down with his favorite IDE, forgets about tabs and spaces and dives right into std::fs.

Rudgal keeps all of his files in /home/rudgal/birds, so he'll start with that.

const PHOTO_HOME: &str = "/home/rudgal/birds";

Listing all of the files in a directory

As an easy start, Rudgal wants to list all of the photos he's taken. Browsing through the documents, he comes across std::fs::read_dir.

"Aha! An Iterator. A most important tool in Ornithology!"

Rudgal shouts, to no one in particular.

std::fs::read_dir gives you an Iterator (aptly named ReadDir) of DirEntry objects. More specifically it's a io::Result of DirEntry, in case something goes wrong along the way.

"I'll start with the paths of the files."

use std::{fs, io};

let entries = fs::read_dir(PHOTO_HOME)?
    .map(|entry_res| entry_res.map(|entry| entry.path()))
    .collect::<Result<Vec<_>, io::Error>>()?;

println!("{:?}", entries);

Note

If you're not familiar with the ? syntax or Results in general, stay tuned for when we dive into the std::result module!

After running his program, he gets the following output:

["/home/rudgal/birds/winter2010","/home/rudgal/birds/savalirwood", "/home/rudgal/birds/DCIM_0001.jpg", "/home/rudgal/birds/DCIM_0002.jpg", ... "/home/rudgal/birds/DCIM_9999.jpg"]

Now he has a list of everything in the PHOTO_HOME directory. This includes both files and sub-directories.

Filtering files in a directory

Rudgal wants to sort out some of the entries from the previous section. Specifically he just wants to count the photos in the root of PHOTO_HOME. For that he finds his way to std::fs::metadata in the docs.

// See previous section for the definition of "entries"

let just_photos = entries.iter()
    .filter_map(|entry| Some((entry, fs::metadata(entry).ok()?)))
    // Only select files, filtering out any directories
    .filter(|(_, meta)| meta.is_file())
    .collect::<Vec<_>>();

println!("There are {} actual files.", just_photos.len());

Satisfied with his Vec of (path, metadata) pairs, Rudgal gave it a spin.

There are 9999 actual files.

"A perfectly sized collection of fauna!"

Getting the size of a file

Now that he had a list of just the photos in PHOTO_HOME, Rudgal wanted to know how much space all of the files were taking up. Luckily, std::fs::Metadata had something for just such an occasion!

let photo_size:u64 = just_photos.iter()
    .map(|(path, meta)| meta.len())
    .sum();

println!("Total size is {}", photo_size);
Total size is 127190000

"Maybe I should organize some of this."

Back to the docs, Rudgal!

Get all files in a directory

Rudgal quickly realized he had a problem. By filtering for just files from his original list, he's left out a number of directories left over from some previous half-hearted attempts to organize his photos.

There are two approaches that one could take when it comes to walking directories in Rust, so let's take a look at what they'd look like!

Get all files in a directory (Iteratively)

Rudgal wanted to start simple, to prove that he could defeat the mighty Directory beast!

use std::path::{Path, PathBuf};

fn iter_dirs(dir: &Path) -> io::Result<Vec<PathBuf>> {
    let mut stack = vec![fs::read_dir(dir)?];
    let mut files = vec![];
    // Look out for our future dive into Vectors and their various uses!
    while let Some(dir) = stack.last_mut() {
        // Transpose says: Take that Option<Result> and turn it into a Result<Option>!
        match dir.next().transpose()? {
            None => {
                stack.pop();
            }
            // A Some! But only if it's the kind of Some we want
            Some(dir) if dir.file_type().map_or(false, |t| t.is_dir()) => {
                stack.push(fs::read_dir(dir.path())?);
            }
            Some(file) => files.push(file.path()),
        }
    }
    Ok(files)
}

...

let iter_files = iter_dirs(&PathBuf::from(PHOTO_HOME))?;
println!("The number of files (iteratively) is {}", iter_files.len());

There! It takes a bit of juggling what with the Results and the Options but he got there in the end. Bolstered by his (possibly unfounded) confidence he decided to get fancy with it.

Get all files in a directory (Recursively)

"Recursion! The cause of, and solution to, all of life's problems!"

Said Rudgal the Delirious.

Digging deep into his memory of his Algorithms classes, he made another attempt at digging through all of the directories:

use std::path::{Path, PathBuf};

fn recurse_dirs(dir: &Path) -> io::Result<Vec<PathBuf>> {
    let mut files = vec![];
    if dir.is_dir() {
        let mut dir_files = vec![];
        for entry in fs::read_dir(dir)? {
            let entry = entry?;
            let path = entry.path();
            if path.is_dir() {
                files.push(recurse_dirs(&path)?);
            } else {
                dir_files.push(path);
            }
        }
        files.push(dir_files);
    }
    Ok(files.into_iter().flatten().collect())
}

...

let recurse_files = recurse_dirs(&PathBuf::from(PHOTO_HOME))?;
println!("The number of files (recursively) is {}", recurse_files.len());

It's directories all the way down! Let's see what Rudgal got as a result.

The number of files (recursively) is 24038

Much better! Now he can be sure he didn't miss any files. He's a bit of a shutterbug.

Warning

Rust does not (at the time of writing) specifically optimise Recursive operations like other languages might. In a lot of cases you can see up to three times performance improvements just by doing things iteratively.

Choose whichever pattern is appropriate for what is required.

Conclusion

This is the end of the first dive into the world of std::fs. If you'd like to try out the code used in this post, it's posted here.

Next week

Stay tuned for next week when we'll continue exploring std::fs, more specifically focusing on creating directories and moving files around.