Sending callbacks from C
Our application is counting bytes and characters like there's no tomorrow. But imagine you're writing a book, have one file per chapter, and want to count characters across them regularly.
Given a CSV file (Comma Separated Values), we want our application to run the calculation on each file in the list.
An example:
Filename: list.csv
chapter1.md,chapter2.md
Filename: chapter1.md
# Getting started
Filename: chapter2.md
# Wrapping up
Our programs command line interface gets a new flag:
./count version ./count bytes list.csv [--csv-list] ./count characters list.csv [--csv-list]
For the given files we want to be able to run commands like this:
$ ./count bytes list.csv --csv-list
18 chapter1.md
14 chapter2.md
Adding a CSV module
This time around we'll start by making the logic in Rust, and then we'll make a FFI wrapper separately, with C types and attributes. The core of our module is a function that takes some string data, splits it on commas, and calls a callback one time for each of the separated values:
Filename: src/modules/csv.rs
#![allow(unused)] fn main() { fn for_each_value(csv: &str, callback: impl Fn(&str)) { for value in csv.split(",") { callback(value.trim()); } } }
We proceed to add the C interface in a separate module at the beginning of the file:
Filename: src/modules/csv.rs
#![allow(unused)] fn main() { mod ffi { use std::ffi::{c_void, CStr, CString}; use std::os::raw::c_char; #[no_mangle] pub extern "C" fn csv_for_each_value( csv: *const c_char, c_callback: unsafe extern "C" fn(*const c_char, *const c_void), context: *const c_void, ) { let csv = unsafe { CStr::from_ptr(csv) }.to_str().unwrap(); super::for_each_value(csv, |value| { let value = CString::new(value).unwrap(); unsafe { c_callback(value.as_ptr(), context) }; }); } } // --snip-- }
We have separated out the FFI-related type conversions from our logic. Notice that
our exported wrapper function has the same name, but with the module name prefixed:
csv_for_each_value()
.
The wrapper takes three parameters:
1. csv: *const c_char
The contents of a CSV file, as a char pointer. Just like earlier, we process it from *const c_char
to CStr
to &str
.
2. c_callback: unsafe extern "C" fn(*const c_char, *const c_void)
An external function callback that takes two arguments. The first is a char pointer taking values from our CSV, and the second is a c_void
pointer. Since C doesn't have closures, a void pointer is a common way
to allow the callee to pass along arbitrary data / state to the callback function.
3. context: *const c_void
The last paramter is a void pointer to the data we want to pass along to the callback.
Upon receiving the string data in our wrapper, we pass it along to CString::new(value).unwrap()
.
CString
is to CStr
what String
is to &str
- an owned version of a
C string. But why are we creating an owned string when we want to pass
along a reference?
Ideally, we would have liked to do the inverse of what we do on
the receiving end, going from &str
to CStr
to *const c_char
. But
to convert to a C string Rust needs to zero-terminate it by adding a \0
at the end of the buffer, thereby requiring ownership of the data.
We then call value.as_ptr()
to get a *const c_char
reference to our
temporary zero-terminated string.
Parsing the new command line argument
In our library's entry point, we need to pull in our new module at the
beginning of the file, and extend the argument parsing to look for and
validate our new --csv-list
flag:
Filename: src/lib.rs
#![allow(unused)] fn main() { mod modules { mod csv; } // --snip-- #[repr(C)] pub struct Arguments { command: Command, filename: *const c_char, file_mode: FileMode, } /// cbindgen:prefix-with-name #[repr(C)] pub enum FileMode { Normal, CsvList, } // --snip-- pub extern "C" fn parse_args(argc: usize, argv: *const *const c_char) -> Arguments { // --snip-- let file_mode = if let Some(csv_flag) = arguments.get(3).copied() { let csv_flag = unsafe { CStr::from_ptr(csv_flag) }.to_str().unwrap(); match csv_flag { "--csv-list" => FileMode::CsvList, _ => panic!("CSV flag not recognized: {csv_flag}") } } else { FileMode::Normal }; Arguments { command, filename, file_mode } } }
We should rebuild the C bindings every time this new file changes:
Filename: build.rs
// --snip-- fn main() { println!("cargo:rerun-if-changed=src/lib.rs"); println!("cargo:rerun-if-changed=src/modules/csv.rs"); // --snip-- } // --snip--
And let's not forget to add it as a dependency of our CMake config:
Filename: CMakeLists.txt
# --snip--
set(
RUST_LIB_SOURCES
${CMAKE_SOURCE_DIR}/build.rs
${CMAKE_SOURCE_DIR}/src/lib.rs
${CMAKE_SOURCE_DIR}/src/modules/csv.rs
)
# --snip--
Re-wiring main.c
We also have to adapt our entry point to the new realities. First, we
change run_command_for_file
so that we'll be able to use it as a
callback. We flip around the two parameters it takes, and substitute
Command
for a CommandContext
, which is the state we soon will pass around
as a void pointer:
Filename: src/main.c
// --snip--
typedef struct CommandContext {
Command command;
} CommandContext;
void run_command_for_file(const char* filename, const void* ctx_ptr);
// --snip--
void run_command_for_file(const char* filename, const void* ctx_ptr) {
const CommandContext* ctx = (CommandContext*) ctx_ptr;
File file = file_read(filename);
char* str = file_to_string(file);
const uint64_t result = do_calculation(ctx->command, str);
print_result(result);
file_free_string(str);
file_free(file);
}
// --snip--
We also have to rewrite the main()
-function to adhere to our new
file_mode
property. If we have FileMode_Normal
, we just wrap the
command in a CommandContext
, and call run_command_for_file
the same
way we always did.
If we have FileMode_CsvList
, we read the contents of the CSV-file
to a string, and pass it on to the Rust-defined csv_for_each_value()
.
// --snip--
int main(const int argc, const char *argv[]) {
const Arguments args = parse_args(argc, argv);
if (args.command == Command_Version) {
print_version();
return 0;
}
switch (args.file_mode) {
case FileMode_Normal: {
CommandContext ctx = { .command = args.command };
run_command_for_file(args.filename, &ctx);
break;
}
case FileMode_CsvList: {
char* csv = file_to_string(file_read(args.filename));
CommandContext ctx = { .command = args.command };
csv_for_each_value(csv, run_command_for_file, &ctx);
file_free_string(csv);
break;
}
}
return 0;
}
// --snip--
Let's test what we've got so far:
$ cmake ..
$ cmake --build .
$ echo "# Getting started" > chapter1.md
$ echo "# Wrapping up" > chapter2.md
$ echo "chapter1.md,chapter2.md" > list.csv
$ ./count characters list.csv --csv-list
18
14
While we do get the count for each of the files, it's not very easy to see which count is for which file.
As a finishing touch, we'll add the filename for each count, if we are in CSV-mode:
// --snip--
typedef struct CommandContext {
Command command;
bool print_filename;
} CommandContext;
// --snip--
void print_result_with_filename(uint64_t result, const char* filename);
// --snip--
int main(const int argc, const char *argv[]) {
// --snip--
switch (args.file_mode) {
case FileMode_Normal: {
CommandContext ctx = { .command = args.command, .print_filename = false };
// --snip--
}
case FileMode_CsvList: {
char* csv = file_to_string(file_read(args.filename));
// --snip--
}
}
// --snip--
}
// --snip--
void run_command_for_file(const char* filename, const void* ctx_ptr) {
// --snip--
const uint64_t result = do_calculation(ctx->command, str);
if (ctx->print_filename) {
print_result_with_filename(result, filename);
} else {
print_result(result);
}
// --snip--
}
// --snip--
void print_result_with_filename(const uint64_t result, const char* filename) {
printf("%s: %lli\n", filename, result);
}
// --snip--
Our void pointer lets us add a new property to the CommandContext
without
touching any code on the Rust side.
We also have to add a print_result_with_filename()
function, and
selecteviley execute it in run_command_for_file()
.
A final test is in order:
$ cmake --build .
$ ./count characters list.csv --csv-list
18 chapter1.md
14 chapter2.md
This output is much easier to digest.
In this section we have showed how you can pass control from C to Rust and back again. In the next one, we will will see how we can pass heap allocated data from Rust to C.