Aotokitsuruya
Aotokitsuruya
Senior Software Developer
Published at

Deep into Magnus to Write Rust Extension for Ruby

This article is translated by AI, if have any corrections please let me know.

Recently, due to work-related reasons, I took a moment to revisit Open Policy Agent and discovered that the Cedar Language introduced by AWS is more suitable for implementing a policy mechanism similar to AWS IAM in software applications. Since it is built on Rust, I decided to try writing an extension in Rust so that it can be utilized in Ruby.

Magnus

magnus can be seen as the current go-to choice for writing Rust extensions for Ruby. When using the Bundler gem template, it defaults to using magnus as the foundation.

1bundle gem --ext=rust demo

The generated demo/ directory will include not only the lib/ directory that is standard for Ruby gems but also an ext/ directory for storing Rust code. The extent to which Rust is used depends on the functionality, and sometimes, being too reliant on Rust can be a drawback.

The API implementation of magnus is essentially consistent with CRuby. If you already have experience, it won’t feel too unfamiliar. The default template defines a method in Ruby called Demo.hello("Aotoki").

 1use magnus::{function, prelude::*, Error, Ruby};
 2
 3fn hello(subject: String) -> String {
 4    format!("Hello from Rust, {subject}!")
 5}
 6
 7#[magnus::init]
 8fn init(ruby: &Ruby) -> Result<(), Error> {
 9    let module = ruby.define_module("Demo")?;
10    module.define_singleton_method("hello", function!(hello, 1))?;
11    Ok(())
12}

Thanks to many convenient designs, writing extensions with Rust is not overly complex. Typically, we only need to use define_method to define the methods we need.

Objects

To enable Ruby to use Rust functionalities, the simplest approach is through define_method or define_singleton_method. However, since Ruby is still an object-oriented language, we generally base our development on objects, which means we need to make Ruby aware of the existence of a specific Rust object.

In Rust, there is no concept of classes; it primarily uses structs and traits implemented for a struct. In this context, we can consider them equivalent to Ruby objects.

magnus utilizes Rust’s macro system to handle a significant amount of Ruby object definition-related implementations and behaviors for us, so we can define a Ruby object in a straightforward manner.

 1#[magnus::wrap(class = "Demo::User")]
 2pub struct RUser {
 3    name: String,
 4}
 5
 6// ...
 7#[magnus::init]
 8fn init(ruby: &Ruby) -> Result<(), Error> {
 9    let module = ruby.define_module("Demo")?;
10    module.define_class("User", ruby.class_object())?;
11    Ok(())
12}

First, we use the magnus::wrap macro, allowing magnus to implement all the necessary characteristics for converting Rust into a Ruby Value, and during conversion, it will register the class Demo::User in Ruby.

However, Ruby does not actually know about the existence of Demo::User, so we still need to define the Demo module and the User class in the magnus::init method where the Rust extension is initialized, so that it can be applied in subsequent behaviors.

Furthermore, Ruby cannot directly know that RUser in Rust requires memory for a name: String, so we need to define the .new method for Ruby, which corresponds to Demo::User.new.

 1// ...
 2impl RUser {
 3    fn new(name: String) -> Self {
 4        Self { name }
 5    }
 6}
 7
 8// ...
 9#[magnus::init]
10fn init(ruby: &Ruby) -> Result<(), Error> {
11    let module = ruby.define_module("Demo")?;
12    module.define_class("User", ruby.class_object())?;
13    class.define_singleton_method("new", function!(RUser::new, 1))?;
14    Ok(())
15}

In the example above, we define Demo::User.new, overriding the original Ruby-defined .new method with our Rust-defined method. At this point, the function macro from magnus automatically converts Ruby Value into Rust’s String and passes it to us, allowing us to use this parameter to initialize the memory for the Rust struct.

This is a simpler method. In Ruby, object initialization is a two-step process. The first step calls the .allocate method to allocate memory when .new is called, and then the object’s defined #initialize method is invoked. Since our object is designed in memory in Rust, this behavior overrides the original .allocate call.

Methods

Once we can successfully use the Demo::User object in Ruby, we need to perform some additional processing to access the data stored in Rust. At this point, we will need to define some methods on the object.

 1// ...
 2impl RUser {
 3    fn new(name: String) -> Self {
 4        Self { name }
 5    }
 6
 7    fn name(&self) -> String {
 8        self.name.clone()
 9    }
10}
11
12#[magnus::init]
13fn init(ruby: &Ruby) -> Result<(), Error> {
14    let module = ruby.define_module("Demo")?;
15    let class = module.define_class("User", ruby.class_object())?;
16    class.define_singleton_method("new", function!(RUser::new, 1))?;
17    class.define_method("name", method!(RUser::name, 0))?;
18    Ok(())
19}

Similar to how we implemented .new, we add a name() method and define the #name method on Demo::User using define_method.

Since Rust is immutable by default, we encounter lifetime issues when returning name using &str. We usually resolve this by directly using clone to create a copy.

The difference between the method macro and the function macro is that the former considers the &self pointer, which points to the RUser struct. Therefore, our first parameter must be &self, making it quite similar to the difference between Ruby’s singleton methods and class methods.

These are relatively basic usage scenarios. Since magnus has already handled many implicit processes for us, data types like String can be easily converted. However, in more complex situations, handling may not be as straightforward.

TryConvert & IntoValue

To facilitate communication between Ruby and Rust, magnus has designed two traits to manage this behavior, enabling Ruby Value to be converted into a specific Rust struct or vice versa.

In the current design of CRuby, a Ruby variable is represented by a pointer called VALUE, which points to a Ruby variable, which could be a class (RClass) or a string (RString). Thus, when we call Ruby methods, we pass the VALUE pointer, which is then parsed by the method itself.

So, why doesn’t it throw an error when we directly use Rust’s String when defining objects and methods? This is because magnus has already pre-defined the String’s TryConvert and IntoValue traits.

 1// TryConvert for String
 2impl TryConvert for String {
 3    #[inline]
 4    fn try_convert(val: Value) -> Result<Self, Error> {
 5        debug_assert_value!(val);
 6        RString::try_convert(val)?.to_string()
 7    }
 8}
 9
10// TryConvert for RString
11impl TryConvert for RString {
12    fn try_convert(val: Value) -> Result<Self, Error> {
13        match Self::from_value(val) {
14            Some(i) => Ok(i),
15            None => protect(|| {
16                debug_assert_value!(val);
17                unsafe { Self::from_rb_value_unchecked(rb_str_to_str(val.as_rb_value())) }
18            }),
19        }
20    }
21}

At this point, when we use the function or method macro, it will call the TryConvert::try_convert trait, automatically converting from Value -> RString -> String, resulting in a data structure that Rust can directly utilize.

Conversely, the IntoValue trait handles the reverse. When we need a Value, it will convert from String -> RString -> Value, providing a Ruby string.

Non-Ruby Object Conversion

In some situations, we may not need to wrap every Rust struct into a Ruby object for use. Doing so would require users to perform too many low-level operations, increasing complexity and wasting significant computational resources in conversions between Rust and Ruby.

However, we may still need to directly handle a Rust feature by passing in a Ruby Value. In such cases, the TryConvert and IntoValue traits become very useful.

For example, consider a third-party package that provides a User object.

 1struct User {
 2    name: String,
 3}
 4
 5impl User {
 6    fn new(name: String) -> Self {
 7        Self { name }
 8    }
 9
10    fn to_upper(&self) -> User {
11        User {
12            name: self.name.to_uppercase(),
13        }
14    }
15
16    fn name(&self) -> String {
17        self.name.clone()
18    }
19}

When we want to leverage the to_upper method directly, we can encapsulate it in our Rust extension using a wrapper technique.

Since Rust does not allow implementing traits on external packages, we create a struct that encapsulates the external package to achieve the wrapper effect.

 1pub struct UserWrapper(User);
 2
 3impl TryConvert for UserWrapper {
 4    fn try_convert(value: Value) -> Result<Self, Error> {
 5        let name = TryConvert::try_convert(value)?;
 6        Ok(Self(User::new(name)))
 7    }
 8}
 9
10impl IntoValue for UserWrapper {
11    fn into_value_with(self, ruby: &Ruby) -> Value {
12        self.0.name().into_value_with(ruby)
13    }
14}

Since we do not need a tangible User object in Ruby, we won’t use magnus’s macro to wrap it into an object. Instead, we implement the TryConvert and IntoValue traits to describe how we convert into Rust’s internal UserWrapper.

The above code utilizes Rust’s type inference mechanism, determining that a String output is required. Therefore, TryConvert::try_convert(value) ultimately returns a String, which can be used in User::new to create a UserWrapper.

When we need to return a value to Ruby, since we only need the transformed name, we can directly use into_value_with in IntoValue to trigger the String’s IntoValue trait to convert the String into a Value for Ruby.

With these encapsulations prepared, defining the actual methods becomes very straightforward.

 1fn transform(user: UserWrapper) -> UserWrapper {
 2    UserWrapper(user.0.to_upper())
 3}
 4
 5#[magnus::init]
 6fn init(ruby: &Ruby) -> Result<(), Error> {
 7    let module = ruby.define_module("Demo")?;
 8    module.define_singleton_method("transform", function!(transform, 1))?;
 9
10    Ok(())
11}

Thanks to the TryConvert trait, Rust understands how UserWrapper is derived from a Ruby variable, allowing us to successfully obtain UserWrapper as a parameter.

To return the result of to_upper, we need to extract the original User from UserWrapper, call the method, and then repackage it as a UserWrapper. At this point, Rust can utilize the IntoValue trait to understand how to convert it back into a Ruby Value for returning.

Through this mechanism, we do not necessarily need to implement a full Ruby Object to communicate quickly with Ruby. I applied this feature in cedar-policy-rb to support such syntax.

 1entities = CedarPolicy::Entities.new([
 2    CedarPolicy::Entity.new(
 3        CedarPolicy::EntityUid.new("User", "1"),
 4        { role: "admin" },
 5        [] # Parents' EntityUid
 6    ),
 7    {
 8        uid: { type: "Image", id: "1" },
 9        attrs: {},
10        parents: []
11    }
12])

Before entering Rust, they will uniformly be converted into a standard Ruby Hash using #to_hash, and then serialized into Rust’s structure using magnus_serde, which is directly utilized by Cedar’s underlying implementation.

The advantage here is that originally, Cedar FFI exchanged data in JSON format, which meant we needed to convert Ruby -> JSON -> Rust. By directly using the Serializer, we can achieve direct exchanges from Ruby -> Rust, eliminating an unnecessary JSON processing step and gaining better data processing speed. Although we cannot avoid at least one processing step, it is certainly more efficient than having to process twice.

Magnus also implements a wide variety of other features and methods. However, mastering the above scenarios allows you to easily take advantage of various Rust packages in Ruby.