Aotokitsuruya

Senior Software Developer

Published at July 12, 2023

The structure of RITE - mruby-go

This article is translated by AI, if have any corrections please let me know.

This article is part of mruby-go series.

mruby compiles source code using a compiler (usually mrbc), resulting in a binary file in the mrb format. This file format is called RITE, and in order to execute compiled mruby code, it needs to be parsed and read.

Structure

The structure of RITE consists of two main parts: the rite_binary_header and several sections. The rite_binary_header records information about the format of the binary file, mruby version, compiler information, and more.

There are four basic types of sections: irep, debug, lv (Local Variable), and footer (meaningless). If the debug section is not specifically included, there will only be the other three types.

Each section has its own section header for identification. Apart from irep, which also includes additional version information (rite_version), other sections consist of an identity (ident) represented by char[4] and the size of the section represented by uint32.

1// example irep section
2struct rite_section_irep_header {
3  uint8_t section_ident[4];
4  uint8_t section_size[4];
5  uint8_t rite_version[4];
6};

With this understanding, we can read this information using Golang.

BinaryHeader

Reading the rite_binary_header is not difficult. We simply need to define a structure called BinaryHeader and use Sized Bytes (a fixed-size byte array) to correctly populate the values using the Read method from the binary package.

 1type BinaryHeader struct {
 2    Identifier [4]byte
 3	Version    struct {
 4		Major [4]byte
 5		Minor [4]byte
 6	}
 7	Size       uint32
 8	Compiler   struct {
 9		Name    [4]byte
10		Version uint32
11	}
12}
13
14func ReadHeader(r io.Reader) (*BinaryHeader, error) {
15	header := &BinaryHeader{}
16	err := binary.Read(r, binary.BigEndian, header)
17	if err != nil {
18		return nil, err
19	}
20	return header, nil
21}

Sections

Reading the sections is more complex and will be explained in subsequent articles. To differentiate a section, we can introduce a SectionHeader structure to extract the common section information.

 1type SectionHeader struct {
 2	Identity [4]byte
 3	Size     uint32
 4}
 5
 6func ReadSection(r io.Reader, remain uint32) (*Section, error) {
 7	header := &SectionHeader{}
 8	err := binary.Read(r, binary.BigEndian, header)
 9	if err != nil {
10		return nil, err
11	}
12
13	isOverSize := header.Size > remain
14	if isOverSize {
15		return nil, errors.New("section size is larger than binary")
16	}
17
18	// ...
19
20    sectionHeaderSize = uint32(unsafe.SizeOf(SectionHeader{}))
21    noopBuffer := make([]byte, header.Size - sectionHeaderSize)
22    _, err := r.Read(noopBuffer)
23    if err != nil {
24      return nil, err
25    }
26
27    return section, nil
28}

In the operation of the virtual machine, both BinaryHeader and SectionHeader are nonessential information. Therefore, in actual implementation, these data will be discarded.

In the design of RITE, the size information contained in the header includes the size of the header itself. Therefore, when calculating, the size of the header needs to be subtracted. In the above example, we use header.Size - sectionHeaderSize to determine the size of the noopBuffer array that will be created, ensuring that the io.Reader can correctly stop at the beginning of the next section when reading.

In the C language, arrays can be processed at arbitrary positions using the pointer feature. However, in Golang, after being encapsulated as an io.Reader, we can only read sequentially, so the position where the reading cursor stops becomes crucial.

mruby-go is a project that aims to implement mruby entirely in Golang. It is expected to enable Golang to run Ruby, allowing for more flexibility in development, such as implementing DSLs or hooks.