Programming the BeagleY-AI C7x256 coprocessors using the cl7x compiler.

Olof Astrand
10 min readAug 5, 2024

--

J722S MCU+ SDK

In this article we try to get a simple C7X dsp program running on the beagle board. It will use the hello world example from the mcu+rtos-sdk.

Download the compiler and emulator

Users guide for the compiler

https://www.ti.com/lit/ug/spruig8j/spruig8j.pdf?ts=1721290837893

Understanding the concerto makefile system

The css IDE uses this makefile system.

Emulator

The compiler comes with an emulator that allows running C7x dsp programs. This can be useful when writing and testing our own program. These are the compiler flags to build an emulated binary with g++ instead of the cl7x compiler.

CEMULFLAGS = $(COMMON_CFLAGS) -g -I . -I include -fno-strict-aliasing -I$(CGT7X_ROOT)/host_emulation/include/C7524-MMA2_256

LDEMULFLAGS = -fno-strict-aliasing -L$(CGT7X_ROOT)/host_emulation -lC7524-MMA2_256-host-emulation

Install the software described here.

Also install PROCESSOR-SDK-RTOS-J722S

https://www.ti.com/tool/PROCESSOR-SDK-J722S

On BeagleY

$ ls -l /sys/bus/rpmsg/devices
total 0
virtio0.rpmsg_chrdev.-1.10 -> ../../../devices/platform/bus@100000/64800000.dsp/remoteproc/remoteproc0/remoteproc0#vdev0buffer/virtio0/virtio0.rpmsg_chrdev.-1.10
virtio0.rpmsg_chrdev.-1.20 -> ../../../devices/platform/bus@100000/64800000.dsp/remoteproc/remoteproc0/remoteproc0#vdev0buffer/virtio0/virtio0.rpmsg_chrdev.-1.20
virtio1.rpmsg_chrdev.-1.10 ->./../../devices/platform/bus@100000/65800000.dsp/remoteproc/remoteproc2/remoteproc2#vdev0buffer/virtio1/virtio1.rpmsg_chrdev.-1.10
virtio1.rpmsg_chrdev.-1.20 -> ../../../devices/platform/bus@100000/65800000.dsp/remoteproc/remoteproc2/remoteproc2#vdev0buffer/virtio1/virtio1.rpmsg_chrdev.-1.20
virtio2.rpmsg_chrdev.-1.14 -> ../../../devices/platform/bus@100000/bus@100000:bus@28380000/bus@100000:bus@28380000:r5fss@41000000/41000000.r5f/remoteproc/remoteproc1/remoteproc1#vdev0buffer/virtio2/virtio2.rpmsg_chrdev.-1.14
virtio2.ti.ipc4.ping-pong.-1.13 -> ../../../devices/platform/bus@100000/bus@100000:bus@28380000/bus@100000:bus@28380000:r5fss@41000000/41000000.r5f/remoteproc/remoteproc1/remoteproc1#vdev0buffer/virtio2/virtio2.ti.ipc4.ping-pong.-1.13

$ ls -l /sys/class/rpmsg
s -l /sys/class/rpmsg
total 0
rpmsg_ctrl0 -> ../../devices/platform/bus@100000/64800000.dsp/remoteproc/remoteproc0/remoteproc0#vdev0buffer/virtio0/virtio0.rpmsg_chrdev.-1.20/rpmsg/rpmsg_ctrl0
pmsg_ctrl1 -> ../../devices/platform/bus@100000/64800000.dsp/remoteproc/remoteproc0/remoteproc0#vdev0buffer/virtio0/virtio0.rpmsg_chrdev.-1.10/rpmsg/rpmsg_ctrl1
rpmsg_ctrl2 -> ../../devices/platform/bus@100000/65800000.dsp/remoteproc/remoteproc2/remoteproc2#vdev0buffer/virtio1/virtio1.rpmsg_chrdev.-1.20/rpmsg/rpmsg_ctrl2
rpmsg_ctrl3 -> ../../devices/platform/bus@100000/65800000.dsp/remoteproc/remoteproc2/remoteproc2#vdev0buffer/virtio1/virtio1.rpmsg_chrdev.-1.10/rpmsg/rpmsg_ctrl3

This is how the IPC mechanism works

RPMSG and VRING

Install and setup Code Composer Studio (CSS)

If you install from a newer ccs, Release date: 10 May 2024, Also select the Sitara MPU:s

Newer version of CCS

As of the writing of this article, we are still waiting for an example written by TI or the experts at Beagleboard.org. In the meantime we compile the examples.

Build according to docs

J722S MCU+ SDK: Introduction

LLVM, (clang) make sure you install it in ${HOME}/ti

Put llvm in ${HOME}/ti

After installing all the software in the ${HOME}/ti dir also install the newer version of sysconfig.

SYSCONFIG IDE, configuration, compiler or debugger | TI.com

This is my hack to get it working, but maybe a better solution exists.

cd ~/ti/ccs1271/ccs/utils
cp -r sysconfig_1.20.0/ ~/ti
cd /ti/sysconfig_1.21.0/
cp -r nodejs ../sysconfig_1.20.0/
cp -r nw ../sysconfig_1.20.0/
cp sysconfig_1.21.0/package.json sysconfig_1.20.0/

This will put sysconfig in a directory, where it is expected.

This is my ti ~/dir after installing and running ccs and some other packages.

cd ~/ti ; ls
psdk_rtos_auto_j7_06_02_00_21 ti-processor-sdk-rtos-j722s-evm-09_02_00_05
ccs1271 sysconfig_1.20.0
sysconfig_1.21.0 tirex-localserver-3.7.1
ti-cgt-c7000_4.1.0.LTS tirex-product-tree

Set up venv and install required python libraries

python3 -m venv ti-env
source ti-env/bin/activate
pip3 install pyserial xmodem tqdm

sudo apt install mono-runtime

Export environment variables and build

export CG_TOOL_ROOT=${HOME}/ti/ti-cgt-c7000_4.1.0.LTS
export SDK_INSTALL_PATH=${HOME}/ti/ti-processor-sdk-rtos-j722s-evm-09_02_00_05/mcu_plus_sdk_j722s_09_02_00_59/
make -s -C examples/hello_world/j722s-evm/main-r5fss0–0_freertos/ti-arm-clang/
make -s -C examples/hello_world/j722s-evm/c75ss0-0_freertos/ti-c7000
make -s -C examples/hello_world/j722s-evm/c75ss1-0_freertos/ti-c7000

We should now have hello world firmware for the r5 and c75 mcus:s

file examples/hello_world/j722s-evm/c75ss0–0_freertos/ti-c7000/hello_world.release.out

hello_world.release.out: ELF 64-bit LSB executable, *unknown arch 0x91* version 1 (SYSV), statically linked, with debug_info, not stripped

file examples/hello_world/j722s-evm/main-r5fss0-0_freertos/ti-arm-clang/hello_world.release.out
examples/hello_world/j722s-evm/main-r5fss0-0_freertos/ti-arm-clang/hello_world.release.out: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped

Make

make -s -C examples/drivers/ipc/ipc_rpmsg_echo/j722s-evm/c75ss0–0_freertos/ti-c7000/

Open device configuration

 make -s -C examples/hello_world/j722s-evm/main-r5fss0-0_freertos/ti-arm-clang/ syscfg-gui
syscfg-gui

Here is the sysconfig for the c75.

make -s -C examples/hello_world/j722s-evm/c75ss0–0_freertos/ti-c7000/ syscfg-gui

Also compile the ipc_rpmsg_echo firmware

make -s -C examples/drivers/ipc/ipc_rpmsg_echo/j722s-evm/c75ss0–0_freertos/ti-c7000/
IPC echo example

Check out ~/ti-processor-sdk-linux-j722s-evm-09_02_00_04/board-support/ti-linux-kernel-6.1.80+gitAUTOINC+1c154b1fe4-ti/drivers/remoteproc/ti_k3_dsp_remoteproc.c how the linux kernel can load the dsp elf file.

Look at the source for the DSP

mcu_plus_sdk_j722s_09_02_00_59/source/kernel/freertos/portable/TI_CGT

Enable kernel debug support

https://www.kernel.org/doc/Documentation/admin-guide/dynamic-debug-howto.rst

Most likely this file,

https://openbeagle.org/beagleboard/linux/-/blob/v6.1.83-ti-arm64-r63/drivers/remoteproc/ti_k3_dsp_remoteproc.c?ref_type=heads

alias ddcmd='echo $* > /proc/dynamic_debug/control'
echo 'file drivers/remoteproc/ti_k3_dsp_remoteproc.c +p' > /sys/kernel/debug/dynamic_debug/control
echo "module remoteproc_elf_loader.c +p" > /proc/dynamic_debug/control
git clone https://github.com/Grippy98/BeagleY-EdgeAI-Debian-Builds
scp firmware/*
echo "module ti_k3_dsp_remoteproc +p" > /proc/dynamic_debug/control

View the beaglebone souce

https://openbeagle.org/beagleboard/linux/-/tree/v6.1.83-ti-arm64-r63?ref_type=heads

git clone https://openbeagle.org/beagleboard/linux.git

Dump the resource table of the elf file

The resource table is incorporated into corresponding base images, and used by the remoteproc on the host-side to allocated/reserve resources. At this point I am not sure the resource table is needed by the remoteproc loader.

Here is also some interesting recent kernel patches.


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <elf.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <stdio.h>
#include <ctype.h>

#define RESOURCE_TABLE_SECTION ".resource_table"

typedef struct {
uint32_t ver;
uint32_t num;
uint32_t reserved[2];
} RPMessage_RscHdr;

typedef struct {
uint32_t type;
uint32_t id;
uint32_t notifyid;
uint32_t dfeatures;
uint32_t gfeatures;
uint32_t config_len;
uint8_t status;
uint8_t num_of_vrings;
uint8_t reserved[2];
} RPMessage_RscVdev;

typedef struct {
uint32_t da;
uint32_t align;
uint32_t num;
uint32_t notifyid;
uint32_t reserved;
} RPMessage_RscVring;

typedef struct {
uint32_t type;
uint32_t da;
uint32_t len;
uint32_t reserved;
uint8_t name[32];
} RPMessage_RscTrace;

typedef struct {
RPMessage_RscHdr base;
uint32_t offset[2];
RPMessage_RscVdev vdev;
RPMessage_RscVring vring0;
RPMessage_RscVring vring1;
RPMessage_RscTrace trace;
} RPMessage_ResourceTable;

void hexdump(const void *data, size_t size) {
const unsigned char *p = (const unsigned char *)data;
for (size_t i = 0; i < size; i++) {
if (i % 16 == 0) {
printf("%04zx: ", i);
}
printf("%02x ", p[i]);
if ((i + 1) % 16 == 0) {
printf(" ");
for (size_t j = i - 15; j <= i; j++) {
printf("%c", isprint(p[j]) ? p[j] : '.');
}
printf("\n");
}
}
// Print any remaining bytes
if (size % 16 != 0) {
size_t remaining = size % 16;
for (size_t i = 0; i < (16 - remaining); i++) {
printf(" ");
}
printf(" ");
for (size_t i = size - remaining; i < size; i++) {
printf("%c", isprint(p[i]) ? p[i] : '.');
}
printf("\n");
}
}
void search_and_print_trace(const void *data, size_t size) {
const char *p = (const char *)data;
const char *trace = "trace";
size_t trace_len = strlen(trace);
for (size_t i = 0; i <= size - trace_len; i++) {
if (memcmp(p + i, trace, trace_len) == 0) {
printf("Found 'trace' at offset 0x%zx: ", i);
// Print up to 32 characters or until a null terminator
for (size_t j = 0; j < 32 && (i + j) < size && p[i + j] != '\0'; j++) {
printf("%c", isprint(p[i + j]) ? p[i + j] : '.');
}
printf("\n");
}
}
}
void dump_resource_table(const RPMessage_ResourceTable *table) {
printf("Resource Table:\n");
printf("Header:\n");
printf(" Version: %u\n", table->base.ver);
printf(" Num entries: %u\n", table->base.num);

printf("Offsets:\n");
printf(" VDEV offset: 0x%x\n", table->offset[0]);
printf(" Trace offset: 0x%x\n", table->offset[1]);

printf("VDEV:\n");
printf(" Type: 0x%x\n", table->vdev.type);
printf(" ID: 0x%x\n", table->vdev.id);
printf(" NotifyID: %u\n", table->vdev.notifyid);
printf(" DFeatures: 0x%x\n", table->vdev.dfeatures);
printf(" GFeatures: 0x%x\n", table->vdev.gfeatures);
printf(" Config_len: %u\n", table->vdev.config_len);
printf(" Status: 0x%x\n", table->vdev.status);
printf(" Num of VRings: %u\n", table->vdev.num_of_vrings);

printf("VRING0:\n");
printf(" DA: 0x%x\n", table->vring0.da);
printf(" Align: %u\n", table->vring0.align);
printf(" Num: %u\n", table->vring0.num);
printf(" NotifyID: %u\n", table->vring0.notifyid);

printf("VRING1:\n");
printf(" DA: 0x%x\n", table->vring1.da);
printf(" Align: %u\n", table->vring1.align);
printf(" Num: %u\n", table->vring1.num);
printf(" NotifyID: %u\n", table->vring1.notifyid);

printf("Trace:\n");
printf(" Type: 0x%x\n", table->trace.type);
printf(" DA: 0x%x\n", table->trace.da);
printf(" Len: 0x%x\n", table->trace.len);
printf(" Name: %s\n", table->trace.name);
}
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "Usage: %s <elf_file>\n", argv[0]);
return 1;
}
int fd = open(argv[1], O_RDONLY);
if (fd == -1) {
perror("Error opening file");
return 1;
}
off_t file_size = lseek(fd, 0, SEEK_END);
lseek(fd, 0, SEEK_SET);
void *file_data = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (file_data == MAP_FAILED) {
perror("Error mapping file");
close(fd);
return 1;
}
Elf64_Ehdr *ehdr = (Elf64_Ehdr *)file_data;
Elf64_Shdr *shdr = (Elf64_Shdr *)((char *)file_data + ehdr->e_shoff);
char *shstrtab = (char *)file_data + shdr[ehdr->e_shstrndx].sh_offset;
for (int i = 0; i < ehdr->e_shnum; i++) {
if (strcmp(&shstrtab[shdr[i].sh_name], RESOURCE_TABLE_SECTION) == 0) {
RPMessage_ResourceTable *table = (RPMessage_ResourceTable *)((char *)file_data + shdr[i].sh_offset);
dump_resource_table(table);
printf("\nHexdump of .resource_table section:\n");
hexdump(table, shdr[i].sh_size);

printf("\nSearching for 'trace' string:\n");
search_and_print_trace(table, shdr[i].sh_size);
break;
}
}
munmap(file_data, file_size);
close(fd);
return 0;
}

It seems that the firmware built here does not have a resource table.

However the prebuilt binaries, found here, https://git.ti.com/cgit/processor-firmware/ti-linux-firmware/tree/ti-ipc/j722s?h=ti-linux-firmware

Do have them.

./dump_res ipc_echo_test_c7x_1_release_strip.xe71
Resource Table:
Header:
Version: 1
Num entries: 2
Offsets:
VDEV offset: 0x18
Trace offset: 0x5c
VDEV:
Type: 0x3
ID: 0x7
NotifyID: 0
DFeatures: 0x1
GFeatures: 0x0
Config_len: 0
Status: 0x0
Num of VRings: 2
VRING0:
DA: 0xffffffff
Align: 4096
Num: 256
NotifyID: 1
VRING1:
DA: 0xffffffff
Align: 4096
Num: 256
NotifyID: 2
Trace:
Type: 0x2
DA: 0xa3100400
Len: 0x1000
Name: trace:c75ss0_0

Hexdump of .resource_table section:
0000: 01 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 ................
0010: 18 00 00 00 5c 00 00 00 03 00 00 00 07 00 00 00 ....\...........
0020: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0030: 00 02 00 00 ff ff ff ff 00 10 00 00 00 01 00 00 ................
0040: 01 00 00 00 00 00 00 00 ff ff ff ff 00 10 00 00 ................
0050: 00 01 00 00 02 00 00 00 00 00 00 00 02 00 00 00 ................
0060: 00 04 10 a3 00 10 00 00 00 00 00 00 74 72 61 63 ............trac
0070: 65 3a 63 37 35 73 73 30 5f 30 00 00 00 00 00 00 e:c75ss0_0......
0080: 00 00 00 00 00 00 00 00 00 00 00 00 ............

Searching for 'trace' string:
Found 'trace' at offset 0x6c: trace:c75ss0_0

Build a kernel module to read the trace buffer

sudo apt-get update
sudo apt-get install linux-headers-$(uname -r)
sudo apt-get install linux-source

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/io.h>
#include <linux/slab.h>

#define DEFAULT_TRACE_BUFFER_ADDR 0xadd9e000
#define DEFAULT_BYTES_TO_READ 256
static unsigned long trace_buffer_addr = DEFAULT_TRACE_BUFFER_ADDR;
static int bytes_to_read = DEFAULT_BYTES_TO_READ;
module_param(trace_buffer_addr, ulong, 0644);
MODULE_PARM_DESC(trace_buffer_addr, "Physical address of the trace buffer");
module_param(bytes_to_read, int, 0644);
MODULE_PARM_DESC(bytes_to_read, "Number of bytes to read from the trace buffer");
static int __init trace_reader_init(void)
{
void __iomem *virt_addr;
u8 *buffer;
int i;
pr_info("Trace Reader: Initializing\n");
pr_info("Trace Reader: Reading %d bytes from physical address 0x%lx\n",
bytes_to_read, trace_buffer_addr);
// Request and map the memory region
if (!request_mem_region(trace_buffer_addr, bytes_to_read, "trace_reader")) {
pr_err("Trace Reader: Unable to request memory region\n");
return -EBUSY;
}
virt_addr = ioremap(trace_buffer_addr, bytes_to_read);
if (!virt_addr) {
pr_err("Trace Reader: Unable to map memory\n");
release_mem_region(trace_buffer_addr, bytes_to_read);
return -ENOMEM;
}
// Allocate a buffer to store the read data
buffer = kmalloc(bytes_to_read, GFP_KERNEL);
if (!buffer) {
pr_err("Trace Reader: Unable to allocate memory for buffer\n");
iounmap(virt_addr);
release_mem_region(trace_buffer_addr, bytes_to_read);
return -ENOMEM;
}
// Read the data from the mapped memory
memcpy_fromio(buffer, virt_addr, bytes_to_read);
// Print the data
pr_info("Trace Reader: First %d bytes of trace buffer:\n", bytes_to_read);
for (i = 0; i < bytes_to_read; i++) {
printk(KERN_CONT "%02x ", buffer[i]);
if ((i + 1) % 16 == 0)
printk(KERN_CONT "\n");
}
printk(KERN_CONT "\n");
// Clean up
kfree(buffer);
iounmap(virt_addr);
release_mem_region(trace_buffer_addr, bytes_to_read);
return 0;
}
static void __exit trace_reader_exit(void)
{
pr_info("Trace Reader: Exiting\n");
}
module_init(trace_reader_init);
module_exit(trace_reader_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("A simple module to read and print trace buffer contents");

Create a Makefile in the same directory with the following content:

Copyobj-m += trace_reader.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

Compile the module:

Not so easy. You might have to download the kernel sources from beaglebone.org https://openbeagle.org/beagleboard/linux/-/tree/v6.1.83-ti-arm64-r63?ref_type=heads

An easier way would be to use the proc filesystem https://openbeagle.org/beagleboard/linux/-/blob/v6.1.83-ti-arm64-r63/drivers/remoteproc/remoteproc_debugfs.c

cd /sys/kernel/debug/<remote_processor_name>
cat <trace_file>
sudo insmod trace_reader.ko trace_buffer_addr=0xadd9e000 bytes_to_read=512

Check the kernel log to see the output:

dmesg | tail
sudo rmmod trace_reader

Helpful references

Device tree for the TI j722 evaluation module.

Device tree for the beagleboard image,

Build instructions for vision apps

Edge AI instructions for the AM67A

https://software-dl.ti.com/jacinto7/esd/processor-sdk-linux-am67a/09_02_00/exports/edgeai-docs/devices/AM67A/linux/index.html

We were able to build the hello world and ipc_echo firmware. In the next article we can try fix the firmware and run it on the beagleY board. Hopefully we also soon have a better image with a proper device tree from beagleboard so that we can run our firmware.

--

--