Programming the BeagleY-AI C7x256 coprocessors using the cl7x compiler.
In this article we try to get a simple C7X dsp program running on the beagle board. It will use the hello world example from the mcu+rtos-sdk.
Download the compiler and emulator
Users guide for the compiler
https://www.ti.com/lit/ug/spruig8j/spruig8j.pdf?ts=1721290837893
Understanding the concerto makefile system
The css IDE uses this makefile system.
Emulator
The compiler comes with an emulator that allows running C7x dsp programs. This can be useful when writing and testing our own program. These are the compiler flags to build an emulated binary with g++ instead of the cl7x compiler.
CEMULFLAGS = $(COMMON_CFLAGS) -g -I . -I include -fno-strict-aliasing -I$(CGT7X_ROOT)/host_emulation/include/C7524-MMA2_256
LDEMULFLAGS = -fno-strict-aliasing -L$(CGT7X_ROOT)/host_emulation -lC7524-MMA2_256-host-emulation
Install the software described here.
Also install PROCESSOR-SDK-RTOS-J722S
https://www.ti.com/tool/PROCESSOR-SDK-J722S
On BeagleY
$ ls -l /sys/bus/rpmsg/devices
total 0
virtio0.rpmsg_chrdev.-1.10 -> ../../../devices/platform/bus@100000/64800000.dsp/remoteproc/remoteproc0/remoteproc0#vdev0buffer/virtio0/virtio0.rpmsg_chrdev.-1.10
virtio0.rpmsg_chrdev.-1.20 -> ../../../devices/platform/bus@100000/64800000.dsp/remoteproc/remoteproc0/remoteproc0#vdev0buffer/virtio0/virtio0.rpmsg_chrdev.-1.20
virtio1.rpmsg_chrdev.-1.10 ->./../../devices/platform/bus@100000/65800000.dsp/remoteproc/remoteproc2/remoteproc2#vdev0buffer/virtio1/virtio1.rpmsg_chrdev.-1.10
virtio1.rpmsg_chrdev.-1.20 -> ../../../devices/platform/bus@100000/65800000.dsp/remoteproc/remoteproc2/remoteproc2#vdev0buffer/virtio1/virtio1.rpmsg_chrdev.-1.20
virtio2.rpmsg_chrdev.-1.14 -> ../../../devices/platform/bus@100000/bus@100000:bus@28380000/bus@100000:bus@28380000:r5fss@41000000/41000000.r5f/remoteproc/remoteproc1/remoteproc1#vdev0buffer/virtio2/virtio2.rpmsg_chrdev.-1.14
virtio2.ti.ipc4.ping-pong.-1.13 -> ../../../devices/platform/bus@100000/bus@100000:bus@28380000/bus@100000:bus@28380000:r5fss@41000000/41000000.r5f/remoteproc/remoteproc1/remoteproc1#vdev0buffer/virtio2/virtio2.ti.ipc4.ping-pong.-1.13
$ ls -l /sys/class/rpmsg
s -l /sys/class/rpmsg
total 0
rpmsg_ctrl0 -> ../../devices/platform/bus@100000/64800000.dsp/remoteproc/remoteproc0/remoteproc0#vdev0buffer/virtio0/virtio0.rpmsg_chrdev.-1.20/rpmsg/rpmsg_ctrl0
pmsg_ctrl1 -> ../../devices/platform/bus@100000/64800000.dsp/remoteproc/remoteproc0/remoteproc0#vdev0buffer/virtio0/virtio0.rpmsg_chrdev.-1.10/rpmsg/rpmsg_ctrl1
rpmsg_ctrl2 -> ../../devices/platform/bus@100000/65800000.dsp/remoteproc/remoteproc2/remoteproc2#vdev0buffer/virtio1/virtio1.rpmsg_chrdev.-1.20/rpmsg/rpmsg_ctrl2
rpmsg_ctrl3 -> ../../devices/platform/bus@100000/65800000.dsp/remoteproc/remoteproc2/remoteproc2#vdev0buffer/virtio1/virtio1.rpmsg_chrdev.-1.10/rpmsg/rpmsg_ctrl3
This is how the IPC mechanism works
Install and setup Code Composer Studio (CSS)
If you install from a newer ccs, Release date: 10 May 2024, Also select the Sitara MPU:s
As of the writing of this article, we are still waiting for an example written by TI or the experts at Beagleboard.org. In the meantime we compile the examples.
Build according to docs
LLVM, (clang) make sure you install it in ${HOME}/ti
After installing all the software in the ${HOME}/ti dir also install the newer version of sysconfig.
SYSCONFIG IDE, configuration, compiler or debugger | TI.com
This is my hack to get it working, but maybe a better solution exists.
cd ~/ti/ccs1271/ccs/utils
cp -r sysconfig_1.20.0/ ~/ti
cd /ti/sysconfig_1.21.0/
cp -r nodejs ../sysconfig_1.20.0/
cp -r nw ../sysconfig_1.20.0/
cp sysconfig_1.21.0/package.json sysconfig_1.20.0/
This will put sysconfig in a directory, where it is expected.
This is my ti ~/dir after installing and running ccs and some other packages.
cd ~/ti ; ls
psdk_rtos_auto_j7_06_02_00_21 ti-processor-sdk-rtos-j722s-evm-09_02_00_05
ccs1271 sysconfig_1.20.0
sysconfig_1.21.0 tirex-localserver-3.7.1
ti-cgt-c7000_4.1.0.LTS tirex-product-tree
Set up venv and install required python libraries
python3 -m venv ti-env
source ti-env/bin/activate
pip3 install pyserial xmodem tqdm
sudo apt install mono-runtime
Export environment variables and build
export CG_TOOL_ROOT=${HOME}/ti/ti-cgt-c7000_4.1.0.LTS
export SDK_INSTALL_PATH=${HOME}/ti/ti-processor-sdk-rtos-j722s-evm-09_02_00_05/mcu_plus_sdk_j722s_09_02_00_59/
make -s -C examples/hello_world/j722s-evm/main-r5fss0–0_freertos/ti-arm-clang/
make -s -C examples/hello_world/j722s-evm/c75ss0-0_freertos/ti-c7000
make -s -C examples/hello_world/j722s-evm/c75ss1-0_freertos/ti-c7000
We should now have hello world firmware for the r5 and c75 mcus:s
file examples/hello_world/j722s-evm/c75ss0–0_freertos/ti-c7000/hello_world.release.out
hello_world.release.out: ELF 64-bit LSB executable, *unknown arch 0x91* version 1 (SYSV), statically linked, with debug_info, not stripped
file examples/hello_world/j722s-evm/main-r5fss0-0_freertos/ti-arm-clang/hello_world.release.out
examples/hello_world/j722s-evm/main-r5fss0-0_freertos/ti-arm-clang/hello_world.release.out: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped
Make
make -s -C examples/drivers/ipc/ipc_rpmsg_echo/j722s-evm/c75ss0–0_freertos/ti-c7000/
Open device configuration
make -s -C examples/hello_world/j722s-evm/main-r5fss0-0_freertos/ti-arm-clang/ syscfg-gui
Here is the sysconfig for the c75.
make -s -C examples/hello_world/j722s-evm/c75ss0–0_freertos/ti-c7000/ syscfg-gui
Also compile the ipc_rpmsg_echo firmware
make -s -C examples/drivers/ipc/ipc_rpmsg_echo/j722s-evm/c75ss0–0_freertos/ti-c7000/
Check out ~/ti-processor-sdk-linux-j722s-evm-09_02_00_04/board-support/ti-linux-kernel-6.1.80+gitAUTOINC+1c154b1fe4-ti/drivers/remoteproc/ti_k3_dsp_remoteproc.c how the linux kernel can load the dsp elf file.
Look at the source for the DSP
mcu_plus_sdk_j722s_09_02_00_59/source/kernel/freertos/portable/TI_CGT
Enable kernel debug support
https://www.kernel.org/doc/Documentation/admin-guide/dynamic-debug-howto.rst
Most likely this file,
alias ddcmd='echo $* > /proc/dynamic_debug/control'
echo 'file drivers/remoteproc/ti_k3_dsp_remoteproc.c +p' > /sys/kernel/debug/dynamic_debug/control
echo "module remoteproc_elf_loader.c +p" > /proc/dynamic_debug/control
git clone https://github.com/Grippy98/BeagleY-EdgeAI-Debian-Builds
scp firmware/*
echo "module ti_k3_dsp_remoteproc +p" > /proc/dynamic_debug/control
View the beaglebone souce
https://openbeagle.org/beagleboard/linux/-/tree/v6.1.83-ti-arm64-r63?ref_type=heads
git clone https://openbeagle.org/beagleboard/linux.git
Dump the resource table of the elf file
The resource table is incorporated into corresponding base images, and used by the remoteproc on the host-side to allocated/reserve resources. At this point I am not sure the resource table is needed by the remoteproc loader.
Here is also some interesting recent kernel patches.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <elf.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <stdio.h>
#include <ctype.h>
#define RESOURCE_TABLE_SECTION ".resource_table"
typedef struct {
uint32_t ver;
uint32_t num;
uint32_t reserved[2];
} RPMessage_RscHdr;
typedef struct {
uint32_t type;
uint32_t id;
uint32_t notifyid;
uint32_t dfeatures;
uint32_t gfeatures;
uint32_t config_len;
uint8_t status;
uint8_t num_of_vrings;
uint8_t reserved[2];
} RPMessage_RscVdev;
typedef struct {
uint32_t da;
uint32_t align;
uint32_t num;
uint32_t notifyid;
uint32_t reserved;
} RPMessage_RscVring;
typedef struct {
uint32_t type;
uint32_t da;
uint32_t len;
uint32_t reserved;
uint8_t name[32];
} RPMessage_RscTrace;
typedef struct {
RPMessage_RscHdr base;
uint32_t offset[2];
RPMessage_RscVdev vdev;
RPMessage_RscVring vring0;
RPMessage_RscVring vring1;
RPMessage_RscTrace trace;
} RPMessage_ResourceTable;
void hexdump(const void *data, size_t size) {
const unsigned char *p = (const unsigned char *)data;
for (size_t i = 0; i < size; i++) {
if (i % 16 == 0) {
printf("%04zx: ", i);
}
printf("%02x ", p[i]);
if ((i + 1) % 16 == 0) {
printf(" ");
for (size_t j = i - 15; j <= i; j++) {
printf("%c", isprint(p[j]) ? p[j] : '.');
}
printf("\n");
}
}
// Print any remaining bytes
if (size % 16 != 0) {
size_t remaining = size % 16;
for (size_t i = 0; i < (16 - remaining); i++) {
printf(" ");
}
printf(" ");
for (size_t i = size - remaining; i < size; i++) {
printf("%c", isprint(p[i]) ? p[i] : '.');
}
printf("\n");
}
}
void search_and_print_trace(const void *data, size_t size) {
const char *p = (const char *)data;
const char *trace = "trace";
size_t trace_len = strlen(trace);
for (size_t i = 0; i <= size - trace_len; i++) {
if (memcmp(p + i, trace, trace_len) == 0) {
printf("Found 'trace' at offset 0x%zx: ", i);
// Print up to 32 characters or until a null terminator
for (size_t j = 0; j < 32 && (i + j) < size && p[i + j] != '\0'; j++) {
printf("%c", isprint(p[i + j]) ? p[i + j] : '.');
}
printf("\n");
}
}
}
void dump_resource_table(const RPMessage_ResourceTable *table) {
printf("Resource Table:\n");
printf("Header:\n");
printf(" Version: %u\n", table->base.ver);
printf(" Num entries: %u\n", table->base.num);
printf("Offsets:\n");
printf(" VDEV offset: 0x%x\n", table->offset[0]);
printf(" Trace offset: 0x%x\n", table->offset[1]);
printf("VDEV:\n");
printf(" Type: 0x%x\n", table->vdev.type);
printf(" ID: 0x%x\n", table->vdev.id);
printf(" NotifyID: %u\n", table->vdev.notifyid);
printf(" DFeatures: 0x%x\n", table->vdev.dfeatures);
printf(" GFeatures: 0x%x\n", table->vdev.gfeatures);
printf(" Config_len: %u\n", table->vdev.config_len);
printf(" Status: 0x%x\n", table->vdev.status);
printf(" Num of VRings: %u\n", table->vdev.num_of_vrings);
printf("VRING0:\n");
printf(" DA: 0x%x\n", table->vring0.da);
printf(" Align: %u\n", table->vring0.align);
printf(" Num: %u\n", table->vring0.num);
printf(" NotifyID: %u\n", table->vring0.notifyid);
printf("VRING1:\n");
printf(" DA: 0x%x\n", table->vring1.da);
printf(" Align: %u\n", table->vring1.align);
printf(" Num: %u\n", table->vring1.num);
printf(" NotifyID: %u\n", table->vring1.notifyid);
printf("Trace:\n");
printf(" Type: 0x%x\n", table->trace.type);
printf(" DA: 0x%x\n", table->trace.da);
printf(" Len: 0x%x\n", table->trace.len);
printf(" Name: %s\n", table->trace.name);
}
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "Usage: %s <elf_file>\n", argv[0]);
return 1;
}
int fd = open(argv[1], O_RDONLY);
if (fd == -1) {
perror("Error opening file");
return 1;
}
off_t file_size = lseek(fd, 0, SEEK_END);
lseek(fd, 0, SEEK_SET);
void *file_data = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (file_data == MAP_FAILED) {
perror("Error mapping file");
close(fd);
return 1;
}
Elf64_Ehdr *ehdr = (Elf64_Ehdr *)file_data;
Elf64_Shdr *shdr = (Elf64_Shdr *)((char *)file_data + ehdr->e_shoff);
char *shstrtab = (char *)file_data + shdr[ehdr->e_shstrndx].sh_offset;
for (int i = 0; i < ehdr->e_shnum; i++) {
if (strcmp(&shstrtab[shdr[i].sh_name], RESOURCE_TABLE_SECTION) == 0) {
RPMessage_ResourceTable *table = (RPMessage_ResourceTable *)((char *)file_data + shdr[i].sh_offset);
dump_resource_table(table);
printf("\nHexdump of .resource_table section:\n");
hexdump(table, shdr[i].sh_size);
printf("\nSearching for 'trace' string:\n");
search_and_print_trace(table, shdr[i].sh_size);
break;
}
}
munmap(file_data, file_size);
close(fd);
return 0;
}
It seems that the firmware built here does not have a resource table.
However the prebuilt binaries, found here, https://git.ti.com/cgit/processor-firmware/ti-linux-firmware/tree/ti-ipc/j722s?h=ti-linux-firmware
Do have them.
./dump_res ipc_echo_test_c7x_1_release_strip.xe71
Resource Table:
Header:
Version: 1
Num entries: 2
Offsets:
VDEV offset: 0x18
Trace offset: 0x5c
VDEV:
Type: 0x3
ID: 0x7
NotifyID: 0
DFeatures: 0x1
GFeatures: 0x0
Config_len: 0
Status: 0x0
Num of VRings: 2
VRING0:
DA: 0xffffffff
Align: 4096
Num: 256
NotifyID: 1
VRING1:
DA: 0xffffffff
Align: 4096
Num: 256
NotifyID: 2
Trace:
Type: 0x2
DA: 0xa3100400
Len: 0x1000
Name: trace:c75ss0_0
Hexdump of .resource_table section:
0000: 01 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 ................
0010: 18 00 00 00 5c 00 00 00 03 00 00 00 07 00 00 00 ....\...........
0020: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0030: 00 02 00 00 ff ff ff ff 00 10 00 00 00 01 00 00 ................
0040: 01 00 00 00 00 00 00 00 ff ff ff ff 00 10 00 00 ................
0050: 00 01 00 00 02 00 00 00 00 00 00 00 02 00 00 00 ................
0060: 00 04 10 a3 00 10 00 00 00 00 00 00 74 72 61 63 ............trac
0070: 65 3a 63 37 35 73 73 30 5f 30 00 00 00 00 00 00 e:c75ss0_0......
0080: 00 00 00 00 00 00 00 00 00 00 00 00 ............
Searching for 'trace' string:
Found 'trace' at offset 0x6c: trace:c75ss0_0
Build a kernel module to read the trace buffer
sudo apt-get update
sudo apt-get install linux-headers-$(uname -r)
sudo apt-get install linux-source
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/io.h>
#include <linux/slab.h>
#define DEFAULT_TRACE_BUFFER_ADDR 0xadd9e000
#define DEFAULT_BYTES_TO_READ 256
static unsigned long trace_buffer_addr = DEFAULT_TRACE_BUFFER_ADDR;
static int bytes_to_read = DEFAULT_BYTES_TO_READ;
module_param(trace_buffer_addr, ulong, 0644);
MODULE_PARM_DESC(trace_buffer_addr, "Physical address of the trace buffer");
module_param(bytes_to_read, int, 0644);
MODULE_PARM_DESC(bytes_to_read, "Number of bytes to read from the trace buffer");
static int __init trace_reader_init(void)
{
void __iomem *virt_addr;
u8 *buffer;
int i;
pr_info("Trace Reader: Initializing\n");
pr_info("Trace Reader: Reading %d bytes from physical address 0x%lx\n",
bytes_to_read, trace_buffer_addr);
// Request and map the memory region
if (!request_mem_region(trace_buffer_addr, bytes_to_read, "trace_reader")) {
pr_err("Trace Reader: Unable to request memory region\n");
return -EBUSY;
}
virt_addr = ioremap(trace_buffer_addr, bytes_to_read);
if (!virt_addr) {
pr_err("Trace Reader: Unable to map memory\n");
release_mem_region(trace_buffer_addr, bytes_to_read);
return -ENOMEM;
}
// Allocate a buffer to store the read data
buffer = kmalloc(bytes_to_read, GFP_KERNEL);
if (!buffer) {
pr_err("Trace Reader: Unable to allocate memory for buffer\n");
iounmap(virt_addr);
release_mem_region(trace_buffer_addr, bytes_to_read);
return -ENOMEM;
}
// Read the data from the mapped memory
memcpy_fromio(buffer, virt_addr, bytes_to_read);
// Print the data
pr_info("Trace Reader: First %d bytes of trace buffer:\n", bytes_to_read);
for (i = 0; i < bytes_to_read; i++) {
printk(KERN_CONT "%02x ", buffer[i]);
if ((i + 1) % 16 == 0)
printk(KERN_CONT "\n");
}
printk(KERN_CONT "\n");
// Clean up
kfree(buffer);
iounmap(virt_addr);
release_mem_region(trace_buffer_addr, bytes_to_read);
return 0;
}
static void __exit trace_reader_exit(void)
{
pr_info("Trace Reader: Exiting\n");
}
module_init(trace_reader_init);
module_exit(trace_reader_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("A simple module to read and print trace buffer contents");
Create a Makefile
in the same directory with the following content:
Copyobj-m += trace_reader.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modulesclean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Compile the module:
Not so easy. You might have to download the kernel sources from beaglebone.org https://openbeagle.org/beagleboard/linux/-/tree/v6.1.83-ti-arm64-r63?ref_type=heads
An easier way would be to use the proc filesystem https://openbeagle.org/beagleboard/linux/-/blob/v6.1.83-ti-arm64-r63/drivers/remoteproc/remoteproc_debugfs.c
cd /sys/kernel/debug/<remote_processor_name>
cat <trace_file>
sudo insmod trace_reader.ko trace_buffer_addr=0xadd9e000 bytes_to_read=512
Check the kernel log to see the output:
dmesg | tail
sudo rmmod trace_reader
Helpful references
Device tree for the TI j722 evaluation module.
Device tree for the beagleboard image,
Build instructions for vision apps
Edge AI instructions for the AM67A
We were able to build the hello world and ipc_echo firmware. In the next article we can try fix the firmware and run it on the beagleY board. Hopefully we also soon have a better image with a proper device tree from beagleboard so that we can run our firmware.