sunfora

Computed goto crossfunction jumps

Prehistory

When I was 16 or something, I was very much into competitive programming.
Codeforces, practicing before the competition, recursive algorithms.

And if you know what I am talking about you probably know that when the solution finally clicks in your mind. You want to finish the implementation as fast as you can.

And I remember one day I was doing some recursive problem but it had this weird duplication of computation.
Where you have one function f and a dual to it g, f sometimes want to do what g does,
and g sometimes wants something specific from f.

Yes, it was pretty bad code and much more tells about how the solution finding has not been finalized yet.
But at this specific moment, I was having very little time to submit the solution.
An IDEA came to my mind?

What if I keep this code as it is. The C++ and C has a GOTO statement.
Why not just make this horrible mind bending jump from one function to the other and then maybe get back?

But when I tried it I have found a very nasty thing. What I thought was a nightmarish tool to use…
Was very much limited to a function scope. I have tried setjmp, longjmp but quickly realised it does not what I want to.

Since that day I had this idea that maybe it is still possible to do. You just have to be more knowledgable.

A twitter thread about GOTO in linux kernel

On Tuesday June 2nd I have seen some tweets discussing why does linux kernel has so many GOTOs.
I published an anecode of my own how I knew about the existance of GOTO before I was even introduced to loops and iterations.
What a dissapointment it was that Python indeed does not have GOTO. I was 14 or something.

Then I made two questions to the Google’s ‘Oracle’ Gemini, why was GOTO even banned in public opinion.
And have found out something very funny: the ALTER statement in COBOL.

Just take a look it is somewhat alike to changing the function pointer in C, but for GOTO.

   IDENTIFICATION DIVISION.
   PROGRAM-ID. ALTER-EXAMPLE.
   
   PROCEDURE DIVISION.
   
   MAIN-LOGIC.
  *    1. Executing this paragraph will initially route to FIRST-TIME
       PERFORM SWITCH-PARA.
       
  *    2. Modify SWITCH-PARA's GO TO target to point to LATER-TIMES
       ALTER SWITCH-PARA TO PROCEED TO LATER-TIMES.
       
  *    3. Subsequent executions will now route directly to LATER-TIMES
       PERFORM SWITCH-PARA.
       STOP RUN.

   SWITCH-PARA.
  *    This paragraph must ONLY contain this one standalone GO TO sentence
       GO TO FIRST-TIME.

   FIRST-TIME.
       DISPLAY "This runs only during initialization.".
       
   LATER-TIMES.
       DISPLAY "This runs on all subsequent passes.".

Basically you have a block named SWITCH-PARA which links to another block of computation FIRST-TIME.
And the interesting part is that the main can alter the execution if SWITCH-PARA by using the ALTER statement which would overwrite
where the SWITCH-PARA goes to (priorly to FIRST-TIME and then to LATER-TIMES).

Computed GOTO’s

I wondered what a nice and weird feature! Can I do something like this in C?! Probably not… Because the labels are not addressable in any way.
But apparently GCC has a feature exactly for that: computed goto’s (LLVM too).

The idea is simple, the code is just a memory with instructions. Thust we can have a pointer to a label. And we can jump to that pointer with indirect jump which is supported by most processor architectures.

jmp rax

And in C it looks like

#include <stdio.h>

int main() {
  void* label_ptr = 0;

  int even = 0;

  goto decide;

  label_1: 
    printf("1!\n");
    goto decide;
  label_2:
    printf("2!\n");
    goto decide;

  decide: 
  if (even) {
    label_ptr = &&label_1;
  } else {
    label_ptr = &&label_2;
  }
  even = 1 - even;
  goto *label_ptr;

  return 0;
}

This porgram would just switch between 1 and 2 forever. Pretty simple.

You can also change it from the outside and it will mirror the ALTER statement from the cobol language.

(as far as I understand from GCC perspective it is pretty legal).

#include <stdio.h>

void* block_a = 0;
void* block_b = 0;
void* block   = 0;

void procedure() { 
  if (block == 0) {
    block_a = &&a;
    block_b = &&b;
  }
  goto *block;
  a:
    printf("A");
    return;
  b:
    printf("B");
    return;
}

int main() {
  // read the a and b blocks
  procedure();
  
  // switch between them now
  int even = 0;

  for (int i = 0; i < 10; ++i) {
    if (even) {
      block = block_a;
    } else {
      block = block_b;
    }
    even = 1 - even;
    procedure();
  }

  return 0;
}

Jumping across functions.

But if we store a pointer to the global,
why not then jump to it right from the main?

This would do exactly what I was thinking to do in the past.
When I was doing that competitive programming problem.

But there is a reason why it is not really possible, nor permitted
even with GCC extension for computed goto. And that reason is: stack.

When function is called in C it reserves the space on a stack for saving the intermediate computation results and variables.

So if you just jump straight into other function you would do some bad damage to the original function’s stack. And the variables.

But there is more: when you hit return in a function, it would add back to the stack the value it supposedly substracted previously when you have entered it. And since it can vary we cannot just blindly asume that if just jump somewhere it will not break our of function stack frame completely.

But if you do it correctly and know what you are doing then you can actually achieve the goal. You have to:

  1. get the pointer to a label in other function
  2. get the pointer to a label in the main function
  3. make it so that the inner function jumps back to the main when it has done its work
  4. do not use any long computation which requires intermediate storage

And here is the listing of the finished program which does exactly that:

#include <stdio.h>

void *inside_ptr;
void *outside_ptr;

void my_function() {
    volatile int force_keep = 0;
    if (force_keep) {
        inside_ptr = &&unused;
    }
    inside_ptr = &&target;
    if (force_keep == 0) {
        return;
    }
target:
    printf("Inside!\n");
    goto *outside_ptr;
unused:
    return;
}

int main() {
    my_function();
    volatile int force_keep = 0;
    if (force_keep) {
        outside_ptr = &&unused;
    }
    outside_ptr = &&escape;
    goto *inside_ptr;
escape:
    printf("Back!\n");
    return 0;
unused:
    return 0;
}

And the assembly listing with -O3 enabled is

	.file	"main.c"
	.intel_syntax noprefix
	.text
	.section	.rodata.str1.1,"aMS",@progbits,1
.LC0:
	.string	"Inside!"
	.text
	.p2align 4
	.globl	my_function
	.type	my_function, @function
my_function:
.LFB23:
	.cfi_startproc
	endbr64
	sub	rsp, 24
	.cfi_def_cfa_offset 32
	mov	DWORD PTR 12[rsp], 0
	mov	eax, DWORD PTR 12[rsp]
	lea	rax, .L2[rip]
	mov	QWORD PTR inside_ptr[rip], rax
	mov	eax, DWORD PTR 12[rsp]
	test	eax, eax
	jne	.L2
	add	rsp, 24
	.cfi_remember_state
	.cfi_def_cfa_offset 8
	ret
	.p2align 4,,10
	.p2align 3
.L2:
	.cfi_restore_state
	endbr64
	lea	rdi, .LC0[rip]
	call	puts@PLT
	jmp	[QWORD PTR outside_ptr[rip]]
	.p2align 4,,10
	.p2align 3
.L3:
	endbr64
	add	rsp, 24
	.cfi_def_cfa_offset 8
	ret
	.cfi_endproc
.LFE23:
	.size	my_function, .-my_function
	.section	.rodata.str1.1
.LC1:
	.string	"Back!"
	.section	.text.startup,"ax",@progbits
	.p2align 4
	.globl	main
	.type	main, @function
main:
.LFB24:
	.cfi_startproc
	endbr64
	sub	rsp, 24
	.cfi_def_cfa_offset 32
	xor	eax, eax
	call	my_function
	mov	DWORD PTR 12[rsp], 0
	mov	eax, DWORD PTR 12[rsp]
	lea	rax, .L10[rip]
	mov	QWORD PTR outside_ptr[rip], rax
	jmp	[QWORD PTR inside_ptr[rip]]
	.p2align 4,,10
	.p2align 3
.L12:
	endbr64
.L11:
	xor	eax, eax
	add	rsp, 24
	.cfi_remember_state
	.cfi_def_cfa_offset 8
	ret
.L10:
	.cfi_restore_state
	endbr64
	lea	rdi, .LC1[rip]
	call	puts@PLT
	jmp	.L11
	.cfi_endproc
.LFE24:
	.size	main, .-main
	.globl	outside_ptr
	.bss
	.align 8
	.type	outside_ptr, @object
	.size	outside_ptr, 8
outside_ptr:
	.zero	8
	.globl	inside_ptr
	.align 8
	.type	inside_ptr, @object
	.size	inside_ptr, 8
inside_ptr:
	.zero	8
	.ident	"GCC: (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0"
	.section	.note.GNU-stack,"",@progbits
	.section	.note.gnu.property,"a"
	.align 8
	.long	1f - 0f
	.long	4f - 1f
	.long	5
0:
	.string	"GNU"
1:
	.align 8
	.long	0xc0000002
	.long	3f - 2f
2:
	.long	0x3
3:
	.align 8
4: