Process Scheduler

reduceOS Process Scheduler

This page was created on September 3rd, 2024. Validate that the kernel has a similar commit history before applying this knowledge.

Process scheduling in an OS is hard, I think everyone can agree. There are lots of part you need to keep up with, and debugging them is a mess.

The scheduler in reduceOS is based off the Misaka kernel scheduler, modified to fit into reduceOS. The scheduler itself uses the round-robin scheduling system, which isn't too great, but isn't too bad. Priority support is on a TODO list.

The Actual Switching

Let's jump straight into the thick of it. This is the main process scheduler in reduceOS:

void process_switchNext() {
    previousProcess = currentProcess;
    updateProcessTimes();

    // Get the next avialable process, discarding anything marked as finished
    do {
        currentProcess = process_getNextReadyProcess();
    } while (currentProcess->flags & PROCESS_FLAG_FINISHED);


    currentProcess->time_in = clock_getTimer();
    currentProcess->time_switch = currentProcess->time_in;



    // Restore paging and task switch context
    spinlock_lock(&switch_lock);
    vmm_switchDirectory(currentProcess->thread.page_directory);
    spinlock_release(&switch_lock);
    
    setKernelStack(currentProcess->image.stack);

    if (currentProcess->flags & PROCESS_FLAG_FINISHED) {
        panic("scheduler", "process_switchNext", "Process is marked finished, we should not have this process");
    }

    // Mark the process as running or started
    __sync_or_and_fetch(&currentProcess->flags, PROCESS_FLAG_STARTED);

    asm volatile ("" ::: "memory"); // this line was here when it worked, too scared to remove.

    load_context(&currentProcess->thread.context);
    __builtin_unreachable(); // You might not see this in reduceOS, because it's not required, but it's to help out ;)
}

You'll notice it's not too bad. This piece of code is in charge of actually switching to the process, but it's not the only piece of the puzzle. Let's break it down though:

It first sets the current process as the previous process and updates its times. This is how Round Robin works, each process gets a time slice on the CPU.
Then, it will try to find the next non-finished process.
The process it found will have its time_in and time_switch updated (time_in is when it was loaded, time_switch is when it switched).
The process has its page directory loaded. This is important because statically linked ELF files demand their own address space. If one file wants 0x80000000, but so does another, you cant just overwrite it. Sidebar: Each process is given a page directory and in that page directory, the physical location of where the ELF file is loaded into the address space it wants. That way, the currently running process can be perfectly fine running.
Spinlocks are used to prevent other parts of the code from interfering. When one part of the code has a spinlock, any other code that wants the spinlock has to wait until its released.
The TSS (read the OSdev wiki) has its kernel stack set to the image's mapped stack (it's just a kmalloc(), basically. see createProcess)
Using the atomic __sync operation, we mark the process as started. This looks scary, but in reality it's just adding PROCESS_FLAG_STARTED to the current process' flags.
Then we load context. This is the tough part!

Contexts

Switching between processes is a matter of loading and saving contexts. This is very similar to how an ISR/interrupt handler does it. When an IRQ is called, the handler will push all the current registers into a specific structure (registers_t) via push instructions. When the handler is done, it will restore them. This is EXACTLY how process switching works.

Of course, the processes themselves don't save their context, that's kernel specific. So how does this work? This:

void process_switchTask(uint8_t reschedule) {
    if (!currentProcess) return; // Scheduler disabled

    if (currentProcess == idleTask) {
        panic("scheduler", "process_switchTask", "Context switch from idleTask triggered from somewhere other than pre-emption source.");
    }

    // If a process got to switchTask but was not marked as running, it must be exiting.
    if (!(currentProcess->flags & PROCESS_FLAG_RUNNING) || (currentProcess == idleTask)) {
        process_switchNext();
        return;
    }

    // Save the FPU registers (TODO: move this to fpu.c)
    asm volatile ("fxsave (%0)" :: "r"(&currentProcess->thread.fp_regs));
    
    if (save_context(&currentProcess->thread.context) == 1) {
        // Returning from a task switch.
        asm volatile ("fxrstor (%0)" :: "r"(&currentProcess->thread.fp_regs));
        return;
    } 

    // If this is a normal yield, we nede to reschedule.
    if (reschedule) {
        makeProcessReady(currentProcess);
    }



    // switch_next() will not return
    process_switchNext();
}

A lot of this stuff is just scheduling stuff. The important part is at the part where it returns from a task switch. It saves context, checks if that save equaled 1, and if so, returned from a task switch.

But wait, if we look at the assembly code for save context, that makes no sense!

save_context:
    movl 4(%esp), %ecx
    leal 8(%esp), %eax
    movl %eax, 0(%ecx)
    movl %ebp, 4(%ecx)
    xorl %eax, %eax
    movl %eax, 8(%ecx)
    movl (%esp), %eax
    movl %eax, 12(%ecx)

    movl %ebx, 16(%ecx)
    movl %edi, 20(%ecx)
    movl %esi, 24(%ecx)
    movl %ebp, 28(%ecx)
    movl %esp, 32(%ecx)

    xor %eax, %eax
    retl

Well, this is complex, so you can't just immediately read it. Let me give some background: This function is passed a pointer to a structure called kthread_context_t, which in turn is loaded into ECX via the first instruction.
The structure has some variables around it loaded that are important, like ESP, EBP, EBX, etc.

The C language (at least the i386 spec) takes return values from EAX. The xor instruction means that that's not going to work! It clears it out, so it will always be 0. That makes no sense!

\But the magic lies here:

load_context:
    # Usermode shenanigans
    mov %gs:0x10, %eax
    cmp %gs:0x0, %eax
    je 1f
    lock andl $0xFFFFfff7, 0x14(%eax)
1:
    movl 4(%esp), %edi
    
    # Restoring from general context, SP/BP
    movl 0(%edi), %esp
    movl 4(%edi), %ebp

    # Restoring from saved[5], note we don't actually do basically any of it.
    movl 16(%edi), %ebx 

    movl $1, %eax                           # This code is to make sure switchTask knows when we're back from a switch
    jmpl *12(%edi)                          # Jump to IP

That last line, where it loads 1 into EAX. Still though, you might think, when the context was saved it would just jump back to that XOR instruction. However, save_context actually saves the line that called it. So, when process_switchTask saves context, that last line that loads 1 into EAX (which can be zeroed for processes) will jump back to that if statement with the return value being 1. Then it knows we're back home!

This switchTask function is called by the PIT every single tick, ONLY IN USERMODE. You can see that here:

    pitTicks++; // Increment tick count
    if (terminalMode == 1) updateTextCursor_vesa(); // To be replaced with some sort of handler/caller list

    // Update the clock
    clock_update();

    // Acknowledge the IRQ, ISR is smart enough to know this was acknowledged
    isrAcknowledge(reg->int_no);

    // If we are in kernel mode we should do whatever we need to do to launch the usermode process
    if (reg->cs == 0x08) return;
    
    process_switchTask(1);

Careful if you're doing this too! The isrAcknowledge statement is a custom one I wrote that will handle if an interrupt is never returned.
An interrupt MUST be acknowledged with the PIC (yes, the PIC, not the PIT, confusing, I know) before another can fire. The way isrAcknowledge works is it checks if the interrupt called was already acknowledged, and if it was, then it ignores it.

This took ~1 week to figure out, unfortunately. The scheduler itself has been in development for ~1 month now, and it's still not even done. fork() is broken as I'm writing this, or at least doesn't play nice with newlib :(

However, I am glad to share anything I've learned from the process scheduler if you need it - ping me in our Discord server, and I'll be glad to answer. I hope this wiki page helps!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process Scheduler

reduceOS Process Scheduler

The Actual Switching

Contexts

Clone this wiki locally