Xv6 Syscalls

About this lab

In this lab you will add a new system call to xv6, which will help you understand how they work and will expose you to some of the internals of the xv6 kernel. You will add more system calls in later labs.

Before you start coding, read Chapter 2 of the xv6 book, and Sections 4.3 and 4.4 of Chapter 4, and related source files:

  • The user-space “stubs” that route system calls into the kernel are in user/usys.S, which is generated by user/usys.pl when you run make. Declarations are in user/user.h
  • The kernel-space code that routes a system call to the kernel function that implements it is in kernel/syscall.c and kernel/syscall.h.
  • Process-related code is kernel/proc.h and kernel/proc.c.

Getting the starter code

You can get and extract the starter code with the commands below.

cd cs334
wget --no-check-certificate https://cs334.cs.vassar.edu/labs/xv6-syscalls.tar
tar xvf xv6-syscalls.tar
rm xv6-syscalls.tar
cd xv6-syscalls

Booting xv6

Inside your xv6-syscalls directory, make and run xv6.

make qemu
...
xv6 kernel is booting

hart 2 starting
hart 1 starting
init: starting sh
$

If you type ls at the prompt, you should see output similar to the following:

$ ls
.              1 1 1024
..             1 1 1024
README         2 2 2425
cat            2 3 36120
echo           2 4 34984
forktest       2 5 16912
grep           2 6 39496
init           2 7 35464
kill           2 8 34928
ln             2 9 34752
ls             2 10 38080
mkdir          2 11 35008
rm             2 12 34992
sh             2 13 57696
stressfs       2 14 35864
usertests      2 15 184560
grind          2 16 50816
wc             2 17 37072
zombie         2 18 34344
logstress      2 19 36848
forphan        2 20 35744
dorphan        2 21 35192
console        3 22 0
$

These are the files that mkfs includes in the initial file system; most are programs you can run. You just ran one of them: ls. xv6 has no ps command, but, if you type Ctrl-p, the kernel will print information about each process. If you try it now, you’ll see two lines: one for init, and one for sh.

To quit qemu type: Ctrl-a x (press Ctrl and a at the same time, followed by x).

Using gdb

In many cases, print statements will be sufficient to debug your kernel, but sometimes it is useful to single step through code or get a stack backtrace. The GDB debugger can help.

To use GDB with qemu, we use a .gdbinit file in the assignment directory to set up gdb properly. To enable this, you must set your top-level (i.e., in your home directory) .gdbinit file to allow this. Here’s a one line command you can copy to set this file up properly.

cd; echo "set auto-load safe-path /" >> .gdbinit

To help you become familiar with gdb, run make qemu-gdb and then open another terminal window (also in the assignment directory). Once you have two windows open, type in the other (gdb) window:

gdb-multiarch

This will run gdb and bring you to the gdb prompt. Set a gdb break at the syscall function and then continue to run xv6 until it hits the syscall function. Type (don’t copy) the gdb command (shown after the (gdb) prompt) below.

(gdb) b syscall
Breakpoint 1 at 0x80001cfe: file kernel/syscall.c, line 133.
(gdb) c
Continuing.
[Switching to Thread 1.3]

Thread 3 hit Breakpoint 1, syscall () at kernel/syscall.c:133
133     {
(gdb) layout src
(gdb) backtrace

The layout command splits the window in two, showing where gdb is in the source code. backtrace command prints a stack backtrace.

This assignment has several questions to answer, in addition to the code you will write. Put all of your answers to the assignment questions in a file called answers-syscall.txt.

Q1: Looking at the backtrace output, which function called syscall?

Type n a few times to step past struct proc *p = myproc(); Once past this statement, run the gdb command p/x *p, which prints the current process’s proc struct (see kernel/proc.h) in hex.

Q2: What is the value of p->trapframe->a7 and what does that value represent? (Hint: look at user/init.c, the first user program xv6 starts, and its compiled assembly user/init.asm.)

In the subsequent part of this lab (or in following labs), it may happen that you make a programming error that causes the xv6 kernel to panic. For example, replace the statement num = p->trapframe->a7; with num = * (int *) 0; at the beginning of syscall, then run make qemu, and you will see somthing like the following:

xv6 kernel is booting

hart 2 starting
hart 1 starting
scause=0xd sepc=0x80001d0e stval=0x0
panic: kerneltrap

Quit out of qemu.

To track down the source of a kernel page-fault panic, search for the sepc value printed for the panic you just saw in the file kernel/kernel.asm, which contains the assembly for the compiled kernel.

Q3. What is assembly instruction the kernel is panicking at? Which register corresponds to the variable num?

To inspect the state of the processor and the kernel at the faulting instruction, fire up gdb, and set a breakpoint at the faulting instruction, like this:

(gdb) b *0x80001d0e
Breakpoint 1 at 0x80001cfe: file kernel/syscall.c, line 133.
(gdb) layout asm
(gdb) c

Confirm that the faulting assembly instruction is the same as the one you found above.

Q4. Why does the kernel crash? Hint: look at figure 3-3 in the text; is address 0 mapped in the kernel address space? Is that confirmed by the value in scause above? See description of scause in RISC-V privileged instructions for the description of scause values.

Note that scause was printed by the kernel panic above, but often you need to look at additional info to track down the problem that caused the panic. For example, to find out which user process was running when the kernel panicked, you can print out the process’s name:

(gdb) p p->name

Q5. What is the name of the binary that was running when the kernel panicked? What is its process id (pid)?

This concludes a brief introduction to tracking down bugs with gdb; it is worth your time to revisit Using the GNU Debugger when tracking down kernel bugs.

Sleep

This exercise makes you familiar with writing a user program on xv6 and the pause system call.

Implement a user-level sleep program for xv6, along the lines of the UNIX sleep command. Your sleep should pause for a user-specified number of ticks. A tick is a notion of time defined by the xv6 kernel, namely the time between two interrupts from the timer chip. Your solution should be in the file user/sleep.c.

Some hints:

  • Before you start coding, read Chapter 1 of the xv6 book.
  • Put your code in user/sleep.c. Look at some of the other programs in user/ (e.g., user/echo.c, user/grep.c, and user/rm.c) to see how command-line arguments are passed to a program.
  • Add your sleep program to UPROGS in Makefile; once you’ve done that, make qemu will compile your program and you’ll be able to run it from the xv6 shell.
  • If the user forgets to pass an argument, sleep should print an error message.
  • The command-line argument is passed as a string; you can convert it to an integer using atoi (see user/ulib.c). Use the system call pause().
  • See kernel/sysproc.c for the xv6 kernel code that implements the pause() system call (look for sys_pause), user/user.h for the C definition of pause() callable from a user program, and user/usys.S for the assembler code that jumps from user code into the kernel for pause().

Run the program from the xv6 shell:

make qemu
...
init: starting sh
$ sleep 10
(nothing happens for a little while)
$

Your program should pause when run as shown above. Run make grade in your command line (outside of qemu) to see if you pass the sleep tests.

Note that make grade runs all tests, including the ones for the tasks below. If you want to run the grade tests for one task, type:

./grade-lab-util sleep

This will run the grade tests that match “sleep”.

Sysinfo

For this part of the assignment you will add a system call, sysinfo, that collects information about the running system. The system call takes one argument: a pointer to a struct sysinfo (see kernel/sysinfo.h). The kernel should fill out the fields of this struct: the freemem field should be set to the number of bytes of free memory, and the nproc field should be set to the number of processes whose state is not UNUSED. We provide a test program sysinfotest; you pass this assignment if it prints "sysinfotest: OK". Some hints:

  • Add $U/_sysinfotest to UPROGS in the Makefile by removing the comment in that line in the Makefile.
  • Run make qemu and you will see the compiler cannot compile user/sysinfotest.c because the user-space stubs for the system call don’t exist yet: add a prototype for the system call to user/user.h, a stub to user/usys.pl, and a syscall number to kernel/syscall.h. The Makefile invokes the perl script user/usys.pl, which produces user/usys.S, the actual system call stubs, which use the RISC-V ecall instruction to transition to the kernel. To declare the prototype for sysinfo() in user/user.h you need pre-declare the existence of struct sysinfo:

    struct sysinfo;
    int sysinfo(struct sysinfo *);
    
  • Once you fix the compilation issues, run sysinfotest; it will fail because you haven’t implemented the system call in the kernel yet. Add a sys_sysinfo() function in kernel/sysproc.c that implements the new system call.

  • sysinfo needs to copy a struct sysinfo back to user space; see sys_fstat() (kernel/sysfile.c) and filestat() (kernel/file.c) for examples of how to do that using copyout().

  • You will have to modify kernel/syscall.c to accept the new syscall.

  • To collect the amount of free memory, add a function to kernel/kalloc.c.

  • To collect the number of processes, add a function to kernel/proc.c.

Attack xv6

The xv6 kernel isolates user programs from each other and isolates the kernel from user programs. As you saw in the above assignments, an application cannot directly call a function in the kernel or in another user program; instead, interactions occur only through system calls. However, if there is a bug in the kernel’s implementation of a system call, an attacker may be able to exploit that bug to break the isolation boundaries. To get a sense for how bugs can be exploited, we have introduced a bug into xv6 and your goal is to exploit that bug to steal a secret from another process.

The bug is that the call to memset(mem, 0, sz) in uvmalloc() in kernel/vm.c to clear a newly-allocated page is omitted when compiling this assignment. Similarly, when compiling kernel/kalloc.c for this assignment, the two lines that use memset to put garbage into free pages are omitted. The net effect of omitting these 3 lines is that newly allocated memory retains the contents from its previous use. Thus an application that calls sbrk() to allocate memory may receive pages that have data in them from previous uses. Despite the 3 deleted lines, xv6 mostly works correctly; it even passes most of usertests.

user/secret.c writes a secret string in its memory and then exits (which frees its memory). Your goal is to add a few lines of code to user/attack.c to find the secret that a previous execution of secret.c wrote to memory, and to print the secret on a line by itself.

Your attack.c must work with unmodified xv6 and unmodified secret.c. You can change anything to help you experiment and debug, but must revert those changes before final testing and submitting.

The secret program takes the secret as an argument. You can test your attack program by running secret with some argument, then runing attack, and seeing whether attack prints exactly the argument passed to secret. Here’s a successful run:

$ secret xyzzy
$ attack
xyzzy
$

Depending on exactly how you implement your attack, you may need to run attack a second time in order for it to find the secret. The grader runs attack twice, just in case, and is satisfied if either produces the secret.

From outside xv6, you can use ./grade-lab-syscall attack, or make grade, to see if your attack passes our tests. The secret strings that the tests generate are guaranteed to contain only digits and upper and lower case letters.

As with this example, bugs that do not directly affect correctness can sometimes be exploited to break security. Careful programming and extensive testing can reduce the number of bugs but can’t guarantee their absence.

Submitting

Don’t forget to put your answers in the file answers-syscall.txt.

Assignment submissions are handled by Gradescope. When you’re ready to submit, run make submit, which will generate syscall-submit.tar. Upload this tarfile file to Gradescope.

Acknowledgements

Parts of this assignment are adapted from materials used in the 6.S081 course at MIT.