In this lab you will implement the basic kernel facilities required to get a protected user-mode environment (i.e., "process") running. You will enhance the JOS kernel to set up the data structures to keep track of user environments, create a single user environment, load a program image into it, and start it running. You will also make the JOS kernel capable of handling any system calls the user environment makes and handling any other exceptions it causes.
Note: In this lab, the terms environment and process are interchangeable - both refer to an abstraction that allows you to run a program. We introduce the term "environment" instead of the traditional term "process" in order to stress the point that JOS environments and UNIX processes provide different interfaces, and do not provide the same semantics.
Create a local branch called lab4 based on our lab4 branch, origin/lab4, and then fetch the latest version from the course repository:
$ cd ~/cs134/lab $ git checkout --track origin/lab4 Branch lab4 set up to track remote branch refs/remotes/origin/lab4. Switched to a new branch "lab4" $ git pull upstream lab4 # Pulls any changes I have made in the upstream repository $
You will now need to merge the changes you made in your lab3 branch into the lab4 branch, as follows:
$ git merge lab3 Merge made by the recursive strategy. ... $
In some cases, Git may not be able to figure out how to merge your changes with the new lab assignment (e.g. if you modified some of the code that is changed in the second lab assignment). In that case, the git merge command will tell you which files are conflicted, and you should first resolve the conflict (by editing the relevant files) and then commit the resulting files with git commit -a.
Important note. If your Pull Request for lab3 has finished being reviewed, then you know that lab3 is complete, and you will never need to merge from lab3 again. However, if it is is still being reviewed, then there may be changes required before the review is complete. Those changes will need to be merged into lab4-no-code, which you can do after the Pull Request is complete, by another call to git merge lab3 from lab4-no-code. Then, you would do a git merge lab4-no-code from lab4.You should merge into both labs so that the Pull Request for lab4 does not include the changes from lab3.
At this point, Lab 4 is ready to go. Before making any code changes, do the following:
$ git branch lab4-no-code # creates a branch prior to adding any Lab 4 code $ git push -u origin lab4-no-code # pushes the new branch to the origin
You may also want to take another look at the lab tools guide, as it includes information on debugging user code that becomes relevant in this lab.
In this lab and subsequent labs, do all of the regular exercises described in the lab. You can also do challenge problems. (Some challenge problems are more challenging than others, of course!) Additionally, write up brief answers to any questions posed in the lab and a short (e.g., one or two paragraph) description of what you did to solve each chosen challenge problem. Place the write-up in a file called answers-lab4.txt in the top level of your lab directory before submitting your work. Do not forget to add that file to git.
Now that your kernel has basic exception handling capabilities, you will refine it to provide important operating system primitives that depend on exception handling.
The page fault exception, interrupt vector 14 (T_PGFLT
),
is a particularly important one that we will exercise heavily
throughout this lab and the next.
When the processor takes a page fault,
it stores the linear (i.e., virtual) address that caused the fault
in a special processor control register, CR2.
In trap.c
we have provided the beginnings of a special function,
page_fault_handler()
,
to handle page fault exceptions.
Exercise 1.
Modify trap_dispatch()
to dispatch page fault exceptions
to page_fault_handler()
.
You should now be able to get make grade
to succeed on the faultread, faultreadkernel,
faultwrite, and faultwritekernel tests.
If any of them don't work, figure out why and fix them.
Remember that you can boot JOS into a particular user program
using make run-x or make run-x-nox.
For instance, make run-hello-nox runs the hello user
program.
You will further refine the kernel's page fault handling below, as you implement system calls.
The breakpoint exception, interrupt vector 3 (T_BRKPT
),
is normally used to allow debuggers
to insert breakpoints in a program's code
by temporarily replacing the relevant program instruction
with the special 1-byte int3
software interrupt instruction.
In JOS we will abuse this exception slightly
by turning it into a primitive pseudo-system call
that any user environment can use to invoke the JOS kernel monitor.
This usage is actually somewhat appropriate
if we think of the JOS kernel monitor as a primitive debugger.
The user-mode implementation of panic()
in lib/panic.c,
for example,
performs an int3
after displaying its panic message.
Exercise 2.
Modify trap_dispatch()
to make breakpoint exceptions invoke the kernel monitor.
You should now be able to get make grade
to succeed on the breakpoint test.
Challenge!
Modify the JOS kernel monitor so that
you can 'continue' execution from the current location
(e.g., after the int3
,
if the kernel monitor was invoked via the breakpoint exception),
and so that you can single-step one instruction at a time.
You will need to understand certain bits
of the EFLAGS register
in order to implement single-stepping.
Optional: If you're feeling really adventurous, find some x86 disassembler source code - e.g., by ripping it out of QEMU, or out of GNU binutils, or just write it yourself - and extend the JOS kernel monitor to be able to disassemble and display instructions as you are stepping through them. Combined with the symbol table loading from lab 1, this is the stuff of which real kernel debuggers are made.
Questions
SETGATE
from trap_init
). Why?
How do you need to set it up in order to get the breakpoint exception
to work as specified above and what incorrect setup would
cause it to trigger a general protection fault?User processes ask the kernel to do things for them by invoking system calls. When the user process invokes a system call, the processor enters kernel mode, the processor and the kernel cooperate to save the user process's state, the kernel executes appropriate code in order to carry out the system call, and then resumes the user process. The exact details of how the user process gets the kernel's attention and how it specifies which call it wants to execute vary from system to system.
In the JOS kernel, we will use the int
instruction, which causes a processor interrupt.
In particular, we will use int $0x30
as the system call interrupt.
We have defined the constant
T_SYSCALL
to 48 (0x30) for you. You will have to
set up the interrupt descriptor to allow user processes to
cause that interrupt. Note that interrupt 0x30 cannot be
generated by hardware, so there is no ambiguity caused by
allowing user code to generate it.
The application will pass the system call number and
the system call arguments in registers. This way, the kernel won't
need to grub around in the user environment's stack
or instruction stream. The
system call number will go in %eax
, and the
arguments (up to five of them) will go in %edx
,
%ecx
, %ebx
, %edi
,
and %esi
, respectively. The kernel passes the
return value back in %eax
. The assembly code to
invoke a system call has been written for you, in
syscall()
in lib/syscall.c. You
should read through it and make sure you understand what
is going on.
Exercise 3.
Add a handler in the kernel
for interrupt vector T_SYSCALL
.
You will have to edit kern/trapentry.S and
kern/trap.c's trap_init()
. You
also need to change trap_dispatch()
to handle the
system call interrupt by calling syscall()
(defined in kern/syscall.c)
with the appropriate arguments,
and then arranging for
the return value to be passed back to the user process
in %eax
.
Finally, you need to implement syscall()
in
kern/syscall.c.
Make sure syscall()
returns -E_INVAL
if the system call number is invalid.
You should read and understand lib/syscall.c
(especially the inline assembly routine) in order to confirm
your understanding of the system call interface.
Handle all the system calls listed in inc/syscall.h by
invoking the corresponding kernel function for each call.
Run the user/hello program under your kernel (make run-hello, or make run-hello-nox). It should print "hello, world" on the console and then cause a page fault in user mode. If this does not happen, it probably means your system call handler isn't quite right. You should also now be able to get make grade to succeed on the testbss test.
Challenge!
Implement system calls using the sysenter
and
sysexit
instructions instead of using
int 0x30
and iret
.
The sysenter/sysexit
instructions were designed
by Intel to be faster than int/iret
. They do
this by using registers instead of the stack and by making
assumptions about how the segmentation registers are used.
The exact details of these instructions can be found in Volume
2B of the Intel reference manuals.
The easiest way to add support for these instructions in JOS
is to add a sysenter_handler
in
kern/trapentry.S that saves enough information about
the user environment to return to it, sets up the kernel
environment, pushes the arguments to
syscall()
and calls syscall()
directly. Once syscall()
returns, set everything
up for and execute the sysexit
instruction.
You will also need to add code to kern/init.c to
set up the necessary model specific registers (MSRs). Section
6.1.2 in Volume 2 of the AMD Architecture Programmer's Manual
and the reference on SYSENTER in Volume 2B of the Intel
reference manuals give good descriptions of the relevant MSRs.
You can find an implementation of wrmsr
to add to
inc/x86.h for writing to these MSRs here.
Finally, lib/syscall.c must be changed to support
making a system call with sysenter
. Here is a
possible register layout for the sysenter
instruction:
eax - syscall number edx, ecx, ebx, edi - arg1, arg2, arg3, arg4 esi - return pc ebp - return esp esp - trashed by sysenter
GCC's inline assembler will automatically save registers that
you tell it to load values directly into. Don't forget to
either save (push) and restore (pop) other registers that you
clobber, or tell the inline assembler that you're clobbering
them. The inline assembler doesn't support saving
%ebp
, so you will need to add code to save and
restore it yourself. The return
address can be put into %esi
by using an
instruction like leal after_sysenter_label,
%%esi
.
Note that this only supports 4 arguments, so you will need to leave the old method of doing system calls around to support 5 argument system calls. Furthermore, because this fast path doesn't update the current environment's trap frame, it won't be suitable for some of the system calls we add in later labs.
You may have to revisit your code once we enable asynchronous
interrupts in the next lab. Specifically, you'll need to
enable interrupts when returning to the user process, which
sysexit
doesn't do for you.
A user program starts running at the top of
lib/entry.S. After some setup, this code
calls libmain()
, in lib/libmain.c.
You should modify libmain() to initialize the global pointer
thisenv
to point at this environment's
struct Env
in the envs[]
array.
(Note that lib/entry.S has already defined envs
to point at the UENVS
mapping you set up in Part A.)
Hint: look in inc/env.h and use
sys_getenvid
.
libmain()
then calls umain
, which,
in the case of the hello program, is in
user/hello.c. Note that after printing
"hello, world", it tries to access
thisenv->env_id
. This is why it faulted earlier.
Now that you've initialized thisenv
properly,
it should not fault.
If it still faults, you probably haven't mapped the
UENVS
area user-readable (back in Part A in
pmap.c; this is the first time we've actually
used the UENVS
area).
Exercise 4.
Add the required code to the user library, then
boot your kernel. You should see user/hello
print "hello, world" and then print "i
am environment 00001000".
user/hello then attempts to "exit"
by calling sys_env_destroy()
(see lib/libmain.c and lib/exit.c).
Since the kernel currently only supports one user environment,
it should report that it has destroyed the only environment
and then drop into the kernel monitor.
You should be able to get make grade
to succeed on the hello test.
Memory protection is a crucial feature of an operating system, ensuring that bugs in one program cannot corrupt other programs or corrupt the operating system itself.
Operating systems usually rely on hardware support to implement memory protection. The OS keeps the hardware informed about which virtual addresses are valid and which are not. When a program tries to access an invalid address or one for which it has no permissions, the processor stops the program at the instruction causing the fault and then traps into the kernel with information about the attempted operation. If the fault is fixable, the kernel can fix it and let the program continue running. If the fault is not fixable, then the program cannot continue, since it will never get past the instruction causing the fault.
As an example of a fixable fault, consider an automatically extended stack. In many systems the kernel initially allocates a single stack page, and then if a program faults accessing pages further down the stack, the kernel will allocate those pages automatically and let the program continue. By doing this, the kernel only allocates as much stack memory as the program needs, but the program can work under the illusion that it has an arbitrarily large stack.
System calls present an interesting problem for memory protection. Most system call interfaces let user programs pass pointers to the kernel. These pointers point at user buffers to be read or written. The kernel then dereferences these pointers while carrying out the system call. There are two problems with this:
For both of these reasons the kernel must be extremely careful when handling pointers presented by user programs.
You will now solve these two problems with a single mechanism that scrutinizes all pointers passed from userspace into the kernel. When a program passes the kernel a pointer, the kernel will check that the address is in the user part of the address space, and that the page table would allow the memory operation.
Thus, the kernel will never suffer a page fault due to dereferencing a user-supplied pointer. If the kernel does page fault, it should panic and terminate.
Exercise 5.
Change kern/trap.c
to panic if a page
fault happens in kernel mode.
Hint: to determine whether a fault happened in user mode or
in kernel mode, check the low bits of the tf_cs
.
Read user_mem_assert
in kern/pmap.c
and implement user_mem_check
in that same file.
Change kern/syscall.c to sanity check arguments to system calls.
Boot your kernel, running user/buggyhello. The environment should be destroyed, and the kernel should not panic. You should see:
[00001000] user_mem_check assertion failure for va 00000001 [00001000] free env 00001000 Destroyed the only environment - nothing more to do!
Finally, change debuginfo_eip
in
kern/kdebug.c to call user_mem_check
on
usd
, stabs
, and
stabstr
. If you now run
user/breakpoint, you should be able to run
backtrace from the kernel monitor and see the
backtrace traverse into lib/libmain.c before the
kernel panics with a page fault. What causes this page fault?
You don't need to fix it, but you should understand why it
happens.
Note that the same mechanism you just implemented also works for malicious user applications (such as user/evilhello).
Exercise 6. Boot your kernel, running user/evilhello. The environment should be destroyed, and the kernel should not panic. You should see:
[00000000] new env 00001000 ... [00001000] user_mem_check assertion failure for va f010000c [00001000] free env 00001000
This completes the lab. In the lab directory, commit your changes with git commit and type make handin to get instructions for submitting your code.
See our page on GitHub and Pull Requests for detailed information on pull requests and submitting your code.