- 1 Boot
- 2 Monitor
- 3 Memory Management
- 4 Environment
- 5 Trap
- 6 Time Tick
- 7 Multiple Processor
- 8 Thread
- 9 Concurrency
- 10 File System
- 11 Shell
- 12 Network
start pa: 0xffff0
start pa: 0x7c00
file: boot/Makefile
point out entry of bootloader
ld -e start -Ttext 0x7C00
file: boot/boot.S function: start 1. switch to 32-bit protected mode 2. jump into bootmain
file: boot/bootmain.c function: bootmain 1. load elf file, include elf header & program headers 2. jump into kernel entry
start pa: 0x00100000 va: 0xF0100000
file: kernel/kernel.ld
point out entry of kernel
ENTRY(_start)
SECTIONS {
. = 0xF0100000;
.text : AT(0x100000) {
...
}
}
file: kernel/entry.S function: _start
- load entry_pgdir into cr3 and turn on paging
- initialize kernel stack(8 pages) in data segment
- call init
file: kernel/init.c function: init
- initialize bss segment (optional)
- initialize console devices, including CGA, keyboard and serial port (cons_init)
- initialize memory (mem_init)
- initialize task (env_init)
- initialize trap (trap_init)
- initialize multiprocessor (mp_init)
- initialize interrupt controller (lapic_init & pic_init)
- initialize clock (time_init)
- initialize PCI bus (pci_init)
- lock kernel before waking up non-boot CPUs (boot_aps)
- start file system server (fs serve)
- start network server (net serve)
- start init process (user initsh)
file: kernel/monitor.c
function: mon_backtrace
Display the trace of function call
- read & load ebp
- gain eip according to ebp
- parse eip from STAB file, call debuginfo_eip a. source from kernel/user b. file c. function d. line
+---------------+
| arg n |
+---------------+
| arg 2 |
+---------------+
| arg 1 |
+---------------+
| ret %eip | caller function's stack frame
+===============+
| saved %ebp | callee function's stack frame
%ebp --> +---------------+ <-- where Prologue & Epilogue happen
| local |
| variables, |
%esp --> | etc. |
+---------------+
file: kernel/pmap.c function: mem_init
- detect available physical memory size (i386_detect_memory)
- allocate kern_pgdir by simplified allocator (boot_alloc)
- setup user page tables (UVPT, read only)
- allocate page array
- allocate env array
- initialize page array
- map page array to user (UPAGES, read only)
- map env array to user (UENVS, read only)
- map [KERNBASE, 2^32) to kernel
- map kernel stack array to kernel (KSTACKTOP)
- load kern_pgdir into cr3 register
- enable paging by loading cr0 register
file: kernel/pmap.c
function: page_alloc
Allocates a physical page
function: page_free
Return a page to the free list
function: page_lookup
Return the page mapped at specific virtual address
function: page_insert
Map the physical page at specific virtual address, allow to handle below cases:
1. map new page
2. remap same page for permission modification
3. remap va, and remove previous page
function: page_remove
Unmap the physical page at specific virtual address
1. look up physical page by va
2. decrease page ref count
3. invalidate TLB entry
function: user_mem_check
Check that an environment is allowed to access the range of memory with specific permission
function: user_mem_phy_addr
transfer user address into physical address
function: mmio_map_region
Reserve size bytes in the MMIO region and map [pa,pa+size) at this location with PTE_PCD & PTE_PWT bit (cache-disable and write-through)
file: lib/malloc.c
based on sbrk()
function: malloc
adopt first-fit / COW
1. find suitable block for allocate
2. if found: split into smaller block
3. if can't: extend the heap
function: free
free specific area
1. verify whether the address is validate
2. get the memory block
3. whether allow to fusion previous & next one
4. if there is no one any more following, shrink heap size
function: calloc
allocate area and the memory is set to zero
function: realloc
changes the size of the memory block pointed by ptr as size bytes
1. if ptr is null, allocate size
2. if original size greater than size, split the block
3. if original size less than size:
a. try to fusion next block
b. or allocate new block & copy & free original one
file: kernel/vm.c
create virtual memory area
function: add_vma
append a virtual memory area in env
function: copy_vma
copy a virtual memory area from parent env
file: kernel/pmap.c
function: env_init
Initialize all of the Env structures in the envs array and add them to the env_free_list
function: env_init_percpu
Configure the global segmentation hardware with separate segments for privilege level 0 (kernel) and privilege level 3 (user). Clear the local segmentation hardware.
function: env_setup_vm
Initialize the kernel virtual memory layout for environment e
1. allocate one page as env_pgdir
2. copy kern_pgdir as env_pgdir
3. map UVPT in env_pgdir
function: region_alloc
Allocate len bytes of physical memory for environment env and map it at va
function: load_icode
Parse an ELF binary image and load its contents into the user address space
1. switch env_pgdir
2. load ELF image
3. set tf_eip
4. initial stack by mapping one page
5. switch back to kern_pgdir
function: env_alloc
Allocates and initializes a new environment
- call env_setup_vm
- initialize Trapframe register
function: env_create
Allocate an environment with env_alloc and call load_icode to load an ELF binary into it
function: env_run
Start a given environment running in user mode
- set current env as suspending
- set target env as running
- switch env_pgdir
- unlock kernel before returning to user
- call env_pop_tf
function: env_pop_tf
Restores the register values in the Trapframe with the 'iret' instruction
function: env_destroy
Free specific environment
- set env status as ENV_DYING when running on other core
- in next schedule, call env_free
- schedule
function: env_free
Free the env and all memory it uses
- switch page dir
- remove pages and page tables of user land
- free the page dir
- set env free and add it into free list
file: kernel/sched.c
function: sched_yield
Choose a user environment to run and run it
- select a suspending env by specific strategy and run it
a.
Round-Robin
strategy b.Least-Run-Time
strategy - if there is no other runnable task, run current task
- if no task, then halt this CPU (sched_halt)
function: sched_halt
Halt this CPU when there is nothing to do
- set current env as NULL
- switch to kern_pgdir
- set CPU status as halted
- unlock kernel
- clean stack & restore interrupt & halt this CPU (until next interrupt comes)
start va: 0x800020
file: user/user.ld
point out entry of user programs
ENTRY(_start)
SECTIONS {
. = 0x800020;
}
file: lib/entry.S function: _start
enter C program
- detect stack, if no arguments, then push 0
- call libmain
file: lib/libmain.c function: libmain
- set page fault handler (handle user stack page)
- call 'main' entry of user program
- exit by self
file: lib/fork.c
function: fork
User-level fork with copy-on-write.
- generate new blank env (sys_exofork)
- duplicate pages in user space (duppage)
- set page fault handler for child
- set env as ENV_RUNNABLE
function: sfork
User-level fork with shared-memory.
- generate new blank env (sys_exofork)
- share pages with child
- duplicate pages in user stack area (duppage)
- set page fault handler for child
- set env as ENV_RUNNABLE
function: duppage
duplicate page in copy-on-write strategy
- share PTE_SHARE pages directly
- adopt copy-on-write strategy on pages set with PTE_W or PTE_COW
- share read-only pages directly
file: lib/spawn.c function: spawnl
taking command-line arguments array directly on the stack
function: spawn
Spawn a child process from a program image loaded from the file system
- open user program image in read-only way
- create child env
- initialize stack (init_stack)
- read ELF program segments and map them (map_segment)
- copy shared pages
- set child eip & esp
- set child ENV_RUNNABLE
function: init_stack
Set up the initial stack page for the new child process
- calculate total size of arguments
- copy argv to temp page
- map page(and COW zero page) to child's stack
file: kernel/trap.c
function: trap_init
set trap, soft interrupt, interrupt handler, and initialize TSS for each CPU, load tss selector & idt
- set handler in idt
- load tss & idt in each CPU
file: kernel/trapentry.S
function: TRAPHANDLER_NOEC TRAPHANDLER
It pushes a trap number onto the stack, then jumps to alltraps
save the user running state
cross rings case:
+--------------------+ KSTACKTOP
| 0x00000 | old SS | " - 4
| old ESP | " - 8
| old EFLAGS | " - 12
| 0x00000 | old CS | " - 16
| old EIP | " - 20
| error code | " - 24
+--------------------+ <---- ESP
| trap num | " - 28
+--------------------+
non-cross rings case:
+--------------------+ <---- old ESP
| old EFLAGS | " - 4
| 0x00000 | old CS | " - 8
| old EIP | " - 12
+--------------------+
function: alltraps
generate struct Trapframe and call trap with argument Trapframe
+--------------------+ KSTACKTOP
| 0x00000 | old SS | " - 4
| old ESP | " - 8
| old EFLAGS | " - 12
| 0x00000 | old CS | " - 16
| old EIP | " - 20
| error code | " - 24
+--------------------+
| trap num | " - 28
+--------------------+
| old DS |
| old ES |
| general registers |
+--------------------+ <---- ESP
file: kernel/trap.c
function: trap
handle the exception/interrupt
- lock kernel if this CPU is halted before
- lock kernel if this task comes from user land
- dispatch based on the type of trap (trap_dispatch)
- schedule
function: trap_dispatch
dispatch based on trap num
- page fault
- breakpoint & debug
- system call
- time
- spurious
- key board
- serial port
file: kernel/trap.c function: page_fault_handler
handle page fault signal in kernel
- gain fault va
- check this incident come from user
- prepare User Trapframe in stack
- set eip as _pgfault_upcall and return to user (env_run)
file: lib/pfentry.S function: _pgfault_upcall
call user handler and restore original field
- call _pgfault_handler (means pgfault)
- simulate 'iret' and return to trap point
already in exception stack case:
+-----------------------+
| Exception Stack n |
+-----------------------+ <--- esp
| trap-time eip |
+-----------------------+
| Exception Stack (n+1) |
+-----------------------+
first time trap into exception stack case:
+-------------------+ +-------------------+
| Regular Stack | | Exception Stack 1 |
+-------------------+ <--- esp +-------------------+
| trap-time eip |
+-------------------+
file: lib/fork.c function: pgfault
handle page fault accident in user
- check faulting access: a. write operation b. copy-on-write page
- allocate new page
- copy original page to new one
- remap this page
file: lib/pgfault.c function: set_pgfault_handler
set page fault handler entry
- allocate exception stack
- set _pgfault_upcall entry
- set page fault handler for user
file: lib/syscall.c function: syscall
generic system call in user, use 'int $T_SYSCALL'
file: kernel/trap.c function: trap_dispatch
parse trap num and call the relevant handler in kernel
file: kernel/syscall.c function: syscall
dispatches to the correct kernel function
- sys_cputs: print string
- sys_cgetc: read a character from the system console
- sys_getenvid: returns the current environment's envid
- sys_env_destroy: env_destroy
- sys_yield: schedule
- sys_exofork: fork
- sys_env_set_status: set the status of a specified environment (ENV_RUNNABLE or ENV_NOT_RUNNABLE)
- sys_env_set_trapframe: set env's eip & esp (enable interrupts, set IOPL as 0)
- sys_env_name: set env's binary name (for debug)
- sys_page_alloc: allocate a page of memory and map it at 'va' with permission
- sys_page_map: map the page of memory at 'src va' in src env's address space at 'dst va' in dst env's address space with permission 'perm'
- sys_page_unmap: unmap the page of memory at 'va' in the address space of 'env'
- sys_env_set_pgfault_upcall: set the page fault upcall
- sys_add_vma: add virtual memory area (by spawn)
- sys_copy_vma: copy VMAs of source env to destination env (by fork)
- sys_ipc_try_send: try to send 'value' to the target env 'envid' (IPC)
- sys_ipc_recv: block until a value is ready (IPC)
- sys_time_msec: gain time, unit: millisecond (Time Tick)
- sys_debug_info: gain info of CPU & memory
- sys_chdir: switch working dir of current env
file: kernel/time.c function: time_init
clean time-counter
file: kernel/trap.c function: trap_dispatch
- when receives IRQ_TIMER:
- increase time tick (if this CPU is boot one)
- acknowledge interrupt
- schedule
file: kernel/mpconfig.c function: mp_init
gain local APIC base address from BIOS(Basic Input Output System) or EBDA(Extended BIOS Data Area)
file: kernel/lapic.c function: lapic_init
map LAPIC base address and initialize the local APIC hardware
file: kernel/init.c function: boot_aps
start the non-boot processors(AP)
- copy mpentry_start to AP entry point
- figure out per-core kernel stack
- send startup IPI(Inter-Processor Interrupts) to boot up (lapic_startap)
- wait until this AP is started
start pa: 0x7000
#define MPENTRY_PADDR 0x7000
code = KADDR(MPENTRY_PADDR);
lapic_startap(c->cpu_id, PADDR(code));
file: kernel/mpentry.S function: mpentry_start
- turn on protection mode
- turn on paging (load entry_pgdir)
- switch to the per-cpu stack
- call mp_main
file: kernel/init.c function: mp_main
- load kern_pgdir
- initialize local APIC
- load GDT and segment descriptors
- initialize and load the per-CPU TSS and IDT
- set CPU status
- lock kernel for making sure only one process can enter the scheduler
- schedule
file: lib/thread.c
function: thread_init
initialize thread-queue
function: thread_create
- allocate thread context, thread id, stack
- set thread esp & eip(thread_entry), entry & arg
- push this thread into thread-queue
- return thread id
function: thread_entry
- call tc_entry with tc_arg
- exit by thread_halt
function: thread_halt
- pop one element from kill_queue, call hook(tc_onhalt, which set by thread_onhalt) and free resources
- push self into kill_queue
- thread yield
- if no other threads, then the whole env exit
function: thread_yield
- pop one element from thread-queue
- if no other thread, then return
- store jump point(toynix_setjmp), and push self into thread-queue
- load thread context and yield(toynix_longjmp)
function: toynix_setjmp
store current context and return 0
function: toynix_longjmp
load context and return arg2
function: thread_wait
- sleep until:
- condition no meet
- already waken up
- clean wakeup bit
function: thread_wakeup
- find all of target thread
- set wakeup bit
file: kernel/spinlock.c
function: spin_lock
acquire the lock
- detect whether holding by self
- exchange lock status
- record info
function: spin_unlock
release the lock
- detect whether holding by self, if no holding then print info
- exchange lock status
file: include/x86.h function: xchg
'xchg' instruction is atomic and x86 CPUs will not reorder loads/stores across 'lock' instructions
asm volatile("lock; xchgl %0, %1"
: "+m" (*addr), "=a" (result)
: "1" (newval)
: "cc");
file: lib/ipc.c
User space
function: ipc_recv
receive a value via IPC and return it, call sys_ipc_recv
function: ipc_send
Send 'val' to 'env', call sys_ipc_try_send
file: kernel/syscall.c
Kernel space
function: sys_ipc_recv
pending self until data is ready
function: sys_ipc_try_send
try to send 'value' to the target env
- find target env by env id
- pass data page
- restore target env running
file: lib/itc.c
function: sys_init
initialize sems & mboxes, and insert into free list
function: sys_sem_new
- get free sem from list
- set counter number
- return sem id
function: sys_sem_free
insert back to free list
function: sys_arch_sem_wait
- if counter > 0, then request counter and return back
- if counter = 0, return when:
- thread wait counter changed
- sleep until time-out
function: sys_sem_signal
- post counter
- if there are someone waiting, then wakeup them(thread_wakeup)
function: sys_mbox_new
- get a free mailbox
- allocate semaphores for queued message and free message
function: sys_mbox_free
- free semaphores
- insert mailbox back to free list
function: sys_mbox_post
- wait sem of free msg
- put msg into mailbox slot
- update
nextq
of mailbox - post sem of queued msg
function: sys_arch_mbox_fetch
- wait sem of queued msg
- gain msg from mailbox slot
- update
head
of mailbox - post sem of free msg
file: kernel/init.c function: ENV_CREATE
create file system task, and this env can access IO ports
file: fs/serv.c function: umain
- initialize opentab (serve_init)
- set block cache handler, initialize super block & bitmap (fs_init)
- trigger fs server (serve)
function: serve
fs server
- wait fs request
- dispatch proper handler
- response to request
file: fs/block_cache.c
function: bc_pgfault
Fault any disk block that is read in to memory by loading it from disk
- find block num by page fault va
- allocate a page
- read ide device 4 sectors as 1 block(page)
- clear dirty bit of page by remapping, because we just read from disk and write to memory
- check the block was allocated in bitmap
function: flush_block
Flush the contents of the block containing VA out to disk
- if the block is not cached or is not dirty, then skip
- write back to disk
- remap this block for cleaning dirty bit
file: fs/fs.c
function: alloc_block
Search the bitmap for a free block and allocate it
- search free block in bitmap
- get it and set it occupied
- update bitmap
function: free_block
Mark a block free in the bitmap
- reset free bit
- update bitmap
function: file_get_block
Set *blk to the address in memory where the filebno'th block of file 'f' would be mapped
- find the block num
- if not allocated yet, then allocate one and map
- return va of this block
function: file_free_block
Remove a block from file
function: file_block_walk
Find the disk block number slot for the 'filebno'th block in file 'f'
- find block from direct array
- find block from indirect array
Regular env FS env
+---------------+ +---------------+
| read | | file_read |
| (lib/fd.c) | | (fs/fs.c) |
....|.......|.......|...|.......^.......|.........................
| v | | | | RPC mechanism
| devfile_read | | serve_read | (Remote Procedure Call)
| (lib/file.c) | | (fs/serv.c) |
| | | | ^ |
| v | | | |
| fsipc | | serve | User Space
| (lib/file.c) | | (fs/serv.c) |
| | | | ^ |
| v | | | |
| ipc_send | | ipc_recv |
| | | | ^ |
+-------|-------+ +-------|-------+
| Kernel Space |
+-------------------+
+----------------------+
| struct OpenFile |
| (max 1024 at once) |
+----------------------+
|
|
+---------------+-------------------+
| |
| |
V V
+-----------------+ +-----------------------------------------+
| struct File | | struct Fd (file descriptor) |
| file meta-data | | associate with process (max 32 per env) |
+-----------------+ +-----------------------------------------+
file: lib/file.c function: fsipc
Send an inter-environment request to the file server, and wait for a reply
file: lib/file.c function: open
- allocate struct fd
- send fd to fs server (fsipc)
- return fd num
file: lib/fd.c function: read write
- find fd by fd num
- find device by device-id of fd
- call read/write function of device(devfile_read/devfile_write)
file: lib/fd.c function: close
- find fd by fd num
- call close function of device(devfile_flush)
- free fd
file: fs/serv.c function: serve_open
- find a free open file, and return file-id(openfile_alloc)
- if file not exist, then create
- open struct File from disk (call file_open)
- initialize struct Fd
- send back struct Fd
function: serve_read serve_write
- find struct open file by file-id
- call file_read/file_write
- update offset
function: serve_flush
- find struct open file by file-id
- flush file (call flush_block)
function: file_read/file_write
- get specific block
- copy data from/to block cache by fs
file: lib/pipe.c
function: pipe
open pipe read/write sides
- allocate struct fd0 & fd1
- map pipe data (twice)
- return fd0 & fd1 num
function: devpipe_read
- if pipe is empty(read pos == write pos) a. if no writers, then return b. or yield
- copy from pipe buffer
- update read pos
function: devpipe_write
- if pipe is full(write pos >= read pos + sizeof(pipe buf)) a. if no readers, then return b. or yield
- copy to pipe buffer
- update write pos
function: devpipe_close
- free fd
- free pipe data
function: pipeisclosed
- make sure detecting in the same timer interrupt
- if pageref(fd) equals to pageref(pipe data), then writer/reader is closed already
file: lib/console.c
function: opencons
allocate fd, and return fd num
function: devcons_read
call sys_cgetc, get only one character from console
function: devcons_write
call sys_cputs, put a string to console
function: devcons_close
free fd
file: user/initsh.c
- make sure close fd0
- open console as 0
- duplicate 0 to 1 (standard input/output)
- spawn sh
file: user/sh.c
- read command line
- if 'cd', then need to change workpath
- fork child
- child run command
- parent wait child done
function: runcmd
- parse shell command
- 'w': argument
- '<': input redirection, open file and dup to 0
- '>': output redirection, open file and dup to 1
- '|': pipe, fork child, tranfer parent output to child input
- '&': run background, no need to wait this child process done
- ';': separate command, need to wait this child process done
- spawn child
- close all file descriptors
- wait child
- exit
- core network server
- input env
- output env
- timer env
file: lib/nsipc.c function: nsipc
send an IP request to the network server, and wait for a reply
file: kernel/pci.c function: pci_init
function: pci_scan_bus
read configuration and irq line
function: pci_attach pci_attach_match
when matches info of device class & vendor, then call
attachfn
to initialize
file: kernel/e1000.c function: pci_e1000_attach
initialize e1000 net card
function: pci_func_enable
allocate resource to this PCI device
file: kernel/e1000.c function: e1000_put_tx_desc
wait until Descriptor Done, then configure DMA registers, fill descriptor and update
e1000_tdt
pointer
function: e1000_get_rx_desc
wait until Descriptor Done, then copy data, clean status and update
e1000_rdt
pointer
file: lib/syscall.c function: sys_tx_pkt
- use
user_mem_assert
checking whether the buffer come from user space - if it is the last descriptor, then mark E1000_TXD_CMD_EOP flag
- call e1000_put_tx_desc, if busy then yield
function: sys_rx_pkt
- use
user_mem_assert
checking whether the buffer come from user space - call e1000_get_rx_desc
file: net/output.c
- ipc receive from network server
- call sys_tx_pkt
file: net/input.c
When you IPC a page to the network server, it will be reading from it for a while, so don't immediately receive another packet in to the same physical page
- allocate nsipcbuf
- call sys_rx_pkt
- ipc send to network server
file: net/serv.c
- setup timer env (as thread's time-scheduler)
- setup input env
- setup output env
- setup thread to run tmain
function: tmain
- TCP/IP initialize
- call serve
function: serve
- wake up all threads
- receive requests
- if it's timer request, then schedule other threads and reset timer
- create thread to deal with other type of request (serve_thread)
- thread yield
function: serve_thread
- if it's request from ns(net-server) client (such as accept, bind, shutdown, close):
- transmit to lwip stack
- send back return-value by ipc_send
- if comes from input env, call jif_input
file: user/httpd.c
- Receive message from client
- Parse url & version of request
- open url
- if file no exist or is dir, then send back 404
- send header
- send file size
- send content type (e.g. text/html)
- send header fin
- send data of file
Request:
GET www.url.com/index.html V1.0
Response:
HTTP/1.0 200 OK
Server: jhttpd/0.1
Content-Length: 123
Content-Type: text/html
"Content of File"