1 include/asm-* Interface

Architectures can implement these interfaces in any way they want (macros, inline functions, inline asm), but preferably fast. :-)

These include files also contain architecture-specific type definitions and system call parameters - see section [here].

1.1 <asm/atomic.h>

Atomic operations that C can't guarantee us. Useful for resource counting etc..

typedef int atomic_t;

static __inline__ void atomic_add(atomic_t i, atomic_t *v);
static __inline__ void atomic_sub(atomic_t i, atomic_t *v);
static __inline__ void atomic_inc(atomic_t *v);
static __inline__ void atomic_dec(atomic_t *v);
static __inline__ int atomic_dec_and_test(atomic_t *v);

1.2 <asm/bitops.h>


int set_bit(int nr, void * addr);
int clear_bit(int nr, void * addr);
int change_bit(int nr, void * addr);

Atomic bit operations (return old bit value).

int test_bit(int nr, void * addr);
int find_first_zero_bit(void * addr, unsigned size);
int find_next_zero_bit (void * addr, int size, int offset);

More bit operations (don't need to be atomic).

unsigned long ffz(unsigned long word);

Find First Zero in word. Undefined if no zero exists, so code should check against ~0UL first.

1.3 <asm/byteorder.h>

Device driver support:

#define __LITTLE_ENDIAN 1234
#define __BIG_ENDIAN

This include file needs to specify exactly one of the above two definitions...


...and exactly one of these two definitions.

#define ntohl(x) ...
#define ntohs(x) ...
#define htonl(x) ...
#define htons(x) ...

Convert longs and shorts from host to network byte order, and vice versa.

1.4 <asm/bugs.h>

This is included by init/main.c to check for architecture-dependent bugs.


static void check_bugs(void);

Called at kernel initialization; should check for processor bugs and set global variables associated with them.

1.5 <asm/checksum.h>

Compute several TCP/IP checksums. For Linux/L4, can probably be copied from <asm-i386/dma.h>.

1.6 <asm/delay.h>

extern __inline__ void __delay(int loops);

Loop for loops loops.

extern __inline__ void udelay(unsigned long usecs);

Loop for usecs micro seconds using the global variable unsigned long loops_per_sec (calibrated at kernel initialization).

extern __inline__ unsigned long muldiv(unsigned long a, 
        unsigned long b, unsigned long c);

Compute a*b/c.

1.7 <asm/dma.h>

DMA controller manipulation via ports. For Linux/L4, can probably be copied from <asm-i386/dma.h>.

1.8 <asm/floppy.h>

i386-specific floppy device driver support. For Linux/L4, can probably be copied from <asm-i386/floppy.h>.

1.9 <asm/io.h>

Port manipulation (inb(), outb() and the like). For Linux/L4, can probably be copied from <asm-i386/io.h>.

1.10 <asm/irq.h>


void disable_irq(unsigned int);
void enable_irq(unsigned int);

Mask/unmask the specified IRQ.

1.11 <asm/mmu_context.h>

inline void get_mmu_context(struct task_struct *p);

Get a new MMU context (ASN, address space number). (The x86's don't know about contexts.)

1.12 <asm/page.h>

#define PAGE_SHIFT      12
#define PAGE_SIZE       (1UL << PAGE_SHIFT)
#define PAGE_MASK       (~(PAGE_SIZE-1))
typedef unsigned long pte_t;
typedef unsigned long pmd_t;
typedef unsigned long pgd_t;
typedef unsigned long pgprot_t;

Typedefs for hardware data structures: page table entry, page mid-level directory entry, page directory entry, page protection flags.

#define pte_val(x)      (x)
#define pmd_val(x)      (x)
#define pgd_val(x)      (x)
#define pgprot_val(x)   (x)

Access functions for the above types. (This allows to implement type checks.)

#define __pte(x)        (x)
#define __pgd(x)        (x)
#define __pgprot(x)     (x)

These macros cast a value to a given type.

#define PAGE_ALIGN(addr)        (((addr)+PAGE_SIZE-1)&PAGE_MASK)

To align the pointer to the (next) page boundary.

#define PAGE_OFFSET             0
#define MAP_NR(addr)            (((unsigned long)(addr)) >> PAGE_SHIFT)

This handles subscripting the memory map array mem_map[] (defined in <linux/mm.h>).

1.13 <asm/param.h>

#ifndef HZ
#define HZ 100

#define EXEC_PAGESIZE   4096

#ifndef NGROUPS
#define NGROUPS         32

#ifndef NOGROUP
#define NOGROUP         (-1)

#define MAXHOSTNAMELEN  64      /* max length of hostname */

The obvious stuff.

1.14 <asm/pgtable.h>

static inline void invalidate(void);
static inline void invalidate_all(void);
static inline void invalidate_mm(struct mm_struct *mm);
static inline void invalidate_page(struct vm_area_struct *vma,
        unsigned long addr);
static inline void invalidate_range(struct mm_struct *mm,
        unsigned long start, unsigned long end);

Flush TLB entries:

/* PMD_SHIFT determines the size of the area a second-level page table can map */
#define PMD_SHIFT       22
#define PMD_SIZE        (1UL << PMD_SHIFT)
#define PMD_MASK        (~(PMD_SIZE-1))

/* PGDIR_SHIFT determines what a third-level page table entry can map */
#define PGDIR_SHIFT     22
#define PGDIR_SIZE      (1UL << PGDIR_SHIFT)
#define PGDIR_MASK      (~(PGDIR_SIZE-1))

 * entries per page directory level: the i386 is two-level, so
 * we don't really have any PMD directory physically.
#define PTRS_PER_PTE    1024
#define PTRS_PER_PMD    1
#define PTRS_PER_PGD    1024

The above definitions define the sizes of page directories, mid-level page directories and page tables.

#define VMALLOC_OFFSET  (8*1024*1024)
#define VMALLOC_START ((high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1))
#define VMALLOC_VMADDR(x) (TASK_SIZE + (unsigned long)(x))

VMALLOC_START is the start of the kernel's virtual vmalloc() area - in this case, at the next 8MB boundary after the end of physical memory. The VMALLOC_VMADDR() macro computes the virtual (linear) address of a vmalloc()'d region (TASK_SIZE points to the beginning of the kernel's data segment (0xC0000000 in the monolithic kernel)).

#define PAGE_NONE       ... /* page mapped, but none of r/w/x */
#define PAGE_SHARED     ... /* r/w shared page */
#define PAGE_COPY       ... /* r/o mapped copy-on-write shared page */
#define PAGE_READONLY   ... /* r/o page */
#define PAGE_KERNEL     ... /* kernel-only page */

Different page attributes. These must be mapped to bit vectors that can be stored in page table entries.

        /* xwr */
#define __P000  PAGE_NONE
#define __P001  PAGE_READONLY
#define __P010  PAGE_COPY
#define __P011  PAGE_COPY
#define __P100  PAGE_READONLY
#define __P101  PAGE_READONLY
#define __P110  PAGE_COPY
#define __P111  PAGE_COPY

#define __S000  PAGE_NONE
#define __S001  PAGE_READONLY
#define __S010  PAGE_SHARED
#define __S011  PAGE_SHARED
#define __S100  PAGE_READONLY
#define __S101  PAGE_READONLY
#define __S110  PAGE_SHARED
#define __S111  PAGE_SHARED

Page protections (r/w/x) for private and shared cases mapped to page attributes. The i386 can't do page protection for execute, and considers them the same as read. Also, write permissions imply read permissions. This is the closest we can get.

#define BAD_PAGETABLE ...
#define BAD_PAGE ...
#define ZERO_PAGE ...

BAD_PAGETABLE is used when we need a bogus page-table (a table pointing to bogus pages), while BAD_PAGE is used for a bogus page (a shared scratch page). These are used when Linux is out of memory.

ZERO_PAGE is a global shared page that is always zero: used for zero-mapped memory areas etc.

/* number of bits that fit into a memory pointer */
#define BITS_PER_PTR                    (8*sizeof(unsigned long))

/* to align the pointer to a pointer address */
#define PTR_MASK                        (~(sizeof(void*)-1))

/* sizeof(void*)==1<<SIZEOF_PTR_LOG2 */
/* 64-bit machines, beware!  SRB. */
#define SIZEOF_PTR_LOG2                 2

Used for pointer arithmetics.

#define PAGE_PTR(address) ...

This macro computes an index into a page table.

inline void SET_PAGE_DIR(struct task_struct * tsk, pgd_t * pgdir);

Set the task's page directory to pgdir. If the task is the current task, set the current hardware page directory pointer to pgdir.

extern unsigned long high_memory;

End of physical memory.

extern inline int pte_none(pte_t pte);
extern inline int pte_present(pte_t pte);       
extern inline void pte_clear(pte_t *ptep);

Page table entry manipulation: check whether a page table entry is in use, or the page is present, and a routine to clear a page table entry.

extern inline int pmd_none(pmd_t pmd);
extern inline int pmd_bad(pmd_t pmd);
extern inline int pmd_present(pmd_t pmd);
extern inline void pmd_clear(pmd_t * pmdp);

The same routines for mid-level page directories, with an additional routine that checks whether a page directory entry is invalid.

extern inline int pgd_none(pgd_t pgd);
extern inline int pgd_bad(pgd_t pgd);
extern inline int pgd_present(pgd_t pgd);
extern inline void pgd_clear(pgd_t * pgdp);

The same routines for (3rd-level) page directories.

extern inline int pte_read(pte_t pte);
extern inline int pte_write(pte_t pte);
extern inline int pte_exec(pte_t pte);
extern inline int pte_dirty(pte_t pte);
extern inline int pte_young(pte_t pte);

These routines check whether some specific flag is set on a page. They only need to return defined results if the page table entry is valid (i.e., the page is present). ,,Young'' means the page has been accessed.

extern inline pte_t pte_wrprotect(pte_t pte);
extern inline pte_t pte_rdprotect(pte_t pte);
extern inline pte_t pte_exprotect(pte_t pte);
extern inline pte_t pte_mkclean(pte_t pte);
extern inline pte_t pte_mkold(pte_t pte);
extern inline pte_t pte_uncow(pte_t pte);
extern inline pte_t pte_mkwrite(pte_t pte);
extern inline pte_t pte_mkread(pte_t pte);
extern inline pte_t pte_mkexec(pte_t pte);
extern inline pte_t pte_mkdirty(pte_t pte);
extern inline pte_t pte_mkyoung(pte_t pte);
extern inline pte_t pte_modify(pte_t pte, pgprot_t newprot);

These routines manipulate page table entries: They take a valid page table entry and return the modified entry. Pages can be write protected, read protected, protected from being executed, made ,,clean'' and ,,dirty'' (i.e., the dirty flag is cleared/set), made ,,old'' and ,,young'' (i.e., have their accessed bit cleared/set), made writable, readable and executable, and have their protection flags reset to some value.

extern inline pte_t mk_pte(unsigned long page, pgprot_t pgprot);
extern inline unsigned long pte_page(pte_t pte);
extern inline unsigned long pmd_page(pmd_t pmd);

Conversion functions: convert a page and protection to a page entry, and a page entry and page directory to the page they refer to.

extern inline pgd_t * pgd_offset(struct task_struct * tsk, unsigned
                                 long address);
extern inline pmd_t * pmd_offset(pgd_t * dir, unsigned long address);
extern inline pte_t * pte_offset(pmd_t * dir, unsigned long address);

These routines return entries belonging to specific virtual addresses from page directories and page tables.

extern inline void pte_free_kernel(pte_t * pte);
extern inline pte_t * pte_alloc_kernel(pmd_t * pmd, unsigned long address);
extern inline void pmd_free_kernel(pmd_t * pmd);
extern inline pmd_t * pmd_alloc_kernel(pgd_t * pgd, unsigned long address);
extern inline void pte_free(pte_t * pte);
extern inline pte_t * pte_alloc(pmd_t * pmd, unsigned long address);
extern inline void pmd_free(pmd_t * pmd);
extern inline pmd_t * pmd_alloc(pgd_t * pgd, unsigned long address);
extern inline void pgd_free(pgd_t * pgd);
extern inline pgd_t * pgd_alloc(void);

Allocate and free page tables, and return a pointer an entry in the new table that belongs to address. The xxx_kernel() versions are used to allocate a kernel page table - this turns on ASN (address space number) bits if any, and marks the page tables MAP_PAGE_RESERVED in mem_map[].

extern pgd_t swapper_pg_dir[1024];

A kernel page directory used whenever the current task's page directory is invalid.

extern inline void update_mmu_cache(struct vm_area_struct * vma,
        unsigned long address, pte_t pte);

Update the MMU after manupulating the page tables. On the i386 hardware, this is unnecessary because it doesn't have any external MMU info.

#define SWP_TYPE(entry) (((entry) >> 1) & 0x7f)
#define SWP_OFFSET(entry) ((entry) >> 8)
#define SWP_ENTRY(type,offset) (((type) << 1) | ((offset) << 8))

Extract information about a swapped page from a page table entry, and construct a page table entry from swap info.

1.15 <asm/processor.h>

#define TASK_SIZE ...

Size of user virtual memory area. On the i386, this should be 0xC0000000.

extern int EISA_bus;
extern int MCA_bus;

Bus types.

extern int wp_works_ok;

Should be set to 0 if supervisor can access write-protected pages.

This include file should also declare all other system setup/ hardware bug flags tested for in <asm/bugs.h>.

#define EISA_bus__is_a_macro
#define MCA_bus__is_a_macro
#define wp_works_ok__is_a_macro

If any of the respective variables has been hard-coded using a macros, the corresponding definition should be included to avoid problems when building the kernel's symbol table (kernel/ksyms.c).

struct thread_struct {...};

State associated with a thread, i.e., register values, etc.

#define INIT_TSS {...}

A thread_struct used as an initializer for the kernel's first task.

#define INIT_MMAP {...}

A vm_area_struct used as an initializer for init_mmap, which seems to be used nowhere (go to linux/sched.c and see for yourself...).

#define alloc_kernel_stack()    ...
#define free_kernel_stack(page) ...

Allocate/free a page used as a kernel stack for a task.

static inline void start_thread(struct pt_regs * regs, 
        unsigned long eip, unsigned long esp);

Start a newly created thread: setup regs using eip and esp. Also setup segment registers in regs if the architecture is segmented.

extern inline unsigned long thread_saved_pc(struct thread_struct *t);

Return saved PC of a blocked thread. This routine can assume the thread has been blocked by a call to schedule(), and can look up the PC using the thread's stack and the SP in t.

1.16 <asm/ptrace.h>

struct pt_regs {...};

This struct defines the way the registers are stored on the stack during a system call or exception, i.e., system call handler fill find this structure on the stack. That means that this structure needs to be layed out such that system call handlers also find their parameters at the places they expect them (at the bottom of the stack).

#define user_mode(regs) ...

Returns true if the process is currently in user mode (and not in system mode).

#define instruction_pointer(regs) ...

Return the PC as an lvalue.

void show_regs(struct pt_regs * regs);

printk() processor state.

1.17 <asm/segment.h>

This file defines an interface for accessing user memory (copyin/copyout).

static inline unsigned char get_user_byte(const char * addr);
#define get_fs_byte(addr) get_user_byte((char *)(addr))

static inline unsigned short get_user_word(const short *addr);
#define get_fs_word(addr) get_user_word((short *)(addr))

static inline unsigned long get_user_long(const int *addr);
#define get_fs_long(addr) get_user_long((int *)(addr))

static inline void put_user_byte(char val,char *addr);
#define put_fs_byte(x,addr) put_user_byte((x),(char *)(addr))

static inline void put_user_word(short val,short * addr);
#define put_fs_word(x,addr) put_user_word((x),(short *)(addr))

static inline void put_user_long(unsigned long val,int * addr);
#define put_fs_long(x,addr) put_user_long((x),(int *)(addr))

static inline void memcpy_tofs(void * to, const void * from, unsigned long n);

static inline void memcpy_fromfs(void * to, const void * from, unsigned long n);

Copyin/copyout routines.

#define KERNEL_DS ...
#define USER_DS ...

Descriptors for kernel and user data segments.

static inline unsigned long get_fs(void);
static inline unsigned long get_ds(void);
static inline void set_fs(unsigned long val);

The get_ routines return the user/kernel data segment descriptors. The set_fs() can be used to temporarily map the user space to kernel space (so that device drivers using copyin/copyout can operate on kernel buffers); set_fs() only needs to accept KERNEL_DS and USER_DS.

1.18 <asm/smp.h>, <asm/smp_lock.h>


Deosn't need to define anything if __SMP__ is not #defined, but could this perhaps be used to do the necessary locking when using external scheduling and kernel-internal multi-threading?

1.19 <asm/string.h>

extern inline void * memcpy(void * to, const void * from, size_t n);
extern inline void * memset(void * s, char c, size_t count);
extern inline size_t strlen(const char * s);

These are the minimally required routines. <asm-i386/string.h> defines many more; this file can probably copied for Linux/L4.

1.20 <asm/system.h>

#define switch_to(prev,next) ...

Switch the context to task next, making the task ,,current'' (current = next). This also switches the user address space to next's address space.

#define nop() ...
#define sti() ...
#define cli() ...
#define iret() ...
#define mb() ...
#define xchg(ptr,x) ...

extern inline unsigned long xchg_u8(char * m, unsigned long val);
extern inline unsigned long xchg_u16(short * m, unsigned long val);
extern inline unsigned long xchg_u32(int * m, unsigned long val);
extern inline unsigned long xchg_u64(volatile long * m, unsigned long val);

Some machine instructions available to i386 device drivers. mb() is a memory barrier instruction; it prevents the optimizer from moving memory references across it.

extern inline int tas(char * m);

Atomic test-and-set instruction.

#define save_flags(x)
#define restore_flags(x)

Save processor flags in a longword.

void disable_hlt(void);
void enable_hlt(void);

A non-mandantory interface available to device drivers.

1.21 <asm/unistd.h>

This include file #defines numbers for all system calls.

In addition, it defines:


1.22 ABI Definitions

Several include files merely define constants and structures pertaining to the architecture's ABI (application binary interface).For Linux/L4, can probably be copied from <asm-i386/*>.

1.22.1 <asm/a.out.h>


struct exec {...};

a.out exec structure for this architecture.

#define N_TRSIZE(a)     ((a).a_trsize)
#define N_DRSIZE(a)     ((a).a_drsize)
#define N_SYMSIZE(a)    ((a).a_syms)

Macros for accessing struct execs.

1.22.2 <asm/elf.h>

#define ELF_NGREG ...
typedef ... elf_greg_t;
typedef elf_greg_t elf_gregset_t[ELF_NGREG];

ELF general-purpose register and register set types.

typedef ... elf_fpregset_t;

ELF floating-point register set type.

1.22.3 <asm/errno.h>

Error numbers.

1.22.4 <asm/fcntl.h>

Several fcntl() opcodes.

1.22.5 <asm/ioctl.h>, <asm/ioctls.h>

Several ioctl() macros.

1.22.6 <asm/mman.h>

Several mmap() opcodes.

1.22.7 <asm/posix_types.h>

Several POSIX type definitions.

1.22.8 <asm/resource.h>

Resource limits for this architecture.

1.22.9 <asm/shmparam.h>

Shared memory definitions.

1.22.10 <asm/signal.h>

Signal handler types, signal numbers, struct sigaction, sigaction() options.

1.22.11 <asm/sigcontext.h>

struct sigcontext {...};

Signal context structure, passed as the third argument to signal handlers.

1.22.12 <asm/socket.h>, <asm/sockios.h>

Socket setsockoptions() and ioctl() opcodes.

1.22.13 <asm/stat.h>

struct old_stat {...};
struct new_stat {...};


1.22.14 <asm/statfs.h>

struct statfs {...};

1.22.15 <asm/termios.h>, <asm/termbits.h>

struct termios, struct termio, and friends.

1.22.16 <asm/types.h>

Several type definitions.

1.22.17 <asm/user.h>

Core file format.

Michael Hohmuth
March 21, 1996