Keyboard

[ Class Overview ] [ CGA Screen ] [ Keyboard ]

Character and Key Codes

For the interaction with the PC keyboard, several different codes are relevant and will be discussed in the following.

ASCII Code and CP 437

The "American Standard Code for Information Interchange" is the classic character-encoding standard for representing letters, digits and special characters as 7-bit numbers between 0 and 127. The original IBM PC extended this encoding to 8 bits with a standard called "code page 437", introducing umlauts, accented characters, greek letters, and line-drawing symbols. A small excerpt from the ASCII (and CP 437, a superset of ASCII) character table:

CharacterASCII code
(40
048
149
250
A65
B66
a97

Characters and strings are classically stored in ASCII encoding.

Scancode

The scan code is used to assign unique numbers to the keys on the PC keyboard. This also makes it possible to identify keys that do not correspond to printable characters, such as e.g. the cursor keys. Scancodes do not distinguish between uppercase and lowercase letters, since both are typed in with the same key.

KeyScancode
A30
S31
D32
Cursor up72
Cursor down80

Throughout the PC's history, different keyboards with a different number and meaning of keys were built. Especially the "function" and "special" keys can have different scan codes on different models. Since PC keyboards usually have only a maximum or a little more than 100 keys, 7 bits suffice to represent the scan code.

Make and Break Codes

Programs must be able to determine not only which of the "normal" keys have been pressed, but also whether the Shift key, the Control key or the ALT key were held down at the same time. Therefore, instead of a simple scan code, the keyboard sends one or more so-called make codes for each press and one or more break codes for each release of a key. If a key is held down for longer than a certain period of time, the make code(s) are repeated. For most keys, the make code corresponds to the scan code, and the break code corresponds to the scan code with the most-significant bit set (i.e., offset +128). However, for historical reasons, some keys generate multiple make and break codes even when pressed and released once. The keyboard driver (starting with task #3: Keyboard::prologue()) must map make and break codes of the pressed keys to ASCII codes the application can work with.

Note: Since the interpretation of the make and break codes is rather tedious, boring and not very instructive, we have taken the decoding off your hands in the Keyboard_Controller template code. However, it may be that our implementation does not correctly recognize all characters on your keyboard, especially German umlauts. In this case, you will either have to live with a few wrong characters or adjust the decoding tables accordingly.

Key-press Sequence

When a key is pressed on a PC keyboard, two crossing lines of the scan matrix are connected within the keyboard, closing an electric circuit. The keyboard processor (classically an Intel 8048) determines the position of the pressed key and from this the scan code. This code is sent to the keyboard controller (Intel 8042 for PC/XT) on the PC mainboard via a serial interface.

This keyboard controller communicates with the keyboard via an input port and an output port (for commands to the keyboard). The controller has registers that can be read and written to via the system bus using in and out instructions.

Portread (R), write (W)RegisterMeaning
0x60ROutput bufferMake/break codes from the keyboard
0x60WInput bufferCommands for the keyboard (e.g. set LEDs)
0x64WControl registerCommands for the keyboard controller (e.g. reboot)
0x64RStatus registerKeyboard-controller state (e.g. output buffer full?)

Whenever the keyboard controller writes a byte to its output buffer, it signals this by setting an interrupt request. The processor must react to this by reading the arrived byte from the output buffer. The outb bit 0 in the status register then transitions to 0, indicating that the output buffer is empty again. Only now new characters can be accepted from the keyboard. If the keyboard is used in polling mode, the outb bit can be used to check whether there is actually a character in the output buffer of the keyboard controller. In the opposite direction, it is always necessary to wait until the input buffer of the keyboard controller is empty (inpb bit 1 cleared) before a new character is written.

On PS/2 PCs, the mouse is also connected to the keyboard controller. This means that codes from both the keyboard and the mouse end up in the output buffer. To distinguish the source of the byte, bit 5 (auxb) is available in the status register (1 = mouse, 0 = keyboard).

BitMaskNameMeaning
00x01outbSet to 1 when a character is ready to be read from the output buffer of the keyboard controller
10x02inpbSet to 1 as long as the keyboard controller has not yet fetched a character written by the CPU
50x20auxbSource of the value in the output buffer (0 = keyboard, 1 = mouse)

Configuring the Keyboard

The keyboard can be configured by sending command codes via the input buffer. A total of about 20 commands are supported, but we only use two of them: One to drive the LEDs and one to set the repetition rate and delay. Both expect an additional data byte after the command byte.

Before sending a byte to the keyboard, you should make sure that the input buffer is empty (status register, inpb), then write the byte (command code or user data) to the data port. Then you should wait for a response from the keyboard controller (outb) and check whether the output buffer contains the acknowledgment code 0xfa (ACK). Formally, only then the next byte may be sent. Note that after each byte an ACK is returned – i.e. one after sending the command code, and another one after sending the user data.

Note that while a clean solution would wait for an ACK after every command code, it is non-trivial to achieve a fully standard-conforming implementation. Correctly waiting for an ACK is difficult, as it may be interwoven with or squashed by subsequent key presses. Therefore it is okay to simply ignore the acknowledgment byte.

Command codeNameDescription
0xedset_ledConfigure the keyboard LEDs. After the command code, an additional data byte must be sent to the keyboard; its structure is described below in a separate table.
0xf3set_speedSet repeat rate and delay according to a data byte following the command byte; its structure is described below in a separate table.

The following table shows the structure of the data byte of set_led for setting the keyboard LEDs. MSB means most significant bit (i.e. 0x80 in hexadecimal notation), LSB least significant bit (i.e. 0x01).

MSB LSB
Always 0 Always 0 Always 0 Always 0 Always 0 Caps Lock Num Lock Scroll Lock

The following two tables describe the data byte of set_speed. The repetition rate is specified by bits 0–4, the delay by bits 5 and 6.

Bits 0–4 (hex) Repeat rate (characters per second)
0x0030
0x0225
0x0420
0x0815
0x0c10
0x107
0x145
Bits 5 and 6 (hex) Delay (in seconds)
0x000.25
0x010.5
0x020.75
0x031.0




Further Information and References