Hello there,

I would like to introduce you all to the pledge(2) which is used to restrict the system operations and supported only on OpenBSD at the time of writing the blog.

I am learning about OpenBSD kernel internals and development and would like to share some tips on how to learn about OpenBSD kernel internals and development.

Following materials I followed while learning about BSD kernel internals:

  • the book “The design and implementation of the BSD operating system” by Kirk McKusick
  • OpenBSD source code
  • man pages and few presentations-papers on OpenBSD
  • ask in mailing lists and OpenBSD Facebook group.

What is pledge(2)?

“pledge” refers to “a solemn promise or undertaking”

So, as in OpenBSD context:

Calling pledge(2) in a program means the program is promissing to the kernel that it will only use the resources which it informs about prior using them

For example: if a user-space program promises kernel to use only IO family calls then it can not perform/call any other family calls like network, process, etc. if it will try to call the other calls without prior informing to kernel then the kernel will abort() the process. How does it make a program more secure?

By limiting the operations of a program. For example:

  • we write a program named “abc” that only needed the stdio to just print something to stdout
  • then we have to add pledge to use only stdio.
  • then, a malicious user found out that there is a vulnerability in our program which one can exploit to get into shell.
  • exploiting our program to open a shell will result in the kernel killing the process with SIGABRT, which can not be catch/ignore and generate a log in the dmesg
  • this happens because opening a shell out of current program needs others family operations like we need fork(2) which resides in “proc” and execution is in “exec” then for network activities “net” but these are not promissed to kernel so calling apis from such family will be forbidden and leads to abort()

after discussing with the developers, they mentioned that pledge(2) is not a system call filter. So, it is not used to restrict system calls, instead of that pledge(2) works on subset promise families like stdio, dns, inet, proc, net etc. but not directly on system calls like read, write, fork, etc.

pledge("read", NULL): incorrect way of using pledge(2) pledge("stdio inet", NULL): correct way of using pledge(2) and, based on the discussion with the developers they mentioned that pledge(2) works on behavioral approach not just like 1:1 approach with the system calls.

On 11 December 2017, Theo de Raadt said:

List: openbsd-tech
Subject: pledge execpromises
From: Theo de Raadt 
Date: 2017-12-11 21:20:51
Message-ID: 6735.1513027251 () cvs ! openbsd ! org

This will probably be committed in the next day or so.

The 2nd argument of pledge() becomes execpromises, which is what
will gets activated after execve.

There is also a small new feature called "error", which causes
violating system calls to return -1 with ENOSYS rather than killing
the process. This must be used with EXTREME CAUTION because libraries
and programs are full of unchecked system calls. If you carry on past
one of these failures, your program is in uncharted territory and
risks of exploitation become high.

"error" is being introduced for a different reason: The pre-exec
process's expectation of what the post-exec process will do might
mismatch, so "error" allows things like starting an editor which has
no network access or maybe other restrictions in the future...

previously it was:

#include <unistd.h>
int pledge(const char *promises, const char *paths[]);

and now it is,

#include <unistd.h>
int pledge(const char *promises, const char *execpromises);

as per OpenBSD 6.2 stable and at the time of writing the blog, developers are still using pledge(const char *promises, const char *paths[]) so our focus will be on the same

How to use pledge() in a program? Let’s take a simple hello world example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#include <unistd.h>
#include <stdio.h>
int
main() {
    if(pledge("stdio",NULL) == -1) {
        err(1,"pledge");
    }
printf("Pledged\n");
return 0;
}

In the above example, the program takes pledge that it will only use stdio operations.

Now, if the above program tries to open network socket(2) or any other operation like fork(2), then the kernel will kill this program with SIGABRT signal.

Let’s take another example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#include <unistd.h>
#include <stdio.h>
int
main() {
    if(pledge("",NULL) == -1) {
        err(1,"pledge");
    }
printf("Pledged\n");
return 0;
}

In the above code snippet, the first parameter of pledge(2) is empty and according to OpenBSD man page, A promises value of "" restricts the process to the _exit(2)

# cat sampe.c
#include <unistd.h>
#include <stdio.h>
int
main() {
    if(pledge("stdio",NULL) == -1) {
        err(1,"pledge");
    }
    printf("Pledged\n");
    return 0;
}
# ./testing
Pledged
#
# vim sampe.c
# gcc -o testing_reduced sampe.c
# cat sampe.c
#include <unistd.h>
#include <stdio.h>
int
main() {
    if(pledge("",NULL) == -1) {
        err(1,"pledge");
    }
    printf("Pledged\n");
    return 0;
}
# ./testing_reduced
Abort trap (core dumped)
#

Introduction on the working of pledge(2) - kernel internals

This part was a little difficult to understand at first. I am very thankful to OpenBSD developers: Marc Espie, Benny Löfgren, Bob Beck, Stuart Henderson and Otto Moerbeek for giving their precious time and resolving queries related to kernel internals of pledge(2).

pledge("stdio", NULL); or pledge("stdio inet proc route dns", NULL)

the above string is split into separate words like “stdio” or “‘stdio’, ‘inet’, ‘proc’, ‘route’, ‘dns’” after that these split words perform look-up operation from pledgereq[] array and if found then their respective flags will return Following pledgereq[] array:

static const struct {
	char *name;
	uint64_t flags;
} pledgereq[] = {
	{ "audio",		PLEDGE_AUDIO },
	{ "bpf",		PLEDGE_BPF },
	{ "chown",		PLEDGE_CHOWN | PLEDGE_CHOWNUID },
	{ "cpath",		PLEDGE_CPATH },
	{ "disklabel",		PLEDGE_DISKLABEL },
	{ "dns",		PLEDGE_DNS },
	{ "dpath",		PLEDGE_DPATH },
	{ "drm",		PLEDGE_DRM },
	{ "error",		PLEDGE_ERROR },
	{ "exec",		PLEDGE_EXEC },
	{ "fattr",		PLEDGE_FATTR | PLEDGE_CHOWN },
	{ "flock",		PLEDGE_FLOCK },
	{ "getpw",		PLEDGE_GETPW },
	{ "id",			PLEDGE_ID },
	{ "inet",		PLEDGE_INET },
	{ "mcast",		PLEDGE_MCAST },
	{ "pf",			PLEDGE_PF },
	{ "proc",		PLEDGE_PROC },
	{ "prot_exec",		PLEDGE_PROTEXEC },
	{ "ps",			PLEDGE_PS },
	{ "recvfd",		PLEDGE_RECVFD },
	{ "route",		PLEDGE_ROUTE },
	{ "rpath",		PLEDGE_RPATH },
	{ "sendfd",		PLEDGE_SENDFD },
	{ "settime",		PLEDGE_SETTIME },
	{ "stdio",		PLEDGE_STDIO },
	{ "tape",		PLEDGE_TAPE },
	{ "tmppath",		PLEDGE_TMPPATH },
	{ "tty",		PLEDGE_TTY },
	{ "unix",		PLEDGE_UNIX },
	{ "unveil",		PLEDGE_UNVEIL },
	{ "vminfo",		PLEDGE_VMINFO },
	{ "vmm",		PLEDGE_VMM },
	{ "wpath",		PLEDGE_WPATH },
	{ "wroute",		PLEDGE_WROUTE },
};

pledgereq array contains macro for every promises; for example, stdio relates to PLEDGE_STDIO. Now, these macros expand into their specific hex pledge values, like PLEDGE_STDIO expands to 0x0000000000000008ULL

For other macros and their expansions as mentioned below:

#include <sys/cdefs.h;>

/*
 * pledge(2) requests
 */
#define PLEDGE_ALWAYS	0xffffffffffffffffULL
#define PLEDGE_RPATH	0x0000000000000001ULL	/* allow open for read */
#define PLEDGE_WPATH	0x0000000000000002ULL	/* allow open for write */
#define PLEDGE_CPATH	0x0000000000000004ULL	/* allow creat, mkdir, unlink etc */
#define PLEDGE_STDIO	0x0000000000000008ULL	/* operate on own pid */
#define PLEDGE_TMPPATH	0x0000000000000010ULL	/* for mk*temp() */
#define PLEDGE_DNS	0x0000000000000020ULL	/* DNS services */
#define PLEDGE_INET	0x0000000000000040ULL	/* AF_INET/AF_INET6 sockets */
#define PLEDGE_FLOCK	0x0000000000000080ULL	/* file locking */
#define PLEDGE_UNIX	0x0000000000000100ULL	/* AF_UNIX sockets */
#define PLEDGE_ID	0x0000000000000200ULL	/* allow setuid, setgid, etc */
#define PLEDGE_TAPE	0x0000000000000400ULL	/* Tape ioctl */
#define PLEDGE_GETPW	0x0000000000000800ULL	/* YP enables if ypbind.lock */
#define PLEDGE_PROC	0x0000000000001000ULL	/* fork, waitpid, etc */
#define PLEDGE_SETTIME	0x0000000000002000ULL	/* able to set/adj time/freq */
#define PLEDGE_FATTR	0x0000000000004000ULL	/* allow explicit file st_* mods */
#define PLEDGE_PROTEXEC	0x0000000000008000ULL	/* allow use of PROT_EXEC */
#define PLEDGE_TTY	0x0000000000010000ULL	/* tty setting */
#define PLEDGE_SENDFD	0x0000000000020000ULL	/* AF_UNIX CMSG fd sending */
#define PLEDGE_RECVFD	0x0000000000040000ULL	/* AF_UNIX CMSG fd receiving */
#define PLEDGE_EXEC	0x0000000000080000ULL	/* execve, child is free of pledge */
#define PLEDGE_ROUTE	0x0000000000100000ULL	/* routing lookups */
#define PLEDGE_MCAST	0x0000000000200000ULL	/* multicast joins */
#define PLEDGE_VMINFO	0x0000000000400000ULL	/* vminfo listings */
#define PLEDGE_PS	0x0000000000800000ULL	/* ps listings */
#define PLEDGE_DISKLABEL 0x0000000002000000ULL	/* disklabels */
#define PLEDGE_PF	0x0000000004000000ULL	/* pf ioctls */
#define PLEDGE_AUDIO	0x0000000008000000ULL	/* audio ioctls */
#define PLEDGE_DPATH	0x0000000010000000ULL	/* mknod & mkfifo */
#define PLEDGE_DRM	0x0000000020000000ULL	/* drm ioctls */
#define PLEDGE_VMM	0x0000000040000000ULL	/* vmm ioctls */
#define PLEDGE_CHOWN	0x0000000080000000ULL	/* chown(2) family */
#define PLEDGE_CHOWNUID	0x0000000100000000ULL	/* allow owner/group changes */
#define PLEDGE_BPF	0x0000000200000000ULL	/* bpf ioctl */
#define PLEDGE_ERROR	0x0000000400000000ULL	/* ENOSYS instead of kill */
#define PLEDGE_WROUTE	0x0000000800000000ULL	/* interface address ioctls */
#define PLEDGE_UNVEIL	0x0000001000000000ULL	/* allow unveil() */

/*
 * Bits outside PLEDGE_USERSET are used by the kernel itself
 * to track program behaviours which have been observed.
 */
#define PLEDGE_USERSET	0x0fffffffffffffffULL
#define PLEDGE_STATLIE	0x4000000000000000ULL
#define PLEDGE_YPACTIVE	0x8000000000000000ULL	/* YP use detected and allowed */

now, all PLEDGE_* macros will perform or("|") operation with each other and as per the promises in the pledge(2) from (user-space) pseudo algorithm for better understanding given below:

uint64_t flags=0
for content_of_PLEDGE_macro from ["stdio", "inet", "proc", "dns", "proc", "route"]

        flags |= content_of_PLEDGE_macro

ps_pledge = flags

Calculation of pledge_bit (pledge value) in kernel mode, (only for demonstration and understanding the concept better):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
test@openbsd:~$ cat pledge_python.py
import sys

PLEDGE_ALWAYS    =  0xffffffffffffffff  #/* pledge always */
PLEDGE_RPATH     =  0x0000000000000001  #/* allow open for read */
PLEDGE_WPATH     =  0x0000000000000002  #/* allow open for write */
PLEDGE_CPATH     =  0x0000000000000004  #/* allow creat, mkdir, unlink etc */
PLEDGE_STDIO     =  0x0000000000000008  #/* operate on own pid */
PLEDGE_TMPPATH   =  0x0000000000000010  #/* for mk*temp() */
PLEDGE_DNS       =  0x0000000000000020  # /* DNS services */
PLEDGE_INET      =  0x0000000000000040  # /* AF_INET/AF_INET6 sockets */
PLEDGE_FLOCK     =  0x0000000000000080  # /* file locking */
PLEDGE_UNIX      =  0x0000000000000100  # /* AF_UNIX sockets */
PLEDGE_ID        =  0x0000000000000200  # /* allow setuid, setgid, etc */
PLEDGE_TAPE      =  0x0000000000000400  # /* Tape ioctl */
PLEDGE_GETPW     =  0x0000000000000800  # /* YP enables if ypbind.lock */
PLEDGE_PROC      =  0x0000000000001000  # /* fork, waitpid, etc */
PLEDGE_SETTIME   =  0x0000000000002000  # /* able to set/adj time/freq */
PLEDGE_FATTR     =  0x0000000000004000  # /* allow explicit file st_* mods */
PLEDGE_PROTEXEC  =  0x0000000000008000  # /* allow use of PROT_EXEC */
PLEDGE_TTY       =  0x0000000000010000  # /* tty setting */
PLEDGE_SENDFD    =  0x0000000000020000  # /* AF_UNIX CMSG fd sending */
PLEDGE_RECVFD    =  0x0000000000040000  # /* AF_UNIX CMSG fd receiving */
PLEDGE_EXEC      =  0x0000000000080000  # /* execve, child is free of pledge */
PLEDGE_ROUTE     =  0x0000000000100000  # /* routing lookups */
PLEDGE_MCAST     =  0x0000000000200000  # /* multicast joins */
PLEDGE_VMINFO    =  0x0000000000400000  # /* vminfo listings */
PLEDGE_PS        =  0x0000000000800000  # /* ps listings */
PLEDGE_DISKLABEL =  0x0000000002000000  #/* disklabels */
PLEDGE_PF        =  0x0000000004000000  # /* pf ioctls */
PLEDGE_AUDIO     =  0x0000000008000000  # /* audio ioctls */
PLEDGE_DPATH     =  0x0000000010000000  # /* mknod & mkfifo */
PLEDGE_DRM       =  0x0000000020000000  # /* drm ioctls */
PLEDGE_VMM       =  0x0000000040000000  # /* vmm ioctls */
PLEDGE_CHOWN     =  0x0000000080000000  # /* chown(2) family */
PLEDGE_CHOWNUID  =  0x0000000100000000  # /* allow owner/group changes */
PLEDGE_BPF       =  0x0000000200000000  # /* bpf ioctl */
PLEDGE_ERROR     =  0x0000000400000000  # /* ENOSYS instead of kill */

pledgereq = {   "audio"     :  PLEDGE_AUDIO,
                "bpf"       :  PLEDGE_BPF,
                "chown"     :  PLEDGE_CHOWN | PLEDGE_CHOWNUID,
                "cpath"     :  PLEDGE_CPATH,
                "disklabel" :  PLEDGE_DISKLABEL,
                "dns"       :  PLEDGE_DNS,
                "dpath"     :  PLEDGE_DPATH,
                "drm"       :  PLEDGE_DRM,
                "exec"      :  PLEDGE_EXEC,
                "fattr"     :  PLEDGE_FATTR | PLEDGE_CHOWN,
                "flock"     :  PLEDGE_FLOCK,
                "getpw"     :  PLEDGE_GETPW,
                "id"        :  PLEDGE_ID,
                "inet"      :  PLEDGE_INET,
                "mcast"     :  PLEDGE_MCAST,
                "pf"        :  PLEDGE_PF,
                "proc"      :  PLEDGE_PROC,
                "prot_exec" :  PLEDGE_PROTEXEC,
                "ps"        :  PLEDGE_PS,
                "recvfd"    :  PLEDGE_RECVFD,
                "route"     :  PLEDGE_ROUTE,
                "rpath"     :  PLEDGE_RPATH,
                "sendfd"    :  PLEDGE_SENDFD,
                "settime"   :  PLEDGE_SETTIME,
                "stdio"     :  PLEDGE_STDIO,
                "tape"      :  PLEDGE_TAPE,
                "tmppath"   :  PLEDGE_TMPPATH,
                "tty"       :  PLEDGE_TTY,
                "unix"      :  PLEDGE_UNIX,
                "vminfo"    :  PLEDGE_VMINFO,
                "vmm"       :  PLEDGE_VMM,
                "wpath"     :  PLEDGE_WPATH,
            }

def sys_pledge(promises,path):
    flags = 0
    if len(promises) == 0:
        print "ABRT (SIGABRT)"
        sys.exit(1)
    promises_list = promises.split()
    for perm in promises_list:
        try:
            perms = pledgereq[perm]
        except Exception as e:
            print(str(e) + ": Undefined promise(s) made")
            sys.exit(1)

        flags = flags | pledgereq[perm]
    return flags

if __name__ == '__main__':

    pledge_bits = sys_pledge(sys.argv[1],"NULL");

    print "pledge_bits :" + str(hex(pledge_bits))

Output:

test@openbsd:~$ python pledge_python.py "stdio"
pledge_bits :0x8
test@openbsd:~$
test@openbsd:~$ python pledge_python.py "stdio inet proc route dns"
pledge_bits :0x101068
test@openbsd:~$
test@openbsd:~$ python pledge_python.py "stdio abcd"
'abcd': Undefined promise(s) made
test@openbsd:~$
test@openbsd:~$ python pledge_python.py ""
ABRT (SIGABRT)

there are lots of features and more detailed internals pending to cover which we will be discussing some other time in future

but I will suggest and encourage everyone to read the user and kernel code of the pledge(2) to get more understanding about the internals sys/kern/kern_pledge.c.

One of the interesting part about pledge(2) is that the pledge(2) does check that you never go in increasing the pledge flags once a process gets pledged. So, it works mainly in decreasing order.

Finally!!

If something is missing or not correct, please feel free to update.