| RUMPUSER(3) | Library Functions Manual | RUMPUSER(3) |
rumpuser — rump
kernel hypercall interface
rump User Library (librumpuser, -lrumpuser)
#include
<rump/rumpuser.h>
The rumpuser hypercall interfaces allow a
rump kernel to access host resources. A hypervisor implementation must
implement the routines described in this document to allow a rump kernel to
run on the host. The implementation included in
NetBSD is for POSIX-like hosts (*BSD, Linux, etc.).
This document is divided into sections based on the functionality group of
each hypercall.
Since the hypercall interface is a C function interface, both the rump kernel and the hypervisor must conform to the same ABI. The interface itself attempts to assume as little as possible from the type systems, and for example off_t is passed as int64_t and enums are passed as ints. It is recommended that the hypervisor converts these to the native types before starting to process the hypercall, for example by assigning the ints back to enums.
A hypercall is always entered with the calling thread scheduled in the rump kernel. In case the hypercall intends to block while waiting for an event, the hypervisor must first release the rump kernel scheduling context. In other words, the rump kernel context is a resource and holding on to it while waiting for a rump kernel event/resource may lead to a deadlock. Even when there is no possibility of deadlock in the strict sense of the term, holding on to the rump kernel context while performing a slow hypercall such as reading a device will prevent other threads (including the clock interrupt) from using that rump kernel context.
Releasing the context is done by
calling the
hyp_backend_unschedule()
upcall which the hypervisor received from rump kernel as a parameter for
rumpuser_init().
Before a hypercall returns back to the rump kernel, the returning thread
must carry a rump kernel context. In case the hypercall unscheduled itself,
it must reschedule itself by calling
hyp_backend_schedule().
int
rumpuser_init(int version,
struct rump_hyperup *hyp)
Initialize the hypervisor.
int
rumpuser_malloc(size_t
len, int alignment, void
**memp)
void
rumpuser_free(void
*mem, size_t len)
rumpuser_malloc() which
returned mem.int
rumpuser_open(const char
*name, int mode, int
*fdp)
Open name for I/O and associate a file descriptor with it. Notably, there needs to be no mapping between name and the host's file system namespace. For example, it is possible to associate the file descriptor with device I/O registers for special values of name.
RUMPUSER_OPEN_RDONLYRUMPUSER_OPEN_WRONLYRUMPUSER_OPEN_RDWRRUMPUSER_OPEN_CREATERUMPUSER_OPEN_EXCLRUMPUSER_OPEN_CREATE, flag an
error if name already existsRUMPUSER_OPEN_BIOint
rumpuser_close(int
fd)
Close a previously opened file descriptor.
int
rumpuser_getfileinfo(const
char *name, uint64_t *size, int
*type)
rumpuser_open().NULL, size of the file is returned
here.NULL, type of the file is returned here.
The options are RUMPUSER_FT_DIR,
RUMPUSER_FT_REG,
RUMPUSER_FT_BLK,
RUMPUSER_FT_CHR, or
RUMPUSER_FT_OTHER for directory, regular file,
block device, character device or unknown, respectively.void
rumpuser_bio(int
fd, int op, void *data,
size_t dlen, int64_t off,
rump_biodone_fn biodone, void
*donearg);
Initiate block I/O and return immediately.
RUMPUSER_OPEN_BIO.RUMPUSER_BIO_READ and transfer data to the file
descriptor with RUMPUSER_BIO_WRITE. Unless
RUMPUSER_BIO_SYNC is specified, the hypervisor may
cache a write instead of committing it to permanent storage.int
rumpuser_iovread(int
fd, struct rumpuser_iovec *ruiov,
size_t iovlen, int64_t off,
size_t *retv);
int
rumpuser_iovwrite(int
fd, struct rumpuser_iovec *ruiov,
size_t iovlen, int64_t off,
size_t *retv);
These routines perform scatter-gather I/O
which is not block I/O by nature and therefore cannot be handled by
rumpuser_bio().
struct rumpuser_iovec {
void *iov_base;
size_t iov_len;
};
RUMPUSER_IOV_NOSEEK. The
latter denotes that no attempt to change the underlying objects offset
should be made. Using both types of offsets on a single instance of
fd results in undefined behavior.int
rumpuser_syncfd(int
fd, int flags, uint64_t
start, uint64_t len);
Synchronizes fd with respect to backing storage. The other arguments are:
RUMPUSER_SYNCFD_READRUMPUSER_SYNCFD_WRITEThe following additional parameters may be passed in flags:
RUMPUSER_SYNCFD_BARRIERRUMPUSER_SYNCFD_SYNCThe hypervisor should support two clocks, one for wall time and one for monotonically increasing time, the latter of which may be based on some arbitrary time (e.g. system boot time). If this is not possible, the hypervisor must make a reasonable effort to retain semantics.
int
rumpuser_clock_gettime(int
enum_rumpclock, int64_t *sec,
long *nsec)
RUMPUSER_CLOCK_RELWALL the wall time should be
returned. In case of RUMPUSER_CLOCK_ABSMONO the
time of a monotonic clock should be returned.int
rumpuser_clock_sleep(int
enum_rumpclock, int64_t sec,
long nsec)
RUMPUSER_CLOCK_RELWALL, the sleep
should last at least as long as specified. In case of
RUMPUSER_CLOCK_ABSMONO, the sleep should last
until the hypervisor monotonic clock hits the specified absolute
time.int
rumpuser_getparam(const
char *name, void *buf, size_t
buflen)
Retrieve a configuration parameter from the hypervisor. It is up to the hypervisor to decide how the parameters can be set.
RUMPUSER_PARAM_NCPU which specifies the amount of
virtual CPUs bootstrapped by the rump kernel and
RUMPUSER_PARAM_HOSTNAME which returns a preferably
unique instance name for the rump kernel.void
rumpuser_exit(int
value)
Terminate the rump kernel with exit value
value. If value is
RUMPUSER_PANIC the hypervisor should attempt to
provide something akin to a core dump.
Console output is divided into two routines: a per-character one and printf-like one. The former is used e.g. by the rump kernel's internal printf routine. The latter can be used for direct debug prints e.g. very early on in the rump kernel's bootstrap or when using the in-kernel routine causes too much skew in the debug print results (the hypercall runs outside of the rump kernel and therefore does not cause any locking or scheduling events inside the rump kernel).
void
rumpuser_putchar(int
ch)
Output ch on the console.
void
rumpuser_dprintf(const
char *fmt, ...)
Do output based on printf-like parameters.
A rump kernel should be able to send signals to client programs
due to some standard interfaces including signal delivery in their
specifications. Examples of these interfaces include
setitimer(2) and
write(2). The
rumpuser_kill()
function advises the hypercall implementation to raise a signal for the
process containing the rump kernel.
int
rumpuser_kill(int64_t
pid, int sig)
RUMPUSER_PID_SELF may also be
specified to indicate no hint. This value will be removed in a future
version of the hypercall interface.A rump kernel will ignore the return value of
this hypercall. The only implication of not implementing
rumpuser_kill()
is that some application programs may not experience expected behavior for
standard interfaces.
As an aside,the rump_sp(7) protocol provides equivalent functionality for remote clients.
int
rumpuser_getrandom(void
*buf, size_t buflen, int
flags, size_t *retp)
RUMPUSER_RANDOM_HARD (return true randomness
instead of something from a PRNG) and
RUMPUSER_RANDOM_NOWAIT (do not block in case the
requested amount of bytes is not available).int
rumpuser_thread_create(void
*(*fun)(void *), void *arg,
const char *thrname, int
mustjoin, int priority, int
cpuidx, void **cookie);
Create a schedulable host thread context. The rump kernel will call this interface when it creates a kernel thread. The scheduling policy for the new thread is defined by the hypervisor. In case the hypervisor wants to optimize the scheduling of the threads, it can perform heuristics on the thrname, priority and cpuidx parameters.
rumpuser_thread_join()
when the thread exits.rumpuser_thread_join().void
rumpuser_thread_exit(void)
Called when a thread created with
rumpuser_thread_create()
exits.
int
rumpuser_thread_join(void
*cookie)
Wait for a joinable thread to exit.
The cookie matches the value from
rumpuser_thread_create().
void
rumpuser_curlwpop(int
enum_rumplwpop, struct lwp *l)
Manipulate the hypervisor's thread context database. The possible operations are create, destroy, and set as specified by enum_rumplwpop:
RUMPUSER_LWP_CREATERUMPUSER_LWP_DESTROYRUMPUSER_LWP_SETRUMPUSER_LWP_CLEARRUMPUSER_LWP_SET. The value passed in
l is the current thread and is never
NULL.struct lwp *
rumpuser_curlwp(void)
Retrieve the rump kernel thread context
associated with the current host thread, as set by
rumpuser_curlwpop().
This routine may be called when a context is not set and the routine must
return NULL in that case. This interface is expected
to be called very often. Any optimizations pertaining to the execution speed
of this routine should be done in
rumpuser_curlwpop().
void
rumpuser_seterrno(int
errno)
Set an errno value in the calling thread's TLS. Note: this is used only if rump kernel clients make rump system calls.
The locking interfaces have standard semantics, so we will not discuss each one in detail. The data types struct rumpuser_mtx, struct rumpuser_rw and struct rumpuser_cv used by these interfaces are opaque to the rump kernel, i.e. the hypervisor has complete freedom over them.
Most of these interfaces will (and must) relinquish the rump kernel CPU context in case they block (or intend to block). The exceptions are the "nowrap" variants of the interfaces which may not relinquish rump kernel context.
void
rumpuser_mutex_init(struct
rumpuser_mtx **mtxp, int flags)
void
rumpuser_mutex_enter(struct
rumpuser_mtx *mtx)
void
rumpuser_mutex_enter_nowrap(struct
rumpuser_mtx *mtx)
int
rumpuser_mutex_tryenter(struct
rumpuser_mtx *mtx)
void
rumpuser_mutex_exit(struct
rumpuser_mtx *mtx)
void
rumpuser_mutex_destroy(struct
rumpuser_mtx *mtx)
void
rumpuser_mutex_owner(struct
rumpuser_mtx *mtx, struct lwp **lp)
Mutexes provide mutually exclusive locking. The flags, of which at least one must be given, are as follows:
RUMPUSER_MTX_SPINrumpuser_mutex_enter()
is used.RUMPUSER_MTX_KMUTEXrumpuser_mutex_owner() will never be called for
that particular mutex.void
rumpuser_rw_init(struct
rumpuser_rw **rwp)
void
rumpuser_rw_enter(int
enum_rumprwlock, struct rumpuser_rw *rw)
int
rumpuser_rw_tryenter(int
enum_rumprwlock, struct rumpuser_rw *rw)
int
rumpuser_rw_tryupgrade(struct
rumpuser_rw *rw)
void
rumpuser_rw_downgrade(struct
rumpuser_rw *rw)
void
rumpuser_rw_exit(struct
rumpuser_rw *rw)
void
rumpuser_rw_destroy(struct
rumpuser_rw *rw)
void
rumpuser_rw_held(int
enum_rumprwlock, struct rumpuser_rw *rw,
int *heldp);
Read/write locks provide either shared or exclusive locking. The
possible values for lk are
RUMPUSER_RW_READER and
RUMPUSER_RW_WRITER. Upgrading means trying to
migrate from an already owned shared lock to an exclusive lock and
downgrading means migrating from an already owned exclusive lock to a shared
lock.
void
rumpuser_cv_init(struct
rumpuser_cv **cvp)
void
rumpuser_cv_destroy(struct
rumpuser_cv *cv)
void
rumpuser_cv_wait(struct
rumpuser_cv *cv, struct rumpuser_mtx *mtx)
void
rumpuser_cv_wait_nowrap(struct
rumpuser_cv *cv, struct rumpuser_mtx *mtx)
int
rumpuser_cv_timedwait(struct
rumpuser_cv *cv, struct rumpuser_mtx *mtx,
int64_t sec, int64_t nsec);
void
rumpuser_cv_signal(struct
rumpuser_cv *cv)
void
rumpuser_cv_broadcast(struct
rumpuser_cv *cv)
void
rumpuser_cv_has_waiters(struct
rumpuser_cv *cv, int *waitersp)
Condition variables wait for an event. The
mtx interlock eliminates a race between checking the
predicate and sleeping on the condition variable; the mutex should be
released for the duration of the sleep in the normal atomic manner. The
timedwait variant takes a specifier indicating a relative sleep duration
after which the routine will return with ETIMEDOUT.
If a timedwait is signaled before the timeout expires, the routine will
return 0.
The order in which the hypervisor reacquires the rump kernel
context and interlock mutex before returning into the rump kernel is as
follows. In case the interlock mutex was initialized with both
RUMPUSER_MTX_SPIN and
RUMPUSER_MTX_KMUTEX, the rump kernel context is
scheduled before the mutex is reacquired. In case of a purely
RUMPUSER_MTX_SPIN mutex, the mutex is acquired
first. In the final case the order is implementation-defined.
All routines which return an integer return an errno value. The hypervisor must translate the value to the native errno namespace used by the rump kernel. Routines which do not return an integer may never fail.
Antti Kantee, Flexible Operating System Internals: The Design and Implementation of the Anykernel and Rump Kernels, Aalto University Doctoral Dissertations, 2012, Section 2.3.2: The Hypercall Interface.
For a list of all known implementations of the
rumpuser interface, see
https://github.com/rumpkernel/wiki/wiki/Platforms.
The rump kernel hypercall API was first introduced in NetBSD 5.0. The API described above first appeared in NetBSD 7.0.
| July 15, 2023 | NetBSD 11.0 |