The `/dev/user' special file
(see section Special File Names in gawk)
provides access to the current user's real and effective user and group id
numbers, and if available, the user's supplementary group set.
However, since these are numbers, they do not provide very useful
information to the average user. There needs to be some way to find the
user information associated with the user and group numbers. This
section presents a suite of functions for retrieving information from the
user database. See section Reading the Group Database,
for a similar suite that retrieves information from the group database.
The POSIX standard does not define the file where user information is
kept. Instead, it provides the <pwd.h> header file
and several C language subroutines for obtaining user information.
The primary function is getpwent, for "get password entry."
The "password" comes from the original user database file,
`/etc/passwd', which kept user information, along with the
encrypted passwords (hence the name).
While an awk program could simply read `/etc/passwd' directly
(the format is well known), because of the way password
files are handled on networked systems,
this file may not contain complete information about the system's set of users.
To be sure of being
able to produce a readable, complete version of the user database, it is
necessary to write a small C program that calls getpwent.
getpwent is defined to return a pointer to a struct passwd.
Each time it is called, it returns the next entry in the database.
When there are no more entries, it returns NULL, the null pointer.
When this happens, the C program should call endpwent to close the
database.
Here is pwcat, a C program that "cats" the password database.
/*
* pwcat.c
*
* Generate a printable version of the password database
*
* Arnold Robbins
* arnold@gnu.ai.mit.edu
* May 1993
* Public Domain
*/
#include <stdio.h>
#include <pwd.h>
int
main(argc, argv)
int argc;
char **argv;
{
struct passwd *p;
while ((p = getpwent()) != NULL)
printf("%s:%s:%d:%d:%s:%s:%s\n",
p->pw_name, p->pw_passwd, p->pw_uid,
p->pw_gid, p->pw_gecos, p->pw_dir, p->pw_shell);
endpwent();
exit(0);
}
If you don't understand C, don't worry about it.
The output from pwcat is the user database, in the traditional
`/etc/passwd' format of colon-separated fields. The fields are:
$HOME).
Here are a few lines representative of pwcat's output.
$ pwcat -| root:3Ov02d5VaUPB6:0:1:Operator:/:/bin/sh -| nobody:*:65534:65534::/: -| daemon:*:1:1::/: -| sys:*:2:2::/:/bin/csh -| bin:*:3:3::/bin: -| arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/sh -| miriam:yxaay:112:10:Miriam Robbins:/home/miriam:/bin/sh -| andy:abcca2:113:10:Andy Jacobs:/home/andy:/bin/sh ...
With that introduction, here is a group of functions for getting user information. There are several functions here, corresponding to the C functions of the same name.
# passwd.awk --- access password file information
# Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
# May 1993
BEGIN {
# tailor this to suit your system
_pw_awklib = "/usr/local/libexec/awk/"
}
function _pw_init( oldfs, oldrs, olddol0, pwcat)
{
if (_pw_inited)
return
oldfs = FS
oldrs = RS
olddol0 = $0
FS = ":"
RS = "\n"
pwcat = _pw_awklib "pwcat"
while ((pwcat | getline) > 0) {
_pw_byname[$1] = $0
_pw_byuid[$3] = $0
_pw_bycount[++_pw_total] = $0
}
close(pwcat)
_pw_count = 0
_pw_inited = 1
FS = oldfs
RS = oldrs
$0 = olddol0
}
The BEGIN rule sets a private variable to the directory where
pwcat is stored. Since it is used to help out an awk library
routine, we have chosen to put it in `/usr/local/libexec/awk'.
You might want it to be in a different directory on your system.
The function _pw_init keeps three copies of the user information
in three associative arrays. The arrays are indexed by user name
(_pw_byname), by user-id number (_pw_byuid), and by order of
occurrence (_pw_bycount).
The variable _pw_inited is used for efficiency; _pw_init only
needs to be called once.
Since this function uses getline to read information from
pwcat, it first saves the values of FS, RS, and
$0. Doing so is necessary, since these functions could be called
from anywhere within a user's program, and the user may have his or her
own values for FS and RS.
The main part of the function uses a loop to read database lines, split
the line into fields, and then store the line into each array as necessary.
When the loop is done, _pw_init cleans up by closing the pipeline,
setting _pw_inited to one, and restoring FS, RS, and
$0. The use of _pw_count will be explained below.
function getpwnam(name)
{
_pw_init()
if (name in _pw_byname)
return _pw_byname[name]
return ""
}
The getpwnam function takes a user name as a string argument. If that
user is in the database, it returns the appropriate line. Otherwise it
returns the null string.
function getpwuid(uid)
{
_pw_init()
if (uid in _pw_byuid)
return _pw_byuid[uid]
return ""
}
Similarly,
the getpwuid function takes a user-id number argument. If that
user number is in the database, it returns the appropriate line. Otherwise it
returns the null string.
function getpwent()
{
_pw_init()
if (_pw_count < _pw_total)
return _pw_bycount[++_pw_count]
return ""
}
The getpwent function simply steps through the database, one entry at
a time. It uses _pw_count to track its current position in the
_pw_bycount array.
function endpwent()
{
_pw_count = 0
}
The endpwent function resets _pw_count to zero, so that
subsequent calls to getpwent will start over again.
A conscious design decision in this suite is that each subroutine calls
_pw_init to initialize the database arrays. The overhead of running
a separate process to generate the user database, and the I/O to scan it,
will only be incurred if the user's main program actually calls one of these
functions. If this library file is loaded along with a user's program, but
none of the routines are ever called, then there is no extra run-time overhead.
(The alternative would be to move the body of _pw_init into a
BEGIN rule, which would always run pwcat. This simplifies the
code but runs an extra process that may never be needed.)
In turn, calling _pw_init is not too expensive, since the
_pw_inited variable keeps the program from reading the data more than
once. If you are worried about squeezing every last cycle out of your
awk program, the check of _pw_inited could be moved out of
_pw_init and duplicated in all the other functions. In practice,
this is not necessary, since most awk programs are I/O bound, and it
would clutter up the code.
The id program in section Printing Out User Information,
uses these functions.
Go to the first, previous, next, last section, table of contents.