blob: 1214e70f7ca7cd45f56a89e0c15b7734991e7ad9 [file] [log] [blame]
Rob Landleyaaffef42006-01-22 01:44:29 +00001<!--#include file="header.html" -->
2
3<h2>Rob's notes on programming busybox.</h2>
4
5<ul>
6 <li><a href="#goals">What are the goals of busybox?</a></li>
7 <li><a href="#design">What is the design of busybox?</a></li>
8 <li><a href="#source">How is the source code organized?</a></li>
9 <ul>
10 <li><a href="#source_applets">The applet directories.</a></li>
11 <li><a href="#source_libbb">The busybox shared library (libbb)</a></li>
12 </ul>
13 <li><a href="#adding">Adding an applet to busybox</a></li>
14 <li><a href="#standards">What standards does busybox adhere to?</a></li>
Rob Landleyb1b3cee2006-01-29 06:29:01 +000015 <li><a href="#tips">Tips and tricks.</a></li>
16 <ul>
17 <li><a href="#tips_encrypted_passwords">Encrypted Passwords</a></li>
18 <li><a href="#tips_vfork">Fork and vfork</a></li>
Rob Landleyc29a0f32006-02-12 00:45:39 +000019 <li><a href="#tips_short_read">Short reads and writes</a></li>
Rob Landley4a8d9ef2006-04-10 17:54:23 +000020 <li><a href="#tips_memory">Memory used by relocatable code, PIC, and static linking.</a></li>
Rob Landleyb1b3cee2006-01-29 06:29:01 +000021 </ul>
Rob Landleyd1e38c02006-02-16 03:21:44 +000022 <li><a href="#who">Who are the BusyBox developers?</a></li>
Rob Landleyaaffef42006-01-22 01:44:29 +000023</ul>
24
25<h2><b><a name="goals" />What are the goals of busybox?</b></h2>
26
27<p>Busybox aims to be the smallest and simplest correct implementation of the
28standard Linux command line tools. First and foremost, this means the
29smallest executable size we can manage. We also want to have the simplest
30and cleanest implementation we can manage, be <a href="#standards">standards
31compliant</a>, minimize run-time memory usage (heap and stack), run fast, and
32take over the world.</p>
33
34<h2><b><a name="design" />What is the design of busybox?</b></h2>
35
36<p>Busybox is like a swiss army knife: one thing with many functions.
37The busybox executable can act like many different programs depending on
38the name used to invoke it. Normal practice is to create a bunch of symlinks
39pointing to the busybox binary, each of which triggers a different busybox
40function. (See <a href="FAQ.html#getting_started">getting started</a> in the
41FAQ for more information on usage, and <a href="BusyBox.html">the
42busybox documentation</a> for a list of symlink names and what they do.)
43
44<p>The "one binary to rule them all" approach is primarily for size reasons: a
45single multi-purpose executable is smaller then many small files could be.
46This way busybox only has one set of ELF headers, it can easily share code
47between different apps even when statically linked, it has better packing
48efficiency by avoding gaps between files or compression dictionary resets,
49and so on.</p>
50
51<p>Work is underway on new options such as "make standalone" to build separate
52binaries for each applet, and a "libbb.so" to make the busybox common code
53available as a shared library. Neither is ready yet at the time of this
54writing.</p>
55
56<a name="source" />
57
58<h2><a name="source_applets" /><b>The applet directories</b></h2>
59
60<p>The directory "applets" contains the busybox startup code (applets.c and
61busybox.c), and several subdirectories containing the code for the individual
62applets.</p>
63
64<p>Busybox execution starts with the main() function in applets/busybox.c,
65which sets the global variable bb_applet_name to argv[0] and calls
66run_applet_by_name() in applets/applets.c. That uses the applets[] array
67(defined in include/busybox.h and filled out in include/applets.h) to
68transfer control to the appropriate APPLET_main() function (such as
69cat_main() or sed_main()). The individual applet takes it from there.</p>
70
71<p>This is why calling busybox under a different name triggers different
72functionality: main() looks up argv[0] in applets[] to get a function pointer
73to APPLET_main().</p>
74
75<p>Busybox applets may also be invoked through the multiplexor applet
76"busybox" (see busybox_main() in applets/busybox.c), and through the
77standalone shell (grep for STANDALONE_SHELL in applets/shell/*.c).
78See <a href="FAQ.html#getting_started">getting started</a> in the
79FAQ for more information on these alternate usage mechanisms, which are
80just different ways to reach the relevant APPLET_main() function.</p>
81
82<p>The applet subdirectories (archival, console-tools, coreutils,
83debianutils, e2fsprogs, editors, findutils, init, loginutils, miscutils,
84modutils, networking, procps, shell, sysklogd, and util-linux) correspond
85to the configuration sub-menus in menuconfig. Each subdirectory contains the
86code to implement the applets in that sub-menu, as well as a Config.in
87file defining that configuration sub-menu (with dependencies and help text
88for each applet), and the makefile segment (Makefile.in) for that
89subdirectory.</p>
90
91<p>The run-time --help is stored in usage_messages[], which is initialized at
92the start of applets/applets.c and gets its help text from usage.h. During the
93build this help text is also used to generate the BusyBox documentation (in
94html, txt, and man page formats) in the docs directory. See
95<a href="#adding">adding an applet to busybox</a> for more
96information.</p>
97
98<h2><a name="source_libbb" /><b>libbb</b></h2>
99
100<p>Most non-setup code shared between busybox applets lives in the libbb
101directory. It's a mess that evolved over the years without much auditing
102or cleanup. For anybody looking for a great project to break into busybox
103development with, documenting libbb would be both incredibly useful and good
104experience.</p>
105
106<p>Common themes in libbb include allocation functions that test
107for failure and abort the program with an error message so the caller doesn't
108have to test the return value (xmalloc(), xstrdup(), etc), wrapped versions
109of open(), close(), read(), and write() that test for their own failures
110and/or retry automatically, linked list management functions (llist.c),
111command line argument parsing (getopt_ulflags.c), and a whole lot more.</p>
112
113<h2><a name="adding" /><b>Adding an applet to busybox</b></h2>
114
115<p>To add a new applet to busybox, first pick a name for the applet and
116a corresponding CONFIG_NAME. Then do this:</p>
117
118<ul>
119<li>Figure out where in the busybox source tree your applet best fits,
120and put your source code there. Be sure to use APPLET_main() instead
121of main(), where APPLET is the name of your applet.</li>
122
123<li>Add your applet to the relevant Config.in file (which file you add
124it to determines where it shows up in "make menuconfig"). This uses
125the same general format as the linux kernel's configuration system.</li>
126
127<li>Add your applet to the relevant Makefile.in file (in the same
128directory as the Config.in you chose), using the existing entries as a
129template and the same CONFIG symbol as you used for Config.in. (Don't
130forget "needlibm" or "needcrypt" if your applet needs libm or
131libcrypt.)</li>
132
133<li>Add your applet to "include/applets.h", using one of the existing
134entries as a template. (Note: this is in alphabetical order. Applets
135are found via binary search, and if you add an applet out of order it
136won't work.)</li>
137
138<li>Add your applet's runtime help text to "include/usage.h". You need
139at least appname_trivial_usage (the minimal help text, always included
140in the busybox binary when this applet is enabled) and appname_full_usage
141(extra help text included in the busybox binary with
142CONFIG_FEATURE_VERBOSE_USAGE is enabled), or it won't compile.
143The other two help entry types (appname_example_usage and
144appname_notes_usage) are optional. They don't take up space in the binary,
145but instead show up in the generated documentation (BusyBox.html,
146BusyBox.txt, and the man page BusyBox.1).</li>
147
148<li>Run menuconfig, switch your applet on, compile, test, and fix the
149bugs. Be sure to try both "allyesconfig" and "allnoconfig" (and
150"allbareconfig" if relevant).</li>
151
152</ul>
153
154<h2><a name="standards" />What standards does busybox adhere to?</a></h2>
155
156<p>The standard we're paying attention to is the "Shell and Utilities"
157portion of the <a href=http://www.opengroup.org/onlinepubs/009695399/>Open
158Group Base Standards</a> (also known as the Single Unix Specification version
1593 or SUSv3). Note that paying attention isn't necessarily the same thing as
160following it.</p>
161
162<p>SUSv3 doesn't even mention things like init, mount, tar, or losetup, nor
163commonly used options like echo's '-e' and '-n', or sed's '-i'. Busybox is
164driven by what real users actually need, not the fact the standard believes
165we should implement ed or sccs. For size reasons, we're unlikely to include
166much internationalization support beyond UTF-8, and on top of all that, our
167configuration menu lets developers chop out features to produce smaller but
168very non-standard utilities.</p>
169
170<p>Also, Busybox is aimed primarily at Linux. Unix standards are interesting
171because Linux tries to adhere to them, but portability to dozens of platforms
172is only interesting in terms of offering a restricted feature set that works
173everywhere, not growing dozens of platform-specific extensions. Busybox
174should be portable to all hardware platforms Linux supports, and any other
175similar operating systems that are easy to do and won't require much
176maintenance.</p>
177
178<p>In practice, standards compliance tends to be a clean-up step once an
179applet is otherwise finished. When polishing and testing a busybox applet,
180we ensure we have at least the option of full standards compliance, or else
181document where we (intentionally) fall short.</p>
182
Rob Landleyb1b3cee2006-01-29 06:29:01 +0000183<h2><a name="tips" />Programming tips and tricks.</a></h2>
184
185<p>Various things busybox uses that aren't particularly well documented
186elsewhere.</p>
187
188<h2><a name="tips_encrypted_passwords">Encrypted Passwords</a></h2>
189
190<p>Password fields in /etc/passwd and /etc/shadow are in a special format.
191If the first character isn't '$', then it's an old DES style password. If
192the first character is '$' then the password is actually three fields
193separated by '$' characters:</p>
194<pre>
195 <b>$type$salt$encrypted_password</b>
196</pre>
197
198<p>The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.</p>
199
200<p>The "salt" is a bunch of ramdom characters (generally 8) the encryption
201algorithm uses to perturb the password in a known and reproducible way (such
202as by appending the random data to the unencrypted password, or combining
203them with exclusive or). Salt is randomly generated when setting a password,
204and then the same salt value is re-used when checking the password. (Salt is
205thus stored unencrypted.)</p>
206
207<p>The advantage of using salt is that the same cleartext password encrypted
208with a different salt value produces a different encrypted value.
209If each encrypted password uses a different salt value, an attacker is forced
210to do the cryptographic math all over again for each password they want to
211check. Without salt, they could simply produce a big dictionary of commonly
212used passwords ahead of time, and look up each password in a stolen password
213file to see if it's a known value. (Even if there are billions of possible
214passwords in the dictionary, checking each one is just a binary search against
215a file only a few gigabytes long.) With salt they can't even tell if two
216different users share the same password without guessing what that password
217is and decrypting it. They also can't precompute the attack dictionary for
218a specific password until they know what the salt value is.</p>
219
220<p>The third field is the encrypted password (plus the salt). For md5 this
221is 22 bytes.</p>
222
223<p>The busybox function to handle all this is pw_encrypt(clear, salt) in
224"libbb/pw_encrypt.c". The first argument is the clear text password to be
225encrypted, and the second is a string in "$type$salt$password" format, from
226which the "type" and "salt" fields will be extracted to produce an encrypted
227value. (Only the first two fields are needed, the third $ is equivalent to
228the end of the string.) The return value is an encrypted password in
229/etc/passwd format, with all three $ separated fields. It's stored in
230a static buffer, 128 bytes long.</p>
231
232<p>So when checking an existing password, if pw_encrypt(text,
233old_encrypted_password) returns a string that compares identical to
234old_encrypted_password, you've got the right password. When setting a new
235password, generate a random 8 character salt string, put it in the right
236format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the
237second argument to pw_encrypt(text,buffer).</p>
238
239<h2><a name="tips_vfork">Fork and vfork</a></h2>
240
Rob Landleyb2183772006-02-23 19:59:34 +0000241<p>On systems that haven't got a Memory Management Unit, fork() is unreasonably
242expensive to implement (and sometimes even impossible), so a less capable
243function called vfork() is used instead. (Using vfork() on a system with an
244MMU is like pounding a nail with a wrench. Not the best tool for the job, but
245it works.)</p>
246
Rob Landley03628c82006-01-29 06:45:38 +0000247<p>Busybox hides the difference between fork() and vfork() in
248libbb/bb_fork_exec.c. If you ever want to fork and exec, use bb_fork_exec()
249(which returns a pid and takes the same arguments as execve(), although in
250this case envp can be NULL) and don't worry about it. This description is
251here in case you want to know why that does what it does.</p>
252
Rob Landleyb2183772006-02-23 19:59:34 +0000253<p>Implementing fork() depends on having a Memory Management Unit. With an
254MMU then you can simply set up a second set of page tables and share the
255physical memory via copy-on-write. So a fork() followed quickly by exec()
256only copies a few pages of the parent's memory, just the ones it changes
257before freeing them.</p>
Rob Landleyb1b3cee2006-01-29 06:29:01 +0000258
Rob Landleyb2183772006-02-23 19:59:34 +0000259<p>With a very primitive MMU (using a base pointer plus length instead of page
260tables, which can provide virtual addresses and protect processes from each
261other, but no copy on write) you can still implement fork. But it's
262unreasonably expensive, because you have to copy all the parent process's
263memory into the new process (which could easily be several megabytes per fork).
264And you have to do this even though that memory gets freed again as soon as the
265exec happens. (This is not just slow and a waste of space but causes memory
266usage spikes that can easily cause the system to run out of memory.)</p>
Rob Landleyb1b3cee2006-01-29 06:29:01 +0000267
Rob Landleyb2183772006-02-23 19:59:34 +0000268<p>Without even a primitive MMU, you have no virtual addresses. Every process
269can reach out and touch any other process's memory, because all pointers are to
270physical addresses with no protection. Even if you copy a process's memory to
271new physical addresses, all of its pointers point to the old objects in the
272old process. (Searching through the new copy's memory for pointers and
273redirect them to the new locations is not an easy problem.)</p>
274
275<p>So with a primitive or missing MMU, fork() is just not a good idea.</p>
Rob Landleyb1b3cee2006-01-29 06:29:01 +0000276
277<p>In theory, vfork() is just a fork() that writeably shares the heap and stack
278rather than copying it (so what one process writes the other one sees). In
279practice, vfork() has to suspend the parent process until the child does exec,
280at which point the parent wakes up and resumes by returning from the call to
281vfork(). All modern kernel/libc combinations implement vfork() to put the
282parent to sleep until the child does its exec. There's just no other way to
Rob Landleyb2183772006-02-23 19:59:34 +0000283make it work: the parent has to know the child has done its exec() or exit()
284before it's safe to return from the function it's in, so it has to block
285until that happens. In fact without suspending the parent there's no way to
286even store separate copies of the return value (the pid) from the vfork() call
Rob Landleyb1b3cee2006-01-29 06:29:01 +0000287itself: both assignments write into the same memory location.</p>
288
289<p>One way to understand (and in fact implement) vfork() is this: imagine
290the parent does a setjmp and then continues on (pretending to be the child)
291until the exec() comes around, then the _exec_ does the actual fork, and the
292parent does a longjmp back to the original vfork call and continues on from
293there. (It thus becomes obvious why the child can't return, or modify
294local variables it doesn't want the parent to see changed when it resumes.)
295
296<p>Note a common mistake: the need for vfork doesn't mean you can't have two
297processes running at the same time. It means you can't have two processes
298sharing the same memory without stomping all over each other. As soon as
299the child calls exec(), the parent resumes.</p>
300
Rob Landley03628c82006-01-29 06:45:38 +0000301<p>If the child's attempt to call exec() fails, the child should call _exit()
302rather than a normal exit(). This avoids any atexit() code that might confuse
303the parent. (The parent should never call _exit(), only a vforked child that
304failed to exec.)</p>
305
Rob Landleyb1b3cee2006-01-29 06:29:01 +0000306<p>(Now in theory, a nommu system could just copy the _stack_ when it forks
307(which presumably is much shorter than the heap), and leave the heap shared.
Rob Landleyb2183772006-02-23 19:59:34 +0000308Even with no MMU at all
Rob Landleyb1b3cee2006-01-29 06:29:01 +0000309In practice, you've just wound up in a multi-threaded situation and you can't
310do a malloc() or free() on your heap without freeing the other process's memory
311(and if you don't have the proper locking for being threaded, corrupting the
312heap if both of you try to do it at the same time and wind up stomping on
313each other while traversing the free memory lists). The thing about vfork is
314that it's a big red flag warning "there be dragons here" rather than
315something subtle and thus even more dangerous.)</p>
316
Rob Landleyc29a0f32006-02-12 00:45:39 +0000317<h2><a name="tips_sort_read">Short reads and writes</a></h2>
318
319<p>Busybox has special functions, bb_full_read() and bb_full_write(), to
320check that all the data we asked for got read or written. Is this a real
321world consideration? Try the following:</p>
322
323<pre>while true; do echo hello; sleep 1; done | tee out.txt</pre>
324
325<p>If tee is implemented with bb_full_read(), tee doesn't display output
326in real time but blocks until its entire input buffer (generally a couple
327kilobytes) is read, then displays it all at once. In that case, we _want_
328the short read, for user interface reasons. (Note that read() should never
329return 0 unless it has hit the end of input, and an attempt to write 0
330bytes should be ignored by the OS.)</p>
331
332<p>As for short writes, play around with two processes piping data to each
333other on the command line (cat bigfile | gzip > out.gz) and suspend and
334resume a few times (ctrl-z to suspend, "fg" to resume). The writer can
335experience short writes, which are especially dangerous because if you don't
336notice them you'll discard data. They can also happen when a system is under
337load and a fast process is piping to a slower one. (Such as an xterm waiting
338on x11 when the scheduler decides X is being a CPU hog with all that
339text console scrolling...)</p>
340
341<p>So will data always be read from the far end of a pipe at the
342same chunk sizes it was written in? Nope. Don't rely on that. For one
343counterexample, see <a href="http://www.faqs.org/rfcs/rfc896.html">rfc 896</p>
344for Nagle's algorithm</a>, which waits a fraction of a second or so before
345sending out small amounts of data through a TCP/IP connection in case more
346data comes in that can be merged into the same packet. (In case you were
347wondering why action games that use TCP/IP set TCP_NODELAY to lower the latency
348on their their sockets, now you know.)</p>
349
Rob Landley4a8d9ef2006-04-10 17:54:23 +0000350<h2><a name="tips_memory">Memory used by relocatable code, PIC, and static linking.</a></h2>
351
352<p>The downside of standard dynamic linking is that it results in self-modifying
353code. Although each executable's pages are mmaped() into a process's address
354space from the executable file and are thus naturally shared between processes
355out of the page cache, the library loader (ld-linux.so.2 or ld-uClibc.so.0)
356writes to these pages to supply addresses for relocatable symbols. This
357dirties the pages, triggering copy-on-write allocation of new memory for each
358processes's dirtied pages.</p>
359
360<p>One solution to this is Position Independent Code (PIC), a way of linking
361a file so all the relocations are grouped together. This dirties fewer
362pages (often just a single page) for each process's relocations. The down
363side is this results in larger executables, which take up more space on disk
364(and a correspondingly larger space in memory). But when many copies of the
365same program are running, PIC dynamic linking trades a larger disk footprint
366for a smaller memory footprint, by sharing more pages.</p>
367
368<p>A third solution is static linking. A statically linked program has no
369relocations, and thus the entire executable is shared between all running
370instances. This tends to have a significantly larger disk footprint, but
371on a system with only one or two executables, shared libraries aren't much
372of a win anyway.</p>
373
374<p>You can tell the glibc linker to display debugging information about its
375relocations with the environment variable "LD_DEBUG". Try
376"LD_DEBUG=help /bin/true" for a list of commands. Learning to interperet
377"LD_DEBUG=statistics cat /proc/self/statm" could be interesting.</p>
378
Rob Landleyd1e38c02006-02-16 03:21:44 +0000379<h2><a name="who">Who are the BusyBox developers?</a></h2>
380
381<p>The following login accounts currently exist on busybox.net. (I.E. these
382people can commit <a href="http://busybox.net/downloads/patches">patches</a>
Rob Landley27cd85b2006-02-17 02:38:00 +0000383into subversion for the BusyBox, uClibc, and buildroot projects.)</p>
384
385<pre>
386aldot :Bernhard Fischer
387andersen :Erik Andersen <- uClibc and BuildRoot maintainer.
388bug1 :Glenn McGrath
389davidm :David McCullough
390gkajmowi :Garrett Kajmowicz <- uClibc++ maintainer
391jbglaw :Jan-Benedict Glaw
392jocke :Joakim Tjernlund
393landley :Rob Landley <- BusyBox maintainer
394lethal :Paul Mundt
395mjn3 :Manuel Novoa III
396osuadmin :osuadmin
397pgf :Paul Fox
398pkj :Peter Kjellerstedt
399prpplague :David Anders
400psm :Peter S. Mazinger
401russ :Russ Dill
402sandman :Robert Griebl
403sjhill :Steven J. Hill
404solar :Ned Ludd
405timr :Tim Riker
406tobiasa :Tobias Anderberg
407vapier :Mike Frysinger
Rob Landley27cd85b2006-02-17 02:38:00 +0000408</pre>
409
410<p>The following accounts used to exist on busybox.net, but don't anymore so
411I can't ask /etc/passwd for their names. (If anybody would like to make
412a stab at it...)</p>
Rob Landleyd1e38c02006-02-16 03:21:44 +0000413
414<pre>
415aaronl
Rob Landleyd1e38c02006-02-16 03:21:44 +0000416beppu
Rob Landleyd1e38c02006-02-16 03:21:44 +0000417dwhedon
418erik : Also Erik Andersen?
419gfeldman
420jimg
421kraai
Rob Landleyd1e38c02006-02-16 03:21:44 +0000422markw
423miles
Rob Landleyd1e38c02006-02-16 03:21:44 +0000424proski
425rjune
Rob Landleyd1e38c02006-02-16 03:21:44 +0000426tausq
Rob Landley4a8d9ef2006-04-10 17:54:23 +0000427vodz :Vladimir N. Oleynik
Rob Landleyd1e38c02006-02-16 03:21:44 +0000428</pre>
429
430
Rob Landleyaaffef42006-01-22 01:44:29 +0000431<br>
432<br>
433<br>
434
435<!--#include file="footer.html" -->