Blame - docs/busybox.net/programming.html - codeaurora/busybox

blob: 99fdaacb7d58a77662b7821496328f0e67a23128 [file] [log] [blame]

Rob Landley	aaffef4	2006-01-22 01:44:29 +0000	[diff] [blame]	1	<!--#include file="header.html" -->
				2
				3	<h2>Rob's notes on programming busybox.</h2>
				4
				5	<ul>
				6	<li><a href="#goals">What are the goals of busybox?</a></li>
				7	<li><a href="#design">What is the design of busybox?</a></li>
				8	<li><a href="#source">How is the source code organized?</a></li>
				9	<ul>
				10	<li><a href="#source_applets">The applet directories.</a></li>
				11	<li><a href="#source_libbb">The busybox shared library (libbb)</a></li>
				12	</ul>
				13	<li><a href="#adding">Adding an applet to busybox</a></li>
				14	<li><a href="#standards">What standards does busybox adhere to?</a></li>
Rob Landley	b1b3cee	2006-01-29 06:29:01 +0000	[diff] [blame]	15	<li><a href="#tips">Tips and tricks.</a></li>
				16	<ul>
				17	<li><a href="#tips_encrypted_passwords">Encrypted Passwords</a></li>
				18	<li><a href="#tips_vfork">Fork and vfork</a></li>
Rob Landley	c29a0f3	2006-02-12 00:45:39 +0000	[diff] [blame]	19	<li><a href="#tips_short_read">Short reads and writes</a></li>
Rob Landley	b1b3cee	2006-01-29 06:29:01 +0000	[diff] [blame]	20	</ul>
Rob Landley	aaffef4	2006-01-22 01:44:29 +0000	[diff] [blame]	21	</ul>
				22
				23	<h2><b><a name="goals" />What are the goals of busybox?</b></h2>
				24
				25	<p>Busybox aims to be the smallest and simplest correct implementation of the
				26	standard Linux command line tools. First and foremost, this means the
				27	smallest executable size we can manage. We also want to have the simplest
				28	and cleanest implementation we can manage, be <a href="#standards">standards
				29	compliant</a>, minimize run-time memory usage (heap and stack), run fast, and
				30	take over the world.</p>
				31
				32	<h2><b><a name="design" />What is the design of busybox?</b></h2>
				33
				34	<p>Busybox is like a swiss army knife: one thing with many functions.
				35	The busybox executable can act like many different programs depending on
				36	the name used to invoke it. Normal practice is to create a bunch of symlinks
				37	pointing to the busybox binary, each of which triggers a different busybox
				38	function. (See <a href="FAQ.html#getting_started">getting started</a> in the
				39	FAQ for more information on usage, and <a href="BusyBox.html">the
				40	busybox documentation</a> for a list of symlink names and what they do.)
				41
				42	<p>The "one binary to rule them all" approach is primarily for size reasons: a
				43	single multi-purpose executable is smaller then many small files could be.
				44	This way busybox only has one set of ELF headers, it can easily share code
				45	between different apps even when statically linked, it has better packing
				46	efficiency by avoding gaps between files or compression dictionary resets,
				47	and so on.</p>
				48
				49	<p>Work is underway on new options such as "make standalone" to build separate
				50	binaries for each applet, and a "libbb.so" to make the busybox common code
				51	available as a shared library. Neither is ready yet at the time of this
				52	writing.</p>
				53
				54	<a name="source" />
				55
				56	<h2><a name="source_applets" /><b>The applet directories</b></h2>
				57
				58	<p>The directory "applets" contains the busybox startup code (applets.c and
				59	busybox.c), and several subdirectories containing the code for the individual
				60	applets.</p>
				61
				62	<p>Busybox execution starts with the main() function in applets/busybox.c,
				63	which sets the global variable bb_applet_name to argv[0] and calls
				64	run_applet_by_name() in applets/applets.c. That uses the applets[] array
				65	(defined in include/busybox.h and filled out in include/applets.h) to
				66	transfer control to the appropriate APPLET_main() function (such as
				67	cat_main() or sed_main()). The individual applet takes it from there.</p>
				68
				69	<p>This is why calling busybox under a different name triggers different
				70	functionality: main() looks up argv[0] in applets[] to get a function pointer
				71	to APPLET_main().</p>
				72
				73	<p>Busybox applets may also be invoked through the multiplexor applet
				74	"busybox" (see busybox_main() in applets/busybox.c), and through the
				75	standalone shell (grep for STANDALONE_SHELL in applets/shell/*.c).
				76	See <a href="FAQ.html#getting_started">getting started</a> in the
				77	FAQ for more information on these alternate usage mechanisms, which are
				78	just different ways to reach the relevant APPLET_main() function.</p>
				79
				80	<p>The applet subdirectories (archival, console-tools, coreutils,
				81	debianutils, e2fsprogs, editors, findutils, init, loginutils, miscutils,
				82	modutils, networking, procps, shell, sysklogd, and util-linux) correspond
				83	to the configuration sub-menus in menuconfig. Each subdirectory contains the
				84	code to implement the applets in that sub-menu, as well as a Config.in
				85	file defining that configuration sub-menu (with dependencies and help text
				86	for each applet), and the makefile segment (Makefile.in) for that
				87	subdirectory.</p>
				88
				89	<p>The run-time --help is stored in usage_messages[], which is initialized at
				90	the start of applets/applets.c and gets its help text from usage.h. During the
				91	build this help text is also used to generate the BusyBox documentation (in
				92	html, txt, and man page formats) in the docs directory. See
				93	<a href="#adding">adding an applet to busybox</a> for more
				94	information.</p>
				95
				96	<h2><a name="source_libbb" /><b>libbb</b></h2>
				97
				98	<p>Most non-setup code shared between busybox applets lives in the libbb
				99	directory. It's a mess that evolved over the years without much auditing
				100	or cleanup. For anybody looking for a great project to break into busybox
				101	development with, documenting libbb would be both incredibly useful and good
				102	experience.</p>
				103
				104	<p>Common themes in libbb include allocation functions that test
				105	for failure and abort the program with an error message so the caller doesn't
				106	have to test the return value (xmalloc(), xstrdup(), etc), wrapped versions
				107	of open(), close(), read(), and write() that test for their own failures
				108	and/or retry automatically, linked list management functions (llist.c),
				109	command line argument parsing (getopt_ulflags.c), and a whole lot more.</p>
				110
				111	<h2><a name="adding" /><b>Adding an applet to busybox</b></h2>
				112
				113	<p>To add a new applet to busybox, first pick a name for the applet and
				114	a corresponding CONFIG_NAME. Then do this:</p>
				115
				116	<ul>
				117	<li>Figure out where in the busybox source tree your applet best fits,
				118	and put your source code there. Be sure to use APPLET_main() instead
				119	of main(), where APPLET is the name of your applet.</li>
				120
				121	<li>Add your applet to the relevant Config.in file (which file you add
				122	it to determines where it shows up in "make menuconfig"). This uses
				123	the same general format as the linux kernel's configuration system.</li>
				124
				125	<li>Add your applet to the relevant Makefile.in file (in the same
				126	directory as the Config.in you chose), using the existing entries as a
				127	template and the same CONFIG symbol as you used for Config.in. (Don't
				128	forget "needlibm" or "needcrypt" if your applet needs libm or
				129	libcrypt.)</li>
				130
				131	<li>Add your applet to "include/applets.h", using one of the existing
				132	entries as a template. (Note: this is in alphabetical order. Applets
				133	are found via binary search, and if you add an applet out of order it
				134	won't work.)</li>
				135
				136	<li>Add your applet's runtime help text to "include/usage.h". You need
				137	at least appname_trivial_usage (the minimal help text, always included
				138	in the busybox binary when this applet is enabled) and appname_full_usage
				139	(extra help text included in the busybox binary with
				140	CONFIG_FEATURE_VERBOSE_USAGE is enabled), or it won't compile.
				141	The other two help entry types (appname_example_usage and
				142	appname_notes_usage) are optional. They don't take up space in the binary,
				143	but instead show up in the generated documentation (BusyBox.html,
				144	BusyBox.txt, and the man page BusyBox.1).</li>
				145
				146	<li>Run menuconfig, switch your applet on, compile, test, and fix the
				147	bugs. Be sure to try both "allyesconfig" and "allnoconfig" (and
				148	"allbareconfig" if relevant).</li>
				149
				150	</ul>
				151
				152	<h2><a name="standards" />What standards does busybox adhere to?</a></h2>
				153
				154	<p>The standard we're paying attention to is the "Shell and Utilities"
				155	portion of the <a href=http://www.opengroup.org/onlinepubs/009695399/>Open
				156	Group Base Standards</a> (also known as the Single Unix Specification version
				157	3 or SUSv3). Note that paying attention isn't necessarily the same thing as
				158	following it.</p>
				159
				160	<p>SUSv3 doesn't even mention things like init, mount, tar, or losetup, nor
				161	commonly used options like echo's '-e' and '-n', or sed's '-i'. Busybox is
				162	driven by what real users actually need, not the fact the standard believes
				163	we should implement ed or sccs. For size reasons, we're unlikely to include
				164	much internationalization support beyond UTF-8, and on top of all that, our
				165	configuration menu lets developers chop out features to produce smaller but
				166	very non-standard utilities.</p>
				167
				168	<p>Also, Busybox is aimed primarily at Linux. Unix standards are interesting
				169	because Linux tries to adhere to them, but portability to dozens of platforms
				170	is only interesting in terms of offering a restricted feature set that works
				171	everywhere, not growing dozens of platform-specific extensions. Busybox
				172	should be portable to all hardware platforms Linux supports, and any other
				173	similar operating systems that are easy to do and won't require much
				174	maintenance.</p>
				175
				176	<p>In practice, standards compliance tends to be a clean-up step once an
				177	applet is otherwise finished. When polishing and testing a busybox applet,
				178	we ensure we have at least the option of full standards compliance, or else
				179	document where we (intentionally) fall short.</p>
				180
Rob Landley	b1b3cee	2006-01-29 06:29:01 +0000	[diff] [blame]	181	<h2><a name="tips" />Programming tips and tricks.</a></h2>
				182
				183	<p>Various things busybox uses that aren't particularly well documented
				184	elsewhere.</p>
				185
				186	<h2><a name="tips_encrypted_passwords">Encrypted Passwords</a></h2>
				187
				188	<p>Password fields in /etc/passwd and /etc/shadow are in a special format.
				189	If the first character isn't '$', then it's an old DES style password. If
				190	the first character is '$' then the password is actually three fields
				191	separated by '$' characters:</p>
				192	<pre>
				193	<b>$type$salt$encrypted_password</b>
				194	</pre>
				195
				196	<p>The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.</p>
				197
				198	<p>The "salt" is a bunch of ramdom characters (generally 8) the encryption
				199	algorithm uses to perturb the password in a known and reproducible way (such
				200	as by appending the random data to the unencrypted password, or combining
				201	them with exclusive or). Salt is randomly generated when setting a password,
				202	and then the same salt value is re-used when checking the password. (Salt is
				203	thus stored unencrypted.)</p>
				204
				205	<p>The advantage of using salt is that the same cleartext password encrypted
				206	with a different salt value produces a different encrypted value.
				207	If each encrypted password uses a different salt value, an attacker is forced
				208	to do the cryptographic math all over again for each password they want to
				209	check. Without salt, they could simply produce a big dictionary of commonly
				210	used passwords ahead of time, and look up each password in a stolen password
				211	file to see if it's a known value. (Even if there are billions of possible
				212	passwords in the dictionary, checking each one is just a binary search against
				213	a file only a few gigabytes long.) With salt they can't even tell if two
				214	different users share the same password without guessing what that password
				215	is and decrypting it. They also can't precompute the attack dictionary for
				216	a specific password until they know what the salt value is.</p>
				217
				218	<p>The third field is the encrypted password (plus the salt). For md5 this
				219	is 22 bytes.</p>
				220
				221	<p>The busybox function to handle all this is pw_encrypt(clear, salt) in
				222	"libbb/pw_encrypt.c". The first argument is the clear text password to be
				223	encrypted, and the second is a string in "$type$salt$password" format, from
				224	which the "type" and "salt" fields will be extracted to produce an encrypted
				225	value. (Only the first two fields are needed, the third $ is equivalent to
				226	the end of the string.) The return value is an encrypted password in
				227	/etc/passwd format, with all three $ separated fields. It's stored in
				228	a static buffer, 128 bytes long.</p>
				229
				230	<p>So when checking an existing password, if pw_encrypt(text,
				231	old_encrypted_password) returns a string that compares identical to
				232	old_encrypted_password, you've got the right password. When setting a new
				233	password, generate a random 8 character salt string, put it in the right
				234	format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the
				235	second argument to pw_encrypt(text,buffer).</p>
				236
				237	<h2><a name="tips_vfork">Fork and vfork</a></h2>
				238
Rob Landley	03628c8	2006-01-29 06:45:38 +0000	[diff] [blame]	239	<p>Busybox hides the difference between fork() and vfork() in
				240	libbb/bb_fork_exec.c. If you ever want to fork and exec, use bb_fork_exec()
				241	(which returns a pid and takes the same arguments as execve(), although in
				242	this case envp can be NULL) and don't worry about it. This description is
				243	here in case you want to know why that does what it does.</p>
				244
Rob Landley	b1b3cee	2006-01-29 06:29:01 +0000	[diff] [blame]	245	<p>On systems that haven't got a Memory Management Unit, fork() is unreasonably
				246	expensive to implement, so a less capable function called vfork() is used
				247	instead.</p>
				248
				249	<p>The reason vfork() exists is that if you haven't got an MMU then you can't
				250	simply set up a second set of page tables and share the physical memory via
				251	copy-on-write, which is what fork() normally does. This means that actually
				252	forking has to copy all the parent's memory (which could easily be tens of
				253	megabytes). And you have to do this even though that memory gets freed again
				254	as soon as the exec happens, so it's probably all a big waste of time.</p>
				255
				256	<p>This is not only slow and a waste of space, it also causes totally
				257	unnecessary memory usage spikes based on how big the _parent_ process is (not
				258	the child), and these spikes are quite likely to trigger an out of memory
				259	condition on small systems (which is where nommu is common anyway). So
				260	although you _can_ emulate a real fork on a nommu system, you really don't
				261	want to.</p>
				262
				263	<p>In theory, vfork() is just a fork() that writeably shares the heap and stack
				264	rather than copying it (so what one process writes the other one sees). In
				265	practice, vfork() has to suspend the parent process until the child does exec,
				266	at which point the parent wakes up and resumes by returning from the call to
				267	vfork(). All modern kernel/libc combinations implement vfork() to put the
				268	parent to sleep until the child does its exec. There's just no other way to
				269	make it work: they're sharing the same stack, so if either one returns from its
				270	function it stomps on the callstack so that when the other process returns,
				271	hilarity ensues. In fact without suspending the parent there's no way to even
				272	store separate copies of the return value (the pid) from the vfork() call
				273	itself: both assignments write into the same memory location.</p>
				274
				275	<p>One way to understand (and in fact implement) vfork() is this: imagine
				276	the parent does a setjmp and then continues on (pretending to be the child)
				277	until the exec() comes around, then the _exec_ does the actual fork, and the
				278	parent does a longjmp back to the original vfork call and continues on from
				279	there. (It thus becomes obvious why the child can't return, or modify
				280	local variables it doesn't want the parent to see changed when it resumes.)
				281
				282	<p>Note a common mistake: the need for vfork doesn't mean you can't have two
				283	processes running at the same time. It means you can't have two processes
				284	sharing the same memory without stomping all over each other. As soon as
				285	the child calls exec(), the parent resumes.</p>
				286
Rob Landley	03628c8	2006-01-29 06:45:38 +0000	[diff] [blame]	287	<p>If the child's attempt to call exec() fails, the child should call _exit()
				288	rather than a normal exit(). This avoids any atexit() code that might confuse
				289	the parent. (The parent should never call _exit(), only a vforked child that
				290	failed to exec.)</p>
				291
Rob Landley	b1b3cee	2006-01-29 06:29:01 +0000	[diff] [blame]	292	<p>(Now in theory, a nommu system could just copy the _stack_ when it forks
				293	(which presumably is much shorter than the heap), and leave the heap shared.
				294	In practice, you've just wound up in a multi-threaded situation and you can't
				295	do a malloc() or free() on your heap without freeing the other process's memory
				296	(and if you don't have the proper locking for being threaded, corrupting the
				297	heap if both of you try to do it at the same time and wind up stomping on
				298	each other while traversing the free memory lists). The thing about vfork is
				299	that it's a big red flag warning "there be dragons here" rather than
				300	something subtle and thus even more dangerous.)</p>
				301
Rob Landley	c29a0f3	2006-02-12 00:45:39 +0000	[diff] [blame]	302	<h2><a name="tips_sort_read">Short reads and writes</a></h2>
				303
				304	<p>Busybox has special functions, bb_full_read() and bb_full_write(), to
				305	check that all the data we asked for got read or written. Is this a real
				306	world consideration? Try the following:</p>
				307
				308	<pre>while true; do echo hello; sleep 1; done \| tee out.txt</pre>
				309
				310	<p>If tee is implemented with bb_full_read(), tee doesn't display output
				311	in real time but blocks until its entire input buffer (generally a couple
				312	kilobytes) is read, then displays it all at once. In that case, we _want_
				313	the short read, for user interface reasons. (Note that read() should never
				314	return 0 unless it has hit the end of input, and an attempt to write 0
				315	bytes should be ignored by the OS.)</p>
				316
				317	<p>As for short writes, play around with two processes piping data to each
				318	other on the command line (cat bigfile \| gzip > out.gz) and suspend and
				319	resume a few times (ctrl-z to suspend, "fg" to resume). The writer can
				320	experience short writes, which are especially dangerous because if you don't
				321	notice them you'll discard data. They can also happen when a system is under
				322	load and a fast process is piping to a slower one. (Such as an xterm waiting
				323	on x11 when the scheduler decides X is being a CPU hog with all that
				324	text console scrolling...)</p>
				325
				326	<p>So will data always be read from the far end of a pipe at the
				327	same chunk sizes it was written in? Nope. Don't rely on that. For one
				328	counterexample, see <a href="http://www.faqs.org/rfcs/rfc896.html">rfc 896</p>
				329	for Nagle's algorithm</a>, which waits a fraction of a second or so before
				330	sending out small amounts of data through a TCP/IP connection in case more
				331	data comes in that can be merged into the same packet. (In case you were
				332	wondering why action games that use TCP/IP set TCP_NODELAY to lower the latency
				333	on their their sockets, now you know.)</p>
				334
Rob Landley	aaffef4	2006-01-22 01:44:29 +0000	[diff] [blame]	335	<br>
				336	<br>
				337	<br>
				338
				339	<!--#include file="footer.html" -->