Wednesday, November 23, 2016

Kesalahan Statistik

http://www.cs.cornell.edu/~asampson/blog/statsmistakes.html


Computer scientists in systemsy fields, myself included, aren’t great at using statistics. Maybe it’s because there are so many other potential problems with empirical evaluations that solid statistical reasoning doesn’t seem that important. Other subfields, like HCI and machine learning, have much higher standards for data analysis. Let’s learn from their example.
Here are three kinds of avoidable statistics mistakes that I notice in published papers.

No Statistics at All

The most common blunder is not using statistics at all when your paper clearly uses statistical data. If your paper uses the phrase “we report the average time over 20 runs of the algorithm,” for example, you should probably use statistics.
Here are two easy things that every paper should do when it deals with performance data or anything else that can randomly vary:
First, plot the error bars. In every figure that represents an average, compute the standard error of the mean or just the plain old standard deviation and add little whiskers to each bar. Explain what the error bars mean in the caption.
(a) Just noise. (b) Meaningful results. (c) Who knows???
Second, do a simple statistical test. If you ever say “our system’s average running time is X seconds, which is less than the baseline running time of Y seconds,” you need show that the difference is statistically significant. Statistical significance tells the reader that the difference you found was more than just “in the noise.”
For most CS papers I read, a really basic test will work: Student’s ttt-test checks that two averages that look different actually are different. The process is easy. Collect some NNN samples from the two conditions, compute the mean X\overline{X}X and the standard deviation sss for each, and plug them into this formula:
t=X1X2s12N1+s22N2 t = \frac{ \overline{X}_1 - \overline{X}_2 } { \sqrt{ \frac{s_1^2}{N_1} + \frac{s_2^2}{N_2} } } t=N1s12+N2s22X1X2
then plug that ttt into the cumulative distribution function of the ttt-distribution to get a ppp-value. If your ppp-value is below a threshold α\alphaα that you chose ahead of time (0.05 or 0.01, say), then you have a statistically significant difference. Your favorite numerical library probably already has an implementation that does all the work for you.
If you’ve taken even an intro stats course, you know all this already! But you might be surprised to learn how many computer scientists don’t. Program committees don’t require that papers use solid statistics, so the literature is full of statistics-free but otherwise-good papers, so standards remain low, and Prof. Ouroboros keeps drawing figures without error bars. Other fields are moving beyond the ppp-value, and CS isn’t even there yet.

Failure to Reject = Confirmation

When you do use a statistical test in a paper, you need to interpret its results correctly. When your test produces a ppp-value, here are the correct interpretations:
  • If p<αp < \alphap<α: The difference between our average running time and the baseline’s average running time is statistically significant. Pedantically, we reject the null hypothesis that says that the averages might be the same.
  • Otherwise, if pαp \ge \alphapα: We conclude nothing at all. Pedantically, we fail to reject that null hypothesis.
It’s tempting to think, when pαp \ge \alphapα, that you’ve found the opposite thing from the p<αp < \alphap<α case: that you get to conclude that there is no statistically significant difference between the two averages. Don’t do that!
Simple statistical tests like the ttt-test only tell you when averages are different; they can’t tell you when they’re the same. When they fail to find a difference, there are two possible explanations: either there is no difference or you haven’t collected enough data yet. So when a test fails, it could be your fault: if you had run a slightly larger experiment with a slightly larger NNN, the test might have successfully found the difference. It’s always wrong to conclude that the difference does not exist.
If you want to claim that two means are equal, you’ll need to use a different test where the null hypothesis says that they differ by at least a certain amount. For example, an appropriate one-tailed ttt-test will do.

The Multiple Comparisons Problem

In most ordinary evaluation sections, it’s probably enough to use only a handful of statistical tests to draw one or two bottom-line conclusions. But you might find yourself automatically running an unbounded number of comparisons. Perhaps you have nnn benchmarks, and you want to compare the running time on each one to a corresponding baseline with a separate statistical test. Or maybe your system works in a feedback loop: it tries one strategy, performs a statistical test to check whether the strategy worked, and starts over with a new strategy otherwise.
Repeated statistical tests can get you into trouble. The problem is that every statistical test has a probability of lying to you. The probability that any single test is wrong is small, but if you do lots of test, the probability amplifies quickly.
For example, say you choose α=0.05\alpha = 0.05α=0.05 and run one ttt-test. When the test succeeds—when it finds a significant difference—it’s telling you that there’s at most an α\alphaα chance that the difference arose from random chance. In 95 out of 100 parallel universes, your paper found a difference that actually exists. I’d take that bet.
Now, say you run a series of nnn tests in the scope of one paper. Then every test has an α\alphaα chance of going wrong. The chances that your paper has more than kkk errors in it is given by the binomial distribution:
1i=0k(ni)αi(1α)ni 1 - \sum_{i=0}^{k} {n \choose i} \alpha^i (1-\alpha)^{n-i} 1i=0k(in)αi(1α)ni
which grows exponentially with the number of tests, nnn. If you use just 10 tests with α=0.05\alpha = 0.05α=0.05, for example, your chance of having one test go wrong grows to 40%. If you do 100, the probability is above 99%. At that point, it’s a near certainty that your paper is misreporting some result.
(To compute these probabilities yourself, set k=0k = 0k=0 so you get the chance of at least one error. Then the CDF above simplifies down to 1(1α)n1 - (1 - \alpha) ^ n1(1α)n.)
This pitfall is called the multiple comparisons problem. If you really need to run lots of tests, all is not lost: there are standard ways to compensate for the increased chance of error. The simplest is the Bonferroni correction, where you reduce your per-test α\alphaα to αn\frac{\alpha}{n}nα to preserve an overall α\alphaα chance of going wrong.

Thursday, August 18, 2016

Using Drupal's autocomplete on a custom form element

http://www.jochenhebbrecht.be/site/2011-01-10/drupal/using-drupals-autocomplete-a-custom-form-element

Did you ever created a custom form? A place where you couldn't use the Form API of Drupal?
Imagine this form, and you want to create an autocomplete feature on the nickname. The moment you start typing a nickname, a list of usernames get in the textfield.
<form action="/foo">
   ...
   <input type="text" value="Nickname"></input>
   <input type="submit" value="Submit"></input>
</form>



Step 1: adjust the HTML code

  • Start from a normal text field
<input id="txt-existing-bidder" type="text"></input>
  • Add attributes to normal text field
<input id="txt-nickname" type="text" class="form-text form-autocomplete text"
       autocomplete="OFF"></input>
  • Add a hidden value to capture the autocomplete values. Put this field immediately after the textfield
<input class="autocomplete" type="hidden" id="txt-nickname-autocomplete"
       value="/##path_to_autocomplete##"/autocomplete" disabled="disabled" />



Step 2: adjust the Javascript code

  • You need the following JS files to have a fully working version:
    drupal_add_js("misc/autocomplete.js");
    drupal_add_js("misc/ahah.js");
  • If the HTML (see above) is added by AJAX, Drupal's autocomplete will not discover the autocomplete fields. Therefore, execute following code after loading the AJAX call
    Drupal.attachBehaviors($("##CONTEXT##"));

    ##CONTEXT##: reference to a div which hold the input


Step 3: create a autocomplete PHP function which returns a JSON object

/**
 * Searches all auction users and returns a JSON object of all users
 * @param $search the value to be searched
 * @return JSON object with all users
 */
function autocomplete_users($search = '') {
        // TODO: implement search function here
 
        // DEBUG
 $users['admin {uid:1}'] = 'admin';
 $users['jochen {uid:2}'] = 'jochen';
 
 return drupal_json($users);
}



Step 4: the result

Saturday, August 6, 2016

A x64 OS #1: UEFI


http://kazlauskas.me/entries/x64-uefi-os-1.html
As a part of the OS pro­ject for the uni­ver­sity there has been a re­quest to also write up the ex­per­i­ences and chal­lenges en­countered. This is the first post of the series on writ­ing a x64 op­er­at­ing sys­tem when boot­ing straight from UE­FI. Please keep in mind that these posts are writ­ten by a not-even-hob­by­ist and con­tent in these posts should be taken with a grain of salt.

Ker­nel and UEFI

I’ve de­cided to write a ker­nel tar­get­ing x64 as a UEFI ap­plic­a­tion. There is a num­ber of ap­peals to write a ker­nel as UEFI ap­plic­a­tion as op­posed to writ­ing a multi­boot ker­nel. Namely:
  1. For x86 fam­ily of pro­cessors, you avoid the work ne­ces­sary to up­grade from real mode to pro­tec­ted mode and then from pro­tec­ted mode to long mode which is more com­monly known as 64-bit mode. As a UEFI ap­plic­a­tion your ker­nel gets a fully work­ing x64 en­vir­on­ment from a get-go;
  2. Un­like BIOS, UEFI is a well doc­u­mented firm­ware. Most of the in­ter­faces provided by BIOS are de facto and you’re lucky if they work at all, while most of these provided by UEFI are de jure and usu­ally just work;
  3. UEFI is ex­tens­ible, whereas BIOS is not really;
  4. Fi­nally, UEFI is the mod­ern tech­no­logy which is to stay around, while BIOS is a 40 years old piece of tech­no­logy on death row. Learn­ing about soon-to-be-dead tech­no­logy is waste of the ef­fort.
Des­pite my strong at­tach­ment to the Rust com­munity and Rust’s per­fect suit­ab­il­ity for ker­nels1, I’ll be writ­ing the ker­nel in C. Mostly be­cause it is un­likely people in­side the uni­ver­sity will be fa­mil­iar with Rust, but also be­cause GNU-EFI is a C lib­rary and I can­not be bothered to bind it. I’d surely be writ­ing it in Rust were I more ser­i­ous about the pro­ject.

Tool­chain

As it turns out, de­vel­op­ing a x64 ker­nel on a x64 host greatly sim­pli­fies set­ting up the build tool-­chain. I’ll be us­ing:
  • clang to com­pile the C code (no cross-­com­piler is ne­ces­sary2!);
  • gnu-efi lib­rary to in­ter­act with the UEFI firm­ware;
  • qemu emu­lator to run my ker­nel; and
  • OVMF as the UEFI firm­ware.

The UEFI “Hel­lo, world!”

The fol­low­ing snip­pet of code is all the code you need to print some­thing on the screen as an UEFI ap­plic­a­tion:
// main.c
#include <efi.h>
#include <efilib.h>
#include <efiprot.h>

EFI_STATUS
efi_main (EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
    InitializeLib(ImageHandle, SystemTable);
    Print(L"Hello, world from x64!");
    for(;;) __asm__("hlt");
}
However, com­pil­ing this code cor­rectly is not as trivi­al. Fol­low­ing three com­mands are ne­ces­sary to pro­duce a work­ing UEFI ap­plic­a­tion:
clang -I/usr/include/efi -I/usr/include/efi/x86_64 -I/usr/include/efi/protocol -fno-stack-protector -fpic -fshort-wchar -mno-red-zone -DHAVE_USE_MS_ABI -c -o src/main.o src/main.c
ld -nostdlib -znocombreloc -T /usr/lib/elf_x86_64_efi.lds -shared -Bsymbolic -L /usr/lib /usr/lib/crt0-efi-x86_64.o src/main.o -o huehuehuehuehue.so -lefi -lgnuefi
objcopy -j .text -j .sdata -j .data -j .dynamic -j .dynsym  -j .rel -j .rela -j .reloc --target=efi-app-x86_64 huehuehuehuehue.so huehuehuehuehue.efi
The clang com­mand is pretty self-­ex­plan­at­ory: we tell the com­piler where to look for the EFI head­ers and what to com­pile into a ob­ject file. Prob­ably the most non-trivial op­tion here is the -DHAVE_USE_MS_ABI one – x64 UEFI uses the Win­dows’ x64 call­ing con­ven­tion, and not the reg­u­lar C one, thus all ar­gu­ments in calls to UEFI func­tions must be passed in a dif­fer­ent way than it is usu­ally done in C code. His­tor­ic­ally this con­ver­sion was done by the uefi_call_wrapper wrap­per, but clang sup­ports the call­ing con­ven­tion nat­ively, and we tell that fact to the gnu-efi lib­rary with this op­tion3.
Then, I manu­ally link my ob­ject file and UE­FI-spe­cific C runtime up into a shared lib­rary us­ing a cus­tom linker script provided by the gnu-efi lib­rary. The res­ult is an ELF lib­rary about 250KB in size. However, UEFI ex­pects its ap­plic­a­tions in PE ex­ecut­able form­at, so we must con­vert our lib­rary into the de­sired format with the objcopy com­mand. At this point huehuehuehuehue.efi file should be pro­duced and ma­jor­ity of UEFI firm­wares should be able to run it.
In prac­tice, I’ve auto­mated these steps along with a con­sid­er­ably com­plex se­quence of build­ing im­age files I’ve stolen from OS­DEV’s tu­torial on cre­at­ing im­ages into a Make­file. Feel free to copy it in parts or in whole for your own use cases.

UEFI boot and runtime ser­vices

A UEFI ap­plic­a­tion has 2 dis­tinct stages over its life­time: a stage where so-c­alled boot ser­vices are avail­able and stage after these boot ser­vices are dis­abled. An UEFI ap­plic­a­tion will be launched by the UEFI firm­ware and both boot and runtime ser­vices will be avail­able to the ap­plic­a­tion. Most not­ably, boot ser­vices provide APIs for load­ing other UEFI ap­plic­a­tions (e.g. im­ple­ment­ing boot­load­er­s), hand­ling (al­loc­at­ing and deal­loc­at­ing) memory and us­ing pro­to­cols (speak­ing to other act­ive UEFI ap­plic­a­tion­s).
Once the ker­nel is done with us­ing boot ser­vices it calls ExitBootServices which is a method provided by… a boot ser­vice. Past that point only runtime ser­vices are avail­able and you can­not ever re­turn to a state where boot ser­vices are avail­able ex­cept by re­set­ting the sys­tem. Man­aging UEFI vari­ables, sys­tem clock and re­set­ting the sys­tem is pretty much the only things you can do with the runtime ser­vices.
For my ker­nel, I will use the graph­ics out­put pro­tocol to set up the video frame buf­fer, exit the boot ser­vices and, fi­nally, shut down the ma­chine be­fore reach­ing the hlt in­struc­tion. Fol­low­ing piece of code im­ple­ments the de­scribed se­quence. I left some code out, you can see it in full at Git­lab. For ex­ample, the defin­i­tion of init_graphics.
EFI_STATUS
efi_main (EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
    EFI_STATUS status;
    InitializeLib(ImageHandle, SystemTable);

    // Initialize graphics
    EFI_GRAPHICS_OUTPUT_PROTOCOL *graphics;
    EFI_GUID graphics_proto = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
    status = SystemTable->BootServices->LocateProtocol(&graphics_proto, NULL, (void **)&graphics);
    if(status != EFI_SUCCESS) return status;
    status = init_graphics(graphics);
    if(status != EFI_SUCCESS) return status;

    // Figure out the memory map (should be identity mapping)
    boot_state.memory_map = LibMemoryMap(&boot_state.memory_map_size,
                                         &boot_state.map_key,
                                         &boot_state.descriptor_size,
                                         &boot_state.descriptor_version);
    // Exit the boot services...
    SystemTable->BootServices->ExitBootServices(ImageHandle, boot_state.map_key);
    // and set up the memory map we just found.
    SystemTable->RuntimeServices->SetVirtualAddressMap(boot_state.memory_map_size,
                                                       boot_state.descriptor_size,
                                                       boot_state.descriptor_version,
                                                       boot_state.memory_map);
    // Once we’re done we power off the machine.
    SystemTable->RuntimeServices->ResetSystem(EfiResetShutdown, EFI_SUCCESS, 0, NULL);
    for(;;) __asm__("hlt");
}
Note, that some pro­to­cols can either be at­tached to your own EFI_HANDLE or some other EFI_HANDLE (i.e. pro­tocol is provided by an­other UEFI ap­plic­a­tion). Graph­ics out­put pro­tocol I’m us­ing here is an ex­ample of a pro­tocol at­tached to an­other EFI_HANDLE, there­fore we use LocateProtocol boot ser­vice to find it. In the off-chance the pro­tocol is at­tached to ap­plic­a­tion’s own EFI_HANDLE, the HandleProtocol method should be used in­stead:
EFI_LOADED_IMAGE *loaded_image = NULL;
EFI_GUID loaded_image_protocol = LOADED_IMAGE_PROTOCOL;
EFI_STATUS status = SystemTable->BootServices->HandleProtocol(ImageHandle, &loaded_image_protocol, &loaded_image);

Next steps

At this point I have a bare bones frame for my awe­some ker­nel called “hue­hue­hue­hue­hue”. From this point on­wards the de­vel­op­ment of the ker­nel should not dif­fer much from the tra­di­tional de­vel­op­ment of any other x64 ker­nel. Next, I’ll be im­ple­ment­ing soft­ware and hard­ware in­ter­rupts; ex­pect a post on that.

Friday, July 29, 2016

"Reverse Engineering for Beginners" free book


http://beginners.re/#lite

"Reverse Engineering for Beginners" free book

Also known as RE4B. Written by Dennis Yurichev (yurichev.com).

My services


Praise for the book

Also, this book is used at least in:
I've also heard about:
If you know about others, please drop me a note!

Download PDF files

Download English version A4 (for browsing or printing) A5 (for ebook readers)
Скачать русскую (Russian) версию A4 (для просмотра или печати) A5 (для электронных читалок)

PGP Signatures

For those, who wants to be sure the PDF files has been compiled by me, it's possible to check PGP signatures, which are: RE4B-EN-A5.pdf.sig, RE4B-EN.pdf.sig, RE4B-RU-A5.pdf.sig, RE4B-RU.pdf.sig.
My PGP public keys are here: http://yurichev.com/pgp.html.

Contents

Topics discussed: x86/x64, ARM/ARM64, MIPS, Java/JVM.
Topics touched: Oracle RDBMS, Itanium, copy-protection dongles, LD_PRELOAD, stack overflow, ELF, win32 PE file format, x86-64, critical sections, syscalls, TLS, position-independent code (PIC), profile-guided optimization, C++ STL, OpenMP, win32 SEH.

Call for translators!

You may want to help me with translation this work into languages other than English and Russian.
Just send me any piece of translated text (no matter how short) and I'll put it into my LaTeX source code.
Korean, Chinese and Persian languages are reserved by publishers.
English and Russian versions I do by myself, but my English is still that horrible, so I'm very grateful for any notes about grammar, etc. Even my Russian is also flawed, so I'm grateful for notes about Russian text as well!
So do not hesitate to contact me: dennis(a)yurichev.com

Donors

Those who supported me during the time when I wrote significant part of the book:
2 * Oleg Vygovsky (50+100 UAH), Daniel Bilar ($50), James Truscott ($4.5), Luis Rocha ($63), Joris van de Vis ($127), Richard S Shultz ($20), Jang Minchang ($20), Shade Atlas (5 AUD), Yao Xiao ($10), Pawel Szczur (40 CHF), Justin Simms ($20), Shawn the R0ck ($27), Ki Chan Ahn ($50), Triop AB (100 SEK), Ange Albertini (€10+50), Sergey Lukianov (300 RUR), Ludvig Gislason (200 SEK), Gérard Labadie (€40), Sergey Volchkov (10 AUD), Vankayala Vigneswararao ($50), Philippe Teuwen ($4), Martin Haeberli ($10), Victor Cazacov (€5), Tobias Sturzenegger (10 CHF), Sonny Thai ($15), Bayna AlZaabi ($75), Redfive B.V. (€25), Joona Oskari Heikkilä (€5), Marshall Bishop ($50), Nicolas Werner (€12), Jeremy Brown ($100), Alexandre Borges ($25), Vladimir Dikovski (€50), Jiarui Hong (100.00 SEK), Jim Di (500 RUR), Tan Vincent ($30), Sri Harsha Kandrakota (10 AUD), Pillay Harish (10 SGD), Timur Valiev (230 RUR), Carlos Garcia Prado (€10), Salikov Alexander (500 RUR), Oliver Whitehouse (30 GBP), Katy Moe ($14), Maxim Dyakonov ($3), Sebastian Aguilera (€20), Hans-Martin Münch (€15), Jarle Thorsen (100 NOK), Vitaly Osipov ($100), Yuri Romanov (1000 RUR), Aliaksandr Autayeu (€10), Tudor Azoitei ($40), Z0vsky (€10), Yu Dai ($10).
Thanks a lot to every donor!

As seen on...

... hacker news, reddit, habrahabr.ru, Russian-speaking RE forum. There are some parts translated to Chinese.
The book at Goodreads website.

mini-FAQ

Q: I clicked on hyperlink inside of PDF-document, how to get back?
A: (Adobe Acrobat Reader) Alt + LeftArrow

Q: May I print this book? Use it for teaching?
A: Of course, that's why book is licensed under Creative Commons terms (CC BY-SA 4.0). Someone may also want to build their own version of book, read here about it.

Q: Why this book is free? You've done great job. This is suspicious, as many other free things.
A: To my own experience, authors of technical literature do this mostly for self-advertisement purposes. It's not possible to gain any decent money from such work.

Q: I have a question...
A: Write me it by email (dennis(a)yurichev.com).

Supplementary materials

All exercises are moved to standalone website: challenges.re.

Be involved!

Feel free to send me corrections, or, it's even possible to submit patches on book's source code (LaTeX) on GitHub or BitBucket, or SourceForge!
Any suggestions, what also should be added to my book?
Write me an email: dennis(a)yurichev.com

News

See ChangeLog

Stay tuned!

My current plans for this book: Objective-C, Visual Basic, anti-debugging tricks, Windows NT kernel debugger, .NET, Oracle RDBMS.
Here is also my blog and facebook.Web 2.0 hater? Subscribe to my mailing list for receiving updates of this book to email.

About Korean publication

In January 2015, Acorn publishing company (www.acornpub.co.kr) in South Korea did huge amount of work in translating and publishing my book (state which is it in August 2014) in Korean language.
Now it's available at their website.
Translator is Byungho Min (@tais9).
Cover pictures was done by my artist friend Andy Nechaevsky: facebook/andydinka.
They are also the Korean translation copyright holder.
So if you want to have a "real" book on your shelf in Korean language and/or want to support my work, now you may buy it.

Thursday, July 28, 2016

Tuesday, July 19, 2016

Instalasi OpenStreetMap dengan 120GB

http://thinkonbytes.blogspot.co.id/2016/07/your-openstreetmap-server-in-120gb.html

Why code review beats testing: evidence from decades of programming research

https://kev.inburke.com/kevin/the-best-ways-to-find-bugs-in-your-code/

tl;dr If you want to ship high quality code, you should invest in more than one of formal code review, design inspection, testing, and quality assurance. Testing catches fewer bugs per hour than human inspection of code, but it might be catching different types of bugs.
Everyone wants to find bugs in their programs. But which methods are the most effective for finding bugs? I found this remarkable chart in chapter 20 of Steve McConnell’s Code Complete. Summarized here, the chart shows the typical percentage of bugs found using each bug detection technique. The range shows the high and low percentage across software organizations, with the center dot representing the average percentage of bugs found using that technique.
10
20
30
40
50
60
70
80
90
100
Regression test
Informal code reviews
Unit test
New function (component) test
Integration test
Low-volume beta test (< 10 users)
Informal design reviews
Personal desk checking of code
System test
Formal design inspections
Formal code inspections
Modeling or prototyping
High-volume beta test (> 1000 users)
As McConnell notes, “The most interesting facts … are that the modal rates don’t rise above 75% for any one technique, and that the techniques average about 40 percent.” Especially noteworthy is the poor performance of testing compared to formal design review (human inspection). There are three pages of fascinating discussion that follow; I’ll try to summarize.

What does this mean for me?

No one approach to bug detection is adequate. Capers Jones – the researcher behind two of the papers McConnell cites – divides bug detection into four categories: formal design inspection, formal code inspection, formal quality assurance, and formal testing. The best bug detection rate, if you are only using one of the above four categories, is 68%. The average detection rate if you are using all four is 99%.

But a less-effective method may be worth it, if it’s cheap enough

It’s well known that bugs found early on in program development are cheaper to fix; costs increase as you have to push changes to more users, someone else has to dig through code that they didn’t write to find the bug, etc. So while a high-volume beta test is highly effective at finding bugs, it may be more expensive to implement this (and you may develop a reputation for releasing highly buggy software).
Shull et al (2002) estimate that non-severe defects take approximately 14 hours of debugging effort after release, but only 7.4 hours before release, meaning that non-critical bugs are twice as expensive to fix after release, on average. However, the multiplier becomes much, much larger for severe bugs: according to several estimates severe bugs are 100 times more expensive to fix after shipping than they are to fix before shipping.
More generally, inspections are a cheaper method of finding bugs than testing; according to Basili and Selby (1987), code reading detected 80 percent more faults per hour than testing, even when testing programmers on code that contained zero comments. This went against the intuition of the professional programmers, which was that structural testing would be the most efficient method.

How did the researchers measure efficiency?

In each case the efficiency was calculated by taking the number of bug reports found through a specific bug detection technique, and then dividing by the total number of reported bugs.
The researchers conducted a number of different experiments to try and measure numbers accurately. Here are some examples:
  • Giving programmers a program with 15 known bugs, telling them to use a variety of techniques to find bugs, and observing how many they find (no one found more than 9, and the average was 5)
  • Giving programmers a specification, measuring how long it took them to write the code, and how many bugs existed in the code base
  • Formal inspections of processes at companies that produce millions of lines of code, like NASA.
In our company we have low bug rates, thanks to our adherence to software philosophy X.
Your favorite software philosophy probably advocates using several different methods listed above to help detect bugs. Pivotal Labs is known for pair programming, which some people rave about, and some people hate. Pair programming means that with little effort they’re getting informal code review, informal design review and personal desk-checking of code. Combine that with any kind of testing and they are going to catch a lot of bugs.
Before any code at Google gets checked in, one owner of the code base must review and approve the change (formal code review and design inspection). Google also enforces unit tests, as well as a suite of automated tests, fuzz tests, and end to end tests. In addition, everything gets dogfooded internally before a release to the public (high-volume beta test).
It’s likely that any company with a reputation for shipping high-quality code will have systems in place to test at least three of the categories mentioned above for software quality.

Conclusions

If you want to ship high quality code, you should invest in more than one of formal code review, design inspection, testing, and quality assurance. Testing catches fewer bugs per hour than human inspection of code, but it might be catching different types of bugs. You should definitely try to measure where you are finding bugs in your code and the percentage of bugs you are catching before release – 85% is poor, and 99% is exceptional.

Appendix

The chart was created using the jQuery plotting library flot. Here’s the raw Javascript, and the CoffeeScript that generated the graph.
References

Computers in Spaceflight: The NASA Experience

http://history.nasa.gov/computers/Ch4-5.html


- Chapter Four -
- Computers in the Space Shuttle Avionics System -
 
Developing software for the space shuttle
 
 
[108] During 1973 and 1974 the first requirements began to be specified for what has become one of the most interesting software systems ever designed. It was obvious from the very beginning that developing the Shuttle's software would be a complicated job. Even though NASA engineers estimated the size of the flight software to be smaller than that on Apollo, the ubiquitous functions of the Shuttle computers meant that no one group of engineers and no one company could do the software on its own. This increased the size of the task because of the communication necessary between the working groups. It also increased the complexity of a spacecraft already made complex by flight requirements and redundancy. Besides these realities, no one could foresee the final form that the software for this pioneering vehicle would take, even after years of development work had elapsed, since there continued to be both minor and major changes. NASA and its contractors made over 2,000 requirements changes between 1975 and the first flight in 198180. As a result, about $200 million was spent on software, as opposed to an initial estimate of $20 million. Even so, NASA lessened the difficulties by making several early decisions that were crucial for the program's success. NASA separated the software contract from the hardware contract, closely managed the contractors and their methods, chose a high-level language, and maintained conceptual integrity.
 
NASA awarded IBM Corporation the first independent Shuttle software contract on March 10, 1973. IBM and Rockwell International had worked together during the period of competition for the orbiter contract81. Rockwell bid on the entire aerospacecraft, intending to subcontract the computer hardware and software to IBM. But to Rockwell's dismay, NASA decided to separate the software contract from the orbiter contract. As a result, Rockwell still subcontracted with IBM for the computers, but IBM hand a separate software contract monitored closely by the Spacecraft Software Division of the Johnson Space Center. There are several reasons why this division of labor occurred. Since software does not weigh anything in and of itself, it is used to overcome hardware problems that would require extra systems and components (such as a mechanical control system)82. Thus software is in many ways the most critical component of the Shuttle, as it ties the other components together. Its importance to the overall program alone justified a separate contract, since it made the contractor directly accountable to NASA. Moreover, during the operations phase, software underwent the most changes, the hardware being essentially fixed83. As time went on, Rockwell's responsibilities as [109] prime hardware contractor were phased out, and the shuttles were turned over to an operations group. In late 1983, Lockheed Corporation, not Rockwell, won the competition for the operations contract. By keeping the software contract separate, NASA could develop the code on a continuing basis. There is a considerable difference between changing maintenance mechanics on an existing hardware system and changing software companies on a not yet perfect system because to date the relationships between components in software are much harder to define than those in hardware. Personnel experienced with a specific software system are the best people to maintain it. Lastly, Christopher Kraft of Johnson Space Center and George Low of NASA Headquarters, both highly influential in the manned spacecraft program during the early 1970's, felt that Johnson had the software management expertise to handle the contract directly84.
 
One of the lessons learned from monitoring Draper Laboratory in the Apollo era was that by having the software development at a remote site (like Cambridge), the synergism of informally exchanged ideas is lost; sometimes it took 3 to 4 weeks for new concepts to filter over85. IBM had a building and several hundred personnel near Johnson because of its Mission Control Center contracts. When IBM won the Shuttle contract, it simply increased its local force.
 
The closeness of IBM to Johnson Space Center also facilitated the ability of NASA to manage the project. The first chief of the Shuttle's software, Richard Parten, observed that the experience of NASA managers made a significant contribution to the success of the programming effort86. Although IBM was a giant in the data processing industry, a pioneer in real time systems, and capable of putting very bright people on a project, the company had little direct experience with avionics software. As a consequence, Rockwell had to supply a lot of information relating to flight control. Conversely, even though Rockwell projects used computers, software development on the scale needed for the Shuttle was outside its experience. NASA Shuttle managers provided the initial requirements for the software and facilitated the exchange of information between the principal contractors. This situation was similar to that during the 1960s when NASA had the best rendezvous calculations people in the world and had to contribute that expertise to IBM during the Gemini software development. Furthermore, the lessons of Apollo inspired the NASA managers to push IBM for quality at every point87.
 
The choice of a high level language for doing the majority of the coding was important because, as Parten noted, with all the changes, "we'd still be trying to get the thing off the ground if we'd used assembly language"88. Programs written in high level languages are far easier to modify. Most of the operating system software, which is rarely changed, is in assembler, but all applications software and some of the interfaces and redundancy management code is in HAL/S89.
 
[110] Although the decision to program in a high-level language meant that a large amount of support software and development tools had to be written, the high-level language nonetheless proved advantageous, especially since it has specific statements created for real-time programming.
 
 
Defining the Shuttle Software
 
 
In the end, the success of the Shuttle's software development was due to the conceptual integrity established by using rigorously maintained requirements documents. The requirements phase is the beginning of the software life cycle, when the actual functions, goals, and user interfaces of the eventual software are determined in full detail. If a team of a thousand workers was asked to set software requirements, chaos would result90. On the other hand, if few do the requirements but many can alter them later, then chaos would reign again. The strategy of using a few minds to create the software architecture and interfaces and then ensuring that their ideas and theirs alone are implemented, is termed "maintaining conceptual integrity," which is well explained in Frederick C. Brooks' The Mythical Man Month 91. As for other possible solutions, Parten says, "the only right answer is the one you pick and make to work"92.
 
Shuttle requirements documents were arranged in three Levels: A, B, and C, the first two written by Johnson Space Center engineers. John R. Garman prepared the Level A document, which is comprised of a comprehensive description of the operating system, applications programs, keyboards, displays, and other components of the software system and its interfaces. William Sullivan wrote the guidance, navigation and control requirements, and John Aaron, the system management and payload specifications of Level B. They were assisted by James Broadfoot and Robert Ernull93. Level B requirements are different in that they are more detailed in terms of what functions are executed when and what parameters are needed94. The Level Bs also define what information is to be kept in COMPOOLS, which are HAL/S structures for maintaining common data accessed by different tasks95. The Level C requirements were more of a design document, forming a set with Level B requirements, since each end item at Level C must be traceable to a Level B requirement96. Rockwell International was responsible for the development of the Level C require ments as, technically, this is where the contractors take over from the customer, NASA, in developing the software.
 
Early in the program, however, Draper Laboratory had significant influence on the software and hardware systems for the Shuttle. Draper was retained as a consultant by NASA and contributed two [111] key items to the software development process. The first was a document that "taught" NASA and other contractors how to write require ments for software, how to develop test plans, and how to use func tional flow diagrams, among other tools97. It seems ironic that Draper was instructing NASA and IBM on such things considering its difficulties in the mid-1960s with the development of the Apollo flight software. It was likely those difficult experiences that helped motivate the MIT engineers to seriously study software techniques and to become, within a very short time, one of the leading centers of software engineering theory. The Draper tutorial included the concept of highly modular software, software that could be "plugged into" the main circuits of the Shuttle. This concept, an application of the idea of interchangeable parts to software, is used in many software systems today, one example being the UNIX*** operating system developed at Bell Laboratories in the 1970s, under which single function software tools can be combined to perform a large variety of functions.
 
Draper's second contribution was the actual writing of some early Level C requirements as a model98. This version of the Level C documents contained the same components as in the later versions delivered by Rockwell to IBM for coding. Rockwell's editions, however, were much more detailed and complete, reflecting their practical, rather than theoretical purpose and have been an irritation for IBM. IBM and NASA managers suspect that Rockwell, miffed when the software contract was taken away from them, may have delivered incredibly precise and detailed specifications to the software contractor. These include descriptions of flight events for each major portion of the software, a structure chart of tasks to be done by the software during that major segment, a functional data flowchart, and, for each module, its name, calculations, and operations to be performed, and input and output lists of parameters, the latter already named and accompanied by a short definition, source, precision, and what units each are in. This is why one NASA manager said that "you can't see the forest for the trees" in Level C, oriented as it is to the production of individual modules99. One IBM engineer claimed that Rockwell went "way too far" in the Level C documents, that they told IBM too much about how to do things rather than just what to do100. He further claimed that the early portion of the Shuttle development was "underengineered" and that Rockwell and Draper included some requirements that were not passed on by NASA. Parten, though, said that all requirements documents were subject to regular review by joint teams from NASA and Rockwell101.
 
The impression one gains from documents and interviews is that both Rockwell and IBM fell victim to the "not invented here" [112] syndrome: If we didn't do it, it wasn't done right. For example, Rockwell delivered the ascent requirements, and IBM coded them to the letter, thereby exceeding the available memory by two and a third times and demonstrating that the requirements for ascent were excessive. Rockwell, in return, argued for 2 years about the nature of the operating system, calling for a strict time-sliced system, which allocates predefined periods of time for the execution of each task and then suspends tasks unfinished in that time period and moves on to the next one. The system thus cycles through all scheduled tasks in a fixed period of time, working on each in turn. Rockwell's original proposal was for a 40-millisecond cycle with synchronization points at the end of each102. IBM, at NASA's urging, countered with a priority-interrupt-driven system similar to the one on Apollo Rockwell, experienced with time-slice systems, fought this from 1973 to 1975, convinced it would never workl03.
 
The requirements specifications for the Shuttle eventually contained in their three levels what is in both the specification and design stage of the software life cycle. In this sense, they represent a fairly complete picture of the software at an early date. This level of detail at least permitted NASA and its contractors to have a starting point in the development process. IBM constantly points to the number of changes and alterations as a continuing problem, partially ameliorated by implementing the most mature requirements first104. Without the attempt to provide detail at an early date, IBM would not have had any mature requirements when the time came to code. Even now, requirements are being changed to reflect the actual software, so they continue to be in a process of maturation. But early development of specifications became the means by which NASA could enforce conceptual integrity in the shuttle software.
 
 
Architecture of the Primary Avionics Software System
 
 
The Primary Avionics Software System, or PASS, is the software that runs in all the Shuttle's four primary computers. PASS is divided into two parts: system software and applications software. The system software is the Flight Computer Operating System (FCOS), the user interface programming, and the system control programs, whereas the applications software is divided into guidance, navigation and control, orbiter systems management, payload and checkout programs. Further divisions are explained in Box 4-3.
 

[113] Box 4-3: Structure of PASS Applications Software
 
The PASS guidance and navigation software is divided into major functions, dictated by mission phases, the most obvious of which are preflight, ascent, on-orbit, and descent. The requirements state that these major functions be called OPS, or operational sequences. (e.g., OPS-1 is ascent; OPS-3, descent.) Within the OPS are major modes. In OPS-1, the first-stage burn, second-stage burn, first orbital insertion burn, second orbital insertion burn, and the initial on-orbit coast are major modes; transition between major modes is automatic. Since the total mission software exceeds the capacity of the memory, OPS transitions are normally initiated by the crew and require the OPS to be loaded from the MMU. This caused considerable management concern over the preservation of data, such as the state vector, needed in more than one OPS105. NASA's solution is to keep common data in a major function base, which resides in memory continuously and is not overlaid by new OPS being read into the computers.
 
Within each OPS, there are special functions (SPECs) and display functions (DISPs). These are available to the crew as a supplement to the functions being performed by the current OPS. For example, the descent software incorporates a SPEC display showing the horizontal situation as a supplement to the OPS display showing the vertical situation. This SPEC is obviously not available in the on-orbit OPS. A DISP for the on-orbit OPS may show fuel cell output levels, fuel reserves in the orbital maneuvering system, and other such information. SPECs usually contain items that can be selected by the crew for execution. DISPs are just what their name means, displays and not action items. Since SPECs and DISPs have lower priority than OPS, when a big OPS is in memory they have to be kept on the tape and rolled in when requested106. The actual format of the SPECs, DISPs, OPS displays, and the software that interprets crew entries on the keyboard is in the user interface portion of the system software.



The most critical part of the system software is the FCOS. NASA, Rockwell, and IBM solved most of the grand conceptual problems, such as the nature of the operating system and the redundancy management scheme, by 1975. The first task was to convert the FCOS from the proposed 40-millisecond loop operating system to a priority-driven [113] system107. Priority interrupt systems are superior to time-slice systems because they degrade gracefully when overloaded108. In a time-slice system, if the tasks scheduled in the current cycle get bogged down by excessive I/O operations, they tend to slow down the total time of execution of processes. IBM's version of the FCOS actually has cycles, but they are similar to the ones in the Skylab system described in the previous chapter. The minor cycle is the high-frequency cycle; tasks within it are scheduled every 40 milliseconds. Typical tasks in this cycle are those related to active flight control in the atmosphere. The major cycle is 960 milliseconds, and many monitoring and system management tasks are scheduled at that frequency109. If a process is still running when its time to.....
 
 
 

[
114]
 
Figure 4-6.
 
Figure 4-6. A block diagram of the Shuttle flight computer software architecture. (From NASA, Data Processing System Workbook)
 
.....restart comes up due to excessive I/O or because it was interrupted, it cancels its next cycle and finishes up110. If a higher priority process is called when another process is running, then the current process is interrupted and a program status word (PSW) containing such items as the address of the next instruction to be executed is stored until the interruption is satisfied. The last instruction of an interrupt is to restore the old PSW as the current PSW so that the interrupted process can continue111. The ability to cancel processes and to interrupt them asynchronously provides flexibility that a strict time-slice system does not.
 
A key requirement of the FCOS is to handle the real-time statements in the HAL/S language. The most important of these are SCHEDULE, which establishes and controls the frequency of execution of processes; TERMINATE and CANCEL, which are the opposite of SCHEDULE; and WAIT, which conditionally suspends execution112. The method of implementing these statements is controlled [115] by a separate interface control document113. SCHEDULE is generally programmed at the beginning of each operational sequence to set up which tasks are to be done in that software segment and how often they are to be done. The syntax of SCHEDULE permits the programmer to assign a frequency and priority to each task. TERMINATE and CANCEL are used at the end of software phases or to stop an unneeded process while others continue. For example, after the solid rocket boosters burn out and separate, tasks monitoring them can cease while tasks monitoring the main engines continue to run. WAIT, although handy, is avoided by IBM because of the possibility of the software being "hung up" while waiting for the I/O or other condition required to continue the process114. This is called a race condition or "deadly embrace" and is the bane of all shared resource computer operating systems.
 
The FCOS and displays occupy 35K of memory at all times115. Add the major function base and other resident items, and about 60K of the 106K of core remains available for the applications programs. Of the required applications programs, ascent and descent proved the most troublesome. Fully 75% of the software effort went into those two programs116. After the first attempts at preparing the ascent software resulted in a 140K load, serious code reduction began. By 1978, IBM reduced the size of the ascent program to 116K, but NASA Headquarters demanded it be further knocked down to 80K117. The lowest it ever got was 98,840 words (including the system software), but its size has since crept back up to nearly the full capacity of the memory. IBM accomplished the reduction by moving functions that could wait until later operational sequences118. The actual figures for the test flight series programs are in Table 4-1119. The total size of the flight test software was 500,000 words of code. Producing it and modifying it for later missions required the development of a complete production facility.
 
[116] TABLE 4-1: Sizes of Software Loads in PASS.
.
NAME K WORDS
. .
Preflight initialization 72.4
Preflight checkout 81.4
Ascent and abort 105.2
On-orbit 83.1
On-orbit checkout 80.3
On-orbit system management 84.1
Entry 101.1
Mass memory utility 70.1
Note: Payload and rendezvous software was added later during the operations phase.
 
 
 
 
Implementing PASS
 
 
NASA planned that PASS would be a continuing development process. After the first flight programs were produced, new functions needed to be added and adapted to changing payload and mission requirements. For instance, over 50% of PASS modules changed during the first 12 flights in response to requested enhancements120. To do this work, NASA established a Software Development Laboratory at Johnson Space Center in 1972 to prepare for the implementation of the Shuttle programs and to make the software tools needed for efficient coding and maintenance. The Laboratory evolved into the Software Production Facility (SPF) in which the software development is carried on in the operations era. Both the facilities were equipped and managed by NASA but used largely by contractors.
 
The concept of a facility dedicated to the production of onboard software surfaced in a Rand Corporation memo in early 1970121. The memo summarized a study of software requirements for Air Force space missions during the decade of the 1970s. One reason for a government-owned and operated software factory was that it would be easier to establish and maintain security. Most modules developed for [117] the Shuttle, such as the general flight control software and memory displays, would be unclassified. However, Department of Defense (DoD) payloads require system management and payload management software, plus occasional special maneuvering modules. These were expected to be classified. Also, if the software maintenance contract moved from the original prime contractor to some different operations contractor, it would be considerably simpler to accomplish the transfer if the software library and development computers were government owned and on government property. Lastly, having such close control over existing software and new development would eliminate some of the problems in communication, verification, and maintenance encountered in the three previous manned programs.
 
Developing the SPF turned out to be as large a task as developing the flight software itself. During the mid-1970s, IBM had as many people doing software for the development lab as they had working on PASS122. The ultimate purpose of the facility is to provide a programming team with sufficient tools to prepare a software load for a flight. This software load is what is put on to the MMU tape that is flown on the spacecraft. In the operations era of the 1980s, over 1,000 compiled modules are available. These are fully tested, and often previously used, versions of tasks such as main engine throttling, memory modification, and screen displays that rarely change from flight to flight. New, mission-specific modules for payloads or rendezvous maneuvers are developed and tested using the SPF's programming tools, which themselves represent more than a million lines of code123. The selection of existing modules and the new modules are then combined into a flight load that is subject to further testing. NASA achieved the goal of having such an efficient software production system through an 8-year development process when the SPF was still the Laboratory.
 
In 1972, NASA studied what sort of equipment would be required for the facility to function properly. Large mainframe computers compatible with the AP-101 instruction set were a must. Five IBM 360/75 computers, released from Apollo support functions, were available124. These were the development machines until January of 1982125. Another requirement was for actual flight equipment on which to test developed modules. Three AP-101 computers with associated display electronics units connected to the 360s with a flight equipment interface device (FEID) especially developed for the purpose. Other needed components, such as a 6-degree-of-freedom flight simulator, were implemented in software126. The resulting group of equipment is capable of testing the flight software by interpreting instructions, simulating functions, and running it in the actual flight hardware127.
 
In the late 1970s, NASA realized that more powerful computers were needed as the transition was made from development to operations. The 360s filled up, so NASA considered the Shuttle Mission [118] Simulator (SMS), the Shuttle Avionics Instrumentation Lab (SAIL), and the Shuttle Data Processing Center's computers as supplementary development sites, but this idea was rejected because they were all too busy doing their primary functions128. In 1981, the Facility added two new IBM 3033N computers, each with 16 million bytes of primary memory. The SPF then consisted of those mainframes, the three AP-101 computers and the interface devices for each, 20 magnetic tape drives, six line printers, 66 million bytes of drum memory, 23.4 billion bytes of disk memory, and 105 terminals129. NASA accomplished rehosting the development software to the 3033s from the 360s during the last quarter of 1981. Even this very large computer center was not enough. Plans at the time projected on-line primary memory to grow to 100 million bytes130, disk storage to 160 billion bytes131, and two more interface units, display units, and AP-101s to handle the growing DOD business132. Additionally, terminals connected directly to the SPF are in Cambridge, Massachusetts, and at Goddard Space Flight Center, Marshall Space Flight Center, Kennedy Space Center, and Rockwell International in Downey, California133.
 
Future plans for the SPF included incorporating backup system software development, then done at Rockwell, and introducing more automation. NASA managers who experienced both Apollo and the Shuttle realize that the operations software preparation is not enough to keep the brightest minds sufficiently occupied. Only a new project can do that. Therefore, the challenge facing NASA is to automate the SPF, use more existing modules, and free people to work on other tasks. Unfortunately, the Shuttle software still has bugs, some of which are no fault of the flight software developers, but rather because all the tools used in the SPF are not yet mature. One example is the compiler for HAL/S. Just days before the STS-7 flight, in June, 1983, an IBM employee discovered that the latest release of the compiler had a bug in it. A quick check revealed that over 200 flight modules had been modified and recompiled using it. All of those had to be checked for errors before the flight could go. Such problems will continue until the basic flight modules and development tools are no longer constantly subject to change. In the meantime, the accuracy of the Shuttle software is dependent on the stringent testing program conducted by IBM and NASA before each flight.
 
 
Verification and Change Management of the Shuttle Software
 
 
IBM established a separate line organization for the verification of the Shuttle software. IBM's overall Shuttle manager has two managers reporting to him, one for design and development, and one for verification and field operations. The verification group has just [119] less than half the members of the development group and uses 35% of the software budget134. There are no managerial or personnel ties to the development group, so the test team can adopt an "adversary relationship" with the development team. The verifiers simply assume that the software is untested when received135. In addition, the test team can also attempt to prove that the requirements documents are wrong in cases where the software becomes unworkable. This enables them to act as the "conscience" of the entire project136.
 
IBM began planning for the software verification while the requirements were being completed. By starting verification activity as the software took shape, the test group could plan its strategy and begin to write its own books. The verification documentation consists of test specifications and test procedures including the actual inputs to be used and the outputs expected, even to the detail of showing the content of the CRT screens at various points in the test137. The software for the first flight had to survive 1,020 of these tests138. Future flight loads could reuse many of the test cases, but the preparation of new ones is a continuing activity to adjust to changes in the software and payloads, each of which must be handled in an orderly manner.
 
Suggestions for changes to improve the system are unusually welcome. Anyone, astronaut, flight trainer, IBM programmer, or NASA manager, can submit a change request139. NASA and IBM were processing such requests at the rate of 20 per week in 1981140. Even as late as 1983 IBM kept 30 to 40 people on requirements analysis, or the evaluation of requests for enhancements141. NASA has a corresponding change evaluation board. Early in the program, it was chaired by Howard W. Tindall, the Apollo software manager, who by then was head of the Data Systems and Analysis Directorate. This turned out to be a mistake, as he had conflicting interests142. The change control board moved to the Shuttle program office. Due to the careful review of changes, it takes an average of 2 years for a new requirement to get implemented, tested, and into the field143. Generally, requests for extra functions that would push out current software due to memory restrictions are turned down144.
 



[120] Box 4-4: How IBM Verifies the Shuttle Flight Software
 
The Shuttle software verification process actually begins before the test group gets the software, in the sense that the development organization conducts internal code reviews and unit tests of individual modules and then integration tests of groups of modules as they are assembled into a software load. There are two levels of code inspection, or "eyeballing" the software looking for logic errors. One level of inspection is by the coders themselves and their peer reviewers. The second level is done by the outside verification team. This activity resulted in over 50% of the discrepancy reports (failures of the software to meet the specification) filed against the software, a percentage similar to the Apollo experience and reinforcing the value of the idea145. When the software is assembled, it is subject to the First Article Configuration Inspection (FACI), where it is reviewed as a complete unit for the first time. It then passes to the outside verification group.
 
Because of the nature of the software as it is delivered, the verification team concentrates on proving that it meets the customer's requirements and that it functions at an acceptable level of performance. Consistent with the concept that the software is assumed untested, the verification group can go into as much detail as time and cost allow. Primarily, the test group concentrates on single software loads, such as ascent, on-orbit, and so forth146. To facilitate this, it is divided into teams that specialize in the operating system and detail, or functional verification; teams that work on guidance, navigation, and control; and teams that certify system performance. These groups have access to the software in the SPF, which thus doubles as a site for both development and testing. Using tools available in the SPF, the verification teams can use the real flight computers for their tests (the preferred method). The testers can freeze the execution of software on those machines in order to check intermediate results, alter memory, and even get a log of what commands resulted in response to what inputs147.
 
After the verification group has passed the software, it is given an official Configuration Inspection and turned over to NASA. At that point NASA assumes configuration control, and any changes must be approved through Agency channels. Even though NASA then has the software, IBM is not finished with it148.
 
[121] The software is usually installed in the SAIL for prelaunch, ascent, and abort simulations, the Flight Simulation Lab (FSL) in Downey for orbit, de-orbit, and entry simulations, and the SMS for crew training. Although these installations are not part of the preplanned verification process, the discrepancies noted by the users of the software in the roughly 6 months before launch help complete the testing in a real environment. Due to the nature of real-time computer systems, however, the software can never be fully certified, and both IBM and NASA are aware of this149. There are simply too many interfaces and too many opportunities for asynchronous input and output.



Discrepancy reports cause changes in software to make it match the requirements. Early in the program, the software found its way into the simulators after less verification because simulators depend on software just to be turned on. At that time, the majority of the discrepancy reports were from the field installations. Later, the majority turned up in the SPF150. All discrepancy reports are formally disposed of, either by appropriate fixes to the software, or by waiver. Richard Parten said, "Sometimes it is better to put in an 'OPS Note' or waiver than to fix (the software). We are dependent on smart pilots"151. If the discrepancy is noted too close to a flight, if it requires too much expense to fix, it can be waived if there is no immediate danger to crew safety. Each Flight Data File carried on board lists the most important current exceptions of which the crew must be aware. By STS-7 in June of 1983, over 200 pages of such exceptions and their descriptions existed152. Some will never be fixed, but the majority were addressed during the Shuttle launch hiatus following the 51L accident in January 1986.
 
So, despite the well-planned and well-manned verification effort, software bugs exist. Part of the reason is the complexity of the real-time system, and part is because, as one IBM manager said, "we didn't do it up front enough," the "it" being thinking through the program logic and verification schemes153. Aware that effort expended at the early part of a project on quality would be much cheaper and simpler than trying to put quality in toward the end, IBM and NASA tried to do much more at the beginning of the Shuttle software development than in any previous effort, but it still was not enough to ensure perfection.




[122] Box 4-5: The Nature of the Backup Flight System
 
The Backup Flight System consists of a single computer and a software load that contains sufficient functions to handle ascent to orbit, selected aborts during ascent, and descent from orbit to landing site. In the interest of avoiding a generic software failure, NASA kept its development separate from PASS. An engineering directorate, not the on-board software division, managed the software contract for the backup, won by Rockwell154.
 
The major functional difference between PASS and the backup is that the latter uses a time-slice operating system rather than the asynchronous priority-driven system of PASS155. This is consistent with Rockwell's opinion on how that system was to be designed. Ironically, since the backup must listen in on PASS operations so as to be ready for instant takeover, PASS had to be modified to make it more synchronous156. Sixty engineers were still working on the Backup Flight System software as late as 1983157.
 


*** UNIX is a trademark of AT&T.
link to previous page