Coding 2015: 2016

Wednesday, November 23, 2016

Kesalahan Statistik

http://www.cs.cornell.edu/~asampson/blog/statsmistakes.html

Computer scientists in systemsy fields, myself included, aren’t great at using statistics. Maybe it’s because there are so many other potential problems with empirical evaluations that solid statistical reasoning doesn’t seem that important. Other subfields, like HCI and machine learning, have much higher standards for data analysis. Let’s learn from their example.
Here are three kinds of avoidable statistics mistakes that I notice in published papers.

No Statistics at All

The most common blunder is not using statistics at all when your paper clearly uses statistical data. If your paper uses the phrase “we report the average time over 20 runs of the algorithm,” for example, you should probably use statistics.
Here are two easy things that every paper should do when it deals with performance data or anything else that can randomly vary:
First, plot the error bars. In every figure that represents an average, compute the standard error of the mean or just the plain old standard deviation and add little whiskers to each bar. Explain what the error bars mean in the caption.
(a) Just noise. (b) Meaningful results. (c) Who knows???

(a) Just noise. (b) Meaningful results. (c) Who knows???

Second, do a simple statistical test. If you ever say “our system’s average running time is X seconds, which is less than the baseline running time of Y seconds,” you need show that the difference is statistically significant. Statistical significance tells the reader that the difference you found was more than just “in the noise.”
For most CS papers I read, a really basic test will work: Student’s

t

-test checks that two averages that look different actually are different. The process is easy. Collect some

N

samples from the two conditions, compute the mean

\overline{X}

and the standard deviation

s

for each, and plug them into this formula:

t = \frac{ \overline{X}_1 - \overline{X}_2 } { \sqrt{ \frac{s_1^2}{N_1} + \frac{s_2^2}{N_2} } }

then plug that

t

into the cumulative distribution function of the

t

-distribution to get a

p

-value. If your

p

-value is below a threshold

\alpha

that you chose ahead of time (0.05 or 0.01, say), then you have a statistically significant difference. Your favorite numerical library probably already has an implementation that does all the work for you.
If you’ve taken even an intro stats course, you know all this already! But you might be surprised to learn how many computer scientists don’t. Program committees don’t require that papers use solid statistics, so the literature is full of statistics-free but otherwise-good papers, so standards remain low, and Prof. Ouroboros keeps drawing figures without error bars. Other fields are moving beyond the

p

-value, and CS isn’t even there yet.

Failure to Reject = Confirmation

When you do use a statistical test in a paper, you need to interpret its results correctly. When your test produces a

p

-value, here are the correct interpretations:

If $p < \alpha$ : The difference between our average running time and the baseline’s average running time is statistically significant. Pedantically, we reject the null hypothesis that says that the averages might be the same.
Otherwise, if $p \ge \alpha$ : We conclude nothing at all. Pedantically, we fail to reject that null hypothesis.

It’s tempting to think, when

p \ge \alpha

, that you’ve found the opposite thing from the

p < \alpha

case: that you get to conclude that there is no statistically significant difference between the two averages. Don’t do that!
Simple statistical tests like the

t

-test only tell you when averages are different; they can’t tell you when they’re the same. When they fail to find a difference, there are two possible explanations: either there is no difference or you haven’t collected enough data yet. So when a test fails, it could be your fault: if you had run a slightly larger experiment with a slightly larger

N

, the test might have successfully found the difference. It’s always wrong to conclude that the difference does not exist.
If you want to claim that two means are equal, you’ll need to use a different test where the null hypothesis says that they differ by at least a certain amount. For example, an appropriate one-tailed

t

-test will do.

The Multiple Comparisons Problem

In most ordinary evaluation sections, it’s probably enough to use only a handful of statistical tests to draw one or two bottom-line conclusions. But you might find yourself automatically running an unbounded number of comparisons. Perhaps you have

n

benchmarks, and you want to compare the running time on each one to a corresponding baseline with a separate statistical test. Or maybe your system works in a feedback loop: it tries one strategy, performs a statistical test to check whether the strategy worked, and starts over with a new strategy otherwise.
Repeated statistical tests can get you into trouble. The problem is that every statistical test has a probability of lying to you. The probability that any single test is wrong is small, but if you do lots of test, the probability amplifies quickly.
For example, say you choose

\alpha = 0.05

and run one

t

-test. When the test succeeds—when it finds a significant difference—it’s telling you that there’s at most an

\alpha

chance that the difference arose from random chance. In 95 out of 100 parallel universes, your paper found a difference that actually exists. I’d take that bet.
Now, say you run a series of

n

tests in the scope of one paper. Then every test has an

\alpha

chance of going wrong. The chances that your paper has more than

k

errors in it is given by the binomial distribution:

1 - \sum_{i=0}^{k} {n \choose i} \alpha^i (1-\alpha)^{n-i}

which grows exponentially with the number of tests,

n

. If you use just 10 tests with

\alpha = 0.05

, for example, your chance of having one test go wrong grows to 40%. If you do 100, the probability is above 99%. At that point, it’s a near certainty that your paper is misreporting some result.
(To compute these probabilities yourself, set

k = 0

so you get the chance of at least one error. Then the CDF above simplifies down to

1 - (1 - \alpha) ^ n

.)
This pitfall is called the multiple comparisons problem. If you really need to run lots of tests, all is not lost: there are standard ways to compensate for the increased chance of error. The simplest is the Bonferroni correction, where you reduce your per-test

\alpha

\frac{\alpha}{n}

to preserve an overall

\alpha

chance of going wrong.

Thursday, August 18, 2016

Using Drupal's autocomplete on a custom form element

http://www.jochenhebbrecht.be/site/2011-01-10/drupal/using-drupals-autocomplete-a-custom-form-element

Did you ever created a custom form? A place where you couldn't use the Form API of Drupal?
Imagine this form, and you want to create an autocomplete feature on the nickname. The moment you start typing a nickname, a list of usernames get in the textfield.

<form action="/foo">
   ...
   <input type="text" value="Nickname"></input>
   <input type="submit" value="Submit"></input>
</form>

Step 1: adjust the HTML code

Start from a normal text field

<input id="txt-existing-bidder" type="text"></input>

Add attributes to normal text field

<input id="txt-nickname" type="text" class="form-text form-autocomplete text"
       autocomplete="OFF"></input>

Add a hidden value to capture the autocomplete values. Put this field immediately after the textfield

<input class="autocomplete" type="hidden" id="txt-nickname-autocomplete"
       value="/##path_to_autocomplete##"/autocomplete" disabled="disabled" />

Step 2: adjust the Javascript code

You need the following JS files to have a fully working version:

drupal_add_js("misc/autocomplete.js");
drupal_add_js("misc/ahah.js");

If the HTML (see above) is added by AJAX, Drupal's autocomplete will not discover the autocomplete fields. Therefore, execute following code after loading the AJAX call
```
Drupal.attachBehaviors($("##CONTEXT##"));
```
##CONTEXT##: reference to a div which hold the input

Step 3: create a autocomplete PHP function which returns a JSON object

/**
 * Searches all auction users and returns a JSON object of all users
 * @param $search the value to be searched
 * @return JSON object with all users
 */
function autocomplete_users($search = '') {
        // TODO: implement search function here
 
        // DEBUG
 $users['admin {uid:1}'] = 'admin';
 $users['jochen {uid:2}'] = 'jochen';
 
 return drupal_json($users);
}

Step 4: the result

Saturday, August 6, 2016

A x64 OS #1: UEFI

http://kazlauskas.me/entries/x64-uefi-os-1.html

As a part of the OS project for the university there has been a request to also write up the experiences and challenges encountered. This is the first post of the series on writing a x64 operating system when booting straight from UEFI. Please keep in mind that these posts are written by a not-even-hobbyist and content in these posts should be taken with a grain of salt.

Kernel and UEFI

I’ve decided to write a kernel targeting x64 as a UEFI application. There is a number of appeals to write a kernel as UEFI application as opposed to writing a multiboot kernel. Namely:

For x86 family of processors, you avoid the work necessary to upgrade from real mode to protected mode and then from protected mode to long mode which is more commonly known as 64-bit mode. As a UEFI application your kernel gets a fully working x64 environment from a get-go;
Unlike BIOS, UEFI is a well documented firmware. Most of the interfaces provided by BIOS are de facto and you’re lucky if they work at all, while most of these provided by UEFI are de jure and usually just work;
UEFI is extensible, whereas BIOS is not really;
Finally, UEFI is the modern technology which is to stay around, while BIOS is a 40 years old piece of technology on death row. Learning about soon-to-be-dead technology is waste of the effort.

Despite my strong attachment to the Rust community and Rust’s perfect suitability for kernels¹, I’ll be writing the kernel in C. Mostly because it is unlikely people inside the university will be familiar with Rust, but also because GNU-EFI is a C library and I cannot be bothered to bind it. I’d surely be writing it in Rust were I more serious about the project.

Toolchain

As it turns out, developing a x64 kernel on a x64 host greatly simplifies setting up the build tool-chain. I’ll be using:

clang to compile the C code (no cross-compiler is necessary²!);
gnu-efi library to interact with the UEFI firmware;
qemu emulator to run my kernel; and
OVMF as the UEFI firmware.

The UEFI “Hello, world!”

The following snippet of code is all the code you need to print something on the screen as an UEFI application:

// main.c
#include <efi.h>
#include <efilib.h>
#include <efiprot.h>

EFI_STATUS
efi_main (EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
    InitializeLib(ImageHandle, SystemTable);
    Print(L"Hello, world from x64!");
    for(;;) __asm__("hlt");
}

However, compiling this code correctly is not as trivial. Following three commands are necessary to produce a working UEFI application:

clang -I/usr/include/efi -I/usr/include/efi/x86_64 -I/usr/include/efi/protocol -fno-stack-protector -fpic -fshort-wchar -mno-red-zone -DHAVE_USE_MS_ABI -c -o src/main.o src/main.c
ld -nostdlib -znocombreloc -T /usr/lib/elf_x86_64_efi.lds -shared -Bsymbolic -L /usr/lib /usr/lib/crt0-efi-x86_64.o src/main.o -o huehuehuehuehue.so -lefi -lgnuefi
objcopy -j .text -j .sdata -j .data -j .dynamic -j .dynsym  -j .rel -j .rela -j .reloc --target=efi-app-x86_64 huehuehuehuehue.so huehuehuehuehue.efi

The clang command is pretty self-explanatory: we tell the compiler where to look for the EFI headers and what to compile into a object file. Probably the most non-trivial option here is the -DHAVE_USE_MS_ABI one – x64 UEFI uses the Windows’ x64 calling convention, and not the regular C one, thus all arguments in calls to UEFI functions must be passed in a different way than it is usually done in C code. Historically this conversion was done by the uefi_call_wrapper wrapper, but clang supports the calling convention natively, and we tell that fact to the gnu-efi library with this option³.
Then, I manually link my object file and UEFI-specific C runtime up into a shared library using a custom linker script provided by the gnu-efi library. The result is an ELF library about 250KB in size. However, UEFI expects its applications in PE executable format, so we must convert our library into the desired format with the objcopy command. At this point huehuehuehuehue.efi file should be produced and majority of UEFI firmwares should be able to run it.
In practice, I’ve automated these steps along with a considerably complex sequence of building image files I’ve stolen from OSDEV’s tutorial on creating images into a Makefile. Feel free to copy it in parts or in whole for your own use cases.

UEFI boot and runtime services

A UEFI application has 2 distinct stages over its lifetime: a stage where so-called boot services are available and stage after these boot services are disabled. An UEFI application will be launched by the UEFI firmware and both boot and runtime services will be available to the application. Most notably, boot services provide APIs for loading other UEFI applications (e.g. implementing bootloaders), handling (allocating and deallocating) memory and using protocols (speaking to other active UEFI applications).
Once the kernel is done with using boot services it calls ExitBootServices which is a method provided by… a boot service. Past that point only runtime services are available and you cannot ever return to a state where boot services are available except by resetting the system. Managing UEFI variables, system clock and resetting the system is pretty much the only things you can do with the runtime services.
For my kernel, I will use the graphics output protocol to set up the video frame buffer, exit the boot services and, finally, shut down the machine before reaching the hlt instruction. Following piece of code implements the described sequence. I left some code out, you can see it in full at Gitlab. For example, the definition of init_graphics.

EFI_STATUS
efi_main (EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
    EFI_STATUS status;
    InitializeLib(ImageHandle, SystemTable);

    // Initialize graphics
    EFI_GRAPHICS_OUTPUT_PROTOCOL *graphics;
    EFI_GUID graphics_proto = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
    status = SystemTable->BootServices->LocateProtocol(&graphics_proto, NULL, (void **)&graphics);
    if(status != EFI_SUCCESS) return status;
    status = init_graphics(graphics);
    if(status != EFI_SUCCESS) return status;

    // Figure out the memory map (should be identity mapping)
    boot_state.memory_map = LibMemoryMap(&boot_state.memory_map_size,
                                         &boot_state.map_key,
                                         &boot_state.descriptor_size,
                                         &boot_state.descriptor_version);
    // Exit the boot services...
    SystemTable->BootServices->ExitBootServices(ImageHandle, boot_state.map_key);
    // and set up the memory map we just found.
    SystemTable->RuntimeServices->SetVirtualAddressMap(boot_state.memory_map_size,
                                                       boot_state.descriptor_size,
                                                       boot_state.descriptor_version,
                                                       boot_state.memory_map);
    // Once we’re done we power off the machine.
    SystemTable->RuntimeServices->ResetSystem(EfiResetShutdown, EFI_SUCCESS, 0, NULL);
    for(;;) __asm__("hlt");
}

Note, that some protocols can either be attached to your own EFI_HANDLE or some other EFI_HANDLE (i.e. protocol is provided by another UEFI application). Graphics output protocol I’m using here is an example of a protocol attached to another EFI_HANDLE, therefore we use LocateProtocol boot service to find it. In the off-chance the protocol is attached to application’s own EFI_HANDLE, the HandleProtocol method should be used instead:

EFI_LOADED_IMAGE *loaded_image = NULL;
EFI_GUID loaded_image_protocol = LOADED_IMAGE_PROTOCOL;
EFI_STATUS status = SystemTable->BootServices->HandleProtocol(ImageHandle, &loaded_image_protocol, &loaded_image);

Next steps

At this point I have a bare bones frame for my awesome kernel called “huehuehuehuehue”. From this point onwards the development of the kernel should not differ much from the traditional development of any other x64 kernel. Next, I’ll be implementing software and hardware interrupts; expect a post on that.

Tuesday, August 2, 2016

"Reverse Engineering for Beginners" free book

http://beginners.re/

Reverse Engineering challenges

Contrived by Dennis Yurichev (yurichev.com).
The website has been inspired by Project Euler and "the matasano crypto challenges".
http://challenges.re/#Solutions

The Matasano Crypto Challenges

https://blog.pinboard.in/2013/04/the_matasano_crypto_challenges/

Android: Collecting and Plotting Accelerometer Data

https://androidstream.wordpress.com/2013/01/16/android-collecting-and-plotting-accelerometer-data/

Loggin accelerometer from Android to PC
http://simena86.github.io/blog/2013/04/30/logging-accelerometer-from-android-to-pc/

Sunday, July 31, 2016

High frequency security bug hunting: 120 days, 120 bugs

https://shubs.io/high-frequency-security-bug-hunting-120-days-120-bugs/

Friday, July 29, 2016

"Reverse Engineering for Beginners" free book

http://beginners.re/#lite

"Reverse Engineering for Beginners" free book

Also known as RE4B. Written by Dennis Yurichev (yurichev.com).

My services

Praise for the book

Its very well done .. and for free .. amazing.' (Daniel Bilar, Siege Technologies, LLC.)
...excellent and free (Pete Finnigan, Oracle RDBMS security guru.).
... book is interesting, great job! (Michael Sikorski, author of Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software.)
... my compliments for the very nice tutorial! (Herbert Bos, full professor at the Vrije Universiteit Amsterdam, co-author of Modern Operating Systems (4th Edition).)
... It is amazing and unbelievable. (Luis Rocha, CISSP / ISSAP, Technical Manager, Network & Information Security at Verizon Business.)
Thanks for the great work and your book. (Joris van de Vis, SAP Netweaver & Security specialist.)
... reasonable intro to some of the techniques. (Mike Stay, teacher at the Federal Law Enforcement Training Center, Georgia, US.)
I love this book! I have several students reading it at the moment, plan to use it in graduate course. (Sergey Bratus, Research Assistant Professor at the Computer Science Department at Dartmouth College)
Dennis @Yurichev has published an impressive (and free!) book on reverse engineering (Tanel Poder, Oracle RDBMS performance tuning expert)
This book is some kind of Wikipedia to beginners... (Archer, Chinese Translator, IT Security Researcher.)

Also, this book is used at least in:

Texas A&M University (4th page; archived);
Comenius University in Bratislava (link; archived);
Masaryk University (link; archived);
Technical University of Munich (link; archived);
Hasso Plattner Institute (link; archived);
Ivanovo Power Engineering Institute (link; archived);
Chelyabinsk State University (link; archived);
Aalto University (link; archived).
Amsterdam University of Applied Sciences (link: "Recommended reading").
Edith Cowan University (link; archived).

I've also heard about:

Indian Institute of Technology Indore.

If you know about others, please drop me a note!

Download PDF files

Download English version	A4 (for browsing or printing)	A5 (for ebook readers)
Скачать русскую (Russian) версию	A4 (для просмотра или печати)	A5 (для электронных читалок)

PGP Signatures

For those, who wants to be sure the PDF files has been compiled by me, it's possible to check PGP signatures, which are: RE4B-EN-A5.pdf.sig, RE4B-EN.pdf.sig, RE4B-RU-A5.pdf.sig, RE4B-RU.pdf.sig.
My PGP public keys are here: http://yurichev.com/pgp.html.

Topics discussed: x86/x64, ARM/ARM64, MIPS, Java/JVM.
Topics touched: Oracle RDBMS, Itanium, copy-protection dongles, LD_PRELOAD, stack overflow, ELF, win32 PE file format, x86-64, critical sections, syscalls, TLS, position-independent code (PIC), profile-guided optimization, C++ STL, OpenMP, win32 SEH.

Call for translators!

You may want to help me with translation this work into languages other than English and Russian.
Just send me any piece of translated text (no matter how short) and I'll put it into my LaTeX source code.
Korean, Chinese and Persian languages are reserved by publishers.
English and Russian versions I do by myself, but my English is still that horrible, so I'm very grateful for any notes about grammar, etc. Even my Russian is also flawed, so I'm grateful for notes about Russian text as well!
So do not hesitate to contact me: dennis(a)yurichev.com

Donors

Those who supported me during the time when I wrote significant part of the book:
2 * Oleg Vygovsky (50+100 UAH), Daniel Bilar ($50), James Truscott ($4.5), Luis Rocha ($63), Joris van de Vis ($127), Richard S Shultz ($20), Jang Minchang ($20), Shade Atlas (5 AUD), Yao Xiao ($10), Pawel Szczur (40 CHF), Justin Simms ($20), Shawn the R0ck ($27), Ki Chan Ahn ($50), Triop AB (100 SEK), Ange Albertini (€10+50), Sergey Lukianov (300 RUR), Ludvig Gislason (200 SEK), Gérard Labadie (€40), Sergey Volchkov (10 AUD), Vankayala Vigneswararao ($50), Philippe Teuwen ($4), Martin Haeberli ($10), Victor Cazacov (€5), Tobias Sturzenegger (10 CHF), Sonny Thai ($15), Bayna AlZaabi ($75), Redfive B.V. (€25), Joona Oskari Heikkilä (€5), Marshall Bishop ($50), Nicolas Werner (€12), Jeremy Brown ($100), Alexandre Borges ($25), Vladimir Dikovski (€50), Jiarui Hong (100.00 SEK), Jim Di (500 RUR), Tan Vincent ($30), Sri Harsha Kandrakota (10 AUD), Pillay Harish (10 SGD), Timur Valiev (230 RUR), Carlos Garcia Prado (€10), Salikov Alexander (500 RUR), Oliver Whitehouse (30 GBP), Katy Moe ($14), Maxim Dyakonov ($3), Sebastian Aguilera (€20), Hans-Martin Münch (€15), Jarle Thorsen (100 NOK), Vitaly Osipov ($100), Yuri Romanov (1000 RUR), Aliaksandr Autayeu (€10), Tudor Azoitei ($40), Z0vsky (€10), Yu Dai ($10).
Thanks a lot to every donor!

As seen on...

... hacker news, reddit, habrahabr.ru, Russian-speaking RE forum. There are some parts translated to Chinese.
The book at Goodreads website.

mini-FAQ

Q: I clicked on hyperlink inside of PDF-document, how to get back?
A: (Adobe Acrobat Reader) Alt + LeftArrow

Q: May I print this book? Use it for teaching?
A: Of course, that's why book is licensed under Creative Commons terms (CC BY-SA 4.0). Someone may also want to build their own version of book, read here about it.

Q: Why this book is free? You've done great job. This is suspicious, as many other free things.
A: To my own experience, authors of technical literature do this mostly for self-advertisement purposes. It's not possible to gain any decent money from such work.

Q: I have a question...
A: Write me it by email (dennis(a)yurichev.com).

Supplementary materials

All exercises are moved to standalone website: challenges.re.

Be involved!

Feel free to send me corrections, or, it's even possible to submit patches on book's source code (LaTeX) on GitHub or BitBucket, or SourceForge!
Any suggestions, what also should be added to my book?
Write me an email: dennis(a)yurichev.com

News

See ChangeLog

Stay tuned!

My current plans for this book: Objective-C, Visual Basic, anti-debugging tricks, Windows NT kernel debugger, .NET, Oracle RDBMS.
Here is also my blog and facebook.Web 2.0 hater? Subscribe to my mailing list for receiving updates of this book to email.

About Korean publication

In January 2015, Acorn publishing company (www.acornpub.co.kr) in South Korea did huge amount of work in translating and publishing my book (state which is it in August 2014) in Korean language.
Now it's available at their website.

Translator is Byungho Min (@tais9).
Cover pictures was done by my artist friend Andy Nechaevsky: facebook/andydinka.
They are also the Korean translation copyright holder.
So if you want to have a "real" book on your shelf in Korean language and/or want to support my work, now you may buy it.

Thursday, July 28, 2016

Scipy Lecture Notes

http://www.scipy-lectures.org/

One document to learn numerics, science, and data with Python

Tutorials on the scientific Python ecosystem: a quick introduction to central tools and techniques. The different chapters each correspond to a 1 to 2 hours course with increasing level of expertise, from beginner to expert.

About the scipy lecture notes

next
Scipy lecture notes »

Tuesday, July 19, 2016

Instalasi OpenStreetMap dengan 120GB

http://thinkonbytes.blogspot.co.id/2016/07/your-openstreetmap-server-in-120gb.html

Why code review beats testing: evidence from decades of programming research

https://kev.inburke.com/kevin/the-best-ways-to-find-bugs-in-your-code/

tl;dr If you want to ship high quality code, you should invest in more than one of formal code review, design inspection, testing, and quality assurance. Testing catches fewer bugs per hour than human inspection of code, but it might be catching different types of bugs.
Everyone wants to find bugs in their programs. But which methods are the most effective for finding bugs? I found this remarkable chart in chapter 20 of Steve McConnell’s Code Complete. Summarized here, the chart shows the typical percentage of bugs found using each bug detection technique. The range shows the high and low percentage across software organizations, with the center dot representing the average percentage of bugs found using that technique.

100

Regression test

Informal code reviews

Unit test

New function (component) test

Integration test

Low-volume beta test (< 10 users)

Informal design reviews

Personal desk checking of code

System test

Formal design inspections

Formal code inspections

Modeling or prototyping

High-volume beta test (> 1000 users)

As McConnell notes, “The most interesting facts … are that the modal rates don’t rise above 75% for any one technique, and that the techniques average about 40 percent.” Especially noteworthy is the poor performance of testing compared to formal design review (human inspection). There are three pages of fascinating discussion that follow; I’ll try to summarize.

What does this mean for me?

No one approach to bug detection is adequate. Capers Jones – the researcher behind two of the papers McConnell cites – divides bug detection into four categories: formal design inspection, formal code inspection, formal quality assurance, and formal testing. The best bug detection rate, if you are only using one of the above four categories, is 68%. The average detection rate if you are using all four is 99%.

But a less-effective method may be worth it, if it’s cheap enough

It’s well known that bugs found early on in program development are cheaper to fix; costs increase as you have to push changes to more users, someone else has to dig through code that they didn’t write to find the bug, etc. So while a high-volume beta test is highly effective at finding bugs, it may be more expensive to implement this (and you may develop a reputation for releasing highly buggy software).
Shull et al (2002) estimate that non-severe defects take approximately 14 hours of debugging effort after release, but only 7.4 hours before release, meaning that non-critical bugs are twice as expensive to fix after release, on average. However, the multiplier becomes much, much larger for severe bugs: according to several estimates severe bugs are 100 times more expensive to fix after shipping than they are to fix before shipping.
More generally, inspections are a cheaper method of finding bugs than testing; according to Basili and Selby (1987), code reading detected 80 percent more faults per hour than testing, even when testing programmers on code that contained zero comments. This went against the intuition of the professional programmers, which was that structural testing would be the most efficient method.

How did the researchers measure efficiency?

In each case the efficiency was calculated by taking the number of bug reports found through a specific bug detection technique, and then dividing by the total number of reported bugs.
The researchers conducted a number of different experiments to try and measure numbers accurately. Here are some examples:

Giving programmers a program with 15 known bugs, telling them to use a variety of techniques to find bugs, and observing how many they find (no one found more than 9, and the average was 5)
Giving programmers a specification, measuring how long it took them to write the code, and how many bugs existed in the code base
Formal inspections of processes at companies that produce millions of lines of code, like NASA.

In our company we have low bug rates, thanks to our adherence to software philosophy X.

Your favorite software philosophy probably advocates using several different methods listed above to help detect bugs. Pivotal Labs is known for pair programming, which some people rave about, and some people hate. Pair programming means that with little effort they’re getting informal code review, informal design review and personal desk-checking of code. Combine that with any kind of testing and they are going to catch a lot of bugs.
Before any code at Google gets checked in, one owner of the code base must review and approve the change (formal code review and design inspection). Google also enforces unit tests, as well as a suite of automated tests, fuzz tests, and end to end tests. In addition, everything gets dogfooded internally before a release to the public (high-volume beta test).
It’s likely that any company with a reputation for shipping high-quality code will have systems in place to test at least three of the categories mentioned above for software quality.

Conclusions

If you want to ship high quality code, you should invest in more than one of formal code review, design inspection, testing, and quality assurance. Testing catches fewer bugs per hour than human inspection of code, but it might be catching different types of bugs. You should definitely try to measure where you are finding bugs in your code and the percentage of bugs you are catching before release – 85% is poor, and 99% is exceptional.

Appendix

The chart was created using the jQuery plotting library flot. Here’s the raw Javascript, and the CoffeeScript that generated the graph.
References

Steve McConnell’s Code Complete, which I’d recommend for anyone who’s interested in improving the quality of the code they write.
Capers Jones, “Software defect-removal efficiency”, published in Computer, volume 29, issue 4, unfortunately the paper is gated.
Forrest Shull et al, “What We Have Learned About Fighting Software Defects,” 2002
Victor Basili and Richard Selby, “Comparing the Effectiveness of Software Testing Strategies,” IEEE Transactions on Software Engineering, volume 13, issue 12.

Computers in Spaceflight: The NASA Experience

http://history.nasa.gov/computers/Ch4-5.html

- Chapter Four -
- Computers in the Space Shuttle Avionics System -

Developing software for the space shuttle

[108] During 1973 and 1974 the first requirements began to be specified for what has become one of the most interesting software systems ever designed. It was obvious from the very beginning that developing the Shuttle's software would be a complicated job. Even though NASA engineers estimated the size of the flight software to be smaller than that on Apollo, the ubiquitous functions of the Shuttle computers meant that no one group of engineers and no one company could do the software on its own. This increased the size of the task because of the communication necessary between the working groups. It also increased the complexity of a spacecraft already made complex by flight requirements and redundancy. Besides these realities, no one could foresee the final form that the software for this pioneering vehicle would take, even after years of development work had elapsed, since there continued to be both minor and major changes. NASA and its contractors made over 2,000 requirements changes between 1975 and the first flight in 1981⁸⁰. As a result, about $200 million was spent on software, as opposed to an initial estimate of $20 million. Even so, NASA lessened the difficulties by making several early decisions that were crucial for the program's success. NASA separated the software contract from the hardware contract, closely managed the contractors and their methods, chose a high-level language, and maintained conceptual integrity.

NASA awarded IBM Corporation the first independent Shuttle software contract on March 10, 1973. IBM and Rockwell International had worked together during the period of competition for the orbiter contract⁸¹. Rockwell bid on the entire aerospacecraft, intending to subcontract the computer hardware and software to IBM. But to Rockwell's dismay, NASA decided to separate the software contract from the orbiter contract. As a result, Rockwell still subcontracted with IBM for the computers, but IBM hand a separate software contract monitored closely by the Spacecraft Software Division of the Johnson Space Center. There are several reasons why this division of labor occurred. Since software does not weigh anything in and of itself, it is used to overcome hardware problems that would require extra systems and components (such as a mechanical control system)⁸². Thus software is in many ways the most critical component of the Shuttle, as it ties the other components together. Its importance to the overall program alone justified a separate contract, since it made the contractor directly accountable to NASA. Moreover, during the operations phase, software underwent the most changes, the hardware being essentially fixed⁸³. As time went on, Rockwell's responsibilities as [109] prime hardware contractor were phased out, and the shuttles were turned over to an operations group. In late 1983, Lockheed Corporation, not Rockwell, won the competition for the operations contract. By keeping the software contract separate, NASA could develop the code on a continuing basis. There is a considerable difference between changing maintenance mechanics on an existing hardware system and changing software companies on a not yet perfect system because to date the relationships between components in software are much harder to define than those in hardware. Personnel experienced with a specific software system are the best people to maintain it. Lastly, Christopher Kraft of Johnson Space Center and George Low of NASA Headquarters, both highly influential in the manned spacecraft program during the early 1970's, felt that Johnson had the software management expertise to handle the contract directly⁸⁴.

One of the lessons learned from monitoring Draper Laboratory in the Apollo era was that by having the software development at a remote site (like Cambridge), the synergism of informally exchanged ideas is lost; sometimes it took 3 to 4 weeks for new concepts to filter over⁸⁵. IBM had a building and several hundred personnel near Johnson because of its Mission Control Center contracts. When IBM won the Shuttle contract, it simply increased its local force.

The closeness of IBM to Johnson Space Center also facilitated the ability of NASA to manage the project. The first chief of the Shuttle's software, Richard Parten, observed that the experience of NASA managers made a significant contribution to the success of the programming effort⁸⁶. Although IBM was a giant in the data processing industry, a pioneer in real time systems, and capable of putting very bright people on a project, the company had little direct experience with avionics software. As a consequence, Rockwell had to supply a lot of information relating to flight control. Conversely, even though Rockwell projects used computers, software development on the scale needed for the Shuttle was outside its experience. NASA Shuttle managers provided the initial requirements for the software and facilitated the exchange of information between the principal contractors. This situation was similar to that during the 1960s when NASA had the best rendezvous calculations people in the world and had to contribute that expertise to IBM during the Gemini software development. Furthermore, the lessons of Apollo inspired the NASA managers to push IBM for quality at every point⁸⁷.

The choice of a high level language for doing the majority of the coding was important because, as Parten noted, with all the changes, "we'd still be trying to get the thing off the ground if we'd used assembly language"⁸⁸. Programs written in high level languages are far easier to modify. Most of the operating system software, which is rarely changed, is in assembler, but all applications software and some of the interfaces and redundancy management code is in HAL/S⁸⁹.

[110] Although the decision to program in a high-level language meant that a large amount of support software and development tools had to be written, the high-level language nonetheless proved advantageous, especially since it has specific statements created for real-time programming.


Defining the Shuttle Software

In the end, the success of the Shuttle's software development was due to the conceptual integrity established by using rigorously maintained requirements documents. The requirements phase is the beginning of the software life cycle, when the actual functions, goals, and user interfaces of the eventual software are determined in full detail. If a team of a thousand workers was asked to set software requirements, chaos would result⁹⁰. On the other hand, if few do the requirements but many can alter them later, then chaos would reign again. The strategy of using a few minds to create the software architecture and interfaces and then ensuring that their ideas and theirs alone are implemented, is termed "maintaining conceptual integrity," which is well explained in Frederick C. Brooks' The Mythical Man Month ⁹¹. As for other possible solutions, Parten says, "the only right answer is the one you pick and make to work"⁹².

Shuttle requirements documents were arranged in three Levels: A, B, and C, the first two written by Johnson Space Center engineers. John R. Garman prepared the Level A document, which is comprised of a comprehensive description of the operating system, applications programs, keyboards, displays, and other components of the software system and its interfaces. William Sullivan wrote the guidance, navigation and control requirements, and John Aaron, the system management and payload specifications of Level B. They were assisted by James Broadfoot and Robert Ernull⁹³. Level B requirements are different in that they are more detailed in terms of what functions are executed when and what parameters are needed⁹⁴. The Level Bs also define what information is to be kept in COMPOOLS, which are HAL/S structures for maintaining common data accessed by different tasks⁹⁵. The Level C requirements were more of a design document, forming a set with Level B requirements, since each end item at Level C must be traceable to a Level B requirement⁹⁶. Rockwell International was responsible for the development of the Level C require ments as, technically, this is where the contractors take over from the customer, NASA, in developing the software.

Early in the program, however, Draper Laboratory had significant influence on the software and hardware systems for the Shuttle. Draper was retained as a consultant by NASA and contributed two [111] key items to the software development process. The first was a document that "taught" NASA and other contractors how to write require ments for software, how to develop test plans, and how to use func tional flow diagrams, among other tools⁹⁷. It seems ironic that Draper was instructing NASA and IBM on such things considering its difficulties in the mid-1960s with the development of the Apollo flight software. It was likely those difficult experiences that helped motivate the MIT engineers to seriously study software techniques and to become, within a very short time, one of the leading centers of software engineering theory. The Draper tutorial included the concept of highly modular software, software that could be "plugged into" the main circuits of the Shuttle. This concept, an application of the idea of interchangeable parts to software, is used in many software systems today, one example being the UNIX^*** operating system developed at Bell Laboratories in the 1970s, under which single function software tools can be combined to perform a large variety of functions.

Draper's second contribution was the actual writing of some early Level C requirements as a model⁹⁸. This version of the Level C documents contained the same components as in the later versions delivered by Rockwell to IBM for coding. Rockwell's editions, however, were much more detailed and complete, reflecting their practical, rather than theoretical purpose and have been an irritation for IBM. IBM and NASA managers suspect that Rockwell, miffed when the software contract was taken away from them, may have delivered incredibly precise and detailed specifications to the software contractor. These include descriptions of flight events for each major portion of the software, a structure chart of tasks to be done by the software during that major segment, a functional data flowchart, and, for each module, its name, calculations, and operations to be performed, and input and output lists of parameters, the latter already named and accompanied by a short definition, source, precision, and what units each are in. This is why one NASA manager said that "you can't see the forest for the trees" in Level C, oriented as it is to the production of individual modules⁹⁹. One IBM engineer claimed that Rockwell went "way too far" in the Level C documents, that they told IBM too much about how to do things rather than just what to do¹⁰⁰. He further claimed that the early portion of the Shuttle development was "underengineered" and that Rockwell and Draper included some requirements that were not passed on by NASA. Parten, though, said that all requirements documents were subject to regular review by joint teams from NASA and Rockwell¹⁰¹.

The impression one gains from documents and interviews is that both Rockwell and IBM fell victim to the "not invented here" [112] syndrome: If we didn't do it, it wasn't done right. For example, Rockwell delivered the ascent requirements, and IBM coded them to the letter, thereby exceeding the available memory by two and a third times and demonstrating that the requirements for ascent were excessive. Rockwell, in return, argued for 2 years about the nature of the operating system, calling for a strict time-sliced system, which allocates predefined periods of time for the execution of each task and then suspends tasks unfinished in that time period and moves on to the next one. The system thus cycles through all scheduled tasks in a fixed period of time, working on each in turn. Rockwell's original proposal was for a 40-millisecond cycle with synchronization points at the end of each¹⁰². IBM, at NASA's urging, countered with a priority-interrupt-driven system similar to the one on Apollo Rockwell, experienced with time-slice systems, fought this from 1973 to 1975, convinced it would never work^l03.

The requirements specifications for the Shuttle eventually contained in their three levels what is in both the specification and design stage of the software life cycle. In this sense, they represent a fairly complete picture of the software at an early date. This level of detail at least permitted NASA and its contractors to have a starting point in the development process. IBM constantly points to the number of changes and alterations as a continuing problem, partially ameliorated by implementing the most mature requirements first¹⁰⁴. Without the attempt to provide detail at an early date, IBM would not have had any mature requirements when the time came to code. Even now, requirements are being changed to reflect the actual software, so they continue to be in a process of maturation. But early development of specifications became the means by which NASA could enforce conceptual integrity in the shuttle software.


Architecture of the Primary Avionics Software System

The Primary Avionics Software System, or PASS, is the software that runs in all the Shuttle's four primary computers. PASS is divided into two parts: system software and applications software. The system software is the Flight Computer Operating System (FCOS), the user interface programming, and the system control programs, whereas the applications software is divided into guidance, navigation and control, orbiter systems management, payload and checkout programs. Further divisions are explained in Box 4-3.

: [113] Box 4-3: Structure of PASS Applications Software; The PASS guidance and navigation software is divided into major functions, dictated by mission phases, the most obvious of which are preflight, ascent, on-orbit, and descent. The requirements state that these major functions be called OPS, or operational sequences. (e.g., OPS-1 is ascent; OPS-3, descent.) Within the OPS are major modes. In OPS-1, the first-stage burn, second-stage burn, first orbital insertion burn, second orbital insertion burn, and the initial on-orbit coast are major modes; transition between major modes is automatic. Since the total mission software exceeds the capacity of the memory, OPS transitions are normally initiated by the crew and require the OPS to be loaded from the MMU. This caused considerable management concern over the preservation of data, such as the state vector, needed in more than one OPS¹⁰⁵. NASA's solution is to keep common data in a major function base, which resides in memory continuously and is not overlaid by new OPS being read into the computers.; Within each OPS, there are special functions (SPECs) and display functions (DISPs). These are available to the crew as a supplement to the functions being performed by the current OPS. For example, the descent software incorporates a SPEC display showing the horizontal situation as a supplement to the OPS display showing the vertical situation. This SPEC is obviously not available in the on-orbit OPS. A DISP for the on-orbit OPS may show fuel cell output levels, fuel reserves in the orbital maneuvering system, and other such information. SPECs usually contain items that can be selected by the crew for execution. DISPs are just what their name means, displays and not action items. Since SPECs and DISPs have lower priority than OPS, when a big OPS is in memory they have to be kept on the tape and rolled in when requested¹⁰⁶. The actual format of the SPECs, DISPs, OPS displays, and the software that interprets crew entries on the keyboard is in the user interface portion of the system software.

The most critical part of the system software is the FCOS. NASA, Rockwell, and IBM solved most of the grand conceptual problems, such as the nature of the operating system and the redundancy management scheme, by 1975. The first task was to convert the FCOS from the proposed 40-millisecond loop operating system to a priority-driven [113] system¹⁰⁷. Priority interrupt systems are superior to time-slice systems because they degrade gracefully when overloaded¹⁰⁸. In a time-slice system, if the tasks scheduled in the current cycle get bogged down by excessive I/O operations, they tend to slow down the total time of execution of processes. IBM's version of the FCOS actually has cycles, but they are similar to the ones in the Skylab system described in the previous chapter. The minor cycle is the high-frequency cycle; tasks within it are scheduled every 40 milliseconds. Typical tasks in this cycle are those related to active flight control in the atmosphere. The major cycle is 960 milliseconds, and many monitoring and system management tasks are scheduled at that frequency¹⁰⁹. If a process is still running when its time to.....

[114]

Figure 4-6. A block diagram of the Shuttle flight computer software architecture. (From NASA, Data Processing System Workbook)
.....restart comes up due to excessive I/O or because it was interrupted, it cancels its next cycle and finishes up¹¹⁰. If a higher priority process is called when another process is running, then the current process is interrupted and a program status word (PSW) containing such items as the address of the next instruction to be executed is stored until the interruption is satisfied. The last instruction of an interrupt is to restore the old PSW as the current PSW so that the interrupted process can continue¹¹¹. The ability to cancel processes and to interrupt them asynchronously provides flexibility that a strict time-slice system does not.

A key requirement of the FCOS is to handle the real-time statements in the HAL/S language. The most important of these are SCHEDULE, which establishes and controls the frequency of execution of processes; TERMINATE and CANCEL, which are the opposite of SCHEDULE; and WAIT, which conditionally suspends execution¹¹². The method of implementing these statements is controlled [115] by a separate interface control document¹¹³. SCHEDULE is generally programmed at the beginning of each operational sequence to set up which tasks are to be done in that software segment and how often they are to be done. The syntax of SCHEDULE permits the programmer to assign a frequency and priority to each task. TERMINATE and CANCEL are used at the end of software phases or to stop an unneeded process while others continue. For example, after the solid rocket boosters burn out and separate, tasks monitoring them can cease while tasks monitoring the main engines continue to run. WAIT, although handy, is avoided by IBM because of the possibility of the software being "hung up" while waiting for the I/O or other condition required to continue the process¹¹⁴. This is called a race condition or "deadly embrace" and is the bane of all shared resource computer operating systems.

The FCOS and displays occupy 35K of memory at all times¹¹⁵. Add the major function base and other resident items, and about 60K of the 106K of core remains available for the applications programs. Of the required applications programs, ascent and descent proved the most troublesome. Fully 75% of the software effort went into those two programs¹¹⁶. After the first attempts at preparing the ascent software resulted in a 140K load, serious code reduction began. By 1978, IBM reduced the size of the ascent program to 116K, but NASA Headquarters demanded it be further knocked down to 80K¹¹⁷. The lowest it ever got was 98,840 words (including the system software), but its size has since crept back up to nearly the full capacity of the memory. IBM accomplished the reduction by moving functions that could wait until later operational sequences¹¹⁸. The actual figures for the test flight series programs are in Table 4-1¹¹⁹. The total size of the flight test software was 500,000 words of code. Producing it and modifying it for later missions required the development of a complete production facility.

[116] TABLE 4-1: Sizes of Software Loads in PASS.
.
NAME	K WORDS
.	.
Preflight initialization	72.4
Preflight checkout	81.4
Ascent and abort	105.2
On-orbit	83.1
On-orbit checkout	80.3
On-orbit system management	84.1
Entry	101.1
Mass memory utility	70.1




Implementing PASS

NASA planned that PASS would be a continuing development process. After the first flight programs were produced, new functions needed to be added and adapted to changing payload and mission requirements. For instance, over 50% of PASS modules changed during the first 12 flights in response to requested enhancements¹²⁰. To do this work, NASA established a Software Development Laboratory at Johnson Space Center in 1972 to prepare for the implementation of the Shuttle programs and to make the software tools needed for efficient coding and maintenance. The Laboratory evolved into the Software Production Facility (SPF) in which the software development is carried on in the operations era. Both the facilities were equipped and managed by NASA but used largely by contractors.

The concept of a facility dedicated to the production of onboard software surfaced in a Rand Corporation memo in early 1970¹²¹. The memo summarized a study of software requirements for Air Force space missions during the decade of the 1970s. One reason for a government-owned and operated software factory was that it would be easier to establish and maintain security. Most modules developed for [117] the Shuttle, such as the general flight control software and memory displays, would be unclassified. However, Department of Defense (DoD) payloads require system management and payload management software, plus occasional special maneuvering modules. These were expected to be classified. Also, if the software maintenance contract moved from the original prime contractor to some different operations contractor, it would be considerably simpler to accomplish the transfer if the software library and development computers were government owned and on government property. Lastly, having such close control over existing software and new development would eliminate some of the problems in communication, verification, and maintenance encountered in the three previous manned programs.

Developing the SPF turned out to be as large a task as developing the flight software itself. During the mid-1970s, IBM had as many people doing software for the development lab as they had working on PASS¹²². The ultimate purpose of the facility is to provide a programming team with sufficient tools to prepare a software load for a flight. This software load is what is put on to the MMU tape that is flown on the spacecraft. In the operations era of the 1980s, over 1,000 compiled modules are available. These are fully tested, and often previously used, versions of tasks such as main engine throttling, memory modification, and screen displays that rarely change from flight to flight. New, mission-specific modules for payloads or rendezvous maneuvers are developed and tested using the SPF's programming tools, which themselves represent more than a million lines of code¹²³. The selection of existing modules and the new modules are then combined into a flight load that is subject to further testing. NASA achieved the goal of having such an efficient software production system through an 8-year development process when the SPF was still the Laboratory.

In 1972, NASA studied what sort of equipment would be required for the facility to function properly. Large mainframe computers compatible with the AP-101 instruction set were a must. Five IBM 360/75 computers, released from Apollo support functions, were available¹²⁴. These were the development machines until January of 1982¹²⁵. Another requirement was for actual flight equipment on which to test developed modules. Three AP-101 computers with associated display electronics units connected to the 360s with a flight equipment interface device (FEID) especially developed for the purpose. Other needed components, such as a 6-degree-of-freedom flight simulator, were implemented in software¹²⁶. The resulting group of equipment is capable of testing the flight software by interpreting instructions, simulating functions, and running it in the actual flight hardware¹²⁷.

In the late 1970s, NASA realized that more powerful computers were needed as the transition was made from development to operations. The 360s filled up, so NASA considered the Shuttle Mission [118] Simulator (SMS), the Shuttle Avionics Instrumentation Lab (SAIL), and the Shuttle Data Processing Center's computers as supplementary development sites, but this idea was rejected because they were all too busy doing their primary functions¹²⁸. In 1981, the Facility added two new IBM 3033N computers, each with 16 million bytes of primary memory. The SPF then consisted of those mainframes, the three AP-101 computers and the interface devices for each, 20 magnetic tape drives, six line printers, 66 million bytes of drum memory, 23.4 billion bytes of disk memory, and 105 terminals¹²⁹. NASA accomplished rehosting the development software to the 3033s from the 360s during the last quarter of 1981. Even this very large computer center was not enough. Plans at the time projected on-line primary memory to grow to 100 million bytes¹³⁰, disk storage to 160 billion bytes¹³¹, and two more interface units, display units, and AP-101s to handle the growing DOD business¹³². Additionally, terminals connected directly to the SPF are in Cambridge, Massachusetts, and at Goddard Space Flight Center, Marshall Space Flight Center, Kennedy Space Center, and Rockwell International in Downey, California¹³³.

Future plans for the SPF included incorporating backup system software development, then done at Rockwell, and introducing more automation. NASA managers who experienced both Apollo and the Shuttle realize that the operations software preparation is not enough to keep the brightest minds sufficiently occupied. Only a new project can do that. Therefore, the challenge facing NASA is to automate the SPF, use more existing modules, and free people to work on other tasks. Unfortunately, the Shuttle software still has bugs, some of which are no fault of the flight software developers, but rather because all the tools used in the SPF are not yet mature. One example is the compiler for HAL/S. Just days before the STS-7 flight, in June, 1983, an IBM employee discovered that the latest release of the compiler had a bug in it. A quick check revealed that over 200 flight modules had been modified and recompiled using it. All of those had to be checked for errors before the flight could go. Such problems will continue until the basic flight modules and development tools are no longer constantly subject to change. In the meantime, the accuracy of the Shuttle software is dependent on the stringent testing program conducted by IBM and NASA before each flight.

Verification and Change Management of the Shuttle Software

IBM established a separate line organization for the verification of the Shuttle software. IBM's overall Shuttle manager has two managers reporting to him, one for design and development, and one for verification and field operations. The verification group has just [119] less than half the members of the development group and uses 35% of the software budget¹³⁴. There are no managerial or personnel ties to the development group, so the test team can adopt an "adversary relationship" with the development team. The verifiers simply assume that the software is untested when received¹³⁵. In addition, the test team can also attempt to prove that the requirements documents are wrong in cases where the software becomes unworkable. This enables them to act as the "conscience" of the entire project¹³⁶.

IBM began planning for the software verification while the requirements were being completed. By starting verification activity as the software took shape, the test group could plan its strategy and begin to write its own books. The verification documentation consists of test specifications and test procedures including the actual inputs to be used and the outputs expected, even to the detail of showing the content of the CRT screens at various points in the test¹³⁷. The software for the first flight had to survive 1,020 of these tests¹³⁸. Future flight loads could reuse many of the test cases, but the preparation of new ones is a continuing activity to adjust to changes in the software and payloads, each of which must be handled in an orderly manner.

Suggestions for changes to improve the system are unusually welcome. Anyone, astronaut, flight trainer, IBM programmer, or NASA manager, can submit a change request¹³⁹. NASA and IBM were processing such requests at the rate of 20 per week in 1981¹⁴⁰. Even as late as 1983 IBM kept 30 to 40 people on requirements analysis, or the evaluation of requests for enhancements¹⁴¹. NASA has a corresponding change evaluation board. Early in the program, it was chaired by Howard W. Tindall, the Apollo software manager, who by then was head of the Data Systems and Analysis Directorate. This turned out to be a mistake, as he had conflicting interests¹⁴². The change control board moved to the Shuttle program office. Due to the careful review of changes, it takes an average of 2 years for a new requirement to get implemented, tested, and into the field¹⁴³. Generally, requests for extra functions that would push out current software due to memory restrictions are turned down¹⁴⁴.

[120] Box 4-4: How IBM Verifies the Shuttle Flight Software; The Shuttle software verification process actually begins before the test group gets the software, in the sense that the development organization conducts internal code reviews and unit tests of individual modules and then integration tests of groups of modules as they are assembled into a software load. There are two levels of code inspection, or "eyeballing" the software looking for logic errors. One level of inspection is by the coders themselves and their peer reviewers. The second level is done by the outside verification team. This activity resulted in over 50% of the discrepancy reports (failures of the software to meet the specification) filed against the software, a percentage similar to the Apollo experience and reinforcing the value of the idea¹⁴⁵. When the software is assembled, it is subject to the First Article Configuration Inspection (FACI), where it is reviewed as a complete unit for the first time. It then passes to the outside verification group.; Because of the nature of the software as it is delivered, the verification team concentrates on proving that it meets the customer's requirements and that it functions at an acceptable level of performance. Consistent with the concept that the software is assumed untested, the verification group can go into as much detail as time and cost allow. Primarily, the test group concentrates on single software loads, such as ascent, on-orbit, and so forth¹⁴⁶. To facilitate this, it is divided into teams that specialize in the operating system and detail, or functional verification; teams that work on guidance, navigation, and control; and teams that certify system performance. These groups have access to the software in the SPF, which thus doubles as a site for both development and testing. Using tools available in the SPF, the verification teams can use the real flight computers for their tests (the preferred method). The testers can freeze the execution of software on those machines in order to check intermediate results, alter memory, and even get a log of what commands resulted in response to what inputs¹⁴⁷.; After the verification group has passed the software, it is given an official Configuration Inspection and turned over to NASA. At that point NASA assumes configuration control, and any changes must be approved through Agency channels. Even though NASA then has the software, IBM is not finished with it¹⁴⁸.
: [121] The software is usually installed in the SAIL for prelaunch, ascent, and abort simulations, the Flight Simulation Lab (FSL) in Downey for orbit, de-orbit, and entry simulations, and the SMS for crew training. Although these installations are not part of the preplanned verification process, the discrepancies noted by the users of the software in the roughly 6 months before launch help complete the testing in a real environment. Due to the nature of real-time computer systems, however, the software can never be fully certified, and both IBM and NASA are aware of this¹⁴⁹. There are simply too many interfaces and too many opportunities for asynchronous input and output.

Discrepancy reports cause changes in software to make it match the requirements. Early in the program, the software found its way into the simulators after less verification because simulators depend on software just to be turned on. At that time, the majority of the discrepancy reports were from the field installations. Later, the majority turned up in the SPF¹⁵⁰. All discrepancy reports are formally disposed of, either by appropriate fixes to the software, or by waiver. Richard Parten said, "Sometimes it is better to put in an 'OPS Note' or waiver than to fix (the software). We are dependent on smart pilots"¹⁵¹. If the discrepancy is noted too close to a flight, if it requires too much expense to fix, it can be waived if there is no immediate danger to crew safety. Each Flight Data File carried on board lists the most important current exceptions of which the crew must be aware. By STS-7 in June of 1983, over 200 pages of such exceptions and their descriptions existed¹⁵². Some will never be fixed, but the majority were addressed during the Shuttle launch hiatus following the 51L accident in January 1986.

So, despite the well-planned and well-manned verification effort, software bugs exist. Part of the reason is the complexity of the real-time system, and part is because, as one IBM manager said, "we didn't do it up front enough," the "it" being thinking through the program logic and verification schemes¹⁵³. Aware that effort expended at the early part of a project on quality would be much cheaper and simpler than trying to put quality in toward the end, IBM and NASA tried to do much more at the beginning of the Shuttle software development than in any previous effort, but it still was not enough to ensure perfection.

[122] Box 4-5: The Nature of the Backup Flight System; The Backup Flight System consists of a single computer and a software load that contains sufficient functions to handle ascent to orbit, selected aborts during ascent, and descent from orbit to landing site. In the interest of avoiding a generic software failure, NASA kept its development separate from PASS. An engineering directorate, not the on-board software division, managed the software contract for the backup, won by Rockwell¹⁵⁴.; The major functional difference between PASS and the backup is that the latter uses a time-slice operating system rather than the asynchronous priority-driven system of PASS¹⁵⁵. This is consistent with Rockwell's opinion on how that system was to be designed. Ironically, since the backup must listen in on PASS operations so as to be ready for instant takeover, PASS had to be modified to make it more synchronous¹⁵⁶. Sixty engineers were still working on the Backup Flight System software as late as 1983¹⁵⁷.


^*** UNIX is a trademark of AT&T.

Wednesday, November 23, 2016

No Statistics at All

Failure to Reject = Confirmation

The Multiple Comparisons Problem

Thursday, August 18, 2016

Step 1: adjust the HTML code

Step 2: adjust the Javascript code

Step 3: create a autocomplete PHP function which returns a JSON object

Step 4: the result

Saturday, August 6, 2016

Ker­nel and UEFI

Tool­chain

The UEFI “Hel­lo, world!”

UEFI boot and runtime ser­vices

Next steps

Tuesday, August 2, 2016

"Reverse Engineering for Beginners" free book

Reverse Engineering challenges

Android: Collecting and Plotting Accelerometer Data

Sunday, July 31, 2016

Friday, July 29, 2016

"Reverse Engineering for Beginners" free book

Praise for the book

Download PDF files

PGP Signatures

Contents

Call for translators!

Donors

As seen on...

mini-FAQ

Supplementary materials

Be involved!

News

Stay tuned!

About Korean publication

Thursday, July 28, 2016

One document to learn numerics, science, and data with Python

Tuesday, July 19, 2016

What does this mean for me?

But a less-effective method may be worth it, if it’s cheap enough

How did the researchers measure efficiency?

In our company we have low bug rates, thanks to our adherence to software philosophy X.

Conclusions

Appendix

Kernel and UEFI

Toolchain

The UEFI “Hello, world!”

UEFI boot and runtime services