Why is it that the de facto standard for the smallest addressable unit of memory (byte) to be 8 bits?

The smallest addressable unit of memory is actually a bit (0 or 1). Your question even admits this when you ask why "8 bits vs 4 bits". A bit is commonly used to store true/false answers in a database for optimal storage. Aside from yes/no type of data it's not very useful to store a bit.

Most programmers just work directly with data types (unless you are working directly with hardware). You work with things like "int(eger)", "float", "decimal", "char(acter)" and "string". Or some languages like javascript don't even allow you to specify what the type is and you just use "var(iables)".

The number of bits is always a power of 2 because it wastes no space with the computer architecture. You want a power of 2 because each bit needs to represent both the on and off state - or if you have "0000" you need to also represent "1111" (hence power of 2). You can think of computer memory as a huge bank of light switches that are either on/off (it's actually an electric signal that is measured above a certain threshold, but on/off is easier).

0's & 1's can get tedious when working with things like character codes so there is a system called hexadecimal that represents 4 bits (using 0 to 9 and then letters A to F). Hex 6 is equal to 0110 and the decimal number six. Or if you wanted a byte (8 bits) to be all 1's you use 0xFF (hex notation).

Hex is nice because instead of trying to make sense of 0's & 1's you have a little more variety. 16 different symbols isn't horrible to look at. The next leap of 32 symbols would likely be too complicated so it's not something that is common.

This chart shows the relationship between decimal numbers, hexadecimal and binary representations.

The earliest computer programs were written almost exclusively using unreadable binary code (called machine language). There was some NASA incident where a bug cause a huge mishap (I can't recall specifics). But after that development of secondary programming languages abstracted away the 0/1 stuff to readable "English". Stuff like:

   IF x < 0 THEN print "Hi" ELSE print "Bye"

Programs like this are compiled into binary (what you would know as the exe file or a dll). There are more advances to programs like functions, pointers, etc that is beyond the scope of your question - but literally everything is built ontop the basis of a 0/1 system - which is kindof wild.

/r/askscience Thread