Bitcoin Paper explained

This article is a beginner friendly explanation of the bitcoin paper published at bitcoin.org/bitcoin.pdf titled "Bitcoin: A Peer-to-Peer Electronic Cash System" by Satoshi Nakamoto.

Before studying the Bitcoin paper itself, understanding the following concepts is necessary and we'll cover these first:

  1. Cryptographic hashes
  2. Digital signatures
  3. Merkle trees
  4. Proof of work

Apart from Merkle trees, none of these can be skipped. Merkle trees too are necessary in understanding how blockchains minimise the amount of memory they need but can be ommitted without taking away too much from one's understanding of the blockchain.

Cryptographic hashes

You may find it easier to square a number than calculating the square root and so do computers. Here, doing something (squaring) is easier than reversing it (square-rooting). Cryptographic hash functions are just functions with a finite range, which are sloooooooww to reverse - so slow that even the most powerful computers would have to work centuries or more (maybe even the lifetime of the universe) to reverse it. Because they take so long to reverse, we just assume they are irreversible in most of the cases.
So you've got a function f(x) where you can throw a value x, and get out a value y in the range [0, r] for some maximum value r . But you cannot go back to x from y. You can also not find another value which also gives the output y.

Lingo Dingo

Finding the input for a given output is called preimaging. Hash functions are preimage resistant.
It's also hard to find collisions , aka two inputs which give the same output.

I'll just show you a hash function. This one is called sha-256 and it is very secure.
I open my linux terminal and issue :

echo "Hi there" | sha256sum

which feeds the string "Hi there" to a hash function. The output is

c641344867e9806fadfd219f25b62b97c94db0eed04a1d79e93676533cfb782b

You can now give this digest (just another word for output) to someone and ask them to find the string which gave this output ("hi there"). It will take them quite a while, and their computer will have to work hard to guess the answer (unless they're lucky), because there is no way to go back to "Hi there" from this output digest.

You can use these hashing functions to announce you predicted something without telling people exactly what you predicted. Say you predict, "my friend will get scolded by an old lady 3 hours 2 minutes from now", and you want to show your friend your prediction powers. If you just tell your friend, your friend will try to avoid it, and may succeed.

So you go to your terminal and type :

echo "You will get scolded by an old lady 3 hours 2 minutes from now" | sha256sum

and get an output 508bc2c080852faa689c0d98c6fbd1ae101e26c0ebd9ee46d691e634a2b54e43. You say to your friend, "Hi, I have made a prediction and hashed it here. I will reveal the prediction when it has already happened". Now, the said incident happens and your friend gets scolded by some old lady. You then reveal your prediction, and which your friend confirms hashes to the sha256 hash that you gave him 3 hours 2 min ago.

Another use, more grounded in a practical example would be if you have come up with a groundbreaking scientific idea, and you don't want to tell anyone or reveal to the world just yet what it is because you are not sure. And yet, you want to announce "hey this idea is mine", without precisely telling anyone the idea of course.
You can instead say "Hey, the idea which hashes to 408bc2c080852faa689c0d98c6fbd1ae101e26c0ebd9ee46d691e634a2b54e43 is mine". Later, when the time is ripe, you have proof that you indeed came up with the idea all that while back, when you published the sha256 hash for it.

In both these examples, announcing hashes gives proof that you know something without quite revealing it. Later, when you actually reveal it, people know that you knew it all that while back (This idea is called timestamping, because you "timestamp" the idea to the time you revealed its hash).

Digital Signatures

Digital signatures are like actual signatures, but digital. You sign them using a private key (which only you have), and there is a public key out there (which anyone can get) that people can use to verify that the signature is actually yours. Although I could get into the maths of it, it will save us some brain space to abstract this idea to what has been said in the sentence before. If you're interested you could work on truly understanding how digital signatures work, but the abstract notion presented above should be enough for understanding the paper. Take note that you can sign anything and everything - as long as that thing can be represented as a sequence of bits.

Merkle trees

When someone gives you two large chunks of data, it can take long to find the difference between the two chunks. Every data is a sequence of bits, so when you are given, say,two 4GB files, you'll have two sequences of bits, each of which has 4000000000 bytes in it, or 16000000000 bits. So you start checking as follows : "Okay the first bit is 0 in both, the next is also 0 in both, the next is a 1, the next is again a 0 in both ... " and realise soon that it will take you very long to find the first mismatch. You can divide your problem, however
and solve it in much faster times. Merkle trees provide a way.

                       H7
                     /    \
                   /          \
                 /              \
               /                  \
           H5                       H6
         /    \                   /    \
        /      \                /       \
       /         \             /         \
       H1         H2          H3          H4
       o          o           o           o

Above, you have divided your data into 4 pieces, where each piece is a small size (you could make more pieces until you get a size that satisfies you. You hash all of the pieces. Then you hash two hashes into a single hash. And you proceed until you get just a single hash (H7) at the top.
Now, let's change the last piece (the one that hashes to H4). Changing this will change the hash H4, which will change H6 and finally, the top hash H7.

If you didn't know which chunk changed, you'll still see that H7 has changed, which means at least one of H5 or H6 changed. This makes us notice that H6 changed but H5 is the same. And just in one step, we know that everything below H5 is still the same. This doesn't seem like a big deal with 4 chunks. But if you had 5000 chunks, in just one step you would have eliminated 2500 chunks from the ones you need to search.
You proceed in the same way with H6 and notice that between H3 and H4, H4 has changed. And in 2 steps you have identified the part that changed (rather than 4 steps to search 4 chunks).

Merkle trees are also used in trust scenarios to give proof. For example, Bob gives Sam the top hash H7 of some data he has, and Sam trusts Bob. Now, Bob goes away, and Mallory tells Sam about H6, H5 and H4, as well as the last piece of data. Although Sam doesn't trust Mallory, because he trusts Bob and he knows that no one can easily find pre-image for a given hash or find collisions, Sam trusts that Bob had this exact piece of data in mind that Mallory gave him just now.

Proof of work

Only way someone can find preimage for a hash is by guessing. Guessing is easy for the hash of, say 'a', and quite hard if you hash something like 'my turquoise grandma ate a black apple', or 'jjsdfuJDFJJkljlkk332;32u9;ds'.

Someone has to do more work to find the preimage of the hash of 'jsjdaljfoO' than of 'a'. If someone finds the pre-image, you got proof that they have done 'work'.

In a bunch of computers all connected together and sending messages to each other, we can avoid spam by having all of them prove that they did work for each message they send. For example, we can pretend that a sha-256 hash, which is just a sequence of 256 bits, is a number. And now we can ask computers to find a value which hashes to a number less than a threshold t for every message they send. The more someone spams, the more work they will need to do, since each message requires some work to be done.

Putting it all together

Now buckle up because we will dive straight into the paper.

The abstract.

A peer-to-peer version of electronic cash would mean that there is no central authority like a bank managing the exchange of money. Digital signatures solve part of the problem, but there's still a problem called the double spend problem.

Double spend problem is where Bob copies his coin, and then spends the copied coin twice. How do we solve this ? That's what the bitcoin paper solves (hint hint : It's solved using proof of work).

The network puts transactions in a chain (the first link in the chain came before the one that came after and so on). The chain is hash-based (we'll see what this means) and uses proof of work. The longest chain comes from most 'work' done, and also stores the sequence of transactions witnessed. As long as honest nodes (aka computers) control most of the network, they will win out on the 'proof of work'. Messages are broadcast to everyone to the best of a computer's ability. If a computer leaves the network, they can join anytime later and accept the longest chain to see what happened when they were gone.

Note : Don't worry if you don't understand much of the above. It will be explained piece by piece in paragraphs below which simplify the rest of the paper for you.

Introduction

Most commerce online was (and is) through central payment authorities like banks. If Bob pays Alice, he has to trust his Bank X. But the bank can do shady things like removing the balance from Bob's account but send it to someone else (who will complain if they get free money), the bank might freeze Bob's account and refuse to let Bob do anything with his money, or interfere in some other unwanted way.
What if people could just pay each other one to one, like they can with paper cash, without the bank ever finding out or controlling who paid who. That's what peer-to-peer online transactions can do for us. These transactions should be irreversible, like cash (people shouldn't be able to 'take back' their money using some computer trickery, they'll have to ask back for it like with cash). It should also not be double-spendable (you shouldn't be able to copy your coins and spend them again).

The paper solves this double spending problem by putting transactions in a chronological order, and having the majority of the network agree on what this order is using proof of work. Now, if someone double spends, their first use of the coin is taken as valid, but the rest are not.

Transactions

What is a bitcoin ? It is a chain of digital signatures. For example, here is a coin that Samir owns.

A coin appears in Mohan's wallet
Mohan sends the coin to Gina  <sign : Mohan>
Gina sends the coin to Rina   <sign : Gina>
Rina sends the coin to Samir  <sign : Rina>

Samir owns this 'coin', because the coin which originally appeared mysteriously in Mohan's wallet was passed on to Samir through a series of gives and takes. The first line is different and talks of 'a coin', but this illusion too will disappear as we eventually learn of proof of work- there is no such coin. There is an abstract notion of this coin, but in truth the coin is just a series of transactions. In between these series of transactions, some special sorts of transactions happen, for example the above 'A coin appears in Mohan's wallet'. We'll learn what these special transactions are and how they work when we move to proof of work. They increase the total bitcoins in the world.

There is a flaw in this version of list of transactions. Bob can copy the list, in one list he can add the line where he pays Alice, and in other where he pays himself. Now, he sends Alice the list where he pays her, and everyone else the list where he pays himself. This is an issue, because Alice immediately gives Bob her favorite game in exchange for the coin, and later finds out that she was tricked when she talks to other people who show her their lists.

Timestamp server

Solution is to somehow make everyone agree on the order of transactions. If everyone thinks Bob paid himself, Alice would too, and not give him the game. If everyone thinks Bob paid Alice, then a succcessful transaction occurs and Alice can happily give her game to Bob in exchange for a shiny bitcoin. Either way, having everyone agree on what happened (and in what order) can solve our problem. Let's go find a solution. First, we need a way of finding what order transactions happened in. We group transactions into "blocks" which also contain some other stuff. Now, if block B includes block A, then when B was created, A existed already. Similarly, if block B includes hash of block A, then when B was created, A already existed. We take the latter approach to save space, since hashes are small and a fixed size (sha-256 is 256 bits). We can think of this as chaining the blocks B and A, with B coming after A. A->B. Hence "blockchain".

Proof of work

We have started chaining blocks together, but say some bad people start writing false things on their copy of the chain, and then tell the rest of the network that their chain is right, who is stopping them. That's what Bob did with Alice when he stole (and not bought) her game. Chain conflicts arise, and we have to find a way of resolving them to make everyone agree on what happened.
Bad guys will also try to create lots of fake accounts, and swarm the honest nodes (called a "sybil" attack). What do we do ?

We will require each block to be hashed to a hash which has an integer value less than threshold t. For this, there will be an arbitrary value called 'nonce' in the block, which we can alter until the block hash is less than t. All the computers in the network try to find the value of
the nonce which gives a desirable hash, and only then is the block added to the chain. If the majority computer power is controlled by honest computers, then they will win with greater probability in finding this nonce, and therefore only 'non-malicious' blocks will be added to the chain. If there are two conflicting chains of very different sizes, the longer one is accepted as right, because a greater majority of computing power has likely gone in it. If the conflicting chains are about the same size, work on the longer one, but keep the smaller one too because it might outgrow the current longer one. The smaller chain is rejected if it grows more than 6 blocks smaller (arbitrary choice, the more the better).
Even now, someone can create bots, but the bots do not get much power because the attacker will also need to buy computing power for each bot. An attacker has limited computing power, and splitting it into multiple bots will not help him. Now, when chain conflicts arise, we have a way of solving them. Thus only one chain remains victorious, giving computers a way to agree on a history of transactions. If Bob tries to spend a coin twice, one of the spendings will happen before another, and only the one that happens first will be taken as valid.

(I've ommitted the network section as it is very self explanatory, and good starting point if you want to read original research papers. I encourage you to read it from the original paper)

Incentive

The person who comes up with the correct proof of work (nonce) for a block gets extra bitcoin in reward out of nowhere. That's how "coins are minted". Or in bitcoin terms, mined. Mining is just finding the correct nonce for a block. Recall that there was a mysterious 'A coin appears in Mohan's wallet' line in an example above. That coin appeared in Mohan's wallet because he was the person who found the correct nonce for some block.

Reclaiming disk space + Simplified payment verification

We'll use Merkle trees here to save space. Instead of a block containing a bunch of transactions, it will just contain the root hash of the merkle tree of those transactions.

If someone wants to verify that a transaction exists in a block, they can request just the necessary hashes alongwith the transaction from the nodes that store entire blocks. The person who trusts the root hash will have to trust these preimages.

Combining and splitting value

There is a slight change in the way transactions actually work. A transaction takes any number of inputs, each of which is a transaction too, and two outputs, one for payment, and one for the amount of change that the person sends to himself or herself. The input transactions must have a total amount greater than the sum of the two outputs, payment + change. This allows an easier way to avoid someone overspending than going through the entire history ourselves.

Privacy

Even though every transaction happens publicly, your name is not in there. Just publickey1 sent so and so money to publickey2 from the transactions x,y,z.

Calculations

This part considers the probability of Bob successfully cheating Alice and stealing her money, when both Bob and Alice try to be smart. Alice knows that it is better to wait for a while until other blocks come up, and then give Bob her game instead of right away. Bob will still try to work on a fake, longer blockchain and make others believe he never paid Alice the coin. But for that, he will have to calculate a nonce for all of his fake blocks himself, and outgrow the chain used by the rest of the network. How hard is it ?

This section proves using probability theory that the probability that Bob will ever be able to fool the network decreases with the number of blocks added to the chain. If you want to be > 99.9% sure that Bob paid you, here is the number of blocks you should wait for, in proportion to the amount of computing power Bob controls.

Bob's power in the chainWait for at least these many blocks
10%5
15%8
20%11
35%41
40%89
45%340

If Bob controls greater than 50% of the power, he can always fool the network successfully. Bitcoin relies on the honest nodes controlling the majority of the computing power, otherwise the so-called 51% attacks become possible.

Wrapping up

Hopefully the explanation above provided a more layman friendly version of the bitcoin paper. A good way to check if you have understood the paper is, to see if you can explain :

  1. The problem the paper solves, and why this problem matters
  2. How the paper solves the problem
  3. Where the paper falls short (this is usually the direction in which other papers can be written in the future)

I strongly encourage you to go to the original paper and try to read it yourself now.