0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

The Two Generals' Problem (二人の将軍問題)

Last updated at Posted at 2021-12-06

XX 2021

I was tired... I was exhausted... I’d just got off work and I didn't have the energy to cook anything. So I went to my phone, opened up a certain food-delivery app that’s popular in the Japan, and I ordered pizza.
Now, I know that food-delivery company’s employment practices are questionable, there are more ethical ways to get dinner delivered.
But I was tired, and I was hungry.

But that was the "Night of the Multiple Orders", when a bug in that app meant that some people around Japan ended up with identical food orders being delivered two or three times, and others got nothing at all.
And I nearly got caught up in the chaos!

To explain what happened, I need to tell you a story about two generals.

The Two Generals' Problem is a classic of computer science, and it goes like this:

Picture a valley.
In the middle of the valley is a heavily fortified castle.
On the edges of the valley are two armies.

tgp_01.png

The generals of these armies know that the only way they can win a battle and overwhelm the castle is if they both attack at the same time.
A single army isn't going to make it. They need the combined strength from both sides of the valley to win. The only way they can communicate is by sending messengers on a risky path through the valley, and General A won’t know what the right time is until everyone’s already in position.

How can those two generals coordinate to make sure they attack at the same time?
This is a magical computer-science-land problem, by the way, so reasonable suggestions like “semaphore” or “telescopes” don’t apply.

On the surface the problem seems trivial.
General A could just send a message to General B with a proposed time, say, 8 o'clock. But the messenger has to pass through the valley, and if they’re spotted, they’re not going to make it to the other side to deliver the message. So how does General A know that General B received the message? The messenger might not have made it. If that happens, A will attack, B won’t, and they’ll lose.
So maybe they arrange it so General B has to send an acknowledgment back, and General A will only attack if that acknowledgement arrives.
But that now runs into the same problem: how does B know that A has received the acknowledgement?
If it doesn’t get through, A won’t attack, B will, and they’ll lose.
So, General A could send another acknowledgement for the acknowledgement. But how do they know that message has gotten through? Well, General B could send an acknowledgment for the acknowledgement for the acknowledgement and so on, and so on, and so on.

tgp_02.png

This problem is unsolvable.
I know, it feels like there should be some hacky workaround like sending 200 messengers, and sure, that would probably work in the real world. But this is magical information-theory computer-science land.
Under these strict rules, there is never a guarantee, there is no certainty, there is no arrangement that can be made, there is no way, that the two generals, the two computers sending data, can agree that the message has definitely been received and acknowledged.

Anyway. I was ordering food and I put my order together, I tapped ‘pay’, input my password and I got the little Apple Pay progress bar, and the little tick. And then I got a message from the app saying that there had been a problem, and my order had failed to go through.
Would I like to try again?
And I was about to... I was about to hit ‘pay’ again, but then something, just in the back of my head, said: HANG ON!.
There was that little tick saying payment had worked, and I’m enough of a computer nerd to go "I’m not sure I believe that failed". So I checked the ‘order history’ page. It took a few tries to load, but when it finally did, there was my order.

Processing

It had gone through, but the acknowledgement hadn’t come back. Or, rather, something had gone wrong on the app’s servers, and the logic they’d written thought it had failed when it hadn’t. So I sat tight, I hoped that my food would arrive, and I figured that the engineers were probably having a very bad day. They really were lol

Because I wasn't the only one. People all over the Tokyo, ordering via the app were going to the payment screen, hitting the button and getting "try again". And a lot of them did. Again, and again, and again.

They were General A, and the app’s server was General B, and they were part of a real-life, complicated version of the Two Generals' problem.
Imagine all the customers as General A, sending message after message to General B. B received the messages, dutifully took the money from the credit card every time but something had happened that stopped the confirmation message getting through.

According to the flood of angry reports on Twitter, sometimes the restaurant would realize the problem and just send one order. Sometimes the restaurant wouldn’t realize, and three different drivers would arrive with three identical orders. Sometimes no food would arrive at all. The app’s customer service line was swamped.

To be clear: this was not the sort of thing that is one engineer’s fault. When something goes this drastically wrong, there have been many poor decisions made over a long period of time. A single human error is never a root cause.

So what else could the app team have done?
How can you solve the Two Generals' Problem in the real world?

Well, first, maybe no-one should have been able to place two identical orders on the same credit card, for the same restaurant, within a few minutes of each other. That seems like the sort of thing there should have been a check for?

But I think the real solution is an “idempotency token”, or an “idempotency key”.
This is a unique value generated on the app, or on the web site.
It’s a shopping cart ID, basically, and it’s sent along with the order.

It's not just for shopping carts, though, the idempotency token could be attached to instructions to delete the oldest log file, or send a text message, or anything that you only want to happen once. The server stores the idempotency key to keep track of the request and if another request arrives with the same key attached, then the server knows it’s already dealt with that request. So it doesn’t fulfill it again; instead it knows that the reply didn’t get through, and it just sends back a copy of that first acknowledgement again.

Now, that still won’t help if none of the messengers get through, if the connection completely fails, but for real-world problems, humans will notice that. Idempotence means that you can request the same thing multiple times and it’ll only ever happen once.

That’s the way to fix the Two Generals' Problem.

I was lucky. I placed one order, I was charged for one order, and one order of food arrived half an hour later.

Next time, I’ll just cook for myself.

Reference(日本語): https://ja.wikipedia.org/wiki/%E4%BA%8C%E4%BA%BA%E3%81%AE%E5%B0%86%E8%BB%8D%E5%95%8F%E9%A1%8C
https://aws.amazon.com/jp/builders-flash/202104/serverless-idempotency/?awsf.filter-name=*all

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?