In my previous post, I talked about the two most important new functionalities in the 0.2 version that's currently in testing. It's particularly the second of those two, the defence against privacy degradation via snooping on utxos, that's important, and most in need of thought, analysis and explanation.
The more I think about it, the more it seems that this can only be properly understood in a wider context, so I'll, here, attempt to answer these questions:
This is how the conversation went pre-0.2, and it will be mostly unchanged; note that this is happening simultaneously for several Makers (and one Taker):
|I'd like to fill your offer number 0 (which allows any amount between 0.5 and 5btc) to do a coinjoin, I want amount 1btc [fill]||>>>|
|<<<||OK, here's my ECDH pubkey [pubkey]|
|OK, here's mine [auth] **||>>>|
|FROM NOW ON THE CONVERSATION IS E2E ENCRYPTED|
|<<<||Here's a list of my utxos for your transaction [ioauth]**|
(Builds transaction after getting utxos from all Makers).
OK, here's the transaction, please sign it [tx]
|<<<||Here's my signature(s) [sig]|
(gathers all the signatures)(adds his own signatures)
Broadcasts transaction to Bitcoin network
** - this misses some details that aren't vital for this discussion.
The words in  are the names of the messages, some of which are referred to in the discussion below.
We'll focus on the, most likely, frequent goal: if an attacker already knows the source of the inputs, he wants to identify the coinjoin output of the initiator (Taker) of the transaction (figuring out the change output is always easy). Here's a simple diagram of a basic Joinmarket coinjoin transaction, nabbed from here :
Remember, the aspect of CoinJoin that Joinmarket leverages, as illustrated in the "coinjoin outputs" above, is that those equal sized outputs are indistinguishable. The idea is that, even if you know the ownership of the inputs (utxo0,..4 above) (which you might not), it doesn't help you figure out the ownership of utxos 5,6 and 7.
Now, the attacker doesn't just look at one transaction - he looks at all the Joinmarket transactions, of which this is just one. Suppose the initiator (Taker) of the transaction spent utxo1 to utxo5 and utxo8 (fees are ignored in this example). That would mean that utxo6 and utxo7 belong to the other participants - the "Makers" in joinmarket. The attacker can prove this as follows:
fill- after a few handshake negotiation messages, the maker will return an
ioauthmessage, which lists the utxos that the Maker proposes to include in the new transaction. Now the attacker has no intention of going through with it; he just wanted that list of utxos. Depending on stuff (we'll get to this below), this list (it looks like txid:n, txid:n .. ) might include one of the outputs as above, for example utxo6. If so, he crosses off utxo6 on his diagram above; he knows that one wasn't the one that belonged to the Taker.
fillmessage has to specify the amount of coins that you want to join). Mostly he'll try to query with the maximum amount, so as to get the list of all utxos, but different Maker bots have a variety of weird ways of specifying their offers; sometimes they offer all coins in one mixdepth, sometimes it's a complex mixture. He might find that he needs to do multiple queries to the same bot to get the maximum possible snooping effect, and in some cases it might not be enough. Second, he cannot be too slow in doing these queries, for the important reason that the utxos he's looking for might get used in new transactions, thus getting consumed, and will no more be offered in ioauth messages from that Taker. At that point he has permanently lost the opportunity to identify the Maker as the holder of that utxo.
So let's paint the positive picture for the attacker, in Joinmarket pre-0.2. He gets to make these queries as often as he likes, subject only really to the limitations of the messaging channel (the IRC server). That means, very often, if he feels like it. His bots can be restarted as often as he likes, he can connect over Tor - there is no way to flag his identity and ban him, for exactly the same reason that Joinmarket users appreciate having an anonymised messaging service. Most Joinmarket users strongly favour using Tor for this same, good reason. Note, further, that his actions up to the point of dropping the transaction are exactly the actions of an honest participant. So he does tons of request->utxo collection pings on all the Makers he wants, perhaps a few for each, to get a good chance of hoovering up within his net, the utxos that he is looking for (I have thus far emphasised the coinjoin outputs, they're critical, but of course he is also collecting utxos that will appear as inputs in the next transactions - often they will be the same, plus previous change). If he manages to hoover the right ones up before they get consumed in a new transaction, he's done his job - he knows, say, that utxo6 belongs to maker A and utxo7 belongs to maker B and that by elimination, utxo5 was the output of the Taker.
If you've got the basic gist from that description (the exact details are fiddly and in some sense beside the point), then this quote of mine from the original discussion may resonate:
The fundamental issue is that makers announce their utxos to anyone who (pretends to) start a transaction with them; whether this can be avoided without creating hideous DOS issues I don't know.
So, given that using any kind of identities is not an acceptable solution, the only way we can stop this is by taking a combination of two approaches:
We'll take these in reverse order, and then see the overall effect on an attacker.
The mechanism in 0.2.0 has been covered in depth in these posts. It was also mentioned that there are other possibilities, so let's treat the high level concept of "utxos are limited". Now, the Taker (who may be an attacker) does not, in the conversation sequence shown in the preamble, have to provide a utxo (actually he did for another reason but that is off topic so ignored), so this is a new field in the message; instead of
!fill offer-id amount
we switch to
!fill offer-id amount commitment=(type byte,hash(pubkey2)) nick-signature
The significance of
nick-signature is discussed in the previous blog post and so ignored here. The commitment is a SHA256 hash of the public key corresponding to a pay-to-pubkey-hash (P2PKH), i.e. standard, Bitcoin utxo (so technically it's a commitment to a Bitcoin keypair, which is a bit more restrictive than a utxo, although hopefully most people are not re-using addresses too much!), derived from PoDLE as previously described. The commitment is prepended with a type byte, in this case "P", for future compatibility with other commitment types.
The utxo committed to is required to have certain key properties as described in the section Adding commitments to slow down snoopers . If the commitment doesn't pass the test of non-reuse, the Maker drops the Taker right after the fill message. If the commitment is new, but on opening of the commitment - which in the new 0.2 protocol comes in the auth message:
!auth commitment-opening=(utxo,pubkey,pubkey2,schnorr-sig) nick-signature
- it is found to not pass those tests, then again the Taker is dropped - before the ioauth message is sent, thus the Taker, if an attacker, does not achieve his goal of utxo list gathering.
How many utxos, and of what type does an attacker need? Let's imagine some numbers (these represent a bigger, more active Joinmarket than today, but only by a smallish factor):
Number of makers in the pit \(N_m = 100\), number of makers in one transaction \(N_t = 3\) (note that's 4 counterparties including Taker in a typical Joinmarket transaction, about right). That means there are 3 coinjoin outputs in a transaction that the attacker is trying to "place". Number of transactions per hour \(\alpha = 20 \). Number of queries to one maker needed by the attacker to retrieve the intended utxo, if indeed it can be retrieved: \(Q = 2\). Success probability after \(Q\) queries: \(p = 0.8\). In the crudest model you'd consider the attacker needing to query every maker in the pit after every transaction occurs, which would need: \(\beta = \alpha N_m Q= 4000\) separate commitments per hour. Given
taker_utxo_retries = 3 (see previous blog post), that would mean ~ 1000 utxos needed, satisfying age requirements in each hour, and perhaps more importantly, their size has to be at least
taker_utxo_amtpercent percent (default 20%) of the largest available size in the offers, typically - so even if a maker just did a transaction of only 1 btc, if his offer is 10 btc, your attacker utxo commitment still needs to bind to a commitment of 2 btc all the same, in order to pass the auth check in that case.
Now, here's a big reason that this is, currently, unrealistic: Most Makers today reannounce their offers immediately after doing a transaction (they basically make adjustments because the number of coins on offer is changing etc.). That gives the attacker a big headstart: he can focus on querying those reannouncements, and if he's successful, stop there (for that utxo). In the worst case you could imagine only needing \(\beta = \alpha N_t Q = 120\) instead of 4000. The truth may be between extremes, accounting for e.g. bot restarts and a host of other factors.
One could easily imagine, that to fulfil commitment requirements, the attacker will need to generate utxos holding maybe even in the hundreds of BTC (remember: 20% restriction on large fills), refreshing them into new utxos by doing transactions every hour (the 1 hour figure is particularly relevant since by default we have chosen
taker_utxo_age to be 5 blocks). They don't have to spend them, but they have to keep cycling them in transactions (and an amusing detail is: they can be tracked by the honest Joinmarket participants - that'd make for a curious role reversal..).
But, does the attacker need to have fresh utxos satisfying these criteria every hour, even under these circumstances?
The above guesstimate scenario is so full of wild assumptions as to be borderline ridiculous, but we have to start somewhere. We find an important dynamic going on: the attacker has to try to keep up with pit activity. If he does the ~ 4000 queries as outlined above in one hour, and then takes a rest for a few hours, he is failing in his goal (if his goal is to identify the ownership of all outputs, anyway): this is a job that can't be done retroactively. Once a utxo is generated in transaction 1, and then used later in transaction 2, it will never again appear in any Maker's offer, so it cannot be identified using this attack any more. So we see there is a race going on between the attacker and the honest pit - the faster the real Joinmarket transactions are taking place, the faster the attacker has to query Makers in order to catch utxos between transactions. In the simplest possible mathematical terms, \(\beta \propto \alpha\) - there's a linear proportionality between real transaction rate and commitment requirement for snoopers, no matter what are the right values for other variables. And since utxos of certain size and age are fundamentally scarce, this is good news. In the limit, one can imagine an optimistic future with even 100 joinmarket transactions per individual block, making the attacker's job here nigh on impossible.
It seems entirely possible to build a fairly sophisticated mathematical model of the dynamics of this race; but we'll leave that for another time.
Apart from the rate-limiting element, the other crucial element is how reliably and frequently does a request for utxos result in receiving the utxos that the attacker is looking for? This would address the values \(Q\) and \(p\) in the above. We want to increase the former and decrease the latter as much as possible. I see several ways to address this from within the algorithms, and patterns of behaviour of the Makers. To quote Chris Belcher in the same discussion as above:
So maybe part of the solution is a polyculture of maker algorithms ?
Let's make a list again!
In summary, the measures described in the previous 2 sections, in some reasonable combination, may greatly reduce the privacy loss we are seeing from an unlimited attacker-querier environment; although there's no reason to think such attacks would stop entirely. The first of the two (utxo rate limiting) is now basically complete and running on testnet. The latter is not really (although just running yg-basic is a good start).
But what I think is more significant is, as mentioned in the subsection "The Race", the more Joinmarket scales up in size, the more difficult it gets to make these attacks work. In the extreme, if Joinmarket scaled up by a factor of 10 to say 700 Makers and 50 transactions per hour (completely made up numbers, like all the rest!), the attack might become almost completely ineffective.Share on Twitter Share on Facebook
Adam Gibson (23)