Racing against snoopers in Joinmarket 0.2

Posted by: Adam Gibson | in Joinmarket | 1 year, 1 month ago | 8 comments

In my previous post, I talked about the two most important new functionalities in the 0.2 version that's currently in testing. It's particularly the second of those two, the defence against privacy degradation via snooping on utxos, that's important, and most in need of thought, analysis and explanation.

The more I think about it, the more it seems that this can only be properly understood in a wider context, so I'll, here, attempt to answer these questions:

  • What is an attacker trying to achieve?
  • How does he achieve it now (if indeed he does)?
  • How does the new functionality in 0.2.0 make it more difficult to achieve?
  • What other measures can or should be taken to make it harder?

Preamble - a reminder of the basic Joinmarket conversation

This is how the conversation went pre-0.2, and it will be mostly unchanged; note that this is happening simultaneously for several Makers (and one Taker):

Taker Maker
I'd like to fill your offer number 0 (which allows any amount between 0.5 and 5btc) to do a coinjoin, I want amount 1btc [fill] >>>
<<< OK, here's my ECDH pubkey [pubkey]
OK, here's mine [auth] ** >>>
FROM NOW ON THE CONVERSATION IS E2E ENCRYPTED
<<< Here's a list of my utxos for your transaction [ioauth]**

(Builds transaction after getting utxos from all Makers).

OK, here's the transaction, please sign it [tx]

>>>
<<< Here's my signature(s) [sig]

(gathers all the signatures)(adds his own signatures)

Broadcasts transaction to Bitcoin network

** - this misses some details that aren't vital for this discussion.

The words in [] are the names of the messages, some of which are referred to in the discussion below.

What is the attacker trying to achieve?

We'll focus on the, most likely, frequent goal: if an attacker already knows the source of the inputs, he wants to identify the coinjoin output of the initiator (Taker) of the transaction (figuring out the change output is always easy). Here's a simple diagram of a basic Joinmarket coinjoin transaction, nabbed from here :

jmtx1

Remember, the aspect of CoinJoin that Joinmarket leverages, as illustrated in the "coinjoin outputs" above, is that those equal sized outputs are indistinguishable. The idea is that, even if you know the ownership of the inputs (utxo0,..4 above) (which you might not), it doesn't help you figure out the ownership of utxos 5,6 and 7.

Now, the attacker doesn't just look at one transaction - he looks at all the Joinmarket transactions, of which this is just one. Suppose the initiator (Taker) of the transaction spent utxo1 to utxo5 and utxo8 (fees are ignored in this example). That would mean that utxo6 and utxo7 belong to the other participants - the "Makers" in joinmarket. The attacker can prove this as follows:

  • Query the makers in the joinmarket pit using fill- after a few handshake negotiation messages, the maker will return an ioauth message, which lists the utxos that the Maker proposes to include in the new transaction. Now the attacker has no intention of going through with it; he just wanted that list of utxos. Depending on stuff (we'll get to this below), this list (it looks like txid:n, txid:n .. ) might include one of the outputs as above, for example utxo6. If so, he crosses off utxo6 on his diagram above; he knows that one wasn't the one that belonged to the Taker.
  • Who should he query? If the pit is full it might be quite a lot of work to query all of them (say 70-100) all the time. Before 0.2.0, though, he can do this, and repeatedly, and quickly. But he can be more intelligent: focus his efforts on Makers that either (a)have just joined (remember: bots have ephemeral names, a newly arrived bot may simply be an old one with a new name) or (b)have newly reannounced their orders, something that they sometimes do after a transaction.
  • How often should he query? This turns out to be a really important question. First, doing only one query means he has to decide what amount to query with (the fill message has to specify the amount of coins that you want to join). Mostly he'll try to query with the maximum amount, so as to get the list of all utxos, but different Maker bots have a variety of weird ways of specifying their offers; sometimes they offer all coins in one mixdepth, sometimes it's a complex mixture. He might find that he needs to do multiple queries to the same bot to get the maximum possible snooping effect, and in some cases it might not be enough. Second, he cannot be too slow in doing these queries, for the important reason that the utxos he's looking for might get used in new transactions, thus getting consumed, and will no more be offered in ioauth messages from that Taker. At that point he has permanently lost the opportunity to identify the Maker as the holder of that utxo.

How does he achieve it now (if indeed he does)?

So let's paint the positive picture for the attacker, in Joinmarket pre-0.2. He gets to make these queries as often as he likes, subject only really to the limitations of the messaging channel (the IRC server). That means, very often, if he feels like it. His bots can be restarted as often as he likes, he can connect over Tor - there is no way to flag his identity and ban him, for exactly the same reason that Joinmarket users appreciate having an anonymised messaging service. Most Joinmarket users strongly favour using Tor for this same, good reason. Note, further, that his actions up to the point of dropping the transaction are exactly the actions of an honest participant. So he does tons of request->utxo collection pings on all the Makers he wants, perhaps a few for each, to get a good chance of hoovering up within his net, the utxos that he is looking for (I have thus far emphasised the coinjoin outputs, they're critical, but of course he is also collecting utxos that will appear as inputs in the next transactions - often they will be the same, plus previous change). If he manages to hoover the right ones up before they get consumed in a new transaction, he's done his job - he knows, say, that utxo6 belongs to maker A and utxo7 belongs to maker B and that by elimination, utxo5 was the output of the Taker.

If you've got the basic gist from that description (the exact details are fiddly and in some sense beside the point), then this quote of mine from the original discussion may resonate:

The fundamental issue is that makers announce their utxos to anyone who (pretends to) start a transaction with them; whether this can be avoided without creating hideous DOS issues I don't know.

So, given that using any kind of identities is not an acceptable solution, the only way we can stop this is by taking a combination of two approaches:

  1. Make the ability to gather utxos in a transaction request as limited, or unreliable as possible (yield generator coin selection algorithms)
  2. Make the ability to gather utxos rate limited by enforcing some kind of scarcity (taker commitment schemes)

We'll take these in reverse order, and then see the overall effect on an attacker.

How does the new functionality in 0.2.0 make it more difficult to achieve?

The mechanism in 0.2.0 has been covered in depth in these posts. It was also mentioned that there are other possibilities, so let's treat the high level concept of "utxos are limited". Now, the Taker (who may be an attacker) does not, in the conversation sequence shown in the preamble, have to provide a utxo (actually he did for another reason but that is off topic so ignored), so this is a new field in the message; instead of

!fill offer-id amount

we switch to

!fill offer-id amount commitment=(type byte,hash(pubkey2)) nick-signature

The significance of nick-signature is discussed in the previous blog post and so ignored here. The commitment is a SHA256 hash of the public key corresponding to a pay-to-pubkey-hash (P2PKH), i.e. standard, Bitcoin utxo (so technically it's a commitment to a Bitcoin keypair, which is a bit more restrictive than a utxo, although hopefully most people are not re-using addresses too much!), derived from PoDLE as previously described. The commitment is prepended with a type byte, in this case "P", for future compatibility with other commitment types.

The utxo committed to is required to have certain key properties as described in the section Adding commitments to slow down snoopers . If the commitment doesn't pass the test of non-reuse, the Maker drops the Taker right after the fill message. If the commitment is new, but on opening of the commitment - which in the new 0.2 protocol comes in the auth message:

!auth commitment-opening=(utxo,pubkey,pubkey2,schnorr-sig) nick-signature

- it is found to not pass those tests, then again the Taker is dropped - before the ioauth message is sent, thus the Taker, if an attacker, does not achieve his goal of utxo list gathering.

Quantitative guesstimates

How many utxos, and of what type does an attacker need? Let's imagine some numbers (these represent a bigger, more active Joinmarket than today, but only by a smallish factor):

Number of makers in the pit \(N_m = 100\), number of makers in one transaction \(N_t = 3\) (note that's 4 counterparties including Taker in a typical Joinmarket transaction, about right). That means there are 3 coinjoin outputs in a transaction that the attacker is trying to "place". Number of transactions per hour \(\alpha = 20 \). Number of queries to one maker needed by the attacker to retrieve the intended utxo, if indeed it can be retrieved: \(Q = 2\). Success probability after \(Q\) queries: \(p = 0.8\). In the crudest model you'd consider the attacker needing to query every maker in the pit after every transaction occurs, which would need: \(\beta = \alpha N_m Q= 4000\) separate commitments per hour. Given taker_utxo_retries = 3 (see previous blog post), that would mean ~ 1000 utxos needed, satisfying age requirements in each hour, and perhaps more importantly, their size has to be at least taker_utxo_amtpercent percent (default 20%) of the largest available size in the offers, typically - so even if a maker just did a transaction of only 1 btc, if his offer is 10 btc, your attacker utxo commitment still needs to bind to a commitment of 2 btc all the same, in order to pass the auth check in that case.

Now, here's a big reason that this is, currently, unrealistic: Most Makers today reannounce their offers immediately after doing a transaction (they basically make adjustments because the number of coins on offer is changing etc.). That gives the attacker a big headstart: he can focus on querying those reannouncements, and if he's successful, stop there (for that utxo). In the worst case you could imagine only needing \(\beta = \alpha N_t Q = 120\) instead of 4000. The truth may be between extremes, accounting for e.g. bot restarts and a host of other factors.

One could easily imagine, that to fulfil commitment requirements, the attacker will need to generate utxos holding maybe even in the hundreds of BTC (remember: 20% restriction on large fills), refreshing them into new utxos by doing transactions every hour (the 1 hour figure is particularly relevant since by default we have chosen taker_utxo_age to be 5 blocks). They don't have to spend them, but they have to keep cycling them in transactions (and an amusing detail is: they can be tracked by the honest Joinmarket participants - that'd make for a curious role reversal..).

But, does the attacker need to have fresh utxos satisfying these criteria every hour, even under these circumstances?

The race

The above guesstimate scenario is so full of wild assumptions as to be borderline ridiculous, but we have to start somewhere. We find an important dynamic going on: the attacker has to try to keep up with pit activity. If he does the ~ 4000 queries as outlined above in one hour, and then takes a rest for a few hours, he is failing in his goal (if his goal is to identify the ownership of all outputs, anyway): this is a job that can't be done retroactively. Once a utxo is generated in transaction 1, and then used later in transaction 2, it will never again appear in any Maker's offer, so it cannot be identified using this attack any more. So we see there is a race going on between the attacker and the honest pit - the faster the real Joinmarket transactions are taking place, the faster the attacker has to query Makers in order to catch utxos between transactions. In the simplest possible mathematical terms, \(\beta \propto \alpha\) - there's a linear proportionality between real transaction rate and commitment requirement for snoopers, no matter what are the right values for other variables. And since utxos of certain size and age are fundamentally scarce, this is good news. In the limit, one can imagine an optimistic future with even 100 joinmarket transactions per individual block, making the attacker's job here nigh on impossible.

It seems entirely possible to build a fairly sophisticated mathematical model of the dynamics of this race; but we'll leave that for another time.

What other measures can or should be taken to make it harder?

Apart from the rate-limiting element, the other crucial element is how reliably and frequently does a request for utxos result in receiving the utxos that the attacker is looking for? This would address the values \(Q\) and \(p\) in the above. We want to increase the former and decrease the latter as much as possible. I see several ways to address this from within the algorithms, and patterns of behaviour of the Makers. To quote Chris Belcher in the same discussion as above:

So maybe part of the solution is a polyculture of maker algorithms ?

Let's make a list again!

  1. Introduce randomization into the utxos that a Maker selects for a particular transaction - will tend to increase the number of requests to find that utxo.
  2. Prefer algorithms that restrict the offers to one mixdepth; as was the case for the first, simplest one coded: yield-generator-basic. This is maybe the least fun idea but an extremely important one in my opinion: limit and isolate the utxos on offer at any particular time, and in particular if it so happens that that set does not include the utxo the attacker is looking for, the game is temporarily won - but it's complex because at some point in the future, that utxo is going to be re-announced as available for joining usually - what's clear is it makes the attacker's job much harder if this doesn't happen quickly. The tension we're trying to address is that the Makers are trying to maximize their marketability (e.g. offering a variety of different cost rates for different utxos), while at the same time preserving privacy, but if we are more aggressively "privacy preserving" like this, it makes the whole system much more viable ... slight tragedy of the commons effect.
  3. Restart bots frequently - this simple measure forces the attacker to make more queries, although it is helpful if offer announcement does not make it immediately obvious that it's the same bot as before!
  4. Don't make the algorithm of the maker obvious directly from it's public pit offer announcements; that makes the attacker's job easier since he can fine tune his fill requests based on knowing the code/algorithm in advance.
  5. Prefer algorithms that don't require re-announcement of offers after transaction completion, as this is a very useful marker for an attacker as to who to query.
  6. More exotic ideas - cross-maker utxo consumption e.g. maker does "real" Joinmarket transaction, then switches role to Taker, does another join with the other makers quickly using up the utxo (to be honest I find this idea, which I just made up, entirely confusing and maybe nonsensical, but just throwing it out there)
  7. More, no doubt...

Conclusion

In summary, the measures described in the previous 2 sections, in some reasonable combination, may greatly reduce the privacy loss we are seeing from an unlimited attacker-querier environment; although there's no reason to think such attacks would stop entirely.  The first of the two (utxo rate limiting) is now basically complete and running on testnet. The latter is not really (although just running yg-basic is a good start).

But what I think is more significant is, as mentioned in the subsection "The Race", the more Joinmarket scales up in size, the more difficult it gets to make these attacks work. In the extreme, if Joinmarket scaled up by a factor of 10 to say 700 Makers and 50 transactions per hour (completely made up numbers, like all the rest!), the attack might become almost completely ineffective.

Current rating: 5

Comments

Don 1 year, 1 month ago

Great explanation thx, much clearer now. Keep at this it's gonna work

Link | Reply
Current rating: 5

Adam Gibson 1 year, 1 month ago

Thanks! (and for the comment, now I know they work :)

Link | Reply
Currently unrated

Anonymous 1 year, 1 month ago

It is a great post. But one way for a snooper to find Maker utxos is missing. The snooper can do a complete JoinMarket transaction. They know which outputs are theirs, so they know which outputs belong to Makers. Makers will use those outputs as utxo inputs for future transactions.

So the snooper learns Maker utxos plus they get new utxos of their own at the same time. They have to pay fees to the Makers of course.

I do not know if this is a viable way for a snooper to operate or not.

Another way to make a snooper's job more difficult is to have Makers randomly announce their offers often, and use randomized high/low limits. So if the actual limit for a Maker is 0.1-0.5 BTC they would announce 0.134685-0.496235 BTC and then 0.182591-0.466720 BTC and so on at random times. That would make it harder to know when a Maker has done a transaction. Also Makers should not announce right after they do a transaction, since a snooper can see a CoinJoin transaction has been broadcast to the network recently.

Makers should also randomize their fees slightly to prevent correlation when they restart under a new nickname.

Link | Reply
Current rating: 5

Adam Gibson 1 year, 1 month ago

> It is a great post. But one way for a snooper to find Maker utxos is missing. The snooper can do a complete JoinMarket transaction. They know which outputs are theirs, so they know which outputs belong to Makers. Makers will use those outputs as utxo inputs for future transactions.

Yes, absolutely. Consider two Sybil attack cases: the attacker chooses to be a Maker, or he chooses to be a Taker. The Malte paper (http://weis2016.econinfosec.org/wp-content/uploads/sites/2/2016/05/WEIS_2016_paper_58.pdf) covers the Maker Sybil scenario in some depth, it is an important attack on Joinmarket privacy but I think a somewhat less serious one than this snooping attack, which is currently really zero cost.
It requires, ideally, the Sybils to be all of the Makers for the transaction.

The problem with the Sybil attack consisting of actually being a Maker is that it doesn't scale well; the more honest Makers, the less effective it is. The more separate attackers even (imagine two different attackers), the less effective it is. And it can be strongly ameliorated by choosing Makers in a price insensitive, e.g. random way. The paper linked above focuses on being among the cheapest liquidity providers which is a restrictive case.

We discussed that attack scenario from day 1 (and we discussed this one for more than a year) - I think the former is not a strong argument against Joinmarket being viable, it just illustrates that it's limited.

If you choose to be a Taker you must do this a lot to be really effective, it's quite costly, and again it scales badly - the more other Takers there are, the more activity you'll have to do yourself; at least you are paying the Makers, and you'll encourage more of them - making your job ever harder perhaps. But this is probably not really the point, the attack described in the post is a better way to do it.

Link | Reply
Currently unrated

Adam Gibson 1 year, 1 month ago

> Another way to make a snooper's job more difficult is to have Makers randomly announce their offers often, and use randomized high/low limits. So if the actual limit for a Maker is 0.1-0.5 BTC they would announce 0.134685-0.496235 BTC and then 0.182591-0.466720 BTC and so on at random times. That would make it harder to know when a Maker has done a transaction. Also Makers should not announce right after they do a transaction, since a snooper can see a CoinJoin transaction has been broadcast to the network recently.

> Makers should also randomize their fees slightly to prevent correlation when they restart under a new nickname.

I have been thinking about exactly these same points the last couple of days! Thanks for your input.

One thing I am not so sure about is how far the randomization goes because the Makers obviously want to advertise something very close to their actual maximums. I note that the attacker (right now) is choosing to fill at amounts slightly less than the advertised maximums.
But, overall, good points, thank you.

Link | Reply
Currently unrated

Anonymous 1 year, 1 month ago

Also Makers should not broadcast utxos to the blacklist if the JoinMarket transaction was broadcast to the network and confirmed. Obviously a utxo that has 1 confirmation or more cannot be reused so there is no reason to broadcast it. Only utxos that are part of failed transactions need to be broadcast.

The Maker should wait to see the utxo spent and confirmed before broadcasting. This is also true because a snooper could complete a JoinMarket conversation but never broadcast the transaction. So Makers should broadcast to the blacklist if they do not see the utxo spent on the network after some time (f.e. one hour).

Link | Reply
Current rating: 5

Adam Gibson 1 year, 1 month ago

I'm not sure if you were aware, but what is blacklisted is a commitment which doesn't reveal the actual utxo (see the podle post for details). So there is nothing lost by simply broadcasting the commitment to other makers and having them blacklist it as soon as it gets "used up". As you say, after a transaction negotiation happens, you don't know whether it'll be broadcast or not ( mean the transaction, not the commitment); if the Taker is a snooper and doesn't intend to broadcast it, you can't really wait and see because during that time he could choose to re-use the commitment, defeating our purpose.

Link | Reply
Currently unrated

Don 1 year, 1 month ago

Not sure if you have seen this but just in case


http://bitfury.com/content/5-white-papers-research/bitfury_whitepaper_shared_send_untangling_in_bitcoin_8_24_2016.pdf

Link | Reply
Currently unrated

New Comment

required

required (not published)

optional