NEAR: Randomness in Blockchain Protocols
Last updated
Last updated
We published a short paper today describing random beacons used in the NEAR protocol. In this accompanying blog post, we discuss why randomness matters, why it's hard, and how it's addressed in other protocols.
Many modern blockchain protocols, including NEAR, rely on a source of randomness to select participants who perform certain actions in the protocol. If malicious actors can influence this source of randomness, they can increase their chances of being selected and potentially compromise the security of the protocol.
Distributed randomness is also a key component of many distributed applications built on blockchains. For example, a smart contract that accepts participant bets and pays 49% of the double amount and not 51% assumes that it can obtain an unbiased random number. If malicious actors can influence or predict the nonce, they can increase the chances of getting paid in the smart contract and drain it.
When we design a distributed random algorithm, we want it to have three properties:
It has to be unbiased. In other words, no participant can influence the outcome of the randomizer in any way.
It needs to be unpredictable. In other words, no actor can predict what number will be generated (or reason about any of its properties) before it is generated.
The protocol needs to tolerate a certain percentage of participants going offline or trying to deliberately stop the protocol.
In this article, we'll cover the basics of distributed random beacons and discuss why simple methods don't work. Finally, we will introduce the methods used by DFinity, Ethereum Serenity, and NEAR, and discuss their advantages and disadvantages.
RANDAO is a very simple and therefore very common method of randomness. The general idea is that the participants of the network first all privately choose a pseudo-random number, submit a commitment to this privately chosen number, all agree on a set of commitments using some consensus algorithm, and then all reveal their chosen number, A consensus is reached on the displayed number, and the XOR of the displayed number is taken as the output of the agreement.
It is unpredictable and has the same liveness as the underlying consensus protocol, but with a bias. Specifically, once others start revealing their numbers, malicious actors can observe the network and choose to reveal or not reveal their numbers based on the XOR of the numbers observed so far. This allows a single malicious actor to have a little influence on the output, while malicious actors controlling multiple actors have as much influence as the number of actors they control.
To make RANDAO unbiased, one way is to make the output not just XOR, but more execution time than allotted to display the numbers. If the computation time of the final output is longer than the reveal phase, malicious actors cannot predict the effect of their revealing or not revealing their quantity, and therefore cannot influence the outcome.
While we want the function that computes randomness to take a long time for the actors that generate the random numbers, we want the users of the random numbers to not have to perform the same expensive function again. So they need to somehow quickly verify that the random numbers were generated correctly without re-doing the expensive computation.
This long computation time, fast computation verification speed, and unique output for each input is called a Verifiable Delay Function (VDF for short), and it turns out to be very complicated to design. There have been multiple recent breakthroughs, this one and this one, and Ethereum currently plans to use RANDAO with VDF as their stochastic beacon. Besides being unpredictable and unbiased in this approach, it has the added advantage of being active even if only two participants are online (assuming the underlying consensus protocol is active with few participants).
The biggest challenge with this approach is that the VDF needs to be configured in such a way that even actors with very expensive dedicated hardware cannot compute the VDF before the reveal phase is over, and ideally with some meaningful safety margin , eg 10 times. The graph below shows an attack by an actor with a specialized ASIC that allows them to run the VDF faster than the time allotted to reveal the RANDAO promise. Such participants can still compute the final outputs with and without their shares, and choose to show or not to show based on those outputs:
For the VDF family linked on top of a dedicated ASIC, it can be over 100 times faster than traditional hardware. So if the reveal phase lasts 10 seconds, the VDF computed on such an ASIC would have to take more than 100 seconds to get a 10x safety margin, so the same VDF computed on legacy hardware would take 100 x 100 seconds = ~3 Hour.
The way the Ethereum Foundation plans to solve this problem is to design its own ASIC and make it publicly available for free. Once this happens, all other protocols can take advantage of the technology, but until then, the RANDAO+VDF approach is not feasible for protocols that cannot invest in designing their own ASICs.
Another approach to randomness pioneered by Dfinity is to use so-called thresholded BLS signatures. (Interestingly, Dfinity hired Ben Lynn, the L character in the BLS).
A BLS signature is a structure that allows multiple parties to create a single signature on a message, often used to save space and bandwidth by not needing to send multiple signatures. A common use of BLS signatures in blockchains is to sign blocks in the BFT protocol. Suppose 100 participants create a block, and a block is considered final if 67 of them sign it. They can all submit their own partial BLS signatures, then use some consensus algorithm to agree on 67 of them, and then aggregate them into a single BLS signature. Any of the 67 parts can be used to create a cumulative signature, but the resulting signature will be different depending on the aggregated 67 signatures.
It turns out that if the private keys used by the participants are generated in a specific way, the resulting multisignature is the same no matter what 67 (or more, but not less) signatures are aggregated. This can be used as a source of randomness: participants first agree on some message they will sign (it could be the output of RANDAO, or just the hash of the last block, as long as it is different and agreed each time), and in Multi-signature is created on it. The output was unpredictable until 67 participants provided their parts, but even before the first part was provided, the output was predetermined and unaffected by any participant.
This method of randomness is completely unbiased and unpredictable, and it persists as long as 2/3 of the participants are online (though configurable for any threshold). While ⅓ offline or misbehaving participants may stop the algorithm, at least ⅔ of participants are required to cooperate to influence the output.
However, there is a caveat. Above, I mentioned that the private key for this scheme needs to be generated in a specific way. The key generation process, known as Distributed Key Generation, or DKG for short, is complex and an area of active research. In a recent talk, Dfinity presented a very sophisticated approach involving zk-SNARKs, a very complex and time-untested cryptographic structure, as one of the steps. Research on threshold signatures in general, and DKGs in particular, is not yet in a state where it can be easily applied to practice.
The NEAR method is influenced by another algorithm called RandShare. RandShare is an unbiased and unpredictable protocol that can tolerate up to 1/3 of participants being malicious. It's relatively slow, and the linked paper also describes two acceleration methods, called RandHound and RandHerd, but unlike RandShare itself, RandHound and RandHerd are relatively complex, and we want the protocol to be very simple.
Aside from the large number of messages exchanged (participants will exchange O(n^3) messages together), the general problem with RandShare is the fact that while ⅓ is a meaningful liveness threshold in practice, the ability to influence is low . output. There are several reasons:
The benefits of affecting the output may greatly outweigh the benefits of stopping randomness.
If the participant controls more than 1/3 of the RandShare participants and uses it to influence the output, no trace is left. Therefore, malicious actors can do this without being detected. Delaying consensus is naturally visible.
It is not impossible for someone to control 1/3 of the hash power/stake, and given (1) and (2) above, it is unlikely that someone with such control will try to prevent randomness, but can and may try to influence it.
The NEAR method is described in a paper we recently published. It is unbiased and unpredictable, and can tolerate up to 1/3 the activity of malicious actors, i.e. if someone controls 1/3 or more of the participants, they can stop the protocol.
However, unlike RandShare, it can tolerate up to ⅔ of malicious actors before it can affect the output. This is a significantly better threshold for practical applications.
The core idea of the protocol is as follows (for simplicity, assume exactly 100 participants):
Each participant takes their own part of the output, divides it into 67 parts, erasure-codes it to get 100 shares, so that any 67 is enough to reconstruct the output, assign each of the 100 shares to a participant and use that participant's public key. Then they share all encoded shares.
Participants use some consensus (such as Tendermint) to agree on these encoding sets from 67 participants.
Once consensus is reached, each participant gets the encoded shares in each of the 67 sets published this way, encoded with their public key, and decodes all those shares and publishes all of them at once Decoded sharing.
Once at least 67 participants have completed step (3), all agreed-upon sets can be fully decoded and reconstructed, and the final number can be obtained as an XOR of the initial part proposed by the participants in (1).
The idea of why the protocol is unbiased and unpredictable is similar to the idea of RandShare and threshold signatures: the final output is decided after consensus, but no one will be able to decrypt the share until ⅔ of the participants decrypt the share encrypted with their public key. Know.
Handling all corner cases and possible malicious behavior makes it a little more complicated (e.g. a participant needs to deal with the case where someone creates an invalid erasure code in step (1)), but in general the whole protocol is very simple, e.g. with all proofs , the corresponding cryptographic primitives and references to describe it. The entire paper is only 7 pages long. If you want to read a more formal description of the algorithm, or an analysis of its activity and resistance to influence, be sure to check it out.
A similar idea of leveraging erasure coding is already used in NEAR's existing infrastructure, where block producers at a specific epoch create so-called blocks that contain all transactions for a specific shard, and distribute erasure-coded versions of those blocks to others. Block producers provide merkle proofs to ensure data availability.