Building a medianizer

Nice to meet you, and thank you for all your advice for Compound.

As an alternative approach to the oracle, how about the idea of holding off on liquidation for a minute when the price reaches a price that will result in liquidation?

Thank you, Nik, for chiming in and sharing your thoughts. You bring up good points that will certainly be taken into account while developing better oracle infrastructure for Compound.

The immediate goal I have is to get the medianizer built so we can begin to discuss what feeds should be added. In tandem, I am working on getting more exchanges to adopt what Coinbase and Okex have done for signing prices. I don’t foresee Compound adding an oracle aggregate (Chainlink, Band, etc) in the near future.

What I know for certain, whatever is built will not be perfect, and that is okay. The main goal is to improve the current system from one source. I would like to avoid getting caught up in trying to build the perfect oracle solution and instead focus on using resources efficiently.

3 Likes

The core issue here though is that a medianizer always weighs each input equally, actually increasing the risk of data manipulation attacks by assuming each source is of a equal quality at all times. This is an issue no matter what collection of sources are used:

  • Weighing raw exchange data equally with decentralized price feeds that aggregate data from a multitude of exchanges is an issue as it overweighs a select few exchanges. This lowers the cost of manipulation by reducing the number of exchanges a malicious actor has to manipulate to affect the final median value, especially if volume/liquidity consolidates away from those select few exchanges.

  • Weighing data raw from different exchanges equally presents an issue as every exchange has a different level of liquidity and volume (different cost of manipulation). This overweighs exchanges with less volume/liquidity and underweighs those with higher volume/liquidity. Additionally, if volume consolidates to a small number of exchanges, then a malicious actor only has to manipulate the low volume/liquidity exchanges to affect the median value.

  • Weighing different price feed oracle solutions equally is an issue each price feed generates its data in different ways. Some price feeds fetch raw data from a predefined selection of exchanges weighing each equally and thus not generating proper market coverage, while other price feeds fetch from multiple data aggregation firms who have full time data quality teams and monitoring tools in order to generate proper market coverage by weighing each exchange by its real volume/liquidity. Additionally, some price feeds make the mistake of pulling from both raw exchanges and data aggregators, weighing each equally and thus overweighing a select few exchanges.

Essentially, such a medianizer design is vulnerable to these types of issues regardless of what inputs are used. As a result, I do not think this medianizer design will be tamper-resistant enough against data manipulation in order to properly secure a $5.6Bn market, particularly when there already exists decentralized price feed solutions that can be integrated into Compound today like Chainlink that provides proper market coverage.

We have already seen from the DAI Liquidation event that you don’t need to move the entire market to manipulate a price oracle that operates without proper market coverage. Even a highly liquid exchange like Coinbase can deviate from the market wide price, so if the medianizer is overweighing a select few exchanges, then only those exchanges needs to be manipulated (not the whole market) to affect the oracle. While lower liquidity assets are more vulnerable to these issues I’ve listed, the lack of market coverage is actually an issue for all assets on the Compound protocol.

2 Likes

You present five issues:

  1. The median of a set of numbers is much safer than a mean. If we have a handful of exchanges reporting data, but one or two are reporting bad data or are down (an unlikely event), we don’t have to worry because the other exchanges will still create a safe median. If we were using a mean then you would be right.

  2. I am planning on advocating to add more exchanges and onchain exchanges and not to add aggregate sources like Chainlink because I agree this would be a problem. Although, it would be a small problem in the scale of things.

  3. Yes, some exchanges have more liquidity than others. In the big picture, all the top 20 exchanges have plenty of liquidity to produce a real market. The bigger concern I have is about uptime/downtime, but if we have 10 plus sources and one or two go down, we don’t need to worry.

  4. For now, I am not considering adding any oracle solutions, so this is not a concern of mine.

  5. I already cite the Dai Liquidation event in November as a failure of the current system. It is well documented that if we have been sourcing prices from multiple exchanges that this wouldn’t have happened, or at least not at the same scale.

1 Like

On no1, would your plan to add more high exchanges need them to adopt open oracle and sign their price feed data?

1 Like

I am not talking about taking a simple mean, but how it is more optimal to create a reference price where sources are weighted differently according to their difference in quality (like what proven price feed solutions already do). If a source went down and the previous data they gave becomes stale, they could have a weight of zero. If we look into how professional data aggregation firms generate their indices, you’ll see that they don’t take a simple median or a simple mean, because that does not generate proper market coverage, they generate a volume weighted average price.

Taking the median from multiple exchanges means you are more protected from downtime yes, but it does not mean you are safe from manipulation. This medianizer cannot produce safe values if volume/liquidity consolidates to a few exchanges, especially if it moves to new exchanges that aren’t being tracked. While the exchanges added to the medianizer may provide enough market coverage initially, there is no guarantee it will stay that way going into the future.

When taking a properly weighted average, adding more exchanges means you track more of the market, but when taking a median value, it means a malicious actor only has to manipulate 51% of the exchanges to affect the final value, even if 80%+ of the volume consolidates to one or two exchanges (like how many DeFi tokens are vastly more liquid on Uniswap). Adding more exchanges does not mean additional tamper-resistance against manipulation in this current design, particularly as Compound scales up and adds more collateral to stay competitive to grow in TVL.

Downtime is really not the issue here, data manipulation attacks are.

I did not get the impression from others in this thread that only values from exchanges would be used, but price feeds as well. It is up to governance to decide if this is the case of not, but regardless of what sources are used, a simple median is not enough in my opinion. Some questions to consider:

  • Who will be responsible for ensuring the medianizer has proper market coverage at all times (automated systems and alerts for this)?
  • When volume consolidates to a few exchanges, particularly exchanges not being tracked, what is the process for updating the collection of sources to keep market coverage (will this be fast enough during extreme market events)?
  • How will we convince more exchanges to sign their data (seems to hit a stall on this front)?
  • Why should the Compound community spend precious resources (COMP) building a new price feed solution that cannot adequately guarantee market coverage, when there already exists price feed solutions that have proven their ability to generate and maintain market coverage for other large scale money markets (like how Aave and Cream use Chainlink feeds without issue)?
1 Like

Thanks for weighing in @nikkunkel. You bring up many good points. I share your concern regarding mixing different data types as well as your concern with us being able to actually get enough exchanges to commit to signing data in the first place.

Data aggregators ensure full market coverage as they are monitored by a variety of full-time data experts who can quickly respond to volume shifts across exchanges, data irregularities etc. Combining many high-quality aggregators ensures there is no single source of truth, and allows for the utilization of multiple data aggregations methodologies which increases the overall tamper-resistance of the system.

Pulling directly from exchange APIs, on the other hand, leaves these data-centric concerns in the hands of Compound governance, which may be slow/unable to react to certain market conditions. It will also present challenges when it comes to adding more collateral types as they will need to be individually assessed, voted in, and continually monitored for proper market coverage.

Personally I don’t think forking an existing oracle project is the way to go, as it would be too expensive and laborious to keep updated and we wouldn’t benefit from the shared cost model of simply using a feed from an existing oracle project.

This all leads to me supporting option #2 (using an aggregated oracle service). The simplest solution is almost always the correct one.

2 Likes

Correct me if I’m wrong, but I think we kinda all agree that relying on one source is not the ideal oracle implementation. Current Compound open oracle uses Coinbase (reporter 0xfceadafab14d46e20144f48824d0c09b1a03f2bc) for price data (must be within ~20% of Uniswap as another layer of sanity check).

Given that the open oracle already has the standard offchain signing and OpenOraclePriceData already supports posting prices from multiple reporters to the store, modifying the smart contract to do medianizer from multiple sources should not be hard. We can start from doing weighted median and see how it goes.

So here’s what I think makes sense:

  1. We write a new oracle medianizer that allows many prices from many reporters to be submitted within one transaction. Valid data points will be recorded on-chain.
  2. Governance should be able to add more reporters and adjust the weight of each reporter. The medianizer will use weighted median as the asset price for Comptroller calculation.
  3. On chain price data (such as KP3R’s Uniquote or ChainLink or BANK) can also become a ā€œreporterā€. In that case, no signatures are needed, but rather the smart contracts need to implementation a certain TBD interface.

Now the question of who will decide what a ā€œreporterā€ is, whether a primary source, an aggregator, an oracle solution, an anon guy, etc is pretty irrelevant to the implementation I think.

Each solution has pros and cons as discussed by several of you, but that can be decided by COMP governance later. My personal opinion is that the combination of all reporter types, while makes it hard to reason where the exact price data value comes from, is the probably the hardest to manipulate by an attacker in practice. And Compound market does not really need the exact, most correct price – just need to be within a certain threshold to make sure loans are liquidated before becoming insolvent.

Let me know what you guys think. We gotta start somewhere and if many of you think this is the right start, I can start the implementation right away!

2 Likes

I very much agree with your suggestions.

Especially the following part

Blockquote
Compound market does not really need the exact, most correct price – just need to be within a certain threshold to make sure loans are liquidated before becoming insolvent.

All we need to do is prevent liquidation!

Compromising on data quality is what got us in trouble in the first place, and I fear this mindset is a slippery slope that will present more issues as Compound continues to scale and add new collateral.

There just isn’t a great reason for a multi-billion dollar protocol to compromise on data quality/accuracy, especially when we are seeing other money markets like Aave operating perfectly because they utilize feeds that don’t.

1 Like

@TennisBowling lets try and get some snapshots up?

This sounds like a great start! I know the community is looking forward to seeing your/Band’s proof of concept on Wednesday’s governance call.

As the medainzier slowly comes to life, I’ll start a new thread about the next steps. Generally speaking, going from just using Coinbase to using Coinbase and Okex will be a huge improvement, but I have higher hopes than that.

I’ll start a separate forum thread after Wednesday’s call if it goes well to begin the discussion.

5 Likes

Looking forward to discussing this in the next thread!

UPDATE: 2/10/2020

The Band Protocol team (@sorawit) presented their proof of concept for the medianzer on today’s governance call. Overall the POC received a warm reception.

Next steps:

  • Make the POC into an efficient contract ready for audit.

  • Run some simulations to analyze gas costs.

  • Make the contract upgradable

  • Research integrating oracles (like Uniswap) directly into the medianizer to save on gas.

Additionally, be on the look out for a new forum post beginning the discussion of reports/price feeds.

5 Likes

Thank you for the update.
Could you link the Band Protocol team’s statement?

1 Like

I don’t have a written statement to post, and unfortunately, the last goverance call does not have a recording I can link due to technical issues.

I can link Band’s Github for the medaizner.

3 Likes

Although I’m nowhere near technical enough to actively build this I just have to say I listened to the Community Call with @getty and also Band protocol representatives, and I have to say this is a great effort, and you did a great job presenting it! I’m absolutely sold on this being required to keep growing the Compound protocol. Thanks for doing this!

2 Likes