We’re happy to announce our alpha release aka Ari! Here is an update on how we arrived at Ari and what it is.
What happened after Devcon 4?
Back in October 2018 while we were running our implementation of Minimal Viable Plasma (MVP) at Devcon 4, we gained valuable insights into the operation and performance of our services. We even had our first production incident during the conference, which we mitigated to keep the services live. One of the key elements of software delivery is getting performance and operational information from the deployed code and using that to make data driven decisions that ensure that the services we are building are resilient for users. This continued iterative process is essential for ensuring the safety and trust in our network, which is paramount to us. After all, this is a network that is transacting real world assets which have value.
What was the incident at Devcon 4?
When we deployed the blockchain services that power Plasma Dog, we also deployed Quest — a service to view transactions on the network in real time. The Watcher stores network transactions in a relational database. Quest connects to the Watcher to get the transactions that have been verified. We had configured the client-side library of Quest to poll the Watcher every 15 seconds to get an updated list of the transactions on the network. This means that if a user left a browser tab open, they would continue to make API calls to get two hundred of the latest transactions per request. When we had several thousand users on Quest continually requesting updates the transaction data, we quickly exhausted the resources available to us on the database. The CPU on the database pegged at 100% and refused further connections, the Watcher could not connect to the database and failed, causing the Quest service to fail too. Fortunately as we have a separate Childchain and Watcher architecture, the blockchain continued to be operational.
This graph is the CPU usage of the database that the Watcher uses. Shortly after it hits 100% utilization, you can see at 2:45 where the database refuses connections in an attempt to shed load, which it does successfully (causing the failure of the Watcher). At 2:52 we restarted the Watcher, only for it to fail again 5 minutes later due to the volume of requests. During the period between 3:00 and 3:20, we determined that the lack of CPU resources was causing this failure condition and re-provisioned the database to handle the load we were experiencing. At 3:24 services were operational again.
We continued to monitor how the services were performing and at 4:25 we could see an increase in the CPU load again. From the Watcher’s monitoring data we could see that several requests per second were coming from an API call associated with Quest’s transaction polling API. We deployed a hot fix on Quest to change that behavior, with the trade-off being that users would have to refresh the page manually if they wanted to view the updated transaction data. After we deployed that, at 4:43, the database CPU load started to reduce.
In the following weeks after Devcon 4, we redesigned our service deployments and began the process to implement Plasma MoreVP. Some of the highlights from this time are:
- We launched our Childchain and Watcher services on the public Ethereum test network Rinkeby.
- We discovered techniques that meant we could reduce our gas usage on the root chain (including not writing empty blocks to the rootchain and a modifying the gas price selection mechanism).
- The service deployment was redesigned to be more tolerant to failure conditions that we identified during and after Devcon 4.
- Continuous integration and deployment of the blockchain services was implemented to provide faster signal to developers of an error condition.
- The public APIs were redesigned to be consistent across all of the services in the OmiseGO ecosystem, which helps for eWallet integration.
- We found a whole range of bugs, including a serious one where blocks weren’t being written to the root chain.
- The Plasma smart contracts, Childchain, and Watcher were engineered to implement Plasma MoreVP.
We’ve switched from calling these releases the internal and external testnet to alpha and beta, respectively. We’ve been using this terminology internally for some time and we’ll be using it publicly from now on as well. We’re building software and we’d like to follow the software release life cycle naming convention. It offers more clarity on where we are in terms of our software development stage. Alpha denotes software that is complete enough for internal testing, which is typically carried out by people other than the software engineers who wrote it. Testers can include individuals within the same organization or community that developed the software. Beta test is the second phase of software testing where we’ll be opening access to everyone.
What is the alpha release (Ari)?
The alpha release, which we’ve nicknamed Ari (see name explanation below), concludes the current stage in our development cycle. We feel we’re in a good place with our builds, deployments, and smart contract development where we’re ready to start using our Plasma MoreVP implementation on a public Ethereum network with users. This allows us in tandem with partners, to test the integration of third-party apps with our Plasma MoreVP network. Hoard will be using our test network Ari at ETHDenver!
To celebrate ETHDenver and the launch of Plasma MoreVP, Hoard has developed a limited edition Bufficorn Skin for the hackathon. Thank you to Hoard for pioneering this program, thus far, and a shout out to the Burner Wallet team! Check Hoard’s blog post for more details on the More Viable PlasmaDog on Rinkeby and how players at ETHDenver can authenticate their Plasma Dog session with their NFT Burner Wallet to unlock the playable Bufficorn skin.
Getting to this point has taken a lot of hard work and dedication to overcome the issues we’ve discovered while building. Here are some of the challenges that we’ve found and addressed along the way in 2019 alone:
- In certain conditions exits from the Plasma chain were unchallengeable. This meant that anybody could deposit to the Plasma MoreVP contract, perform an exit, and the entire chain would become invalid. This results in a denial of service (DoS) condition which would have taken down the whole network for everyone.
- A race condition on the Watcher start-up was discovered where unchallenged exits were processed before other modules of the system started. This meant that for users running their own Watcher, it would appear the chain is faulty (has a byzantine condition) — which would prevent normal operation, including transactions.
- We found a bug where an attacker could drain all the ETH from the deployed Plasma MoreVP contracts.
- Protection against re-orgs on Ethereum has been added to the Plasma MoreVP contracts.
- We implemented a flexible deployment mechanism for deploying services in production or locally on a developer laptop.
- Support was added for other components of the exit game including in flight exits and piggybacking.
We’re really excited to be at this stage of the development and look forward to more load on our Plasma MoreVP services as part of the OmiseGO Developer Program (ODP). The ODP is an ongoing program of early testers and integrators who will be given first access to our products, documentation and tooling.
Why Ari? 🚋 🚋 🚋
The naming convention for test Ethereum networks is to use train stations. Rinkeby and Ropsten are stations in Sweden, Kovan is a station in Singapore. When we turned up the environment that was used for Devcon 4, we were huddled in a hotel room as we deployed the services and ERC20 tokens that would be needed by Hoard to operate Plasma Dog. During this, we realized we didn’t have a name for the network. We checked on a map and just around the corner from the conference center was Vyšehrad station. So that’s how we named that one.
We wanted to continue the naming convention with our alpha release and decided it was time to put Bangkok on the map. Ari is: 1) Easy to say for non-native Thai speakers, and 2) Can also mean “hospitable”, “accommodating” or “gentle”. Please be gentle with our test net. 😄
We’re going to continue the same iterative process that we’ve been building with to date. All software has bugs and we will not rush releases or perform any activity that has the potential to compromise user safety. The next version (beta) of our network will be a publicly available release that anybody can use! We’ll use the beta phase to observe real-world usage and look for bugs or flaws that haven’t been discovered in the alpha phase. We’ll deploy the network to mainnet when, and only when, the code has been thoroughly tested and audited and we are confident that it’s a safe place to put real money.
We’re currently planning a video workshop pre-event and a face-to-face workshop ̶h̶a̶c̶k̶a̶t̶h̶o̶n̶ during EDCON HACK, which takes place this April in Sydney, Australia. We will have a bug bounty in place to catch more bugs and security vulnerabilities as a part of the process of getting ready for mainnet. We will develop more tooling around the interaction with services and the status of the systems we’ve deployed — eWallet Plasma integration. Stay tuned for more details.
Want to get involved?
If you would like to get even more involved with the next stages of development and building, we’re hiring!
Stay updated. Connect with us.