The Volume of Ethereum (ETH)

Yağmur Bali
3 min readMar 25, 2020

This is a Regression project that is my second project at Istanbul Data Science Academy.

Cryptocurrencies are becoming so popular nowadays. So, what is cryptocurrency? Cryptocurrency is a digital or a virtual currency that is secured by cryptography which is a method of protecting information and communications that allow only the intended recipient to view its contents.

According to my research, predicting the price of the stock market or currencies is common rather than cryptocurrencies. Thus, I tried this and I chose ETH instead of Bitcoin (BTC) because ETH is the second cryptocurrency with the highest volume after Bitcoin among cryptocurrencies.

First of all, I scrapped data from this website using BeautifulSoup. After the necessary import tools, we are ready to scrape. You can scrape any website you can choose with the code below:

url = 'website_url'
def get_page_contents(url):
page = requests.get(url)
return bs4.BeautifulSoup(page.text, 'html.parser')
soup = get_page_content(url)

For this project, I scraped Ethereum history data from 7 August 2015, to 13 March 2020. There are ‘Close’, ‘High’, ‘Low’, ‘Open’, ‘Volume’ and ‘Market Cap’ columns on the data. Let’s take a view of the meanings of these columns.

  • Close Price: Last price for currency for that particular day.
  • High Price: The highest price of currency for the day.
  • Low Price: The lowest price of currency for the day.
  • Open Price: First price for currency for that particular day.
  • Volume: The amount of activity that is being in the trade for that day.
  • Market Capitalization (Market Cap): The current share multiplied by the total number of existing shares. The rise and fall of price values affect market capitalization and therefore its market value, but it does not mean there is an ITM (In the Money) or OTM (Out of the Money).

Besides, Ethereum was proposed at the end of 2013 by Vitalik Buterin, a cryptocurrency researcher and programmer. The system being developed went live on 30 July 2015, with 72 million coins minted.

According to the graphic above, we can see that the volume of ETH has been increasing. When we look at the market cap of the ETH, we can see that it had a peak in 2018 and it decreased until 2019 and then it started to increase again.

I take volume as a dependent variable for OLS (Ordinary Least Square) model that R² is 0.66 for the year, close price and market cap. When I look at the correlation between variables, R² is 0.70 for the same variables after adding the low price. That’s better.

Finally, I decide on a model with varying features (Linear, Ridge, Polynomial, etc). The results of R² for each model are:

  • Linear Regression: 0.70
  • Ridge Regression: 0.66 (data I used requires no regularization)
  • Degree 2 Polynomial Regression: 0.69

I chose the best model for this data which is Linear Regression. Then, When I apply the test value which is the %20 of all data, I record R² of 0.69 that is the proof of the model I chose is correct.

To sum up, the Linear model can explain some of the variances, but many factors can impact the volume of ETH such as technological progress, economic problems, political issues, etc.

You can see the codes of the project here.