Having recently joined 1010data as a new employee I wanted to understand the capabilities of the platform. To get started, I took some data for a test drive to learn the features of the Trillion Row Spreadsheet℠. With a variety of public datasets available, there was no shortage of Big Data to experiment with. Having come from the financial services industry, I immediately gravitated to a relatively small set of stock trading data going back to 1950. With minimal query writing effort and virtually no processing time I was able to put together a comprehensive picture of trading volumes by day over the course of the last 63 years! I was impressed by the 1010data user interface, as I felt like it practically wrote the query for me. The most rewarding part was using two of the 1010data proprietary “g-functions”, which made arranging each trading day across the 63 years and ranking every trading day by year even easier than the equivalent VLOOKUP in Excel.
With over 60 years of information and thousands of days I wanted to see if there was a pattern around the busiest trading day of the year. I thought it best to standardize the calendar on one page so that I could represent the regular 252 trading days per year. The following animation, created using Excel with 1010data output, describes the trading volume by day of year between 1950 and 2013 for each year specifically. As you view the animation, the darker the green, the higher the trading volume. The light gray rectangles represent non-applicable trading days for the given month. Any orange rectangle signifies the single highest day of trading volume for the given year.
There are just over 16,000 trading days between 1950 and 2013 in relatively small 1010data table of Yahoo.com equities (22 million rows of transaction details by ticker symbol and date). October quickly popped as the month that is typically busiest each year, and Wednesday has the most volume during any given week day. During an average month the 13th and 14th trading days also stuck out as the busiest days of the month. Like any analyst, I started to wonder, “Why is October so busy?”. I am confident that the busiest month should be March, August, or October, as they all have 31 days and minimal market holidays. Both March and August have 31 days with very few holidays (sometimes Good Friday occurs in March), but October still has a lot more trading volume even when adjusting for historic trading calendars. I did some more research on the topic and the best advice I received was from a fellow 1010data employee who suggested that it could be driven by mutual funds because most funds operate on a fiscal calendar that ends on October 31st. Perhaps those reading have some more ideas on why October consistently has so much trading volume?
Here is a sample of the code that I created in 1010data by just thinking through the steps to organize my desired output:
<note type="base">Applied to table: pub.fin.equity.equity</note>
<willbe name="MonthTrade" value="month(date)"/>
<willbe name="YearTrade" value="year(date)" format="type:nocommas;dec:0"/>
<sort col="date" dir="down"/>
<sort col="date" dir="up"/>
<tabu label="Tabulation on Yahoo Finance Equity Historical Prices" breaks="YearTrade,MonthTrade,date">
<tcol source="vol" fun="sum" label="Sum of`Trade`Volume"/>
<sort col="date" dir="up"/>
<willbe name="TradeDay" value="g_cumcnt(MonthTrade;;date)"/>
<willbe name="YearRank" value="g_rank(YearTrade;;;t0)"/>
If it went by too quick in the video, here's the summary trading volume by month of the year:
And, here is the distribution of trading volume by day of the week: