Perfectly Random machine learning and stuff

Setup a Spark cluster on AWS EMR

AWS provides an easy way to run a Spark cluster. Let’s use it to analyze the publicly available IRS 990 data from 2011 to present. This data is already available on S3 which makes it a good candidate to learn Spark. This medium post describes the ... Read more

Always on top in MacOS Sierra

Afloat is a software that allows some Mac application windows to remain on top of other windows even when they are not in focus. Hence the name Always on Top. This is a standard feature for all windows in Ubuntu but in Mac we need to use a third-p... Read more

Install xml2 R package on MacOS

Installing xml2 R package often fails due to missing or incompatible library issues. In this post, I describe why this problem occurs and provide two solutions to solve this problem. What goes wrong? xml2 R package depends on libxml2. When you i... Read more

Sublime-style multiple cursors in Jupyter

Jupyter Notebooks Jupyter Notebooks are great for visualizing and sharing results with others. Learning the keyboard shortcuts tremendously improves productivity while using Jupyter Notebooks. By default, you can see the keyboard shortcuts hel... Read more

A new look using Trio

I decided to switch the theme of my Jekyll blog from Pixyll to Trio. Pixyll is a great theme and I learned a lot from using it. I decided it was time for a change. I wrote Trio as a hobby and to learn how Jekyll works in detail. The content of th... Read more

Time is TRUE, Female is FALSE

Introduction R has two reserved words denoting logical constants, namely TRUE and FALSE. These are case sensitive literals and you cannot create a variable or a function named TRUE or FALSE. > TRUE <- 1 Error in TRUE <- 1 : invalid (do_s... Read more