Big Data Analytics for Healthcare – Experience

One of the cute course slides by Prof. Jimeng Sun

I completed Big Data Analytics for Healthcare (CSE 6250) as part of Spring 2019 of my OMSCS program – my last semester! It was ranked as the hardest course in the program as per OMSCentral, a befitting finale to my 3.5 year long OMSCS journey.

As some might think, I did not choose to take the course because I am a masochist. Rather, the course seemed to cover a lot of ground that is not explored by any other courses in the program. In particular, I was interested in learning more about:

  • Hadoop, Pig, Hive, MapReduce
  • Spark (RDDs, Spark SQL, Mlib, GraphX etc.)
  • CNNs, RNNs
  • Scala

For me, the last reason was one of the major considerations. I consider myself a programming language geek and wouldn’t miss a chance to play around with a new programming language if I can. I must say I was in for a treat. I got to write some significant code in Scala. The sample code, template etc. that they provided for the projects were leveraging some of the nice parts of Scala such as type inference, case classes, pattern matching etc. All this along with its functional nature and full-fledged REPL made Scala a delight to work with.

The course was pretty well designed overall. The well maintained self-paced labs available for almost all topics were really helpful in getting up to speed. Check out their Scala lab to get a feel – link. The course videos were well planned. The TAs were pretty responsive in Piazza. All the good stuff.

The final project was the highlight of the course. It was a group project, worth 40% of the overall grade. Teams were allowed to pick from of a variety of interesting topics. The project milestones were structured in a way similar to how someone would go about pitching a data-science project in a corporate setting (say to your CEO), execute it and come up with a report and presentation summarizing that work.

I was fortunate to get a really good team. We were four. We chose the domain NLP for Healthcare. After exploring a variety of suggested ideas in the domain, we finalized on the following topic: Hierarchical Ensembles of Heterogeneous Models for Prediction of Medical Codes from Clinical Text. We worked well together and did a good amount of research and implementation. We were able to try out various ensemble ML models combining SVMs, DNNs etc as part of the project.

While there were homeworks to keep you busy on most weekends, I felt that the notoriety of the course in terms of difficulty was unfounded. If you go in with an open mind, excited about a fast paced journey across the various big-data tools/technologies being used for data-science in the industry, you won’t be disappointed.

  • Difficulty: 4/5
  • Rating: 5/5

On related news, I got my Georgia Tech Masters degree in my email couple of weeks back! It was truly a moment of happiness. I still need to figure out what to do with all the extra time now that the course is over ūüôā

My Georgia Tech Masters degree ūüôā

Graduate Algorithms – Experience

I completed Graduate Algorithms course (CS 8803) as part of my OMSCS program during Fall 2018. It was an interesting and really informative course.

The main topics covered in the course include dynamic programming; divide and conquer, including FFT; randomized algorithms, including RSA cryptosystem and hashing using Bloom filters;  graph algorithms; max-flow algorithms; linear programming; and NP-completeness.

Course website

Of these, I had not studied FFTs and max-flow algorithms during my undergrad. Also, though I had studied basic DP (dynamic programming) in my undergrad (KnapSack, Matrix multiplication etc.) and had prepared some more for tech interviews, I had never really had a rigorous formal training on DP before. This course had sufficient course work (home-works and exams) with the focus on building that DP intuition that I really liked.

The course text was Algorithms by Dasgupta, Papadimitriou and Vazirani. It is a really good, concise introduction to most advanced algorithm topics. It is especially good as a textbook for colleges because you can realistically expect students to read from cover to cover unlike say Introductions to Algorithms by Cormen, which is better suited as a long term reference manual. My one gripe with the course was that it did not cover the last chapter from Dasgupta, on Quantum Computing.

The course grading breakdown was as follows:

  • Homeworks: 5%
  • Project: 10%
  • Midterms: (two) 25% each
  • Final exam: 35%

The midterms and final exam were closed book 3-hour exams. The bulk of the grade (85%) was from these. There were 8 home-works, spaced roughly once a week which added up to just 5%. These usually involve 4-5 questions which would require you to write your solution in pseudo-code (only in case of DP) or in plain English. One might think that doing the home-works is just not worth the effort. However, I cannot over emphasize how important and helpful to the exams these were.

Another thing that I am ever grateful to the homeworks is for introducing me to Latex. Till now, though I had heard that Latex is pretty good for writing technical documents, I never really had a good use-case or forcing function to make me learn it. The HWs usually involved formulas, bigO complexities, pseudo-code, matrices etc. This gave me a really good opportunity to learn Latex. Its amazing!

While we are on the topic of Latex, if it is something that interests you, may I suggest the Emacs plugin for Latex? I used the Spacemacs latex layer. Among other things, it provides previewing the rendered doc in Emacs itself (SPC m p p)!

The TAs and the Professor were prompt on answering queries on Piazza and office hours. The grading turn-around times were also really fast. Overall, I really enjoyed the course. My ratings:

  • Difficulty: 4/5
  • Rating: 4.5/5

Machine Learning for Trading – Experience

I completed the Machine Learning for Trading (CS 7647-O01) course during the Summer of 2018. This was a fun and light course.

The course was divided into 3 mini-courses:

The first part of the course was mainly about getting familiar with Numpy and Pandas. The key take-away was how small differences (optimizations) in how you handle large arrays using Numpy can lead to substantial performance gains.

For me, the second part was the most enlightening. It introduced the various concepts within the Stock market. This was something I’ve always wanted to and in-fact have been learning on an ad-hoc need-to-know basis from Wikipedia, Investopedia etc. However, I’ve always wanted a more formal and comprehensive introduction to these concepts. Some of the ones discussed include:

  1. Going long or short with shares
  2. Capital Assets Pricing Model (CAPM)
  3. Efficient Markets Hypothesis
  4. The Fundamental Law of active portfolio management
  5. Arbitrage Pricing Theory
  6. Technical analysis (Bollinger Bands, Sharpe ratio etc.)

The third part was about using different ML techniques (supervised – Regression, Decision Trees, unsupervised – Q-learning etc) to predict future stock prices based on past data. I was familiar with most of these techniques from previous AI/ML courses I had taken. However, using those with time-series data in this setup was something new.

There were five (4 proper and 1 intro) projects in this course and 2 closed-book exams. The professor also recommended some good/interesting external resources for students to understand the stock market better. The movie The Big Short was even part of the syllabus. The course was well-paced and well organized. My ratings:

  • Difficulty: 2.5/5
  • Rating: 4/5

Intro to High Performance Computing – Experience

This Spring, I completed the Intro to High Performance Computing course (CSE 6220) as part of OMSCS. It is one of the hardest (4.5/5) and also the highest rated (4.8/5) course in the program as per OMS Central. Based on my experience,  I concur with both ratings.

At a high level, the course covers the algorithmic aspects of maximizing the performance of your code. This includes things like parallelizing your code across all processors or across multiple machines, exploiting the memory hierarchy to your advantage etc. The other ‘high performance’ course in the program – High Performance Computer Architectures (CS 6290), in contrast, discusses maximizing performance more at a processor architecture level.

Prof. Vuduc requires special mention. He has put a lot of effort in making the course videos easy-to-understand and interesting. His hilarious antics make you laugh even while discussing the most complex topics. He is also very active in Piazza and participates in the office hours regularly.

There were 5 hands-on projects in total, all in C/C++, with one due every two weeks. These were really the time-sinks. Interestingly, these were also the most fun part of the course in my experience. These involved implementing the algorithms taught in the lectures, making everything you learn more ‘real’.

Key concepts

At a broad level, these were the key concepts I learned from the course:

  1. Shared memory model (aka dynamic multithreading model)
    1. Concepts of work and span (link) in analyzing parallel programs.
    2. Introduction to OpenMP library.
  2. Distributed memory models
    1. Parallel computing across network using message passing.
    2. The Alpha-Beta model (aka latency & inverse-bandwidth model) for analyzing distributed parallel programs.
    3. Introduction to OpenMPI library.
  3. Two level memory model
    1. I/O aware algorithms that can exploit the cache and main memory structures.
    2. Cache oblivious algorithms that still achieve optimal performance without being aware of the cache/memory structures.

alpha_beta_model
The Alpha-Beta model for measuring the cost of distributed computing

For a more detailed overview, I would recommend going over the Course Syllabus (PDF).

Overall, I really enjoyed the course and am happy that I decided to take it though it was not directly related to my specialization (Interactive Intelligence).

Artificial Intelligence – Experience

I recently completed the Artificial Intelligence course (CS 6601) as part of OMSCS Fall 2017. The course gives an good overview of the different key areas within AI. Having taken Knowledge Based AI (CS 7637), AI for Robotics (CS 8803-001), Machine Learning (CS 7641) and Reinforcement Learning (CS 8803-003) before, I must say that the AI course syllabus had significant overlap in many areas with these courses (which is expected). However, I felt the course was still worthwhile since Prof. Thad taught these topics in his own perspective, which made me look at these topics in a different light. Prof. Thad also tried his best to make the course content interesting and humorous, which I really appreciated.

Course Outline

  1. Game Playing – Iterative Deepening, MinMax trees, Alpha Beta Pruning etc.
  2. Search РUniform Cost Search, Bidirectional UCS, A*, Bidirectional A* etc.
  3. Simulated Annealing – Hill Climbing, Random restarts, Simulated Annealing, Genetic Algorithms etc.
  4. Constraint Satisfaction – Node, Arc and Path consistency, Backtracking, forward checking etc.
  5. Probability – Bayes Rule, Bayes Nets basics, Dependence etc.
  6. Bayes Nets – Conditional Independence, Cofounding cause, Explaining Away, D Separation, Gibbs Sampling, Monty Hall Problem etc.
  7. Machine Learning – kNN, Expectation Maximization, Decision Trees, Random forests, Boosting, Neural nets etc.
  8. Pattern Recognition through Time – Dynamic Time Warping, Sakoe Chiba bounds, Hidden Markov Models, Viterbi Trellis etc.
  9. Logic and Planning – Propositional Logic, Classic planning, Situation Calculus etc.
  10. Planning under Uncertainty – Markov Decision Processes (MDPs), Value iteration, Policy iteration, POMDPs etc.

The course used the classic textbook in AI Р Artificial Intelligence РA Modern Approach (3rd Edition) by Peter Norvig and Stuart Russell. Some chapters (such as Logic and Planning) was taught by Peter Norvig himself whereas few others were taught by Sebastian Thrun. There is no arguing that the course was taught by the industry best.

Screen Shot 2018-01-07 at 6.30.34 PM
The iconic cover of Artificial Intelligence: A Modern Approach

There were 6 assignments (almost one every alternate week) which required proper understanding of the course material and decent amount of coding (in Python). There was an open book midterm and final exam as well. Even though these were open book, these involved significant amount of work (researching and rereading the text, on paper calculations etc.). Overall, completing these forces one to really understand the concepts, which I really liked.

Summary Stats

  1. Average time spend per week – approx. 20 hours (including whole weekends on assignment due weeks)
  2. Difficulty (out of 5) – 4.25 (which is what I would rate ML too, and these two would top my list)
  3. Rating – 4/5

Introduction to Information Security – Experience

I did the Introduction to Information Security course (CS6035) as part OMSCS Summer 2017 semester.

The course was a good overview of various aspects of Information Security. It broadly covered topics like system security, network security, web security, cryptography, different types of malware etc. The course was lighter in terms of work load compared to the other subjects I’ve taken so far. I really liked the projects which were thoughtfully designed to give the students hands-on experience in each of these topics.

The four projects that we had to do were:

  1.  Implementing Buffer Overflow in a given vulnerable code. This required brushing up on C basics,  understanding how process memory allocation works internally and some playing around with gdb.
  2.  Analyzing provided malware samples using Cuckoo, an automatic malware analyzer and sandbox to identify behaviors such as registry updates, keyboard and mouse sniffing, remote access, privilege escalation etc.
  3. Understanding and implementing the RSA algorithm in python, identifying the weakness in using smaller length keys (64 bit) and decrypting an RSA encrypted message by exploiting this weakness.
  4. Exploiting vulnerabilities in a target (sample) website using Cross-Site Request Forgery (XSRF), Cross Site Scripting (XSS) and SQL injection.

Apart from the projects, there were 10 Quizzes to be completed, one per week throughout the course. The various exploits discussed in the course are fairly easy to be introduced in a codebase if you are not aware of these. Unfortunately, these are pretty common even now, many years after they were first discovered.

Hence, no matter the type of software development one is into (mobile, web, DB, relatively low level languages like C, embedded device programming, bare metal etc.), these exploits and their counter-measures are a must-know.

 

Reinforcement Learning – Experience

I completed the Reinforcement Learning course (link) as part of OMSCS Spring 2017 semester. It was one of the most rewarding courses I took as part of the program till date.

The course was taught by professors Charles Isbell and Michael Littman, the same Profs who had taken the Machine Learning course previously (blog link). The course was really challenging considering the closely packed and research oriented home works and projects as well as the math/theory heavy course material. We had

  • 6 home works which involved implementing different RL algorithms to solve given problems
  • 3 projects out of which two were the reproduction of experiment results from prominent RL research papers and one was solving an RL¬†problem using OpenAI Gym

Summarizing my key learnings from RL below:

  • Reinforcement learning helps you train an AI agent to maximise some form of reward without prior understanding of the environment ¬†-i.e. model-free.
  • E.g: Pacman. Here the agent (or player) can roam around the space using possible actions (left, right, up, down). When it consumes one of the small orbs, it gets points (+ve reward). When it eats the big orbs and then eats the enemy players, it again gets more points. However, if it’s eaten by one of the enemy players, it loses a life (-ve reward). If you let an RL agent play Pacman for some time, it will start playing randomly, but eventually, figure out the rules of the game and can potentially play better than a human player. All this without we injecting any domain knowledge (rules of the game, winning strategies etc.) beforehand! (crazy right?)
  • Screen Shot 2017-05-07 at 10.23.36 PM
  • Most RL¬†research assumes all processes can be represented¬†using MDPs (Markov Decision Processes). These are processes where the entire past can be represented using the current state of the agent.
  • Learned about different RL algorithms such as:
    • Value Iteration
    • Policy Iteration
    • Q-learning
    • TD-Lambda etc.
  • Generalization using function approximation – This seemed to me to be one of the most promising sections of RL. It can effectively take RL outside the confines of Grid world and into the big and continuous state spaces of the real world.
    • For one of our projects, we used DQN (Deep Q-Networks), one of the latest efforts in generalization¬†using deep neural networks, published by DeepMind – a Google company.
  • Reward Shaping –¬†a mechanism to accelerate the learnings of the agents and help them get to their goals faster.
  • POMDPs (Partially Observable MDPS) – These are¬†closer to the processes which we see in real-life. We don’t get to know fully which state we are in. We have to work with a set of ‘belief states’ or probability distributions of possible states we might be in.
  • Game Theory – I found this to be the most fun part of the course. It deals with stochastic games where multiple agents try to maximise their collective/competing rewards. This is again closer to the situations which we face in real-life. Topics include:
    • Prisoners Dilemma
    • Nash Equilibrium
    • Folk theorem and sub-game perfect equilibrium
    • Tit-for-Tat, Grim trigger, Pavlov etc. game strategies
    • Coordinated equilibria, using side payments (Coco-Q) etc.

The course content was a bit too theoretic in some chapters (e.g.: AAA – Advanced Algorithm Analysis). I found lectures from David Silver, DeepMind to be a good supplementary course to build the required intuition for this course – link.

One of the really exciting moments in this course was when Prof. Richard Sutton, considered by many as the father of Reinforcement Learning, and the author of the primary textbook for RL (of our course and elsewhere) ‘Reinforcement Learning: An Introduction’ (second edition draft available from author’s website – link) appeared for one of our office hours as a special guest.

Screen Shot 2017-04-20 at 4.44.15 AM
Prof. Richard Sutton along with our TAs during an office hour

I found all the TAs for this course really knowledgeble and helpful. All the office hours were really useful and fun-filled at the same time. One of our TAs, Migual Morales has been featured in the OMSCS website recently – link.

In conclusion, this course has been one helluva ride that I enjoyed throughout! ūüôā