Machine Learning: The Journey from “hello, world” to Self Sufficient Computers

Coding Club, IIT Guwahati
5 min readMay 18, 2020

Hey there, machine learning enthusiast! Ready to start your long, excruciating yet wonderful journey into the world of Machine Learning? Let me guide you through this amazing journey! Allow me to introduce myself, I’m your trip planner for the day and will be assisting you on this journey so that you don’t wander off alone in the darkness.

So I hope you have your bags ready to pack and an inviting mind, to begin with. To start off, you’ll need some PREREQUISITES in the form of basic Python knowledge, inferential statistics, some Linear Algebra and of course a clear and receptive mind :)

Without further wait, let’s begin the trip by PACKING OUR BAGS.

Numpy and Pandas: the most important tools to pack

First, we’ll put NumPy and Pandas in our bags, they’re like cash in hand, without this you cannot start your journey! In real-world, a large amount of data exists, but in raw form (missing values, categorical variables, etc). It must be processed before it’ll be ready to use. NumPy and Pandas help in Data Preprocessing!

Pandas is used to load data from any source, playing around with the dataset, creating new parameters while NumPy is used for performing mathematical operations on the dataset. It also allows us to operate with multi-dimensional arrays and apply complex mathematical functions to our data.

These are indispensable arsenals to begin the exciting journey of Machine Learning!

Packed your bags already? Now you surely would want to CAPTURE the picturesque moments which you encounter on the expedition.

Data Visualisation: The perfect bridge between the eyes and the math

I’ve got a perfect tool for you here, Matplotlib and Seaborn (flexible python libraries), used exclusively for Data Visualization. These give you valid insights into the data. As you work with complex datasets, you’ll know why trends and correlations in data in the form of visualizations form an integral part of predictive modeling.

But wait a second, the world outside is just too big! There are many beautiful and scenic locations you’d want to travel to. You’ve packed your bags, hired a guide but you haven’t decided yet where you want to travel in this world!

I’ve got a solution for this too, you collect data from various resources (namely websites, etc) by certain WEB SCRAPING methods using packages such as BeautifulSoup, Selenium, Scrapy, and many more.

Now that you’ve reached your destination, captured perfect moments, and delved really deep into the beautiful locations, the journey sure must have been very hectic but totally worth it! But it’ll be truly fruitful only if you return safely back to your home.

The data from companies are so huge that it can contain as many as up to hundreds of millions of data points. There are numerous Machine Learning models out there and you need to be very careful in which one to apply. To give you a hint of uncertainty, some will take months to fetch you a prediction and even then you’ll be unsure of the result ! So you need to know the algorithmic “math” behind the models. Knowledge will surely help you apply really good models to your dataset, but this is an art best mastered through practice.

Scikit-Learn library allows you to apply the predictive models in the simplest way possible. You can use a whole slew of functions like regression, classification, model selection, pre-processing, clustering, and much more.

Now as your personal tour planner, I want to take reviews about your trip, so that I can provide a better experience next time.

Evaluation Metrics: A way of predicting the accuracy of the models

Similarly, there are many EVALUATION METRICS to determine how good your applied model is! These metrics work on different criteria so that you can always get a scope to improve upon your model.

Congratulations on your first solo trip!

Planning another one already? Maybe this time you’d want to include your friends too.

Once you master the basic syntax and explore the basics of libraries, you can already begin working on your own projects or explore new projects that have been made by other people (open source). An excellent place for this is GitHub. It simplifies the process of working with other people and makes it easy to collaborate on projects.

After many routine trips, you probably want something more exciting and advanced like a trek, right?

For that, you’ll have to jump into the world of Deep Learning and Neural Networks. Deep learning is a branch of Machine Learning which attempts to model high-level abstractions in the data through the use of neural networks (similar to how our brain works!). You can make image recognition systems, sound alterations, chatbots, and all the cool stuff you’d been thinking about since your childhood!

I can guide you for all that too, but I think you are pretty exhausted from your enriching trip into the unknown.

Some useful resources from my side which I compiled for your journey:

1.Python basics: https://www.coursera.org/learn/python-for-applied-data-science?specialization=ibm-data-science-professional-certificate

2. Probability and Inferential Statistics:

https://www.coursera.org/learn/probability-intro

https://www.coursera.org/learn/inferential-statistics-intro

3. Pandas (30 videos):

https://www.youtube.com/watch?v=yzIMircGU5I&list=PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y&index=1

4. Numpy:

https://www.w3schools.com/python/numpy_intro.asp

5. Matplotlib and Seaborn:

https://www.youtube.com/watch?v=yZTBMMdPOww

https://www.youtube.com/watch?v=GcXcSZ0gQps

https://caciitg.in/projects/data_visualization.html

6. Web Scraping:

https://youtu.be/ng2o98k983k

I’d also recommend you to go through the Kaggle micro-courses in the above topics as they help in starting from something practical and having a better view of the whole picture.

Happy exploring and hoping that you liked all that I’d planned out for you! Until next time, Ciao!

An article by Kartik Bansal and Maneshwar Singh. (Coding Club, IIT Guwahati)

--

--

Coding Club, IIT Guwahati

A series of short informative blogs where the best programmers have your back with all the new technologies you need help exploring. So dive in!