Image for post
Image for post
Photo by Markus Spiske on Unsplash

As you remember from the Model Deployment with Flask/Part-1, we covered the definition of Machine Learning Pipeline and model deployment then created necessary files for model deployment.

In this story, we will take the steps to make our model run on the web. At first, we will check whether our model works properly, then upload the necessary files to GitHub, and finally create an app on Heroku.

Step-1:Testing our model locally

Before uploading files to GitHub and then Heroku, it is better to check our model locally. We can do it by running the python command.

Anaconda Prompt to…

Machine Learning (ML) pipeline can be defined as dividing a machine learning workflow into reusable parts that create an end-to-end design. Although the number of steps may vary, a Machine Learning pipeline usually consists of six steps that require continuous monitoring.

Image for post
Image for post

As seen from the picture above, the final step, after building a machine learning model, is to make the model available in production; in other words, model deployment. Model deployment means making your trained ML model available to end-users or other systems.

There are different methods for model deployment, however, this two-part story will cover only model deployment using…

Image for post
Image for post

There is a famous English adage, “A picture is worth a thousand words”, meaning that a single image sometimes explains complex or multiple ideas way better than words. Namely, data visualization is a great medium to communicate with our audience.

Data visualization is useful for not only presenting the insights gained but also exploring the data at hand. Understanding the structure of the data, detecting outliers, identifying trends and clusters, choosing a model to apply, evaluating the model output, and finally presenting results can be effectively done via data visualization. …

Dealing with Missing Data: Practical Imputation Methods

Image for post
Image for post

According to different resources, data scientists spend 80% of their time cleaning data rather than creating insights. Here are some reports that confirm this argument.

According to Crowdflower 2015 report, although there is no info about the time spent on it, 66.7% of the participants stated that data cleaning/organizing data is one of their most time-consuming tasks.

In 2016 report of the same organization, the answer for the question “What data scientists spend the most time doing?” was data cleaning, ranking 1st with 60%, and in 2017 the number was 51%.

Mehmet Simsek

MBA/Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store