Machine Learning - Decision Trees and Random Forests for Classification and Regression - Python Example using Real Data
I've written the following code as an example of an application for Decision Trees and Random Forests for Classification and Regression using Python code and libraries with a downloadable CSV with real data about US housing prices and characteristics.
The Use Case
You want to predict the price of a property using training data for the machine learning supervised algorithms of Random Forests and Classification and Regression. The principle is very simple, you have a set of historic data, this can be separated into training data and test data for our algorithms, and determine the accuracy of each algorithm to see which one is better for this sample data.
CSV File
The file and be downloaded from my repository here.
https://github.com/Markuspg1/machine-learning1
Loading CSV File Function
Here's a python function that will help you upload any csv file going forward.
Using the following code you can test this function, we use it now to upload the CSV into our df variable.
Data Analysis
Once you've uploaded the CSV file and confirmed it was successful, we continue by analyzing the data, finding those columns containing data that correlates the most with the result, and we use those columsn as training. Using all columns is also an option, but the algorithms will behave and be more accurate if the training data has a large percentage of correlation to let the algorithm take better decisions when learning.
Part 2 Coming Soon
Comentarios
Publicar un comentario