In this tutorial, you will learn, how to Install XGBoost on Mac OS Sierra for Python programming language.

XGBoost is a Scalable and Flexible Gradient Boosting library. XGBoost is extensively used by machine learning practitioners(Kaggle) to create state of art data science solutions.

Once you installed this XGBoost library, you can directly use this library into the Anaconda software. To learn, how to install the Anaconda software, use the below link.

https://ampersandacademy.com/tutorials/python-data-science/python-data-science-tool-anaconda-spyder-installation-on-mac-and-windows

This tutorial gives you instructions on how to build and install the XGBoost package from scratch on Mac OS Sierra. It consists of two steps:

Step1:

First, build the shared library from the C++ codes (libxgboost.so).

Step2:

Then install the Python language packages.

Required Software.

1. Python(Download and install Python from python.org website)

Step1: Build the Shared Library

XGBoost supports multi-threading. So you can use the multi-threading feature if you need. But enabling multi-thread feature is very high time-consuming process. It took me 71 minutes to install the multi-threading library. So the choice is yours. Try with 
Without multi-threading or With multi-threading.

Without multi-threading

Run the below two commands on the terminal to install the shared library.

git clone --recursive https://github.com/dmlc/xgboost
cd xgboost; cp make/minimum.mk ./config.mk; make -j4

That's all. You are done.

With multi-threading

By default clang in OSX does not come with open-mp. Use the below code for OpenMP enabled XGBoost.

brew install gcc --without-multilib

Installation of gcc can take a while (~ 30 minutes mentioned in the official website documentation). For me, it took 71 minutes. I tried it with MacBook Air 2015 model. So please be patient, until the installation to be completed.

Now, clone the repository

git clone --recursive https://github.com/dmlc/xgboost

Next copy the config file.

cd xgboost; cp make/config.mk ./config.mk

Now open the config.mk file using the below command or open with any text editor.

vi config.mk

Uncomment the lines near the top of the file:

export CC = gcc

export CXX = g++

Change them to the following:

export CC = gcc-7

export CXX = g++-7

Save the file. Also, make changes to the file xgboost/Makefile. Open the file using vi or text editor.

vi Makefile

And Change them to the following in the Makefile.

export CC = gcc-7

export CXX = g++-7

Save the file. Now you need to run a cleaning step since you changed the Makefile.

make clean_all && make -j4

Step2: Python Package Installation

The python package is located at python-package.

Install system-widely, which requires root permission.

cd python-package; sudo python setup.py install

That's all. Now you successfully installed the XGBoost library.

Import the XGBoost like below on Anaconda or Python.

import xgboost as xgb


For XGBoost regression, use below classes.

xgboost.XGBRegressor()

For XGBoost classification, use below classes.

xgboost.XGBClassifier()

 

Sometimes, you will encounter some error like below while using the XGBoost on Anaconda.

from xgboost import XGBClassifier ImportError: cannot import name 'XGBClassifier' from xgboost/xgboost.py

Execute the below command on terminal to solve the above error.

conda install -c conda-forge xgboost=0.6a2

Features of XGBoost:

 1. Flexible

Supports regression, classification, ranking and user-defined objectives. 

2.  Portable

Runs on Windows, Linux and OS X, as well as various cloud Platforms

3. Multiple Languages

Supports multiple languages including C++, Python, R, Java, Scala, Julia.

4.  Battle-tested

Wins many data science and machine learning challenges. Used in production by multiple companies. 

5. Distributed on Cloud

Supports distributed training on multiple machines, including AWS, GCE, Azure, and Yarn clusters. Can be integrated with Flink, Spark and other cloud dataflow systems.

6. Performance

The well-optimized backend system for the best performance with limited resources. The distributed version solves problems beyond billions of examples with same code. 

Conclusion:

In this tutorial, you learned how to install the XGBoost library on Mac OS Sierra for Python programming language. This library is mostly used by the Kaggle Competiton winners. So try this and enjoy machine learning using the XGBoost library.