Home » » Top 20 Python Machine Learning Open Source Projects

Top 20 Python Machine Learning Open Source Projects

We examine top Python Machine learning open source projects on Github, both in terms of contributors and commits, and identify most popular and most active ones.
By Bhavya Geethika Peddibhotla. 

We analyze Top 20 Python Machine learning projects on GitHub and find that scikit-Learn, PyLearn2 and NuPic are the most actively contributed projects. Explore these popular projects on Github! 

top-python-machine-learning-projects
Fig. 1: Python Machine learning projects on GitHub, with color corresponding to commits/contributors. Bob, Iepy, Nilearn, and NuPIC have the highest such value. 

  1. scikit-learn, 18845 commits, 404 contributors, 
    www.github.com/scikit-learn/scikit-learn 
    scikit-learn is a Python module for machine learning built on top of SciPy.It features various classification, regression and clustering algorithms including support vector machines, logistic regression, naive Bayes, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
  2. Pylearn2, 7027 commits, 117 contributors, 
    www.github.com/lisa-lab/pylearn2 
    Pylearn2 is a library designed to make machine learning research easy. Its a library based on Theano
  3. NuPIC, 4392 commits, 60 contributors, 
    www.github.com/numenta/nupic 
    The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implements the HTM learning algorithms. HTM is a detailed computational theory of the neocortex. At the core of HTM are time-based continuous learning algorithms that store and recall spatial and temporal patterns. NuPIC is suited to a variety of problems, particularly anomaly detection and prediction of streaming data sources.
  4. Nilearn, 2742 commits, 28 contributors, 
    www.github.com/nilearn/nilearn 
    Nilearn is a Python module for fast and easy statistical learning on NeuroImaging data. It leverages the scikit-learn Python toolbox for multivariate statistics with applications such as predictive modeling, classification, decoding, or connectivity analysis.
  5. PyBrain, 969 commits, 27 contributors, 
    www.github.com/pybrain/pybrain 
    PyBrain is short for Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library. Its goal is to offer flexible, easy-to-use yet still powerful algorithms for Machine Learning Tasks and a variety of predefined environments to test and compare your algorithms.
  6. Pattern, 943 commits, 20 contributors, 
    www.github.com/clips/pattern 
    Pattern is a web mining module for Python. It has tools for Data Mining, Natural Language Processing, Network Analysis and Machine Learning. It supports vector space model, clustering, classification using KNN, SVM, Perceptron
  7. Fuel, 497 commits, 12 contributors, 
    www.github.com/mila-udem/fuel 
    Fuel provides your machine learning models with the data they need to learn. it has interfaces to common datasets such as MNIST, CIFAR-10 (image datasets), Google's One Billion Words (text). It gives you the ability to iterate over your data in a variety of ways, such as in minibatches with shuffled/sequential examples
  8. Bob, 5080 commits, 11 contributors, 
    www.github.com/idiap/bob 
    Bob is a free signal-processing and machine learning toolbox The toolbox is written in a mix of Python and C++ and is designed to be both efficient and reduce development time. It is composed of a reasonably large number of packages that implement tools for image, audio & video processing, machine learning and pattern recognition
  9. skdata, 441 commits, 10 contributors, 
    www.github.com/jaberg/skdata 
    Skdata is a library of data sets for machine learning and statistics. This module provides standardized Python access to toy problems as well as popular computer vision and natural language processing data sets.
  10. MILK, 687 commits, 9 contributors, 
    www.github.com/luispedro/milk 
    Milk is a machine learning toolkit in Python. Its focus is on supervised classification with several classifiers available: SVMs, k-NN, random forests, decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems.For unsupervised learning, milk supports k-means clustering and affinity propagation.
  11. IEPY, 1758 commits, 9 contributors, 
    www.github.com/machinalis/iepy 
    IEPY is an open source tool for Information Extraction focused on Relation Extraction 
    It's aimed at users needing to perform Information Extraction on a large dataset. scientists wanting to experiment with new IE algorithms.
  12. Quepy, 131 commits, 9 contributors, 
    www.github.com/machinalis/quepy 
    Quepy is a python framework to transform natural language questions to queries in a database query language. It can be easily customized to different kinds of questions in natural language and database queries. So, with little coding you can build your own system for natural language access to your database. 
    Currently Quepy provides support for Sparql and MQL query languages, with plans to extended it to other database query languages.
  13. Hebel, 244 commits, 5 contributors, 
    www.github.com/hannes-brt/hebel 
    Hebel is a library for deep learning with neural networks in Python using GPU acceleration with CUDA through PyCUDA. It implements the most important types of neural network models and offers a variety of different activation functions and training methods such as momentum, Nesterov momentum, dropout, and early stopping.
  14. mlxtend, 135 commits, 5 contributors, 
    www.github.com/rasbt/mlxtend 
    Its a library consisting of useful tools and extensions for the day-to-day data science tasks.
  15. nolearn, 192 commits, 4 contributors, 
    www.github.com/dnouri/nolearn 
    This package contains a number of utility modules that are helpful with machine learning tasks. Most of the modules work together with scikit-learn, others are more generally useful.
  16. Ramp, 179 commits, 4 contributors, 
    www.github.com/kvh/ramp 
    Ramp is a python library for rapid prototyping of machine learning solutions. It's a light-weight pandas-based machine learning framework pluggable with existing python machine learning and statistics tools (scikit-learn, rpy2, etc.). Ramp provides a simple, declarative syntax for exploring features, algorithms and transformations quickly and efficiently.
  17. Feature Forge, 219 commits, 3 contributors, 
    www.github.com/machinalis/featureforge 
    A set of tools for creating and testing machine learning features, with a scikit-learn compatible API. 
    This library provides a set of tools that can be useful in many machine learning applications (classification, clustering, regression, etc.), and particularly helpful if you use scikit-learn (although this can work if you have a different algorithm).
  18. REP, 50 commits, 3 contributors, 
    www.github.com/yandex/rep 
    REP is environment for conducting data-driven research in a consistent and reproducible way. It has a unified classifiers wrapper for variety of implementations like TMVA, Sklearn, XGBoost, uBoost. It can train classifiers in parallel on a cluster. It supports interactive plots
  19. Python Machine Learning Samples, 15 commits, 3 contributors, 
    www.github.com/awslabs/machine-learning-samples 
    A collection of sample applications built using Amazon Machine Learning.
  20. Python-ELM, 17 commits, 1 contributor, 
    www.github.com/dclambert/Python-ELM 
    This is an implementation of the Extreme Learning Machine in Python, based on scikit-learn.
This post used some content from www.pansop.com/1039/ 

29 Comments:

Manthra said...

Great blog, I was searching this for a while. Do post more like this.
Machine Learning course in Chennai
Machine Learning Training in Chennai
Machine Learning institute in Chennai
RPA Training in Chennai
RPA course in Chennai
Blue Prism Training in Chennai
UiPath Training in Chennai
Azure Training in Chennai
Machine Learning Training in Tambaram
Machine Learning Training

ahamed riyaz said...

It was good explanation and wonderful content. Keep posting...
IELTS Coaching in Chennai
IELTS Training in Chennai
German Classes in Chennai
Japanese Classes in Chennai
Spoken English Classes in Chennai
TOEFL Coaching in Chennai
spanish language in chennai
content writing training in chennai
IELTS Coaching in Adyar
IELTS Coaching in Velachery

shreekavi said...

Great post. keep sharing such a worthy information
Software Testing Training in Chennai
Software Testing Training in Bangalore
Software Testing Training in Coimbatore
Software Testing Training in Madurai
Best Software Testing Institute in Bangalore
Software Testing Course in Bangalore
Software Testing Training Institute in Bangalore
Selenium Course in Bangalore

menon said...

Very nice blog i really enjoyed to read this blog, continue to post useful information...
DevOps Training in Chennai
DevOps Training in Bangalore
Best DevOps Training in Bangalore
DevOps Course in Bangalore
DevOps Training Bangalore
DevOps Training Institutes in Bangalore
DevOps Training in Marathahalli
AWS Training in Bangalore
Data Science Courses in Bangalore
PHP Training in Bangalore

Neha Bora said...

Great Information sharing. I am very happy to read this article. thanks for giving us go through info. Fantastic nice. I appreciate this post. Machine learning course

varsha said...

I believe that your blog will surely help the readers who are really in need of this vital piece of information. Waiting for your updates
AWS training in chennai | AWS training in anna nagar | AWS training in omr | AWS training in porur | AWS training in tambaram | AWS training in velachery

PMP Certification Bangalore said...

This is my first visit to your blog! We are a team of volunteers and new initiatives in the same niche. Blog gave us useful information to work. You have done an amazing job!
PMP Certification Training in Bangalore

Project said...

The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. machine learning projects for final year In case you will succeed, you have to begin building machine learning projects in the near future.

Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.


The Nodejs Training Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training

Theverge said...

ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE

Theverge said...

ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE
ROWE ROWE

Cognex Technology said...

Well written! You have covered many points. Please keep it up and keep us updated with the latest information. Thanks a lot!
by cognex
AWS Training in Chennai

Jhon Michael said...

Red Hat Certified Engineer is a professional who has expertise in handling the Red Hat Enterprise Linux System. The Certified Engineer takes care of various tasks such as setting kernel runtime parameters, handling various types of system logging and providing certain kinds of network operability. The professionals must have the ability to install networking services and security on servers running Red Hat Enterprise Linux.

Red Hat Certified Engineer

Anonymous said...

nice

Sages Marketing said...

Very interesting blog. Many blogs I see these days do not really provide anything that attracts others, but believe me the way you interact is literally awesome.You can also check my articles as well.

Security Guard License
Ontario Security License
Security License Ontario
Security License

Thank you..

travel5212 said...

Post is very good thanks for sharing its very amazing
website: Trip to Vietnam

Business Analytics Course said...

I am delighted to discover this page. I must thank you for the time you devoted to this particularly fantastic reading !! I really liked each part very much and also bookmarked you to see new information on your site.

Business Analytics Course in Bangalore

Data Analytics Course said...

It took me a while to read all the reviews, but I really enjoyed the article. This has proven to be very helpful to me and I'm sure all the reviewers here! It's always nice to be able to not only be informed, but also have fun!

Data Analytics Course in Bangalore

Tableau Course said...

I am sure it will help many people. Keep up the good work. It's very compelling and I enjoyed browsing the entire blog. Tableau Course in Bangalore

centre99 said...

it is really a great and helpful piece of info. I am glad that you shared this helpful information with us. Please keep us informed like this. Thank you for sharing.
china visa application

riya singh said...

I enjoyed over read your blog post. This was actually what i was looking for and i am glad to came here!
Website: artificial jewellery

Tech Institute said...

Great article with valuable information found very resourceful thanks for sharing.
typeerror nonetype object is not subscriptable

Siya said...

Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
Data Science Course In Hyderabad

Data Science Training said...

Really nice and interesting blog information shared was valuable and enjoyed reading this one. Keep posting. Thanks for sharing.
Data Science Training in Hyderabad

Emerging Technology said...

Fantastic blog with high quality content found very valuable and enjoyed reading thank you.
Data Science Course

Cyber Security said...

Nice Information Your first-class knowledge of this great job can become a suitable foundation for these people. I did some research on the subject and found that almost everyone will agree with your blog.
Cyber Security Course in Bangalore

Cyber Security Course said...

Writing in style and getting good compliments on the article is hard enough, to be honest, but you did it so calmly and with such a great feeling and got the job done. This item is owned with style and I give it a nice compliment. Better!
Cyber Security Training in Bangalore

technology said...

So when the industrialist, business owner, service providers understand these methods why those are very important in the digital marketing data science course syllabus

technology said...

field and the need of online presence for any business in current situation then this will become very easy to implement the online marketing strategies in their industry or field. data science course syllabus

Harrison Mordich said...

It is good post about python! If you want to get many twitter likes on it, just post it on twitter and go here https://soclikes.com/ to buy twitter likes

Popular Posts