Home » » Top 20 Python Machine Learning Open Source Projects

Top 20 Python Machine Learning Open Source Projects

We examine top Python Machine learning open source projects on Github, both in terms of contributors and commits, and identify most popular and most active ones.
By Bhavya Geethika Peddibhotla. 

We analyze Top 20 Python Machine learning projects on GitHub and find that scikit-Learn, PyLearn2 and NuPic are the most actively contributed projects. Explore these popular projects on Github! 

top-python-machine-learning-projects
Fig. 1: Python Machine learning projects on GitHub, with color corresponding to commits/contributors. Bob, Iepy, Nilearn, and NuPIC have the highest such value. 

  1. scikit-learn, 18845 commits, 404 contributors, 
    www.github.com/scikit-learn/scikit-learn 
    scikit-learn is a Python module for machine learning built on top of SciPy.It features various classification, regression and clustering algorithms including support vector machines, logistic regression, naive Bayes, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
  2. Pylearn2, 7027 commits, 117 contributors, 
    www.github.com/lisa-lab/pylearn2 
    Pylearn2 is a library designed to make machine learning research easy. Its a library based on Theano
  3. NuPIC, 4392 commits, 60 contributors, 
    www.github.com/numenta/nupic 
    The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implements the HTM learning algorithms. HTM is a detailed computational theory of the neocortex. At the core of HTM are time-based continuous learning algorithms that store and recall spatial and temporal patterns. NuPIC is suited to a variety of problems, particularly anomaly detection and prediction of streaming data sources.
  4. Nilearn, 2742 commits, 28 contributors, 
    www.github.com/nilearn/nilearn 
    Nilearn is a Python module for fast and easy statistical learning on NeuroImaging data. It leverages the scikit-learn Python toolbox for multivariate statistics with applications such as predictive modeling, classification, decoding, or connectivity analysis.
  5. PyBrain, 969 commits, 27 contributors, 
    www.github.com/pybrain/pybrain 
    PyBrain is short for Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library. Its goal is to offer flexible, easy-to-use yet still powerful algorithms for Machine Learning Tasks and a variety of predefined environments to test and compare your algorithms.
  6. Pattern, 943 commits, 20 contributors, 
    www.github.com/clips/pattern 
    Pattern is a web mining module for Python. It has tools for Data Mining, Natural Language Processing, Network Analysis and Machine Learning. It supports vector space model, clustering, classification using KNN, SVM, Perceptron
  7. Fuel, 497 commits, 12 contributors, 
    www.github.com/mila-udem/fuel 
    Fuel provides your machine learning models with the data they need to learn. it has interfaces to common datasets such as MNIST, CIFAR-10 (image datasets), Google's One Billion Words (text). It gives you the ability to iterate over your data in a variety of ways, such as in minibatches with shuffled/sequential examples
  8. Bob, 5080 commits, 11 contributors, 
    www.github.com/idiap/bob 
    Bob is a free signal-processing and machine learning toolbox The toolbox is written in a mix of Python and C++ and is designed to be both efficient and reduce development time. It is composed of a reasonably large number of packages that implement tools for image, audio & video processing, machine learning and pattern recognition
  9. skdata, 441 commits, 10 contributors, 
    www.github.com/jaberg/skdata 
    Skdata is a library of data sets for machine learning and statistics. This module provides standardized Python access to toy problems as well as popular computer vision and natural language processing data sets.
  10. MILK, 687 commits, 9 contributors, 
    www.github.com/luispedro/milk 
    Milk is a machine learning toolkit in Python. Its focus is on supervised classification with several classifiers available: SVMs, k-NN, random forests, decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems.For unsupervised learning, milk supports k-means clustering and affinity propagation.
  11. IEPY, 1758 commits, 9 contributors, 
    www.github.com/machinalis/iepy 
    IEPY is an open source tool for Information Extraction focused on Relation Extraction 
    It's aimed at users needing to perform Information Extraction on a large dataset. scientists wanting to experiment with new IE algorithms.
  12. Quepy, 131 commits, 9 contributors, 
    www.github.com/machinalis/quepy 
    Quepy is a python framework to transform natural language questions to queries in a database query language. It can be easily customized to different kinds of questions in natural language and database queries. So, with little coding you can build your own system for natural language access to your database. 
    Currently Quepy provides support for Sparql and MQL query languages, with plans to extended it to other database query languages.
  13. Hebel, 244 commits, 5 contributors, 
    www.github.com/hannes-brt/hebel 
    Hebel is a library for deep learning with neural networks in Python using GPU acceleration with CUDA through PyCUDA. It implements the most important types of neural network models and offers a variety of different activation functions and training methods such as momentum, Nesterov momentum, dropout, and early stopping.
  14. mlxtend, 135 commits, 5 contributors, 
    www.github.com/rasbt/mlxtend 
    Its a library consisting of useful tools and extensions for the day-to-day data science tasks.
  15. nolearn, 192 commits, 4 contributors, 
    www.github.com/dnouri/nolearn 
    This package contains a number of utility modules that are helpful with machine learning tasks. Most of the modules work together with scikit-learn, others are more generally useful.
  16. Ramp, 179 commits, 4 contributors, 
    www.github.com/kvh/ramp 
    Ramp is a python library for rapid prototyping of machine learning solutions. It's a light-weight pandas-based machine learning framework pluggable with existing python machine learning and statistics tools (scikit-learn, rpy2, etc.). Ramp provides a simple, declarative syntax for exploring features, algorithms and transformations quickly and efficiently.
  17. Feature Forge, 219 commits, 3 contributors, 
    www.github.com/machinalis/featureforge 
    A set of tools for creating and testing machine learning features, with a scikit-learn compatible API. 
    This library provides a set of tools that can be useful in many machine learning applications (classification, clustering, regression, etc.), and particularly helpful if you use scikit-learn (although this can work if you have a different algorithm).
  18. REP, 50 commits, 3 contributors, 
    www.github.com/yandex/rep 
    REP is environment for conducting data-driven research in a consistent and reproducible way. It has a unified classifiers wrapper for variety of implementations like TMVA, Sklearn, XGBoost, uBoost. It can train classifiers in parallel on a cluster. It supports interactive plots
  19. Python Machine Learning Samples, 15 commits, 3 contributors, 
    www.github.com/awslabs/machine-learning-samples 
    A collection of sample applications built using Amazon Machine Learning.
  20. Python-ELM, 17 commits, 1 contributor, 
    www.github.com/dclambert/Python-ELM 
    This is an implementation of the Extreme Learning Machine in Python, based on scikit-learn.
This post used some content from www.pansop.com/1039/ 

228 Comments:

«Oldest   ‹Older   201 – 228 of 228   Newer›   Newest»
victortimely123 said...

Your blog consistently offers valuable and thought-provoking content. Your dedication to sharing insightful information is truly remarkable. Each post reflects your passion for the topic, making your blog an excellent source of knowledge. Thank you for consistently delivering such high-quality content. Middlesex County Driving Without a License Lawyer and New Jersey Domestic Violence Attorney

apply for Turkey visa online said...

Your writing skillfully crafts a narrative, intertwining insights and reflections that evoke resonance. The seamless integration of well-researched facts and personal perspectives introduces richness and authenticity. Your unique voice leaves an indelible mark, making this an absorbing experience. Looking forward to more content that stimulates thought and inspiration.

Turkey e visa cost said...

I am thrilled to have discovered this website. I would like to express my gratitude for taking the time to read this exceptional blog. I thoroughly enjoyed every aspect of it and have saved it in my bookmarks to explore new content on your blog. The way you present intricate ideas with clarity and depth on your blog is truly remarkable, providing readers with valuable and insightful content.

judasanjoy said...

The review "top-20-python-machine-learning-open" is unclear and requires more context. To write a helpful comment, provide more details about the review, its specific aspects, initial impressions, target audience, and key resources or libraries highlighted. Ask about expectations, recommendations, and specific points to discuss. The more details, the better the comment can be. The goal is to help readers navigate the world of Python machine learning and provide valuable insights for fellow programmers, data scientists, and beginners.
Thanks for the information, Very useful.truck accident attorney near me

hastenclease said...

Diving into the Python Machine Learning realm on Github, we explore the thriving landscape of open-source projects, gauging popularity through contributors and commits. Just as these projects evolve dynamically, ensuring a clean and efficient coding environment, consider the transformative power of heavy detergent cleaning for spotless code maintenance and seamless collaboration.

shira said...

Scikit-learn: A comprehensive library for machine learning algorithms and tools.
TensorFlow: An open-source deep learning framework for building and training neural networks.
PyTorch: A dynamic deep learning library with a focus on ease of use and flexibility.
Keras: High-level neural networks API, running on top of TensorFlow or Theano.
XGBoost: An optimized, scalable, and efficient gradient boosting library.
dui in virginia

카지노사이트탑 said...

This is very interesting article, Very skilled blogger.

카지노사이트위키 said...

Appreciate you for sharing this blog. A must read post here. great piece of work

온라인카지노사이트 said...

Thank you for sharing this fascinating information. Best of luck

카지노사이트킹 said...

Great info. A very awesome blog post. Thanks and have a nice day!

outlookindia said...

Well-researched content. Keep what you're doing, Thanks

iteducationcentre said...

Great Blog post.
Python Classes in Pune

perry said...

Scikit-learn: Widely used for machine learning tasks with a simple and efficient API.
TensorFlow: An open-source deep learning library developed by Google.
PyTorch: A deep learning framework known for its dynamic computational graph.
Keras: High-level neural networks API, running on top of TensorFlow or Theano.
XGBoost: A scalable and accurate gradient boosting library for classification and regression.
Pandas: Data manipulation and analysis library, essential for preprocessing.
NumPy: Fundamental package for scientific computing in Python, crucial for ML operations.
Attorney near me

seriale le turcesti said...

Urmăriți Pe All Serialele Turcesti 2023 Netflix Online. Urmărește Serialul Românesc Gratis Online Streaming Video HD. You Can Enjoy It TV Shows And Seriale Turcesti Comedii Romantic And Filme De Dragoste Streaming For Free

terasa cucartii said...


Urmariti Online Serial Drama, Actiune, Drama, Si Comedie, Seriale Turcesti Vechi sau Nou in situatii foarte odihnitoare. Asta înseamnă că inima ta se simte confortabil seriale turcești comedie romantică, atât de mult încât te simți confortabil. Vă puteți bucura și, de asemenea, să împărtășiți cu prietenii și cu Faimly.

GCPMASTERS said...

thanks for valuable info
gcp data engineer training

shira said...

Scikit-learn: A widely-used library for machine learning, featuring various algorithms and tools for data analysis and modeling.
TensorFlow: An open-source machine learning framework developed by Google, known for its flexibility and scalability.
PyTorch: A deep learning library that provides dynamic computational graphs, making it popular among researchers and developers.
Keras: A high-level neural networks API, often used as an interface for TensorFlow, designed for fast experimentation and prototyping.
lawyer for bankruptcies



yara said...


Explore the top 20 Python machine learning open-source projects for cutting-edge advancements in AI. From TensorFlow and scikit-learn to PyTorch and Keras, these projects offer robust frameworks and libraries for developing powerful machine learning models. Dive into a wealth of resources, tutorials, and community support to accelerate your AI projects and stay at the forefront of innovation. With these open-source tools, unleash the full potential of Python for machine learning applications.
cuáles son las causales de divorcio en nueva jersey




lucask110198 said...

The top 20 open-source Python machine learning projects cover a wide spectrum of applications, from libraries like scikit-learn and Keras to frameworks like TensorFlow and PyTorch. These projects provide strong tools with copious documentation and vibrant communities for machine learning model development and deployment. These projects offer helpful tools for developing your abilities and solving practical issues in domains like computer vision, natural language processing, and predictive analytics, regardless of your level of experience.
dui lawyer virginia

kolson said...

The "top-20-python-machine-learning-open" list is too vague for a review comment. To provide a helpful comment, provide the source of the list, a general comment on the state of Python machine learning open source projects, or a specific project within the top 20. The author can then write a relevant and insightful review comment based on the provided information.
attorneys fairfax va

zyairkhan said...

This is amazing thanks for sharing
abogado dui mecklenburg va

Stephen John said...

One of the primary roles of a motorcycle accident attorney is to thoroughly investigate the circumstances surrounding the accident. They gather evidence, review police reports, interview witnesses, and consult with accident reconstruction experts to determine liability and build a strong case on behalf of their clients. motorcycle accident attorney

zyairkhan said...

Thanks for your awesome content. i like this post. thanks for sharing.divorcio de nueva jersey

Rachel said...

traffic lawyer hopewell va
"Top 20 Python Machine Learning Open Source Projects" is a comprehensive list of notable projects in the field of machine learning. It provides a valuable resource for developers and data scientists seeking to explore and contribute to open-source projects. The article showcases a diverse range of projects, covering various aspects of machine learning. It could benefit from categorizing projects based on their focus areas, providing links to GitHub repositories, highlighting key features, providing insights from project maintainers, and incorporating user reviews.

kolson said...

The podcast discusses the topic of AdSense users, providing valuable information and insights. It is available on various platforms like Spotify, Apple Podcasts, and Google Podcasts. The podcast is designed for everyone, allowing them to understand their AdSense preferences and gain practical insights to improve their products.
estate and gift taxes lawyer

Sophiaallen said...

bankruptcy lawyers near my location
TensorFlow, PyTorch, Scikit-learn, Keras, Pandas, NumPy, Matplotlib, OpenCV, H2O.ai, XGBoost, Dask, Fastai, Prophet, AllenNLP, SpaCy, TensorFlow.js, Ludwig, Prophet, TensorFlow Probability, and PyCaret are open-source machine learning frameworks and libraries used for various tasks, including data analysis, forecasting, natural language processing, and more. These libraries offer various features for building and deploying machine learning models.

india visa for sri lankan said...

Your post resonates deeply with me, sparking a cascade of reflections on the intricacies of life. It's a testament to the power of words to evoke emotion and provoke thought. Your prose flows effortlessly, guiding readers through a labyrinth of ideas and emotions. Each sentence is a brushstroke, painting a vivid tapestry of experiences. I found myself nodding in agreement, as if conversing with an old friend. Your writing possesses a rare authenticity that captivates the soul. Thank you for sharing your insights and igniting a spark of inspiration within me. I eagerly anticipate delving deeper into your future posts.

india visa for sri lankan said...

Your blog post strikes a chord within me, prompting profound contemplation on life's complexities. It showcases the remarkable power of language to stir emotions and provoke introspection. The fluidity of your writing effortlessly guides readers through a maze of thoughts and feelings. Each word feels like a brushstroke, crafting a vibrant canvas of experiences. I find myself nodding along, as though engaged in a heartfelt conversation. Your authenticity shines through, capturing hearts and minds alike. Thank you for sharing your wisdom and kindling a flame of inspiration. I eagerly anticipate immersing myself in your future musings.

«Oldest ‹Older   201 – 228 of 228   Newer› Newest»

Popular Posts