Scikit-learn Crash Course – Machine Learning Library for Python



Scikit-learn is a free software machine learning library for the Python programming language. Learn how to use it in this crash course.

✏️ Course created by Vincent D. Warmerdam.

⭐️ Course Contents ⭐️
⌨️ (0:00:00) introduction
⌨️ (0:03:08) introducing scikit-learn
⌨️ (0:34:36) preprocessing
⌨️ (0:53:36) metrics
⌨️ (1:24:49) meta-estimators
⌨️ (1:45:34) human-learn
⌨️ (2:06:17) wrap-up

⭐️ Code ⭐️
💻 Full code: https://github.com/koaning/calm-notebooks
💻 Notebook per section:
🖥 introducing scikit-learn: https://github.com/koaning/calm-notebooks/blob/master/scikit-learn/scikit-learn.ipynb
🖥 preprocessing: https://github.com/koaning/calm-notebooks/blob/master/scikit-prep/scikit-prepare.ipynb
🖥 metrics: https://github.com/koaning/calm-notebooks/blob/master/scikit-metrics/scikit-metrics.ipynb
🖥 meta estimators: https://github.com/koaning/calm-notebooks/blob/master/scikit-meta/scikit-meta.ipynb
🖥 human-learn: https://github.com/koaning/calm-notebooks/blob/master/human-learn/human-learn.ipynb

⭐️ Other Recources ⭐️
🔗 https://calmcode.io
🔗 scikit-learn docs: https://sklearn.org/index.html
🔗 spaCy course: https://www.youtube.com/watch?v=WnGPv6HnBok&list=PLBmcuObd5An559HbDr_alBnwVsGq-7uTF&ab_channel=Explosion
🔗 PyData Youtube channel: https://www.youtube.com/user/PyDataTV
🔗 algorithm whiteboard: https://www.youtube.com/watch?v=Czto6GzJah8&list=PL75e0qA87dlG-za8eLI6t0_Pbxafk-cxb&index=32&ab_channel=Rasa

Learn to code for free and get a developer job: https://www.freecodecamp.org

Read hundreds of articles on programming: https://freecodecamp.org/news

And subscribe for new videos on technology every day: https://youtube.com/subscription_center?add_user=freecodecamp

source

This Post Has 48 Comments

  1. freeCodeCamp.org

    Message from the creator:
    I hope you've all enjoyed this series of videos. It was fun to collaborate with freeCodeCamp!

    If you're interested in more content from me feel free to check out calmcode. Also, I'd like to give a shoutout to my employer, Rasa! We're using scikit-learn (and a whole bunch of other tools) to build open-source chatbot technology for python. If that sounds interesting, definitely check out https://rasa.com/docs/rasa/.

  2. Tanmay Bhardwaj

    Does Vincent has his own Channel, I just love his teaching style!!

  3. Kuan

    A racist algorithm vs a political algorithm?

  4. Neet Shah

    Thank you soo much Vincent and FreeCodeCamp. This is the most useful video on YouTube and it's free!

  5. Anand Srikumar

    If i get a high paying job, i will donate at least 5000 rupees to freecodecamp

  6. ADI

    I loved the "racist algorithm" concern you raised. I guess most of us would have ignored it while drowning in fancy algorithms.

  7. kevin dandrade

    The section on Metrics gets confusing for me. Any easy to understand books I can read for understanding metrics?

  8. I’m Da Dood

    Just completed the first part of the lecture. I have been using scikit for a couple of months! Dudeee! This is an eye opener!

  9. Alberto G

    Very good teacher. Thanks for the content I learned a lot.

  10. elghark

    min 56:56. You said that 196 cases out 80000 means there are a lot more "fraud cases"(class 1) that "non fraud cases "(class 0). Why? Isn't it the contrary?

  11. rajat sharma

    sorry…but i totally lost it from metrics onwards…it was too heavy to understand…did not understand even the purpose of the lecture let alone the code…

  12. Madhavaraj

    💯💯💯💯💯💯💯💯💯💯💯💯💯😍😍😍♥️🙏🤝🤝🤝🤝🤝🤝🤝🤝🤝🤝👍

  13. ghaziekid gaming

    im stuck at the first step,
    from sklearn.datasets import load_boston
    ModuleNotFoundError: No module named 'sklearn'

  14. Rick Ellis

    This is the way everything should be taught!

    I love that you present concepts in a structured and systematic way, speaking slowly and clearly, using as few words as possible…

    – starting with the concept and talking through drawing a logical diagram (which is so important for developing abstract thinking in terms of high level concepts, which is how we think when we are experienced in something).

    – then writing clean, concise code to implement each part of the concept

    – showing plots that directly demonstrate the effects of the entire iteration

    Too many tutorials make the mistake of talking too much. A lot of videos also either assume too much or too little about the viewer's knowledge.

    This seems to confidently stike the nail on the head!

    Thanks!

  15. Rick Ellis

    Can I ask you how you are able to draw on the screen? I understand you are probably using a Stylus pen over some touch screen surface, which mirrors your display, but what software are you using for that?

  16. 123asdf

    Is there a way KNN to skip the closest nearest neighbor?

  17. icuclc

    Hey man, thanks for this video – it's taught me allot! Really apprecaite the thoroughness as it's really helping me nail down theses concepts.

    I do have to question something though, what's so wrong with the original dataset? I understand the racists concerns, but I think it should serve as a reminder of how silly our history actually is, and how far we've come?

    It also highlights another very important topic in machine learning: Just because a feature is in a data set, doesn't mean that it needs to be part of the analytics or modeling.

    Moreover, part of AI explainability is to identify what features you were given to work with, and wich ones you decided to chose for your model. Therefore, having a feature like this is important so that we all learn to critically analyze the information we've been given, and help us understand which features should be used for modeling.

    I'm commenitng on here, as opposed to elsewhere, because it seems like this was something you found bad – and were very passionate about in the video. My personal view is that it should have been kept in the data set from scikit-learn, those of us who are serious about this stuff know how to "drop columns", and be fully tranparent in how we built our models – including the features used.

    Therefore, there is absolutely nothing to hide.

    Anyone who builds a model, tells you how well it perfomrs, but can't explain anything other than draw you a picture of how a nueral network works should never be trusted. AI and Machine Learning is science. Science must be transparent, repeatible and reproducible.

    It should also serve as a reminder of history. Those who don't know about history tend to repeat it. We need these reminders from that context.

  18. Burak Senel

    This is by far the most beginner friendly introduction to sk-learn I've seen

  19. PW

    Kudos! Excellent training.

  20. kh al

    00:19 i did not underestand why after changing k value from 5 to 1 prediction diagram changed ? knn is a classification algoithm but here it was like a regration

  21. Before I use up a ton of time for nothing, I want to know if Scikit-learn is capable of Deep Q learning because that's what I've been trying to do

  22. Where are the datasets for the sklearn metric tutorial (credit card dataset, etc)? Thank you!

  23. Mugumya Vicent

    thanks my co name — vicent, you inspire me to do machine learning

  24. Saptarshi Sanyal

    At 48 minutes the explanation for polynomial scaling was not clear, the plot for standard scaler and polynomial scaler was shown as same. Then what was the improvement?

    Further, at 1.09hr, the syntax for python code inside plt.plot, could anybody pls explain??

    At 57th minute you told there are way more cases with fraud cases than without fraud cases. Is that correct? Because there are 80000 samples and only 196 fraud cases.

  25. Gisle Berge

    Great introduction to ML, educational and well explained to the core… 🙂

  26. TwixWyd

    50:00 count vecotorizer is a really good preprocessor for that too in my opinion

  27. José Cazarin

    For the Titanic example: 76% of the women survived, whereas just 16% of the men survived, that would have been a really good classifier to start with

  28. TwixWyd

    I was rewatching the course to make my basics better , there were actually a lot of details man!!!

  29. Louis Liu

    Could you please explain why the min of recall and precision is lower than both? Could not find appendix.

  30. BeastBoyGaming

    i dont know what u said about load boston being deprecated and removed cus its very much still there and so is the 'B' Variable used in the algorithim lol. im using the new versions of scikit that you get from anaconda

    – a black person lol

  31. Miles

    it's insane how good this video is

Leave a Reply