In this video I explain how you can scale python pandas to handle millions of records using libraries like Dask and Modin. I also show that if your dataset can fit into main memory then pandas is much faster than Dask and Modin. Dask and Modin are better suited to distributed computing scenarios.
If you like such content please subscribe to the channel here: https://www.youtube.com/c/RitheshSree...
If you like to support me financially, It is totally optional and voluntary. Buy me a coffee here: https://www.buymeacoffee.com/rithesh
Relevant Links:
https://www.datarevenue.com/en-blog/p...
https://modin.readthedocs.io/en/stable/
https://docs.dask.org/en/stable/dataf...