PC-53293 | PyCon Canada 2018

Numpy To PyTorch

by Shagun Sodhani

data science pytorch numpy best practices

Numpy is the de-facto choice for array-based operations while PyTorch largely used as a deep learning framework. At the core, both provide a powerful N-dimensional tensor. This talk would focus on the similarities and difference between the two and how we can use PyTorch to augment Numpy.

Numpy is the de-facto choice for performing array-based operations while PyTorch is largely used as 'a deep learning framework for fast, flexible experimentation'. Even though the two descriptions sound different, both the libraries provide access to a powerful N-dimensional array (or as we say in PyTorch - tensor). PyTorch supports tensor computations (similar to Numpy) with strong GPU acceleration. In some sense, PyTorch can be used as a replacement for Numpy to use the power of GPUs (even if your use-case is not a machine learning use case). The cost of converting a numpy ndarray to torch tensor is quite negligible as they share the same storage. Unfortunately, PyTorch cannot be used like a drop-in replacement for Numpy though PyTorchs is 'expect to get closer and closer to NumPy’s API where appropriate'

Content:

One slide about 'what is Numpy'
Few slides about 'what is ndarray' and 'operations supported' by ndarray.
- These slides would not go into the internals and are presented to keep the presentation self-contained.
Few slides about 'what is PyTorch'
- These slides would provide a brief background information about PyTorch
Few slides about 'what is torch.Tensor'
Some slides about syntactical similarities and difference between the two apis
- This would be a very important part of the presentation.
- Here we would spend time on subtle differences between the two apis - eg Numpy uses axis while PyTorch uses dim and gotchas to look out for.
- We would talk about operations that are not yet supported in PyTorch.
- A small pointer about the performance of PyTorch tensors and Numpy ndarrays.
- I am expecting a few questions from the audience during this part
Why do we want PyTorch
One slide about how to convert Numpy ndarray to torch Tensor and vice-versa
Some slides about moving tensors to GPU
- How to move tensor
- Things to take care of - eg do not move data between every operation. If your data lies on GPU, try to do as many operations there as possible.
How to use other numerical computation libraries with PyTorch
Where to go from here
Conclusion

The slides and notebook are available here. I would use colab notebooks for demoing the code. Note that I am specifically avoiding the perspective of how to train neural networks using PyTorch and want to focus on the interplay between PyTorch and Numpy.

About the Author

Hi, I am Shagun Sodhani. I am a researcher and MSc student at Montreal Institute of Learning Algorithms. I graduated in Computer Science from IIT Roorkee in 2015 and have previously worked as a Machine Learning developer with the Data Science team at Adobe Systems for 2 years. Research takes most of my time though I regularly attend tech-talks and meetups around the city. You may check out my projects, talks and resume.

I love Computer Science in general and Machine Learning in particular. I regularly read and summarise papers from these domains. I am also interested in Maths and Economics. I always look forward to meeting interesting people and learn new stuff, both in-person and remotely. If you want to talk, feel free to drop me a line.

Author website: https://shagunsodhani.in/

If you are the author of this talk and want to make an edit, feel free to send us a PR!