In this episode, I focus on TensorFlow 2.0, the most popular open source library for Machine Learning. I explain in detail how it differs from TensorFlow 1.x, and how you can start using it on AWS: Deep Learning AMIs, Deep Learning Containers, and Amazon SageMaker.
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future episodes ⭐️⭐️⭐️
* TensorFlow on AWS: https://aws.amazon.com/tensorflow
* My SageMaker notebook: https://gitlab.com/juliensimon/dlnotebooks/tree/master/keras/07-keras-fmnist-tf20
* A notebook giving you an overview of TensorFlow 2.0, by François Chollet, the creator of Keras: https://colab.research.google.com/drive/1UCJt8EYjlzCs1H1d1X0iDGYJsHKwu-NO
This podcast is also available in video: https://www.youtube.com/watch?v=Kqd7__Yllr0
For more content, follow me at https://medium.com/@julsimon and at https://twitter.com/julsimon.
speaker 0: 0:00
I think everybody. This is Julian from AWS and welcome to Episode two of My Podcast. It's a bit of a special episode today. I'm going to focus on Tensorflow to zero and how to run it on AWS. And the reason why I'm doing this is because Tensorflow to zero is now available on all compute platforms so you can easily run it on E. C. Two. Container Service is and Sage Maker, So it's a good opportunity to cover all bases. First, I'm going. Teoh give you a little bit of background information on Tensorflow. Then I will explain how tensorflow to zero is really stepped forward and how it's different from Chancellor for 101 X, I guess. And then I'll show you how to get started with a chance of flow to zero on E C. Two containers and sage maker. Let's get to work. As you probably know, Tensorflow is an open source library for machine learning and deep learning. The main A P I is and python, and you have some additional support for languages like Java, for example. It came out ah, little more than four years ago, and the first version called a chance of flow one x as being extremely successful. I recently read a research report from a KN analyst company called Nucleus and Ah, and they're telling us that TENSORFLOW is used in 74% off deep learning research projects with PYTORCH a distant second at 43%. So tensorflow is really the number one library out there. Off course. Over time, a lot of features have been added to tensorflow and tensorflow to zero came out at the end of September. So how is that different from tensorflow one X? I think it's time for the white board. Tensorflow one X uses a programming model Cole Symbolic mode. Okay, let me explain. White Board, please. Here we go. So let's say we're trying to compute a multiply by B plus C. Okay, and off course, please bear in mind. These are not inter. Jurors are floating points, right? These are matrices because when you're working with machine learning or deep running, you're working with matrices. Okay, multi array, multi dimensional raise on there. The fancy word for those is you guessed it dancers. Okay, that's why these libraries gold tensorflow. Anyway, let's not get bogged down in vocabulary. So when you're working with, uh, symbolic programming, you first defined the execution graph. Okay, so we would need two variables A and B. We would need a multiplication operator gave variables operators, and we would need 1/3 variable sea. And we would need the addition of plus operator, okay. And we would combine them in a graph just like this. Okay, so, a m B feed them into the multiplication operator and then feed the result of that and see to the plus operator. And that would give us our results. Okay, So if we were writing symbolic code, it will look something like this. We would define three name variables, ABC story. No data at this point. Okay? Just being variables eso names really for data that will provide later on. And then we would define, um, a new variable d, which would be the multiplication off and be again at this point. This is just definition. This is against symbolic programming. No, actual processing is ah is performed. Then he would be a d plus c. Okay. And that would build the execution graph you see over there, right? And once the graph is fully defined. We would compile it using ah, library function. And this would give us a proper function. Let's go. The F that we could then apply to actual values for A B and C, so we would invoke f passing values for ABC, and that would give us our results. Okay, let's call it. Why? So we can clearly see here why this programming model is called Define then run. Okay, First we define it the graph, and then we run the graph using data that you provided, you tell. Yeah, What's the you know, what's the problem with this? Well, the problem with this is as you compile the graph, it's transformed into an internal representation something that's really highly optimized, optimized for speed up to my for memory consumption. And it looks probably nothing like your initial graph anymore. So that makes it really, really difficult to debug and inspect the code and understand why that code is not working the way you wanted to run, right? And I guess this contributes to the black box problem around deep learning. Okay, So, for example, if we look at the scruff again, okay, we could see that actually D, it's pretty. It's pretty useless, right? I mean, sure, we need DEETs store the result. But that's the only thing it does. And then we use D on the next line for E. So you could say, Well, maybe the memory that we allocate for D can be reused by E. There's no reason to to have memory allocated for D and E so a member optimization would be okay. Let's re use D now the memory allocated for G for the e tenser and and save memory. Okay, this is just a very, very basic example. But this is the kind of stuff that graph compilation will do. OK, And the result, I guess the benefit is, um, you know, you end up using less memory and of course, you run the graph faster, so you trained faster. Okay, so that's symbolic programming. Fast, efficient, difficult to debug. Difficult to understand. Okay, that was turned to flow one x. Now let's talk about tensorflow to zero. And in practice mode, the main difference between tensorflow to zero and tensorflow one X is that we can now shift from symbolic mode to imperative mode. Okay? And tensorflow actually called it Eager mode. So let's see what this does. White board, please. So let's look at the same calculation here. Okay, Uh, and the good news is, you already know what Imperative Mode is because Imperative Mode is just running code and writing code the way we've bean running it forever, I suppose so. Writing a line of code at a time and running a line of code at a time. So this is called defined by run. Okay, there's no two stages year. We just run that code and it builds and runs the graph line by line. So here I am, using numb pies. An example. But it could be Java B C plus plus again, imperative mode is what you already know. Okay, so if we were using none by they would create three variables ABC three numb fire raise with actual data. Okay, so data would be provided right there, and then we would create additional dump. I raise one for multiplication. So D is a multiplied by B and E is d plus C. And of course, we get a result. Now if we try to look at what's happening, you know, it's really running line by line. So every time we run a line, we create a new number. I object and all of them exist. Okay, The old existing memory. So a B C D E E r. All, um, inspect herbal. And this makes the code, I guess, easier to understand easier to debug. You know exactly what each line, Dawes. There is nothing happening magically on. And what you see is what you run and what you debug right. And that's the main difference so easier to understand the more natural way of writing code, amore friendly, way off writing code. Now, of course, the downside to this is it's slower because we we have fewer opportunities or possibly no opportunities to actually optimize and do all the crazy stuff that that we can do our graphs. So that's eager mode or ah, as tensorflow calls Now, the good news is you actually get symbolic mode as well. Okay, you can start by writing your code in the imperative fashion, which is great for experimentation, debugging, et cetera, et cetera. And then you can easily transform it, compile it to ah to the symbolic form and get the increased speed and an optimization that goes with it. Okay, All right. So you don't get to pick. You can have your cake and eat it. Or as we say in France, you can have cheese and dessert right, which is nice. So that's the biggest difference with tensorflow to zero. The other one that I want to mention is that the care as a P I, which used to be a separate library running on top of tensorflow, is now fully integrated with tensorflow. And I guess it's now the preferred A p I and you can you can use care us at a very high level, or you can also customize it heavily and I guess, more than you could in the past. So here, too, you know, you get more opportunities to experiment quickly, as well as to optimize and write custom code custom training loops, custom lairs, etcetera. So those two things, eager mode and full caress integration are really, really cool features. Okay, so now let's look at how you can run tensorflow to zero Inedible. Yes, Thea analyst's report that I mentioned earlier also told us that 85% off cloud based tensorflow workloads run on AWS and you know, that's a nice number. So I guess it gives us a responsibility to make sure tensile for runs nicely on AWS. So let's ah, let's take a look at the different ways you can do that on the 1st 1 is to run it on a Niecy to instance, and to make it simple, we've built those deep learning Am eyes. If you've never heard about am I cz? That means Amazon machine image. And it's basically the binary file that is used to create virtual machines on Amazon. Easy to okay. And yes, it's pronounced am eyes and not armies, right? Don't get me started anyway. If you go to a double s marketplace, you'll find different. An eye's already packaged. So the one you want to ah use if you want to use tensorflow to zero is version 26 or later. Look at the time of recording. This is the latest version. But don't go and pick something older because you're going to miss tensorflow to zero. Okay. And these are available for Amazon Lennox to or you boon to 18 to whatever whatever suits you. Okay, so you can just select the semi launch An instance. Okay. You don't need me to show you this launch An instance which I've already done. I've launched G for instance here. So I can s a sage to my instance, And I can see the different environments that are available there. This is old managed by Kanda, the package manager for Python. And let's ah, select tensorflow to bite on 36 activated that. Okay. And now if I run Python three and if I import chance flow webs and I look at the version, all right, it is tensorflow to zero. Look, and you know what to do next. So this is one way of doing it. Just fire up on the deep running. Am I the latest version? And it comes with a chance of flow to zero Princeton. And of course, we update those am I very regularly. So you will also get the future versions. Okay, so another way you can use tensorflow to zero is with the deep running containers. So deep running containers or what? You would think it. A blessed meeting containers that package the deplaning libraries that are available on the deep learning every. Okay, so We have mxnet versions and pytorch versions and tensorflow versions, and you have separate containers for training and prediction. So for the sake of simplicity, I'm going to keep working on this same instance. But of course, this would work exactly the same on one of our containers. Service's E C S E K s, or just any sea to instance preinstalled with doctor. Okay, so the first step is to logging to Amazon. CCR, the doctor registry service for AWS. Okay, so all images are stored in this ridiculous account. Make sure you provide the right region for that. Okay. And, uh, next you can just pull the image, okay? And you'll find a list of image names in the deep running container documentation. Okay, I already did that because, you know, it's not really interesting to see Dr Images being being pulled. Okay, so now my image is available, and I can easily run it just like that Doctor, run. Okay. And ah, again if I run Python And if I am poor tensorflow, I should see that this is the proper version. Okay? Yes. Version 20 So you know nothing fancy just containers, but ah, unless you really, really enjoy maintaining your own containers, you know, why not? But, you know, give those a try. They might just save you some time. And of course, these come with optimized versions. We have, actually ah, dedicated team working on the optimizing tensorflow on AWS. So this is not a vanilla version that you're getting here. This is actually a pretty fast bird. So how do you use tensorflow to zero on sage maker? Well, just like you used the previous versions. Nothing to learn. Okay. And this is a very simple notebook with a simple tensorflow to zero script which I will put on get live. And of course, you you'll get all the information for that. Um, and how do you use that thing? Well, remember that when you're training a tensorflow script on sage Maker, you use this sage maker dot tensorflow doc tensorflow estimator. Okay. And basically, these takes your script as the first parameter your infrastructure requirements. So how many instances you want? What type of instance do you want hyper parameters, et cetera, et cetera, And the frame of version. Okay, so there's a parameter that's actually called framework version, where you say, Hey, I want to use Tensorflow 1 15 or I want to use something else. And now, as off yesterday, actually, according to the getup depository for the sagemaker sdk, you can now say Hey, give me framework version 2.0 dot zero and that's it. Right? So in case you're wondering, you need sage maker as decay 1.49. OK, so make sure your update as decay to this latest version. This was pushed yesterday, but if you have 1 49 all later, you can now just say, all right, please give me former version 200 and that's it. Okay, Nothing fancy. And for the record, this hasn't officially been announced. Not sure why, but, hey, the code is out there. So the future is available for all of you. Okay. And then deploying is exactly the same as well. Okay, you would call dot deploy on your estimator and get a model, and you will be able to predict. Okay, So from a sagemaker perspective, the only difference is use the new frame of version. Okay. Uh, well, I think that's it. I think that's what I wanted to Ah, to show you today. So remember three ways you can use tensorflow to zero deep running. Am I? Make sure you use Version 26 up deep running containers and sage maker and make sure you use as decay 1.49 up. Well, that's it for this episode. A hope you learn a few things and Merry Christmas and happy holidays to all of you out there And I'll see you soon. Maybe you'll have, Ah, a New Year's episode. Who knows, You know anything's possible. It's edible. Yes, it's machine learning. It's totally crazy. See you next time. And until then, keep rocking.