Anastasiia Tymoshchuk - Can we deploy yet?

00:00

People from Germany might know you from that you're the organizer of the Py Berlin meetups

00:12

and that you've been around Pycon D.E. as well a lot. And of course we've seen you at other

00:18

Europatans as well. Today you want to talk about how to make your code production ready.

00:24

Yes, exactly. That's a super interesting topic because when you do stuff in production like

00:31

now things can break all the time. So how to make sure that that works is interesting for us.

00:37

Are you ready to do your screen share? Yes, I am. I hope you have a nice talk.

00:42

So actually I'm going to talk about production ready code and whether our code is ready to be

00:48

deployed somewhere or not. A few words about myself. Thank you for a nice introduction.

00:55

I'm basically connected from Berlin and I'm primarily an organizer since already a year and a half.

01:03

I have 10 years in software development and 7 years in Python. Actually there isn't

01:10

very interesting story which I will quickly share before this start. I'm here in Python

01:16

because of the community most of the time. On the conferences and I'm super happy that once

01:24

our customer I was a former C++ developer and once customers said that now you are a Python developer,

01:31

you have to learn Python. I didn't understand how great was it because back then I wasn't

01:37

new crane and there was not much of a community scene in Python at all. There was like C++, Java and

01:46

different meetups but not Python at all. And then when I moved to Berlin I realized that there is

01:52

so much to learn and to share with the community everyone is so supportive. So I'm super happy that

01:58

I'm here right now and thank you for all the support. I know it's remote. I don't see anyone

02:04

here right now but I can feel the support. Thank you so much. You're all invited to the next

02:09

Pybrilimit app and if you're willing to speak for us just feel free to drop by just a message or

02:16

maybe just join the Pybrilim channel. So what exactly does production mean to you? What do you mean

02:24

by production ready code? What's the difference of production ready in non-production ready code?

02:30

Actually I read the book, release it. It's great and there is a favorite quote of this book as

02:40

you work so hard on your project. It looks like all the features are actually complete and most

02:46

even have tests. You can breathe a side of relief you're done or are you? There is no checklist

02:56

what to go through on your code if it's ready for production or not. Can you deploy already to your

03:04

customers? So I prepared some checklists for you and before I started that I would like to emphasize

03:13

that the only difference between production and not production code is that there is a customer

03:18

behind the computer or behind the device who is trying to run your code. The difference is that

03:26

we are developers. We know how our code is supposed to work. We know where to click, we know what

03:31

to do on the website or the application or whatever we are building but the customer doesn't know.

03:37

The only difference that code could break and it will break. We have to make sure that we know how

03:45

investigate, how to find the problem and fix it, how to provide a solution to our customers

03:51

and here are the checkpoints, what to check before going to production. So first of all,

03:59

I will talk about exception handling. There was also a very good talk yesterday about exception

04:05

handling and I will also share some tips about that and then how actually not to become

04:13

a detective to find your problems in the code, how to make your login meaningful,

04:20

then how to set up your CICD by applying I prepared some beautiful examples and then how to secure

04:29

your Docker images. After that I will show some links then you can follow later on.

04:37

So let's start with exceptions. We don't really want to show to our customers 500 error

04:45

because it's like you can see this error and then you know that something is run on the

04:50

server side but as your customer you cannot follow up on that. You can contact the administrator

04:57

of the website usually there is an email to contact and then what to do with that. We don't

05:02

really want to share this information with the customer. We want the customer to see some

05:08

proper exception handling if something went wrong then they would know what to do, how to follow up

05:13

for that. We are usually trying to handle exceptions in the manner that we are either silencing them

05:22

or maybe just using the try except block to catch the exception and then printing something on

05:28

our site on the server side to see. So what's the proper way of handling exceptions?

05:38

We can try to catch exceptions and the basic exceptions. Catching exceptions by official

05:44

definition of exceptions in the official recommendation is not a really good idea because it's

05:49

too broad and catching the basic exception is even dangerous so it's highly not recommended to

05:55

catch any basic exceptions. So guess what will be printed here if you are trying to get some

06:04

input from the terminal you run this code and then you're trying to catch exception and the

06:10

basic exception. Then you are interrupting your script and then what would be printed next?

06:16

You will get the basic exception printed on the screen that keyboard interrupt happened.

06:23

Why is it happening? Let's take a look into hierarchy of the exceptions. It's way longer,

06:28

I just cropped the first top section. So basically basic exception has more than exception. That's

06:36

why it's not recommended to handle to try to catch basic exception because you might catch

06:43

more than just exception which you're trying to catch. You will get also keyboard interrupt,

06:48

system exit and generator exit. So in this case we interrupted our script and then we got

06:54

keyboard interrupt. In this case if you will silent the exception then what will happen next?

07:02

Highly not recommended to use basic exceptions. So how to actually handle exceptions? Let's try to

07:08

take a look at this example. We're trying to catch exception of the exception class and then we will

07:15

print some exception on the screen and then we need to change message of the exception to have

07:22

something different like my custom message and then we will see on the screen that you're

07:27

handling the above exception, another exception occurred. That's not the ideal way because you

07:33

were trying to handle exception and then you failed while handling the exception. That could happen

07:39

also if you're trying to do some actions which might throw the exception on the handling but this

07:48

is super high, high, not recommended. So what can we do if we really need to change the exception

07:56

message? We can use from. At the last line you can see that we are trying to rise the exception with

08:02

the custom message so it's going to be a different exception but we're rising it from

08:08

exception which we already got. So at the screen we will see the above exception was the

08:15

direct cost of the following exception and then you can see the correct trace back, you will see

08:20

the previous exception and the correct exception and then you will see your awesome printed message.

08:30

The best way to handle exceptions is to be more specific on the exceptions. For example you can

08:36

create your custom exceptions with the custom message and then you can rise your custom

08:41

exceptions whenever it's needed. You can try to catch your custom exceptions or specific exceptions

08:47

to be more specific not to use a broad version of just exception handling of exception and then

08:54

you can still print them on the screen. Last year at Europitan in Basel I saw a talk

09:03

from Mario Kohero exceptional exceptions. It was really really good. There are more tips to learn

09:10

and you can see the video from PyConUS in here highly recommended to watch it and to go through

09:16

all of the tips. The next step is logging. How to make your logging meaningful? Let's take a look

09:25

what are the logs. In the 12th factor app which was super popular a few years ago everyone was

09:32

talking about 12th factor app. Nowadays I heard some docs about that some people mentioned in that

09:39

but not that much anymore but actually they have really good content and feel free to go to

09:46

the 12th factor app and check the logs section. So they're saying that we have to treat logs as

09:53

events streams and logs provide visibility into the behavior of a rainy application.

10:03

So let's take a look at the basics of logs. What should log in cloud? I did some research and then

10:10

I found out that the main login attributes are when or where, what, who and the outcome itself. So

10:21

when would be the just the timestamp when the log and tree happened then where did it happen?

10:29

Maybe a bus to the file then what actually happened if it was exception or maybe just information.

10:37

Then who did this action if there is a user then maybe some kind of ID information but no

10:45

customer information at all. I'm sure that you're following all the GPR principles and you're

10:52

not logging any customer information and then also important to have an outcome the message itself.

10:59

So how can we improve our logs? Usually we are having this type of logs we are using the standard

11:07

logger and then we are putting everything into a one log message. So I added here some important

11:13

information about the conference name, the talk name and also some random key ID. After printing this

11:22

we will see something like this. Then we have a log level, we have where this log message

11:28

happens, we can also add the timestamp super easily and then we have the entire message. This is

11:34

not super useful if we're trying to find out what actually happened on our system. So how can we

11:40

improve that and how can we make the log message more possible for the vops team? If we have a

11:49

devops team or we are our own devops and we're trying to build some boards on the top of the log messages.

11:57

In this case we can really check by the conference name because for example if we separate by

12:05

commas the log message then we will still get some random stuff if for example in the talk name

12:14

we have comma inside of the talk name. If we try to separate by I don't know some other ID or

12:24

maybe we will have ID friendly then we need to make sure that everyone uses the same structure of

12:31

the logs the same quality of the logs which is quite hard to establish. So what actually I tried

12:40

in my teams we use the struct log for that and there are also great talks about the struct log

12:48

and there are lots of examples in official documentation. I will show you some examples which I

12:55

used myself. So it's super easy you just get the logger instance and then you try to use the same

13:05

we are trying to use the same information from the previous log so we will have also key ID

13:11

conference name and talk name but the structure in here is a key value pair. So what will we get

13:19

actually? We will have the log message and then keywords. We can easily parse by the name of the

13:28

keyword and then we can get the value to build a dashboard on that which will show us way more

13:36

than just going through millions of logs from just our servers. So yes definitely recommended to

13:46

use the struct log and let's take a look at the features of it. For example here we can bind

13:58

some important keys to every log entry. In this case for example I have the entire big file

14:08

I was all the logs log entries like info exception whatever we need to log and then I need to have

14:16

conference name, talk name and key ID in every log entry. Obviously I would not write

14:24

like just copy based in the entire line to every log message. It's possible to use just

14:32

bind and then in every log message we will have the same keywords already there so we just don't

14:38

need to specify them over and over again through the all log messages. Then we will get the same

14:47

exception handling and then we will see our beautiful log message. For the development we can have

14:56

this highlight on, we can use Colorama for that and for servers we just don't need that. Also we

15:06

can use JSON format for logs then we will see event as a message of the log then we will see the

15:14

euro python custom log level. I will show that in a minute and then the name of the logger and the

15:21

time step and many more. So let me show you some code. I hope you can see my screen with the

15:34

my charm. This is a basic idea. Then let's take a look at the processors. So what do we have in

15:53

here? We have some password in the log message which we wanted to print but actually we don't want

16:01

to have any customer information in here. Because of that we can use a super custom processor. We

16:08

can write the processor ourselves and then after that we will have the password censored. We can

16:14

also use tracing from the struct log. We can, for example, pass the trace ID from one service to the

16:24

other one and then just add it to the log message which would show us how this actually

16:32

ever transferred from one message to the other one. And also many more features you can

16:39

add so many processors and features into this one. But we are running out of time. So I want to

16:46

show you a bit more of the different features. I wrote everything in the blog post. I will share

16:53

links later on so you can follow all the ideas later on. And this is a really good talk about

17:07

the struct log with more examples. Effective CICD. I prepared also some examples for you and I

17:14

will share them after this talk. Continuous integration and why do we need it. It will provide you

17:23

the test coverage, reliability, fault desolation, transparency, code quality, faster development,

17:29

and code review improvements. You can automate everything. There are different examples of how to

17:36

integrate your checks into the CI. And I will show you some super tiny example with the GitHub

17:42

actions which is super easy to set up. You just go to actions and then you click on the

17:49

specific CI which is already proposed and then you can edit it easily. I have chosen the Python

17:55

application setup. Here is the link to the test setup. You can check it out afterwards and I will show

18:04

you how it looks at the moment. So basically we have GitHub workflows and then we can have our

18:12

beautiful workflow. I added everything in here. The only difference between those different

18:18

entries is this continue on error. If you want to make sure that developer cannot really merge

18:25

the code before the CI is passing, you have to delete this line, continue on error.

18:34

But if you don't really care, you just want to see how it looks on the CI then you can just

18:40

continue on error and run all of the checks before they're fixed. But then we can really make sure

18:45

that CI is actually passing. So in here we have the CI part and then here are all the jobs.

18:55

Basically this one is the only one which is mandatory for my code and all the other ones are not

19:01

really passing because I didn't really care to fix, just to show you the difference.

19:10

Let's check the pilot. Pilot is not passing because there is something which is missing here.

19:18

Also I saw it is not passing but actually my code was already merged because I didn't make that

19:23

mandatory. So make sure that those steps which are important for you to check are mandatory.

19:28

And there is also really nice to which is called the documentation coverage to interrogate.

19:36

You can check that a bit later. I don't have any documentation right now.

19:42

So it's obviously failing because actual result is the zero. You can use it like this,

19:49

interrogate and then just specify what do you want to check if there is enough documentation.

19:59

Then the Docker file. Just a couple of words about the Docker file, how to secure your Docker

20:06

file and Docker images. I also prepared a lot of examples which you can check later on after the

20:12

talk. So the basic advice is not to use the root user. Then I use trusted well-known images,

20:22

just check on the Docker hub if the image has enough done loads and it's an official source.

20:31

Then use copy instead of add. Although they have super similar functionality but then

20:40

add can have extra functionality which you will not expect. So make sure that you're using copy.

20:46

And then try to link your Docker file. So you can use that on the CI. It's super easy and simple.

20:53

I use one tool which is called Hado-Lint. It's a Docker file linter. It will link your code

21:03

super easily. It will tell you what's wrong in the code. This is just an example for them. It's super easy

21:10

to set up on the CI. So in this Docker file I made a mistake. I did the people install and then

21:20

just the package name. I didn't pin the dependencies and because of that my linter advised me to

21:27

change that. Whenever I changed then it was passing. Also this linter will let you know whenever

21:33

you're using add instead of copy. So you don't need to worry about that as well. And the last

21:41

but not least you can also check vulnerabilities in your code. I didn't try it myself. But I will

21:50

definitely try and I saw that there is already interesting tool which is sort of for free and

21:58

going to give it a try. And a bit more hints how to use not a root user in the Docker files.

22:10

You need to create a user. You need to create a group. And why do we actually do that all?

22:16

Because we would like to follow the principle of the least privilege here and that means that we

22:22

should give access only to resources to perform their required functions. So if they need to

22:30

perform just a function with not a root user, obviously we will not give them root access.

22:37

And I have some recommendation of the books for the further reading. And also you can find all

22:47

the links here in my personal log. So thank you for listening and I believe we don't have so much

22:57

time for questions. No, actually we're very good on time at the moment but I don't see many questions

23:05

at the moment because in the chat nobody has asked any. I think you just gave them an overview

23:12

so that they will have the time to think about this. I really like the Docker tips that you gave

23:18

and there was also another Docker talk yesterday. So if somebody is into Docker then they should

23:24

look at the other talks as well. So it was a really good talk. You highly recommend it to check it out.

23:30

Well, yes. So the only thing that's left to me is to give you a round of applause.


Beschreibung

The speaker begins by explaining that when working on projects in production, things can break easily. As a result, it's essential to make sure that your code is production-ready before deploying it. They share their experience as an organizer of Py Berlin meetups and how they've seen people struggle with this issue. The speaker then discusses the importance of testing and ensuring that code is reliable and scalable before deployment. They also emphasize the need to keep in mind the potential for things to break in production and to plan accordingly. The speaker concludes by inviting viewers to their next Pybrilimit app and encouraging them to share their experiences with the community.