A Year in Flink - Flink SQL Live Coding

Created with glancer

hi hi welcome everyone so uh welcome to this a year in flink special edition um because i'm doing live coding it's like i said so first of all i want to have a look brief look at what happened over the past look at what happened over the past years years so if we look at the website visitors of so if we look at the website visitors of link.apache.org link.apache.org we see this steady stream like from 2016 to now there has been a steady stream of increased uses
increased uses like basically if you look at it very closely you will see this quite steady stream up until let's say second half of 2018 and then it ramped up a bit so from from that there seem to be quite a lot of users that look into flink that work with link and that are using our website as a useful resource then when we look at things like what happened from the development side of
things just in september last year flink 190 was released so that's a bit out of this last year scope but since then a lot has happened so there have been two major releases like frink 110 came two major releases like frink 110 came out out in february flink 111 came out in july so roughly speaking flink release or major release every five months and if link developers have not
have not only been focusing on flink or the the core flink distribution but also released the stateful functions library into the open so into in april this year the state for functions api became a member or became part of the apache software foundation and the first release was created then and the state for functions is actually a bit on a faster pace so delivering you more up-to-date more recent features
but wait i collected those statistics but wait i collected those statistics manually manually like by hand but why why should i um it would be better to to automate this it would be better to to automate this right right and this is actually what i want to do for the rest of the talk so let's look at something that gives us the statistics um in a better way so i don't have to work on those by collecting data work on those by collecting data manually manually and for that i chose to have a look at
flink sql because its focus is on logic not on the implementation both available or both useful for batch processing as well as stream processing it aims at maximizing the developer speed and the autonomy of the user and thus it should be useful in getting quick results and exploring your data and at some point also getting into full-fledged streaming applications so what is flink sql in case you haven't dug into it
dug into it yet flink sql is well sql on its own is a declarative api so on the left hand side you see this statement and what flink does if you give it that statement it will basically translate that into a job so in job in terms of flink is a composition of some operators that exchange data and eventually well you start reading from something then you do transformations and eventually you have something that you put on the output so that's this streaming graph that the sql query will
streaming graph that the sql query will be translated into so it defines a program it's not a so it defines a program it's not a database database it's just a way of defining a program it can also be used to get materialized it can also be used to get materialized views views and to get the maintenance on that for free so take this example it has an input table of transactions and we want to have a flink job that is doing some have a flink job that is doing some aggregation aggregation like this job is going over the transactions table and we'll just count the transactions and group them by type
and group them by type and then get the counter that you can then use to server dashboards so that's a common use case for sling sql as well you will get your real-time dashboards updated by running flink job and um what is there at the moment to to work with things with flink sql well you can write your your java code or scala and just embed sql in there and then or use the table api and then