SiliconANGLE Report: Reporters Notebook with Adrian Cockcroft | AWS re:Invent 2022

(soft techno upbeat music) >> Hi there. Welcome back to Las Vegas. This is Dave Villante with Paul Gillon. Reinvent day one and a half. We started last night, Monday, theCUBE after dark. Now we're going wall to wall. Today. Today was of course the big keynote, Adam Selipsky, kind of the baton now handing, you know, last year when he did his keynote, he was very new. He was sort of still getting his feet wet and finding his guru swing. Settling in a little bit more this year, learning a lot more, getting deeper into the tech, but of course, sharing the love with other leaders like Peter DeSantis. Tomorrow's going to be Swamy in the keynote. Adrian Cockcroft is here. Former AWS, former network Netflix CTO, currently an analyst. You got your own firm now. You're out there. Great to see you again. Thanks for coming on theCUBE. >> Yeah, thanks. >> We heard you on at Super Cloud, you gave some really good insights there back in August. So now as an outsider, you come in obviously, you got to be impressed with the size and the ecosystem and the energy. Of course. What were your thoughts on, you know what you've seen so far, today's keynotes, last night Peter DeSantis, what stood out to you? >> Yeah, I think it's great to be back at Reinvent again. We're kind of pretty much back to where we were before the pandemic sort of shut it down. This is a little, it's almost as big as the, the largest one that we had before. And everyone's turned up. It just feels like we're back. So that's really good to see. And it's a slightly different style. I think there were was more sort of video production things happening. I think in this keynote, more storytelling. I'm not sure it really all stitched together very well. Right. Some of the stories like, how does that follow that? So there were a few things there and some of there were spelling mistakes on the slides, you know that ELT instead of ETL and they spelled ZFS wrong and something. So it just seemed like there was, I'm not quite sure just maybe a few things were sort of rushed at the last minute. >> Not really AWS like, was it? It's kind of remind the Patriots Paul, you know Bill Belichick's teams are fumbling all over the place. >> That's right. That's right. >> Part of it may be, I mean the sort of the market. They have a leader in marketing right now but they're going to have a CMO. So that's sort of maybe as lack of a single threaded leader for this thing. Everything's being shared around a bit more. So maybe, I mean, it's all fixable and it's mine. This is minor stuff. I'm just sort of looking at it and going there's a few things that looked like they were not quite as good as they could have been in the way it was put together. Right? >> But I mean, you're taking a, you know a year of not doing Reinvent. Yeah. Being isolated. You know, we've certainly seen it with theCUBE. It's like, okay, it's not like riding a bike. You know, things that, you know you got to kind of relearn the muscle memories. It's more like golf than is bicycle riding. >> Well I've done AWS keynotes myself. And they are pretty much scrambled. It looks nice, but there's a lot of scrambling leading up to when it actually goes. Right? And sometimes you can, you sometimes see a little kind of the edges of that, and sometimes it's much more polished. But you know, overall it's pretty good. I think Peter DeSantis keynote yesterday was a lot of really good meat there. There was some nice presentations, and some great announcements there. And today I was, I thought I was a little disappointed with some of the, I thought they could have been more. I think the way Andy Jesse did it, he crammed more announcements into his keynote, and Adam seems to be taking sort of a bit more of a measured approach. There were a few things he picked up on and then I'm expecting more to be spread throughout the rest of the day. >> This was more poetic. Right? He took the universe as the analogy for data, the ocean for security. Right? The Antarctic was sort of. >> Yeah. It looked pretty, >> yeah. >> But I'm not sure that was like, we're not here really to watch nature videos >> As analysts and journalists, You're like, come on. >> Yeah, >> Give it the meat >> That was kind the thing, yeah, >> It has always been the AWS has always been Reinvent has always been a shock at our approach. 100, 150 announcements. And they're really, that kind of pressure seems to be off them now. Their position at the top of the market seems to be unshakeable. There's no clear competition that's creeping up behind them. So how does that affect the messaging you think that AWS brings to market when it doesn't really have to prove that it's a leader anymore? It can go after maybe more of the niche markets or fix the stuff that's a little broken more fine tuning than grandiose statements. >> I think so AWS for a long time was so far out that they basically said, "We don't think about the competition, we are listen to the customers." And that was always the statement that works as long as you're always in the lead, right? Because you are introducing the new idea to the customer. Nobody else got there first. So that was the case. But in a few areas they aren't leading. Right? You could argue in machine learning, not necessarily leading in sustainability. They're not leading and they don't want to talk about some of these areas and-- >> Database. I mean arguably, >> They're pretty strong there, but the areas when you are behind, it's like they kind of know how to play offense. But when you're playing defense, it's a different set of game. You're playing a different game and it's hard to be good at both. I think and I'm not sure that they're really used to following somebody into a market and making a success of that. So there's something, it's a little harder. Do you see what I mean? >> I get opinion on this. So when I say database, David Foyer was two years ago, predicted AWS is going to have to converge somehow. They have no choice. And they sort of touched on that today, right? Eliminating ETL, that's one thing. But Aurora to Redshift. >> Yeah. >> You know, end to end. I'm not sure it's totally, they're fully end to end >> That's a really good, that is an excellent piece of work, because there's a lot of work that it eliminates. There's are clear pain points, but then you've got sort of the competing thing, is like the MongoDB and it's like, it's just a way with one database keeps it simple. >> Snowflake, >> Or you've got on Snowflake maybe you've got all these 20 different things you're trying to integrate at AWS, but it's kind of like you have a bag of Lego bricks. It's my favorite analogy, right? You want a toy for Christmas, you want a toy formula one racing car since that seems to be the theme, right? >> Okay. Do you want the fully built model that you can play with right now? Or do you want the Lego version that you have to spend three days building. Right? And AWS is the Lego technique thing. You have to spend some time building it, but once you've built it, you can evolve it, and you'll still be playing those are still good bricks years later. Whereas that prebuilt to probably broken gathering dust, right? So there's something about having an vulnerable architecture which is harder to get into, but more durable in the long term. And so AWS tends to play the long game in many ways. And that's one of the elements that they do that and that's good, but it makes it hard to consume for enterprise buyers that are used to getting it with a bow on top. And here's the solution. You know? >> And Paul, that was always Andy Chassy's answer to when we would ask him, you know, all these primitives you're going to make it simpler. You see the primitives give us the advantage to turn on a dime in the marketplace. And that's true. >> Yeah. So you're saying, you know, you take all these things together and you wrap it up, and you put a snowflake on top, and now you've got a simple thing or a Mongo or Mongo atlas or whatever. So you've got these layered platforms now which are making it simpler to consume, but now you're kind of, you know, you're all stuck in that ecosystem, you know, so it's like what layer of abstractions do you want to tie yourself to, right? >> The data bricks coming at it from more of an open source approach. But it's similar. >> We're seeing Amazon direct more into vertical markets. They spotlighted what Goldman Sachs is doing on their platform. They've got a variety of platforms that are supposedly targeted custom built for vertical markets. How do successful do you see that play being? Is this something that the customers you think are looking for, a fully integrated Amazon solution? >> I think so. There's usually if you look at, you know the MongoDB or data stacks, or the other sort of or elastic, you know, they've got the specific solution with the people that really are developing the core technology, there's open source equivalent version. The AWS is running, and it's usually maybe they've got a price advantage or it's, you know there's some data integration in there or it's somehow easier to integrate but it's not stopping those companies from growing. And what it's doing is it's endorsing that platform. So if you look at the collection of databases that have been around over the last few years, now you've got basically Elastic Mongo and Cassandra, you know the data stacks as being endorsed by the cloud vendors. These are winners. They're going to be around for a very long time. You can build yourself on that architecture. But what happened to Couch base and you know, a few of the other ones, you know, they don't really fit. Like how you going to bait? If you are now becoming an also ran, because you didn't get cloned by the cloud vendor. So the customers are going is that a safe place to be, right? >> But isn't it, don't they want to encourage those partners though in the name of building the marketplace ecosystem? >> Yeah. >> This is huge. >> But certainly the platform, yeah, the platform encourages people to do more. And there's always room around the edge. But the mainstream customers like that really like spending the good money, are looking for something that's got a long term life to it. Right? They're looking for a long commitment to that technology and that it's going to be invested in and grow. And the fact that the cloud providers are adopting and particularly AWS is adopting some of these technologies means that is a very long term commitment. You can base, you know, you can bet your future architecture on that for a decade probably. >> So they have to pick winners. >> Yeah. So it's sort of picking winners. And then if you're the open source company that's now got AWS turning up, you have to then leverage it and use that as a way to grow the market. And I think Mongo have done an excellent job of that. I mean, they're top level sponsors of Reinvent, and they're out there messaging that and doing a good job of showing people how to layer on top of AWS and make it a win-win both sides. >> So ever since we've been in the business, you hear the narrative hardware's going to die. It's just, you know, it's commodity and there's some truth to that. But hardware's actually driving good gross margins for the Cisco's of the world. Storage companies have always made good margins. Servers maybe not so much, 'cause Intel sucked all the margin out of it. But let's face it, AWS makes most of its money. We know on compute, it's got 25 plus percent operating margins depending on the seasonality there. What do you think happens long term to the infrastructure layer discussion? Okay, commodity cloud, you know, we talk about super cloud. Do you think that AWS, and the other cloud vendors that infrastructure, IS gets commoditized and they have to go up market or you see that continuing I mean history would say that still good margins in hardware. What are your thoughts on that? >> It's not commoditizing, it's becoming more specific. We've got all these accelerators and custom chips now, and this is something, this almost goes back. I mean, I was with some micro systems 20,30 years ago and we developed our own chips and HP developed their own chips and SGI mips, right? We were like, the architectures were all squabbling of who had the best processor chips and it took years to get chips that worked. Now if you make a chip and it doesn't work immediately, you screwed up somewhere right? It's become the technology of building these immensely complicated powerful chips that has become commoditized. So the cost of building a custom chip, is now getting to the point where Apple and Amazon, your Apple laptop has got full custom chips your phone, your iPhone, whatever and you're getting Google making custom chips and we've got Nvidia now getting into CPUs as well as GPUs. So we're seeing that the ability to build a custom chip, is becoming something that everyone is leveraging. And the cost of doing that is coming down to startups are doing it. So we're going to see many, many more, much more innovation I think, and this is like Intel and AMD are, you know they've got the compatibility legacy, but of the most powerful, most interesting new things I think are going to be custom. And we're seeing that with Graviton three particular in the three E that was announced last night with like 30, 40% whatever it was, more performance for HPC workloads. And that's, you know, the HPC market is going to have to deal with cloud. I mean they are starting to, and I was at Supercomputing a few weeks ago and they are tiptoeing around the edge of cloud, but those supercomputers are water cold. They are monsters. I mean you go around supercomputing, there are plumbing vendors on the booth. >> Of course. Yeah. >> Right? And they're highly concentrated systems, and that's really the only difference, is like, is it water cooler or echo? The rest of the technology stack is pretty much off the shelf stuff with a few tweets software. >> You point about, you know, the chips and what AWS is doing. The Annapurna acquisition. >> Yeah. >> They're on a dramatically different curve now. I think it comes down to, again, David Floyd's premise, really comes down to volume. The arm wafer volumes are 10 x those of X 86, volume always wins. And the economics of semis. >> That kind of got us there. But now there's also a risk five coming along if you, in terms of licensing is becoming one of the bottlenecks. Like if the cost of building a chip is really low, then it comes down to licensing costs and do you want to pay the arm license And the risk five is an open source chip set which some people are starting to use for things. So your dis controller may have a risk five in it, for example, nowadays, those kinds of things. So I think that's kind of the the dynamic that's playing out. There's a lot of innovation in hardware to come in the next few years. There's a thing called CXL compute express link which is going to be really interesting. I think that's probably two years out, before we start seeing it for real. But it lets you put glue together entire rack in a very flexible way. So just, and that's the entire industry coming together around a single standard, the whole industry except for Amazon, in fact just about. >> Well, but maybe I think eventually they'll get there. Don't use system on a chip CXL. >> I have no idea whether I have no knowledge about whether going to do anything CXL. >> Presuming I'm not trying to tap anything confidential. It just makes sense that they would do a system on chip. It makes sense that they would do something like CXL. Why not adopt the standard, if it's going to be as the cost. >> Yeah. And so that was one of the things out of zip computing. The other thing is the low latency networking with the elastic fabric adapter EFA and the extensions to that that were announced last night. They doubled the throughput. So you get twice the capacity on the nitro chip. And then the other thing was this, this is a bit technical, but this scalable datagram protocol that they've got which basically says, if I want to send a message, a packet from one machine to another machine, instead of sending it over one wire, I consider it over 16 wires in parallel. And I will just flood the network with all the packets and they can arrive in any order. This is why it isn't done normally. TCP is in order, the packets come in order they're supposed to, but this is fully flooding them around with its own fast retry and then they get reassembled at the other end. So they're not just using this now for HPC workloads. They've turned it on for TCP for just without any change to your application. If you are trying to move a large piece of data between two machines, and you're just pushing it down a network, a single connection, it takes it from five gigabits per second to 25 gigabits per second. A five x speed up, with a protocol tweak that's run by the Nitro, this is super interesting. >> Probably want to get all that AIML that stuff is going on. >> Well, the AIML stuff is leveraging it underneath, but this is for everybody. Like you're just copying data around, right? And you're limited, "Hey this is going to get there five times faster, pushing a big enough chunk of data around." So this is turning on gradually as the nitro five comes out, and you have to enable it at the instance level. But it's a super interesting announcement from last night. >> So the bottom line bumper sticker on commoditization is what? >> I don't think so. I mean what's the APIs? Your arm compatible, your Intel X 86 compatible or your maybe risk five one day compatible in the cloud. And those are the APIs, right? That's the commodity level. And the software is now, the software ecosystem is super portable across those as we're seeing with Apple moving from Intel to it's really not an issue, right? The software and the tooling is all there to do that. But underneath that, we're going to see an arms race between the top providers as they all try and develop faster chips for doing more specific things. We've got cranium for training, that instance has they announced it last year with 800 gigabits going out of a single instance, 800 gigabits or no, but this year they doubled it. Yeah. So 1.6 terabytes out of a single machine, right? That's insane, right? But what you're doing is you're putting together hundreds or thousands of those to solve the big machine learning training problems. These super, these enormous clusters that they're being formed for doing these massive problems. And there is a market now, for these incredibly large supercomputer clusters built for doing AI. That's all bandwidth limited. >> And you think about the timeframe from design to tape out. >> Yeah. >> Is just getting compressed It's relative. >> It is. >> Six is going the other way >> The tooling is all there. Yeah. >> Fantastic. Adrian, always a pleasure to have you on. Thanks so much. >> Yeah. >> Really appreciate it. >> Yeah, thank you. >> Thank you Paul. >> Cheers. All right. Keep it right there everybody. Don't forget, go to thecube.net, you'll see all these videos. Go to siliconangle.com, We've got features with Adam Selipsky, we got my breaking analysis, we have another feature with MongoDB's, Dev Ittycheria, Ali Ghodsi, as well Frank Sluman tomorrow. So check that out. Keep it right there. You're watching theCUBE, the leader in enterprise and emerging tech, right back. (soft techno upbeat music)

Published Date : Nov 30 2022

SUMMARY :

Great to see you again. and the ecosystem and the energy. Some of the stories like, It's kind of remind the That's right. I mean the sort of the market. the muscle memories. kind of the edges of that, the analogy for data, As analysts and journalists, So how does that affect the messaging always in the lead, right? I mean arguably, and it's hard to be good at both. But Aurora to Redshift. You know, end to end. of the competing thing, but it's kind of like you And AWS is the Lego technique thing. to when we would ask him, you know, and you put a snowflake on top, from more of an open source approach. the customers you think a few of the other ones, you know, and that it's going to and doing a good job of showing people and the other cloud vendors the HPC market is going to Yeah. and that's really the only difference, the chips and what AWS is doing. And the economics of semis. So just, and that's the entire industry Well, but maybe I think I have no idea whether if it's going to be as the cost. and the extensions to that AIML that stuff is going on. and you have to enable And the software is now, And you think about the timeframe Is just getting compressed Yeah. Adrian, always a pleasure to have you on. the leader in enterprise

ENTITIES

Entity	Category	Confidence
Adam Selipsky	PERSON	0.99+
David Floyd	PERSON	0.99+
Peter DeSantis	PERSON	0.99+
Paul	PERSON	0.99+
Ali Ghodsi	PERSON	0.99+
Adrian Cockcroft	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Frank Sluman	PERSON	0.99+
Paul Gillon	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Andy Chassy	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Adam	PERSON	0.99+
Dev Ittycheria	PERSON	0.99+
Andy Jesse	PERSON	0.99+
Dave Villante	PERSON	0.99+
August	DATE	0.99+
two machines	QUANTITY	0.99+
Bill Belichick	PERSON	0.99+
10	QUANTITY	0.99+
Cisco	ORGANIZATION	0.99+
today	DATE	0.99+
last year	DATE	0.99+
1.6 terabytes	QUANTITY	0.99+
AMD	ORGANIZATION	0.99+
Goldman Sachs	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
one machine	QUANTITY	0.99+
three days	QUANTITY	0.99+
Adrian	PERSON	0.99+
800 gigabits	QUANTITY	0.99+
Today	DATE	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
David Foyer	PERSON	0.99+
two years	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
yesterday	DATE	0.99+
this year	DATE	0.99+
Snowflake	TITLE	0.99+
Nvidia	ORGANIZATION	0.99+
five times	QUANTITY	0.99+
one	QUANTITY	0.99+
Netflix	ORGANIZATION	0.99+
thecube.net	OTHER	0.99+
Intel	ORGANIZATION	0.99+
five	QUANTITY	0.99+
both sides	QUANTITY	0.99+
Mongo	ORGANIZATION	0.99+
Christmas	EVENT	0.99+
last night	DATE	0.99+
HP	ORGANIZATION	0.98+
25 plus percent	QUANTITY	0.98+
thousands	QUANTITY	0.98+
20,30 years ago	DATE	0.98+
pandemic	EVENT	0.98+
both	QUANTITY	0.98+
two years ago	DATE	0.98+
twice	QUANTITY	0.98+
tomorrow	DATE	0.98+
X 86	COMMERCIAL_ITEM	0.98+
Antarctic	LOCATION	0.98+
Patriots	ORGANIZATION	0.98+
siliconangle.com	OTHER	0.97+

API Gateways Ingress Service Mesh | Mirantis Launchpad 2020

>>thank you everyone for joining. I'm here today to talk about English controllers. AP Gateways and service mention communities three very hot topics that are also frequently confusing. So I'm Richard Lee, founder CEO of Ambassador Labs, formerly known as Data Wire. We sponsor a number of popular open source projects that are part of the Cloud Native Computing Foundation, including telepresence and Ambassador, which is a kubernetes native AP gateway. And most of what I'm going to talk about today is related to our work around ambassador. Uh huh. So I want to start by talking about application architecture, er and workflow on kubernetes and how applications that are being built on kubernetes really differ from how they used to be built. So when you're building applications on kubernetes, the traditional architectures is the very famous monolith, and the monolith is a central piece of software. It's one giant thing that you build, deployed run, and the value of a monolith is it's really simple. And if you think about the monolithic development process, more importantly, is the architecture er is really reflecting that workflow. So with the monolith, you have a very centralized development process. You tend not to release too frequently because you have all these different development teams that are working on different features, and then you decide in advance when you're going to release that particular pieces offering. Everyone works towards that release train, and you have specialized teams. You have a development team which has all your developers. You have a Q A team. You have a release team, you have an operations team, so that's your typical development organization and workflow with a monolithic application. As organization shift to micro >>services, they adopt a very different development paradigm. It's a decentralized development paradigm where you have lots of different independent teams that are simultaneously working on different parts of the application, and those application components are really shipped as independent services. And so you really have a continuous release cycle because instead of synchronizing all your teams around one particular vehicle, you have so many different release vehicles that each team is able to ship a soon as they're ready. And so we call this full cycle development because that team is >>really responsible, not just for the coding of that micro service, but also the testing and the release and operations of that service. Um, >>so this is a huge change, particularly with workflow. And there's a lot of implications for this, s o. I have a diagram here that just try to visualize a little bit more the difference in organization >>with the monolith. You have everyone who works on this monolith with micro services. You have the yellow folks work on the Yellow Micro Service, and the purple folks work on the Purple Micro Service and maybe just one person work on the Orange Micro Service and so forth. >>So there's a lot more diversity around your teams and your micro services, and it lets you really adjust the granularity of your development to your specific business need. So how do users actually access your micro services? Well, with the monolith, it's pretty straightforward. You have one big thing. So you just tell the Internet while I have this one big thing on the Internet, make sure you send all your travel to the big thing. But when you have micro services and you have a bunch of different micro services, how do users actually access these micro services? So the solution is an AP gateway, so the gateway consolidates all access to your micro services, so requests come from the Internet. They go to your AP gateway. The AP Gateway looks at these requests, and based on the nature of these requests, it routes them to the appropriate micro service. And because the AP gateway is centralizing thing access to all the micro services, it also really helps you simplify authentication, observe ability, routing all these different crosscutting concerns. Because instead of implementing authentication in each >>of your micro services, which would be a maintenance nightmare and a security nightmare, you put all your authentication in your AP gateway. So if you look at this world of micro services, AP gateways are really important part of your infrastructure, which are really necessary and pre micro services. Pre kubernetes Unhappy Gateway Well valuable was much more optional. So that's one of the really big things around. Recognizing with the micro services architecture er, you >>really need to start thinking much more about maybe a gateway. The other consideration within a P A gateway is around your management workflow because, as I mentioned, each team is actually response for their own micro service, which also means each team needs to be able to independently manage the gateway. So Team A working on that micro service needs to be able to tell the AP at Gateway. This this is >>how I want you to write. Request to my micro service, and the Purple team needs to be able to say something different for how purple requests get right into the Purple Micro Service. So that's also really important consideration as you think about AP gateways and how it fits in your architecture. Because it's not just about your architecture. It's also about your workflow. So let me talk about a PR gateways on kubernetes. I'm going to start by talking about ingress. So ingress is the process of getting traffic from the Internet to services inside the cluster kubernetes. From an architectural perspective, it actually has a requirement that all the different pods in a kubernetes cluster needs to communicate with each other. And as a consequence, what Kubernetes does is it creates its own private network space for all these pods, and each pod gets its own I p address. So this makes things very, very simple for inter pod communication. Cooper in any is, on the other hand, does not say very much around how traffic should actually get into the cluster. So there's a lot of detail around how traffic actually, once it's in the cluster, how you routed around the cluster and it's very opinionated about how this works but getting traffic into the cluster. There's a lot of different options on there's multiple strategies pot i p. There's ingress. There's low bounce of resource is there's no port. >>I'm not gonna go into exhaustive detail on all these different options on. I'm going to just talk about the most common approach that most organizations take today. So the most common strategy for routing is coupling an external load balancer with an ingress controller. And so an external load balancer can be >>ah, Harvard load balancer. It could be a virtual machine. It could be a cloud load balancer. But the key requirement for an external load balancer >>is to be able to attack to stable I people he address so that you can actually map a domain name and DNS to that particular external load balancer and that external load balancer, usually but not always well, then route traffic and pass that traffic straight through to your ingress controller, and then your English controller takes that traffic and then routes it internally inside >>kubernetes to the various pods that are running your micro services. There are >>other approaches, but this is the most common approach. And the reason for this is that the alternative approaches really required each of your micro services to be exposed outside of the cluster, which causes a lot of challenges around management and deployment and maintenance that you generally want to avoid. So I've been talking about in English controller. What exactly is an English controller? So in English controller is an application that can process rules according to the kubernetes English specifications. Strangely, Kubernetes is not actually ship with a built in English controller. Um, I say strangely because you think, well, getting traffic into a cluster is probably a pretty common requirement. And it is. It turns out that this is complex enough that there's no one size fits all English controller. And so there is a set of ingress >>rules that are part of the kubernetes English specifications at specified how traffic gets route into the cluster >>and then you need a proxy that can actually route this traffic to these different pods. And so an increase controller really translates between the kubernetes configuration and the >>proxy configuration and common proxies for ingress. Controllers include H a proxy envoy Proxy or Engine X. So >>let me talk a little bit more about these common proxies. So all these proxies and there >>are many other proxies I'm just highlighting what I consider to be probably the most three most well established proxies. Uh, h a proxy, uh, Engine X and envoy proxies. So H a proxy is managed by a plastic technology start in 2000 and one, um, the H a proxy organization actually creates an ingress controller. And before they kept created ingress controller, there was an open source project called Voyager, which built in ingress Controller on >>H a proxy engine X managed by engine. Xing, subsequently acquired by F five Also open source started a little bit later. The proxy in 2004. And there's the engine Xing breast, which is a community project. Um, that's the most popular a zwelling the engine Next Inc Kubernetes English project which is maintained by the company. This is a common source of confusion because sometimes people will think that they're using the ingress engine X ingress controller, and it's not clear if they're using this commercially supported version or the open source version, and they actually, although they have very similar names, uh, they actually have different functionality. Finally. Envoy Proxy, the newest entrant to the proxy market originally developed by engineers that lift the ride sharing company. They subsequently donated it to the cloud. Native Computing Foundation Envoy has become probably the most popular cloud native proxy. It's used by Ambassador uh, the A P a. Gateway. It's using the SDO service mash. It's using VM Ware Contour. It's been used by Amazon and at mesh. It's probably the most common proxy in the cloud native world. So, as I mentioned, there's a lot of different options for ingress. Controller is the most common. Is the engine X ingress controller, not the one maintained by Engine X Inc but the one that's part of the Cooper Nannies project? Um, ambassador is the most popular envoy based option. Another common option is the SDO Gateway, which is directly integrated with the SDO mesh, and that's >>actually part of Dr Enterprise. So with all these choices around English controller. How do you actually decide? Well, the reality is the ingress specifications very limited. >>And the reason for this is that getting traffic into the cluster there's a lot of nuance into how you want to do that. And it turns out it's very challenging to create a generic one size fits all specifications because of the vast diversity of implementations and choices that are available to end users. And so you don't see English specifying anything around resilience. So if >>you want to specify a time out or rate limiting, it's not possible in dresses really limited to support for http. So if you're using GSPC or Web sockets, you can't use the ingress specifications, um, different ways of routing >>authentication. The list goes on and on. And so what happens is that different English controllers extend the core ingress specifications to support these use cases in different ways. Yeah, so engine X ingress they actually use a combination of config maps and the English Resource is plus custom annotations that extend the ingress to really let you configure a lot of additional extensions. Um, that is exposing the engineers ingress with Ambassador. We actually use custom resource definitions different CRTs that extend kubernetes itself to configure ambassador. And one of the benefits of the CRD approach is that we can create a standard schema that's actually validated by kubernetes. So when you do a coup control apply of an ambassador CRD coop Control can immediately validate and tell >>you if you're actually applying a valid schema in format for your ambassador configuration on As I previously mentioned, ambassadors built on envoy proxy, >>it's the Gateway also uses C R D s they can to use a necks tension of the service match CRD s as opposed to dedicated Gateway C R D s on again sdo Gateway is built on envoy privacy. So I've been talking a lot about English controllers. But the title of my talk was really about AP gateways and English controllers and service smashed. So what's the difference between an English controller and an AP gateway? So to recap, an immigrant controller processes kubernetes English routing rules and a P I. G. Wave is a central point for managing all your traffic to community services. It typically has additional functionality such as authentication, observe, ability, a >>developer portal and so forth. So what you find Is that not all Ap gateways or English controllers? Because some MP gateways don't support kubernetes at all. S o eso you can't make the can't be ingress controllers and not all ingrates. Controllers support the functionality such as authentication, observe, ability, developer portal >>that you would typically associate with an AP gateway. So, generally speaking, um, AP gateways that run on kubernetes should be considered a super set oven ingress controller. But if the A p a gateway doesn't run on kubernetes, then it's an AP gateway and not an increase controller. Yeah, so what's the difference between a service Machin and AP Gateway? So an AP gateway is really >>focused on traffic into and out of a cluster, so the political term for this is North South traffic. A service mesh is focused on traffic between services in a cluster East West traffic. All service meshes need >>an AP gateway, so it's Theo includes a basic ingress or a P a gateway called the SDO gateway, because a service mention needs traffic from the Internet to be routed into the mesh >>before it can actually do anything Omelet. Proxy, as I mentioned, is the most common proxy for both mesh and gateways. Dr. Enterprise provides an envoy based solution out of the box. >>Uh, SDO Gateway. The reason Dr does this is because, as I mentioned, kubernetes doesn't come package with an ingress. Uh, it makes sense for Dr Enterprise to provide something that's easy to get going. No extra steps required because with Dr Enterprise, you can deploy it and get going. Get exposed on the Internet without any additional software. Dr. Enterprise can also be easily upgraded to ambassador because they're both built on envoy and interest. Consistent routing. Semantics. It also with Ambassador. You get >>greater security for for single sign on. There's a lot of security by default that's configured directly into Ambassador Better control over TLS. Things like that. Um And then finally, there's commercial support that's actually available for Ambassador. SDO is an open source project that has a has a very broad community but no commercial support options. So to recap, ingress controllers and AP gateways are critical pieces of your cloud native stack. So make sure that you choose something that works well for you. >>And I think a lot of times organizations don't think critically enough about the AP gateway until they're much further down the Cuban and a journey. Considerations around how to choose that a p a gateway include functionality such as How does it do with traffic management and >>observe ability? Doesn't support the protocols that you need also nonfunctional requirements such as Does it integrate with your workflow? Do you offer commercial support? Can you get commercial support for this on a P? A. Gateway is focused on north south traffic, so traffic into and out of your kubernetes cluster. A service match is focused on East West traffic, so traffic between different services inside the same cluster. Dr. Enterprise includes SDO Gateway out of the box easy to use but can also be extended with ambassador for enhanced functionality and security. So thank you for your time. Hope this was helpful in understanding the difference between a P gateways, English controllers and service meshes and how you should be thinking about that on your kubernetes deployment

Published Date : Sep 12 2020

SUMMARY :

So with the monolith, you have a very centralized development process. And so you really have a continuous release cycle because instead of synchronizing all your teams really responsible, not just for the coding of that micro service, but also the testing and so this is a huge change, particularly with workflow. You have the yellow folks work on the Yellow Micro Service, and the purple folks work on the Purple Micro Service and maybe just so the gateway consolidates all access to your micro services, So that's one of the really big things around. really need to start thinking much more about maybe a gateway. So ingress is the process of getting traffic from the Internet to services So the most common strategy for routing is coupling an external load balancer But the key requirement for an external load balancer kubernetes to the various pods that are running your micro services. And the reason for this is that the and the So So all these proxies and So H a proxy is managed by a plastic technology Envoy Proxy, the newest entrant to the proxy the reality is the ingress specifications very limited. And the reason for this is that getting traffic into the cluster there's a lot of nuance into how you want to do that. you want to specify a time out or rate limiting, it's not possible in dresses really limited is that different English controllers extend the core ingress specifications to support these use cases So to recap, an immigrant controller processes So what you find Is that not all Ap gateways But if the A p a gateway doesn't run on kubernetes, then it's an AP gateway focused on traffic into and out of a cluster, so the political term for this Proxy, as I mentioned, is the most common proxy for both mesh because with Dr Enterprise, you can deploy it and get going. So make sure that you choose something that works well for you. to choose that a p a gateway include functionality such as How does it do with traffic Doesn't support the protocols that you need also nonfunctional requirements

ENTITIES

Entity	Category	Confidence
Richard Lee	PERSON	0.99+
2004	DATE	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
2000	DATE	0.99+
Ambassador Labs	ORGANIZATION	0.99+
each team	QUANTITY	0.99+
Engine X Inc	ORGANIZATION	0.99+
Data Wire	ORGANIZATION	0.99+
each team	QUANTITY	0.99+
each pod	QUANTITY	0.99+
Native Computing Foundation	ORGANIZATION	0.99+
today	DATE	0.99+
English	OTHER	0.99+
one person	QUANTITY	0.98+
SDO	TITLE	0.98+
three	QUANTITY	0.98+
one	QUANTITY	0.97+
each	QUANTITY	0.97+
ingress	ORGANIZATION	0.96+
Ambassador	ORGANIZATION	0.96+
Purple	ORGANIZATION	0.95+
Harvard	ORGANIZATION	0.95+
one big thing	QUANTITY	0.94+
both	QUANTITY	0.94+
Orange Micro Service	ORGANIZATION	0.93+
one giant thing	QUANTITY	0.92+
Purple Micro Service	ORGANIZATION	0.92+
SDO	OTHER	0.9+
Next Inc Kubernetes	ORGANIZATION	0.89+
Cuban	LOCATION	0.89+
one particular vehicle	QUANTITY	0.88+
SDO Gateway	TITLE	0.86+
three most well established proxies	QUANTITY	0.85+
envoy	ORGANIZATION	0.85+
purple	ORGANIZATION	0.85+
Cooper Nannies	ORGANIZATION	0.83+
Cooper	PERSON	0.81+
Yellow Micro Service	ORGANIZATION	0.8+
single sign	QUANTITY	0.8+
A P a.	COMMERCIAL_ITEM	0.77+
hot topics	QUANTITY	0.76+
Launchpad 2020	COMMERCIAL_ITEM	0.75+
both mesh and	QUANTITY	0.69+
Envoy	TITLE	0.65+
CEO	PERSON	0.64+
Dr	TITLE	0.64+
AP	ORGANIZATION	0.63+
VM Ware Contour	TITLE	0.62+
Dr Enterprise	ORGANIZATION	0.61+
Mirantis	ORGANIZATION	0.59+
North South	LOCATION	0.57+
Gateway	TITLE	0.54+
folks	ORGANIZATION	0.54+
Voyager	TITLE	0.5+
Dr. Enterprise	TITLE	0.49+
Omelet	TITLE	0.45+
Machin	TITLE	0.45+
Enterprise	ORGANIZATION	0.43+

Jerome Hardaway, Vets Who Code | CUBE Conversation, July 2020

(soft music) >> From theCUBE studios in Palo Alto, in Boston, connecting with thought leaders all around the world. This is theCUBE Conversation. >> Hi, I'm Stu Miniman coming to you from our Boston area studio here for a CUBE conversation. Really like when we can dig into help some of the nonprofits in our industry, going to be talking about, training, helping other people lift up their careers. Happy to welcome to the program, first time guests, Jerome Hardaway. He's the founder of vets who code coming down from Nashville, Jerome, I seem to remember a time where I was able to travel. I did some lovely hiking even saw bear last time I was down in Nashville. Thanks so much for joining us. Roger that. Thank you, a funny story. I saw a cow on the loose while driving on the highway yesterday. So not much has changed. (Jerome laughs) Thank you guys for having me. >> Yeah, it is a little bit of strange times here in the Covert area. I live kind of suburban Massachusetts area. One of my neighbors did report a small bear in the area. I'm definitely seeing more than just the usual, what kind of wild turkeys and the like that we get up in New England, but let's talk about Vets Who Code. So, you're the founder, the name doesn't leave much up for us to guess what you do, but tell us a little bit as to the inspiration and the goals of your organization. Roger that, Vets Who Code is the first veteran founded, operated and led, a remote 501 C three that focuses on training veterans regardless where they are and modern age of technologies. Our stack right now, I would say is focused more towards front-end DevOps with a lot of serverless technologies being built-in. And that's pretty much what exactly what we do well. >> Well awesome, I had been loving digging into the serverless ecosystem the last few years. Definitely an exciting area, help us understand a little bit, who comes and joins this? What skill set do they have to have coming in? And explain a little bit the programs that they can offer that they can be part of. >> Yeah, cool. So we run Vets Who Code like a mixture between a tech company or a tech nonprofit, I guess, using those practices while also using military practices as well. And the people that come in are veterans and military spouses. And we try to use what we call a pattern matching practice, showcasing like. Hey, these are the things, he's been in military. This is how it translates to the tech side. Like, our sit reps is what you guys would call stand up. Kanban is what we would call like systems checks and frag orders, Op orders, things like that, or, our SLPs. So we turn around, we just train them, retrain them. So that way they can understand the lingo, understand how things, how you code, move and communicate and make sure that these guys and girls, they know how the work as JavaScript engineers and a serverless community. As of right now, we've helped 252 veterans in 37 States get jobs, our social economic impacts, then I think it's at 17.6 million right now. So it all from the comfort of their homes, that's like the cool and free, and those are like the coolest things that we've been able to do. >> Wow, that's fascinating. Jerome, I heard something that you've talked about, leveraging the military organizational styles. I'm just curious, there's in the coding world a lot of times we talk about Conway's law, which is that the code will end up resembling the look of the organization. And you talk about DevOps, DevOps is all about various organizations collaborating and working together. It seems a little bit different from what I would think of traditional military command and control. So is that anything you've given any thought to? Is there some of the organizational pieces that you need to talk to people about? Moving into these environments compared to what they might've had in the military. >> Negative, I think the biggest misconception that we have is that people, when you're talking about how the military moves, they're thinking of the military of yesteryear of 20, 30, 40 years ago. They're not thinking of global war on terrorism veterans and how we move and things like that. We understand distributed chains. We understand cause we call, that's what we've done at CENTAF and CENTCOM in Iraq and Afghanistan. So we honored, like we are already doing a lot of this stuff, we just naming it different. So that's part of the thing that we have as an advantage as the, cause all the people who are educators, there are veterans who learn how to code and they've been working in industry and they know. And so when they're teaching, they know the entire process that a veteran's going to go through. So that's how now we focus on things. And so the organizational structure for us first term to second term veterans is pretty normal. If you're coming out within the last, heck 10 years. (Jerome laughs) >> Yeah, absolutely. That's wonderful. And I I've had the opportunity to work with plenty of people that had come from the military. Very successful in the tech industry, definitely tend to be hard workers and engaged in what they'r doing. Curious, you talked about being able to do this remotely and then it is free. What's the impact of the current global pandemic? Everything that's happening here in 2020 been on what you're doing in your resources. >> Of the impact, unfortunately, I mean, not unfortunately, fortunately it has been nothing but positive. It's been crazy, we've gotten more applications. We have people are seeing that during, I was the crazy person in the room, when in 2014, when I was saying nonprofits should move to remote first protocols. So that way they could have greater impact for less, with less financial resources. And back then I was the, like what are you talking about? This is the way we've always done. Well now everybody was scrambling to try to figure out how to help people without being in same room with them. We were like, Oh, okay, lt's do today. So we got an influx of people applying, influx of people, sending me, trying to get into our next cohort in August. It's just, the biggest thing that has happened for Vets Who Code is yet, it's been a really positive experience for us, which is really weird to say, but I think it has, my doomsday Murphy's law style of preparing, I assume that anything that can go wrong will go wrong. So I try to prepare for that. So being open source, being serverless, being having everything in a manner to where--in case I was out of the pot, out of the situation, other people operate having this distributed teams, or there are other leaders that can take over and do things. It's all stuff that, I guess I got from the military. So, we were know we were prepared because there was absolutely zero pivot for us. If anything, it has been more resources. We've been able to dive deeper in more subjects because people have had more time, but, we can do, we can dive deeper into AWS. We started a lunch and learn every two weeks. We actually have a lunch and learn next week with Dr. Lee Johnson. And she's going to be talking, we open that to it by all juniors and entry level devs, developers, regardless of whether you're a veteran or not, we just throw it on Twitter and let them get in. And the focus will be on tech ethics. We all know, right now we've been leading the charge on trying to make sure people are supercharging their skills during this time frame. So that's what, it's been very positive. I've been working with magazine, front-end masters. It's been awesome. >> Well, that's wonderful. Wish everyone had the mindset coming into 2020, because it does seem that anything that could go wrong has, (both laugh) I'm curious, once people have skilled up and they've gone through the program, what connections do you have with industry? How do you help with job placement in that sort of activity? >> That is the most asked question, because that is the thing that people expect because of code schools, because of our educational program protocols. We don't really need that issue because our veterans are skilled enough to where to hiring managers know the quality that we produce. I live in Nashville and I've only been able to place one veteran that I've trained locally in the community because of fame companies have snatched up every other veteran I've ever trained in the community, so things like that, it's not a problem because no, a usually 80% of our veterans have jobs before they even graduate. So you're literally picking up, picking people who, they know they have the potential to get a bit companies if they put the work in and it's just as they come, we actually have people. I think a company reached out to me yesterday and I was like, I don't even have people for you. They already have jobs. (jerome laughs) Or I'm in a situation now where all my senior devs are looking for fame companies. Cause that's one of the things we do is that we support our veterans from reentry to retirement. So we're not like other code schools where they only focus on that 30 to 60 to 90 days, so that first job, our veterans, they keep coming back to re-skill, get more skills, come up to the lunch and learns, come to our Slack side chats to become better programmers. And once they're, so we've helped several of our programmers go from entry-level dev to senior dev, from absolutely zero experience. And so, I think that's the most rewarding thing. When you see a person who they came in knowing nothing. And three years later, like after the cohort safe they got their job and then they come back after they got the jobs, they want to get more skills and they get another job and then they come back. And the next thing, my favorite, one of my favorites Schuster, he starts at a local web shop, a web dev shop in Savannah, Georgia. And then next thing, oh, he's on Amazon, he's at Amazon three years later and you're like, Oh wow, we did that, that's awesome. So that's the path that we do is awesome. >> I'm curious, are there certain skill sets that you see in more need than other? And I'm also curious, do you recommend, or do you help people along with certain certifications? Thinking, the cloud certifications definitely have been on the rise, the last couple years. >> I feel like the cloud, the cloud certifications have been on the rise because it's expensive to like test for that stuff. If a person messes up, unless you have a very dedicated environment to where they can't mess up, they can cost you a lot of money, right? So you want that certain, right? But for us, it's been, we just focused on what we like to call front-end DevOps. We focus on Jamstack, which is JavaScript, APIs and markup, also along with a lot of serverless. So we're using AWS, we're using, also they're, they're learning Lambda functions, all this stuff. We're using a query language called GraphQL. We're using Apollo with that query language. We're using some node, React, GET, Speed. And a lot of third party API has to do like a lot of heavy lifting cause we believe that the deeper dive that a person has in a language and being able to manipulate and utilize APIs that they can, the better they will be, Right? So, same way that colleges do it, but a more modern take like colleges, they give you the most painful language to learn, which is usually like C right? Where you had to make everything a very low-level language. And then you're going through this process of building. And because of that, other languages are easier because you felt the pain points. We do the same thing, but with JavaScript, because it's the most accessible, painful language on earth, that's what I called it with Wire magazine last year anyway. (jerome laughs) >> So Jerome, you've laid out how you you're well organized. You're lean and financially, making sure that things are done responsibly. We want to give you the opportunity though. What's the call to action? Vets Who Code, you're looking for more people to participate. Is it sponsorships? Work in the community, look to engage. >> Roger that, we are looking for two things. One, we're always looking for people to help support us. We're open source, we're on GitHub sponsors. Like the people who we we're up, we're open source. But the people that do most of our tickets are the students themselves. So that's one of the best things about us. there is no better move, feeling that having something in production that works, right? It actually does something right? Like, Oh, this actually helps people, right? So we help have our veterans like actually pull tickets and do things like that. But, we also, we build, we're building out teams that they're on all the time as well. We have our new tutorials team or veterans. They literally built front facing tutorials for people on the outside. So that way they can learn little skills as we also have podcasts team and they're always podcasting, always interviewing people that in community, from our mentors to our students, to our alumni. And so just, let's throw our podcasts on Spotify. Let's do some codes, the best Code podcast and sponsor song get up. >> Wonderful, Jerome. We want to give you the final word. you're very passionate. You've got a lot interested, loved hearing about some of the skill sets that you're helping others with. What's exciting you these days? What kind of things are you digging into, beyond Vets Who Code? >> Oh man, everything serverless dude. As a front-end, as a person who was full stack and move to front-end. This has never been a more exciting time to learn how to code because there's so many serverless technologies and is leveling the playing field for front-end engineers, just knowing a little bit of like server-side code and having DevOp skills and being able to work in a CLI, you can do like Jamstack and the people that are using it. You have Nike, you have governments. It's just, it's such an exciting time to be a front-end. So I'm just like, and just seeing also how people are like really turning towards wanting their data more open source. So that's another thing that's really exciting for me. I've never been a person that was very highbrow when it came to talking about code. I felt like that was kind of boring, but seeing how, when it comes to like how code is actually helping normal, average everyday people and how the culture as a whole is starting to get more hip to how, API is like our running the world and how tech is being leveraged for. And it gets them, I'm on fire with these conversations, so I try to contain it cause I don't want to scare anyone on TV, but we could talk like, we could talk hours of that stuff. Love it. >> Well, Jerome, thank you so much for sharing with our community, everything you're doing and wonderful activity Vets Who Code, definitely call out to the community, make sure check it out, support it. If you can and tie so much in Jerome, I've got a regular series I do called Cloud Native Insights that are poking at some of those areas that you were talking about serverless and some of the emerging areas. So Jerome, thanks so much for joining, pleasure having you on the program. >> Roger that, thank you for having me. >> All right. Be sure to check out thecube.net for all of the videos that we have as well as Siliconangle.com for the news an6d the writeups, what we do. I'm Stu Miniman and thank you for watching theCUBE. (soft music)

Published Date : Jul 23 2020

SUMMARY :

leaders all around the world. Hi, I'm Stu Miniman coming to you and the goals of your organization. And explain a little bit the programs So it all from the comfort of their homes, the look of the organization. So that's part of the thing that And I I've had the opportunity to work And the focus will be on tech ethics. Wish everyone had the Cause that's one of the things we do is have been on the rise, that the deeper dive that Work in the community, look to engage. So that's one of the best things about us. the skill sets that you're and is leveling the playing of the emerging areas. for the news an6d the

ENTITIES

Entity	Category	Confidence
Jerome	PERSON	0.99+
Nashville	LOCATION	0.99+
2014	DATE	0.99+
Stu Miniman	PERSON	0.99+
30	QUANTITY	0.99+
Jerome Hardaway	PERSON	0.99+
Boston	LOCATION	0.99+
August	DATE	0.99+
Palo Alto	LOCATION	0.99+
New England	LOCATION	0.99+
July 2020	DATE	0.99+
80%	QUANTITY	0.99+
CENTAF	ORGANIZATION	0.99+
Iraq	LOCATION	0.99+
17.6 million	QUANTITY	0.99+
Roger	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
2020	DATE	0.99+
CENTCOM	ORGANIZATION	0.99+
Lee Johnson	PERSON	0.99+
Murphy	PERSON	0.99+
Nike	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
yesterday	DATE	0.99+
Afghanistan	LOCATION	0.99+
last year	DATE	0.99+
three years later	DATE	0.99+
Massachusetts	LOCATION	0.99+
jerome	PERSON	0.99+
One	QUANTITY	0.99+
next week	DATE	0.99+
Savannah, Georgia	LOCATION	0.99+
60	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
90 days	QUANTITY	0.99+
JavaScript	TITLE	0.99+
three years later	DATE	0.98+
thecube.net	OTHER	0.98+
two things	QUANTITY	0.98+
one	QUANTITY	0.98+
first term	QUANTITY	0.98+
252 veterans	QUANTITY	0.98+
Vets Who Code	ORGANIZATION	0.98+
Cloud Native Insights	TITLE	0.98+
second term	QUANTITY	0.97+
Wire	TITLE	0.97+
Covert	LOCATION	0.97+
GitHub	ORGANIZATION	0.97+
Lambda	TITLE	0.97+
today	DATE	0.96+
CUBE	ORGANIZATION	0.95+
one veteran	QUANTITY	0.95+
first time	QUANTITY	0.94+
Twitter	ORGANIZATION	0.94+
Spotify	ORGANIZATION	0.93+
501 C three	OTHER	0.92+
first job	QUANTITY	0.91+
40 years ago	DATE	0.91+
Jamstack	TITLE	0.9+
Vets Who Code	TITLE	0.9+
React	TITLE	0.9+
first protocols	QUANTITY	0.89+
30	DATE	0.88+
Dr.	PERSON	0.88+
first veteran	QUANTITY	0.88+
GraphQL	TITLE	0.87+
last few years	DATE	0.86+
theCUBE	ORGANIZATION	0.85+
Schuster	PERSON	0.84+
earth	LOCATION	0.82+
37 States	QUANTITY	0.82+
last couple years	DATE	0.79+
two weeks	QUANTITY	0.77+
DevOps	TITLE	0.76+

Stephan Ewen, data Artisans | Flink Forward 2018

>> Narrator: Live from San Francisco. It's the CUBE covering Flink Forward brought to you by data Artisans. >> Hi, this is George Gilbert. We are at Flink Forward. The conference put on by data Artisans for the Apache Flink community. This is the second Flink Forward in San Francisco and we are honored to have with us Stephan Ewen, co-founder of data Artisans, co-creator of Apache Flink, and CTO of data Artisans. Stephan, welcome. >> Thank you, George. >> Okay, so with others we were talking about the use cases they were trying to solve but you put together the sort of all the pieces in your head first and are building out, you know, something that's ultimately gets broader and broader in its applicability. Help us, now maybe from the bottom up, help us think through the problems you were trying to solve and and let's start, you know, with the ones that you saw first and then how the platform grows so that you can solve more and more a broader scale of problems. >> Yes, yeah, happy to do that. So, I think we have to take a bunch of step backs and kind of look at what is the let's say the breadth or use cases that we're looking at. How did that, you know, influence some of the inherent decisions and how we've built Flink? How does that relate to what we presented earlier today, the in Austrian processing platform and so on? So, starting to work on Flink and stream processing. Stream processing is an extremely general and broad paradigm, right? We've actually started to say what Flink is underneath the hood. It's an engine to do stateful computations over data streams. It's a system that can process data streams as a batch processor processes, you know, bounded data. It can process data streams as a real-time stream processor produces real-time streams of events. It can handle, you know, data streams as in sophisticated event by event, stateful, timely, logic as you know many applications that are, you know, implemented as data-driven micro services or so and implement their logic. And the basic idea behind how Flink takes its approach to that is just start with the basic ingredients that you need that and try not to impose any form of like various constraints and so on around the use of that. So, when I give the presentations, I very often say the basic building blocks for Flink is just like flowing streams of data, streams being, you know, like received from that systems like Kafka, file systems, databases. So, you route them, you may want to repartition them, organize them by key, broadcast them, depending on what you need to do. You implement computation on these streams, a computation that can keep state almost as if it was, you know, like a standalone java application. You don't think necessarily in terms of writing state or database. Think more in terms of maintaining your own variables or so. Sophisticated access to tracking time and progress or progress of data, completeness of data. That's in some sense what is behind the event time streaming notion. You're tracking completeness of data as for a certain point of time. And then to to round this all up, give this a really nice operational tool by introducing this concept of distributed consistent snapshots. And just sticking with these basic primitives, you have streams that just flow, no barrier, no transactional barriers necessarily there between operations, no microbatches, just streams that flow, state variables that get updated and then full tolerance happening as an asynchronous background process. Now that is what is in some sense the I would say kind of the core idea and what helps Flink generalize from batch processing to, you know, real-time stream processing to event-driven applications. And what we saw today is, in the presentation that I gave earlier, how we use that to build a platform for stream processing and event-driven applications. That's taking some of these things and in that case I'm most prominently the fourth aspect the ability to draw like some application snapshots at any point in time and and use this as an extremely powerful operational tool. You can think of it as being a tool to archive applications, migrate applications, fork applications, modify them independently. >> And these snapshots are essentially your individual snapshots at the node level and then you're sort of organizing them into one big logical snapshot. >> Yeah, each node is its own snapshot but they're consistently organized into a globally consistent snapshot, yes. That has a few very interesting and important implications for example. So just to give you one example where this makes really things much easier. If you have an application that you want to upgrade and you don't have a mechanism like that right, what is the default way that many folks do these updates today? Try to do a rolling upgrade of all your individual nodes. You replace one then the next, then the next, then the next but that has this interesting situation where at some point in time there's actually two versions of the application running at the same time. >> And operating on the same sort of data stream. >> Potentially, yeah, or on some partitions of the data stream, we have one version and some partitions you have another version. You may be at the point we have to maintain two wire formats like all pieces of your logic have to be written in understanding both versions or you try to you know use the data format that makes this a little easier but it's just inherently a thing that you don't even have to worry about it if you have this consistent distributed snapshots. It's just a way to switch from one application to the other as if nothing was like shared or in-flight at any point in time. It just gets many of these problems just out of the way. >> Okay and that snapshot applies to code and data? >> So in Flink's architecture itself, the snapshot applies first of all only to data. And that is very important. >> George: Yeah. >> Because what it actually allows you is to decouple the snapshot from the code if you want to. >> George: Okay. >> That allows you to do things like we showed earlier this morning. If you actually have an earlier snapshot where the data is correct then you change the code but you introduce the back. You can just say, "Okay, let me actually change the code "and apply different code to a different snapshot." So, you can actually, roll back or roll forward different versions of code and different versions of state independently or you can go and say when I'm forking this application I'm actually modifying it. That is a level of flexibility that's incredible to, yeah, once you've actually start to make use of it and practice it, it's incredibly useful. It's been actually almost, it's been one of the maybe least obvious things once you start to look into stream processing but once you actually started production as stream processing, this operational flexibility that you get there is I would say very high up for a lot of users when they said, "Okay this is "why we took Flink to streaming production and not others." The ability to do for example that. >> But this sounds then like with some stream processors the idea of the unbundling the database you have derived data you know at different sync points and that derived data is you know for analysis, views, whatever, but it sounds like what you're doing is taking a derived data of sort of what the application is working on in progress and creating essentially a logically consistent view that's not really derived data for some other application use but for operational use. >> Yeah, so. >> Is that a fair way to explain? >> Yeah, let me try to rephrase it a bit. >> Okay. >> When you start to take this streaming style approach to things, which you know it's been called turning the database inside out, unbundling the database, your input sequence of event is arguably the ground truth and what the stream processor computes is as a view of the state of the world. So, while this sounds you know this sounds at first super easy and you know views, you can always recompute a few, right? Now in practice this view of the world is not just something that's just like a lightweight thing that's only derived from the sequence of events. it's actually the, it's the state of the world that you want to use. It might not be fully reproducible just because either the sequence of events has been truncated or because the sequence events is just like too plain long to feasibly recompute it in a reasonable time. So, having a way to work with this in a way that just complements this whole idea of you know like event-driven, log-driven architecture very cleanly is kind of what this snapshot tool also gives you. >> Okay, so then help us think so that sounds like that was part of core Flink. >> That is part of core Flink's inherent design, yes. >> Okay, so then take us to the the next level of abstraction. The scaffolding that you're building around it with the dA platform and how that should make that sort of thing that makes stream processing more accessible, how it you know it empowers a whole other generation. >> Yeah, so there's different angles to what the dA platform does. So, one angle is just very pragmatically easing rollout of applications by having a one way to integrate the you know the platform with your metrics. Alerting logins, the ICD pipeline, and then every application that you deploy over there just like inherits all of that like every edge in the application developer doesn't have to worry about anything. They just say like this is my piece of code. I'm putting it there and it's just going to be hooked in with everything else. That's not rocket science but it's extremely valuable because there's like a lot of tedious bits here and there that you know otherwise eat up a significant amount of the development time. Like technologically maybe more challenging part that this solves is the part where we're really integrating the application snapshot, the compute resources, the configuration management and everything into this model where you don't think about I'm running a Flink job here. That Flink job has created a snapshot that is running around here. There's also a snapshot here which probably may come from that Flink application. Also, that Flink application was running. That's actually just a new version of that Flink application which is the let's say testing or acceptance run for the version that we're about to deploy here and so like tying all of these things together. >> So, it's not just the artifacts from one program, it's how they all interrelate? >> It gives you the idea of exactly of how they all interrelate because an application over its lifetime will correspond to different configurations different code versions, different different deployments on production a/b testing and so on and like how all of these things kind of work together how they interplay right, Flink, like I said before Flink deliberately couples checkpoints and code and so on in a rather loose way to allow you to to evolve the code differently then and still be able to match a previous snapshot into a newer code version and so on. We make heavy use of that but we we cannot give you a good way of first of all tracking all of these things together how do they how do they relate, when was which version running, what code version was that, having a snapshots we can always go back and reinstate earlier versions, having the ability to always move on a deployment from here to there, like fork it, drop it, and so on. That is one part of it and the other part of it is the tight integration with with Kubernetes which is initially container sweet spot was stateless compute and the way stream processing is, how architecture works is the nodes are inherently not stateless, they have a view of the state of the world. This is recoverable always. You can also change the number of containers and with Flink and other frameworks you have the ability to kind of adjust this and so on, >> Including repartitioning the-- >> Including repartitioning the state, but it's a thing that you have to be often quite careful how to do that so that it all integrates exactly consistency, like the right containers are running at the right point in time with the exact right version and there's not like there's not a split brain situation where this happens to be still running some other partitions at the same time or you're running that container goes down and it's this a situation where you're supposed to recover or rescale like, figuring all of these things out, together this is what they like the idea of integrating these things in a very tight way gives you so think of it as the following way, right? You start with, initially you just start with Docker. Doctor is a way to say, I'm packaging up everything that a process needs, all of its environment to make sure that I can deploy it here and here in here and just always works it's not like, "Oh, I'm missing "the correct version of the library here," or "I'm "interfering with that other process on a port." On top of Docker, people added things like Kubernetes to orchestrate many containers together forming an application and then on top of Kubernetes there are things like Helm or for certain frameworks there's like Kubernetes Operators and so on which try to raise the abstraction to say, "Okay we're taking care of these aspects that this needs in addition to a container orchestration," we're doing exactly that thing like we're raising the abstraction one level up to say, okay we're not just thinking about the containers the computer and maybe they're like local persistent storage but we're looking at the entire state full application with its compute, with its state with its archival storage with all of it together. >> Okay let me sort of peel off with a question about more conventionally trained developers and admins and they're used to databases for a batch and request response type jobs or applications do you see them becoming potential developers of continuous stream processing apps or do you see it only mainly for a new a new generation of developers? >> No, I would I would actually say that that a lot of the like classic... Call it request/response or call it like create update, delete create read update delete or so application working against the database, there's this huge potential for stream processing or that kind of event-driven architectures to help change this view. There's actually a fascinating talk here by the folks from (mumbles) who implemented an entire social network in this in this industry processing architecture so not against the database but against a log in and a stream processor instead it comes with some really cool... with some really cool properties like very unique way of of having operational flexibility too at the same time test, and evolve run and do very rapid iterations over your-- >> Because of the decoupling? >> Exactly, because of the decoupling because you don't have to always worry about okay I'm experimenting here with something. Let me first of all create a copy of the database and then once I actually think that this is working out well then, okay how do I either migrate those changes back or make sure that the copy of the database that I did that bring this up to speed with a production database again before I switch over to the new version and so like so many of these things, the pieces just fall together easily in the streaming world. >> I think I asked this of Kostas, but if a business analyst wants to query the current state of what's in the cluster, do they go through some sort of head node that knows where the partitions lay and then some sort of query optimizer figures out how to execute that with a cost model or something? In other words, if you want it to do some sort of batcher interactive type... >> So there's different answers to that, I think. First of all, there's the ability to log into the state of link as in you have the individual nodes that maintains they're doing the computation and you can look into this but it's more like a look up thing. >> It's you're not running a query as in a sequel query against that particular state. If you would like to do something like that, what Flink gives you as the ability is as always... There's a wide variety of connectors so you can for example say, I'm describing my streaming computation here, you can describe in an SQL, you can say the result of this thing, I'm writing it to a neatly queryable data store and in-memory database or so and then you would actually run the dashboard style exploratory queries against that particular database. So Flink's sweet spot at this point is not to run like many small fast short-lived sequel queries against something that is in Flink running at the moment. That's not what it is yet built and optimized for. >> A more batch oriented one would be the derived data that's in the form of a materialized view. >> Exactly, so this place, these two sites play together very well, right? You have the more exploratory better style queries that go against the view and then you have the stream processor and streaming sequel used to continuously compute that view that you then explore. >> Do you see scenarios where you have traditional OLTP databases that are capturing business transactions but now you want to inform those transactions or potentially automate them with machine learning. And so you capture a transaction, and then there's sort of ambient data, whether it's about the user interaction or it's about the machine data flowing in, and maybe you don't capture the transaction right away but you're capturing data for the transaction and the ambient data. The ambient data you calculate some sort of analytic result. Could be a model score and that informs the transaction that's running at the front end of this pipeline. Is that a model that you see in the future? >> So that sounds like a formal use case that has actually been run. It's not uncommon, yeah. It's actually, in some sense, a model like that is behind many of the fraud detection applications. You have the transaction that you capture. You have a lot of contextual data that you receive from which you either built a model in the stream processor or you built a model offline and push it into the stream processor. As you know, let's say a stream of model updates, and then you're using that stream of model updates. You derive your classifiers or your rule engines, or your predictor state from that set of updates and from the history of the previous transactions and then you use that to attach a classification to the transaction and then once this is actually returned, this stream is fed back to the part of the computation that actually processes that transaction itself to trigger the decision whether to for example hold it back or to let it go forward. >> So this is an application where people who have built traditional architectures would add this capability on for low latency analytics? >> Yeah, that's one way to look at it, yeah. >> As opposed to a rip and replace, like we're going to take out our request/response in our batch and put in stream processing. >> Yeah, so that is definitely a way that stream processing is used, that you you basically capture a change log or so of whatever is happening in either a database or you just immediately capture the events, the interaction from users and devices and then you let the stream processor run side by side with the old infrastructure. And just exactly compute additional information that, even a mainframe database might in the end used to decide what to do with a certain transaction. So it's a way to complement legacy infrastructure with new infrastructure without having to break off or break away the legacy infrastructure. >> So let me ask in a different direction more on the complexity that forms attacks for developers and administrators. Many of the open source community products slash projects solve narrow sort of functions within a broader landscape and there's a tax on developers and admins and trying to make those work together because of the different security models, data models, all that. >> There is a zoo of systems and technologies out there and also of different paradigms to do things. Once systems kind of have a similar paradigm, or a tier in mind, they usually work together well, but there's different philosophical takes-- >> Give me some examples of the different paradigms that don't fit together well. >> For example... Maybe one good example was initially when streaming was a rather new thing. At this point in time stream processors were very much thought of as a bit of an addition to the, let's say, the batch stack or whatever ever other stack you currently have, just look at it as an auxiliary piece to do some approximate computation and a big reason why that was the case is because, the way that these stream processors thought of state was with a different consistency model, the way they thought of time was actually different than you know like the batch processors of the databases at which use time stem fields and the early stream processors-- >> They can't handle event time. >> Exactly, just use processing time, that's why these things you know you could maybe complement the stack with that but it didn't really go well together, you couldn't just say like, okay I can actually take this batch job kind of interpret it also as a streaming job. Once the stream processors got a better interpretation. >> The OEM architecture. >> Exactly. So once the stream processors adopted a stronger consistency model a time model that is more compatible with reprocessing and so on, all of these things all of a sudden fit together much better. >> Okay so, do you see that vendors who are oriented around a single paradigm or unified paradigm, do you see them continuing to broaden their footprint so that they can essentially take some of the complexity off the developer and the admin by providing something that, one throat to choke with the pieces that were designed to work together out-of-the-box, unlike some of the zoos with the former Hadoop community? In other words, lot of vendors seem to be trying to do a broader footprint so that it's something that's just simpler to develop to and to operate? >> There there are a few good efforts happening in that space right now, so one that I really like is the idea of standardizing on some APIs. APIs are hard to standardize on but you can at least standardize on semantics, which is something, that for example Flink and Beam have been very keen on trying to have an open discussion and a road map that is very compatible in thinking about streaming semantics. This has been taken to the next level I would say with the whole streaming sequel design. Beam is adding adding stream sequel and Flink is adding stream sequel, both in collaboration with the Apache CXF project, so very similar standardized semantics and so on, and the sequel compliancy so you start to get common interfaces, which is a very important first step I would say. Standardizing on things like-- >> So sequel semantics are across products that would be within a stream processing architecture? >> Yes and I think this will become really powerful once other vendors start to adopt the same interpretation of streaming sequel and think of it as, yes it's a way to take a changing data table here and project a view of this changing data table, a changing materialized view into another system, and then use this as a starting point to maybe compute another derive, you see. You can actually start and think more high-level about things, think really relational queries, dynamic tables across different pieces of infrastructure. Once you can do something like interplay in architectures become easier to handle, because even if on the runtime level things behave a bit different, at least you start to establish a standardized model, in thinking about how to compose your architecture and even if you decide to change on the way, you frequently saved the problem of having to rip everything out and redesign everything because the next system that you bring in just has a completely different paradigm that it follows. >> Okay, this is helpful. To be continued offline or back online on the CUBE. This is George Gilbert. We were having a very interesting and extended conversation with Stephan Ewen, CTO and co-founder of data Artisans and one of the creators of Apache Flink. And we are at Flink Forward in San Francisco. We will be back after this short break.

Published Date : Apr 12 2018

SUMMARY :

brought to you by data Artisans. This is the second Flink Forward in San Francisco how the platform grows so that you can solve with the basic ingredients that you need that and then you're sort of organizing them So just to give you one example where this makes have to worry about it if you have this consistent the snapshot applies first of all only to data. the snapshot from the code if you want to. that you get there is I would say very high up and that derived data is you know for analysis, approach to things, which you know it's been called like that was part of core Flink. more accessible, how it you know it empowers and everything into this model where you and so on in a rather loose way to allow you to raise the abstraction to say, "Okay we're taking care that a lot of the like classic... make sure that the copy of the database that I did that In other words, if you want it to do the state of link as in you have the individual nodes or so and then you would actually run of a materialized view. go against the view and then you have the stream processor Is that a model that you see in the future? You have the transaction that you capture. As opposed to a rip and replace, and devices and then you let the stream processor run Many of the open source community there and also of different paradigms to do things. Give me some examples of the different paradigms that the batch stack or whatever ever other stack you currently you know you could maybe complement the stack with that So once the stream processors right now, so one that I really like is the idea of to maybe compute another derive, you see. and one of the creators of Apache Flink.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Stephan Ewen	PERSON	0.99+
George	PERSON	0.99+
Stephan	PERSON	0.99+
San Francisco	LOCATION	0.99+
Flink	ORGANIZATION	0.99+
one version	QUANTITY	0.99+
both versions	QUANTITY	0.99+
two sites	QUANTITY	0.99+
Apache Flink	ORGANIZATION	0.99+
two versions	QUANTITY	0.99+
Flink Forward	ORGANIZATION	0.99+
second	QUANTITY	0.99+
one	QUANTITY	0.99+
today	DATE	0.98+
fourth aspect	QUANTITY	0.98+
java	TITLE	0.98+
Artisans	ORGANIZATION	0.98+
one program	QUANTITY	0.97+
one way	QUANTITY	0.97+
both	QUANTITY	0.97+
Kubernetes	TITLE	0.97+
one angle	QUANTITY	0.97+
Kafka	TITLE	0.96+
one part	QUANTITY	0.96+
first step	QUANTITY	0.96+
two wire formats	QUANTITY	0.96+
first	QUANTITY	0.96+
First	QUANTITY	0.94+
each node	QUANTITY	0.94+
Beam	ORGANIZATION	0.94+
one example	QUANTITY	0.94+
CTO	PERSON	0.93+
2018	DATE	0.93+
Docker	TITLE	0.92+
Apache	ORGANIZATION	0.91+
one good example	QUANTITY	0.91+
single paradigm	QUANTITY	0.9+
one application	QUANTITY	0.89+
Flink	TITLE	0.86+
node	TITLE	0.79+
Kostas	ORGANIZATION	0.76+
earlier this morning	DATE	0.69+
CUBE	ORGANIZATION	0.67+
SQL	TITLE	0.64+
Helm	TITLE	0.59+
CXF	TITLE	0.59+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for two wire formats: