Stephan Ewen | Flink Forward 2017
(click) >> Welcome, everyone, we're back at the Flink Forward user conference sponsored by data Artisan's folks. This is the first U.S. based Flink user conference, and we are on the ground at the Kabuki Hotel in San Francisco. We have a special guest, Stephan Ewen, who is one of the founders of data Artisans, and one of the creators of Flink. He is CTO, and he is in a position to shed some unique light on the direction of the company and the product. Welcome, Stephan. >> Yeah, so you were asking about how can stream processing or how can Flink and data Artisans help companies that are enterprises that want to adopt this kind of technologies actually do that despite the fact that we've been seeing, if we look at what the big internet companies that first adopted these technologies, what they had to do, they had to go through all this big process of productionizing these things by integrating them with so many other systems, making sure everything fits together, everything kind of works as one piece. What can we do there? So I think there are a few interesting points to that. Let's maybe start with stream processing in general. So, stream processing by itself has actually the potential to simplify many of these setups and infrastructures, per se. There's multiple dimensions to that. First of all, the ability to just more naturally fit what you're doing to what is actually happening. Let me qualify that a little bit. All these companies that are dealing with big data are dealing with data that is typically continuously produced from sensors, from user devices, from server logs, from all these things, right? Which is quite naturally a stream. And processing this with systems that give you the abstraction of a stream is a much more natural fit, so you eliminate bunches of the pipeline that do, for example, try to do periodic ingestion, and then grooming that into a video file and data sets and periodic processing of that and you can for example, get rid of a lot of these things. You kind of get a paradigm that unifies the processing of real time data and also historic data. So this by itself is an interesting development that I think many have recognized and that's why they're excited about stream processing because it helps reduce a lot of that complexity. So that is one side to it. The other side to it is that there was always kind of an interplay between the processing on the data and then you want to do something with these insights, right, you don't process the data just for the fun of processing, right? Usually the outcome infers to something. Sometimes it's just a report, but sometimes it's something that immediately affects how certain services react. For example, how they apply their decisions in classifying transactions as frauds or how to send out alerts, how to trigger certain actions. The interesting thing is then, we're going to see actually a little more of that later in this conference also, is that in this reprocessing paradigm there's this very natural way for these online live applications and the analytical applications to march together, again, reducing a bunch of this complexity. Another thing that is happening that I think is very, very powerful and helping (mumbles) in bringing these kind of technologies to a broader anchor system is actually how the whole deployment stick is growing. So we see actually more and more users converging onto recessed management infrastructures. Yan was an interesting first step to make it really easy and once you've productionized that part of productionized voice systems but even beyond that, like the uptake of mezas, the uptake of containment engines like (mumbles) on the ability to just prepare more functionality buttoned together out of the box, it doesn't pack into a container of what you need and put it into a repository and then various people can bring up these services without having to go through all of the set up and integration work, it can kind of way better templated integration with systems with this kind of technology. So those seem to be helping a lot for much broader adoption of these kind of technologies. Both stream processing as an easier paradigm, fewer moving parts, and developments and (mumbles) technologies. >> So let me see if I can repeat back just a summary version, which is stream processing is more natural to how the data is generated, and so we want to match the processing to how it originates, it flows. At the same time, if we do more of that, that becomes a workload or an application pattern that then becomes more familiar to more people who didn't grow up in a continuous processing environment. But also, it has a third capability of reducing the latency between originating or adjusting the data and getting an analysis that informs a decision whether by a person or a machine. Would that be a >> Yeah, you can even go one step further, it's not just about introducing the latency from the analysis to the decision. In many cases you can actually see that the part that does the analysis in the decision just merge and become one thing which makes it much fewer moving parts, less integration work, less, yeah, less maintenance and complexity. >> Okay, and this would be like, for example, how application databases are taking on the capabilities of analytic databases to some extent, or how stream processors can have machine learning whether they're doing online learning or calling a model that they're going to score in real time or even a pre scored model, is that another example of where we put? >> You can think of those as examples, yeah. A nice way to think about it is that if you look at what a lot of what the analytical applications do versus let's say, just online services that measure offers and trades, or to generate alerts. A lot of those kind of are, in some sense, different ways of just reacting to events, right? If you are receiving some real time data and just want to process these interact with some form of knowledge that you accumulated over the past, or some form of knowledge that you've accumulated from some other inputs and then react to that. That kind of paradigm which is in the core of stream processing for (mumbles) is so generic that it covers many of these use cases, both building directly applications, as we have actually seen, we have seen users that directly build a social network on Flink, where the events that they receive are, you know, a user being created, a user joining a group and so on, and it also covers the analytics of just saying, you know, I have a stream of sensor data and on certain outliers I want to raise alerts. It's so similar once you start thinking about both of them as just handling streams of events, in this flexible fashion that it helps to just bring together many things. >> So, that sounds like it would play into the notion of, micro services where the service is responsible for its own state, and they communicate with each other asynchronously, so you have a cooperating collection of components. Now, there are a lot of people who grew up with databases out here sharing the state among modules of applications. What might drive the growth of this new pattern, the microservices, for, you know, considering that there's millions of people who just know how to use databases to build apps. >> The interesting part that I think drives this new adaption is that it's such a natural fit for the microservice world. So how do you deploy microservices with state, right? You can have a central database with which you work and every time you create a new service you have to make sure that it fits with the capacities and capabilities of the database, you have to make sure that the group that runs this database is okay with the additional load that, or you can go to the different model where each microservice comes up with its own database, but that time, every time you deploy one and that may be a new service or it may just be experimenting with a different variation of the service they'd be testing. You'd have to bring out a completely new thing. In this interesting world of stream processing, stateful stream processing is done by Flink state is embedded directly in the processing application. So, you actually don't worry about this thing separately, you just deploy that one thing, and it brings both together tightly integrated, and it's a natural fit, right, the working set of your application goes with your application. If it deployed, if it's (mumbles), if you bring it down, these things go away. What the central part in this thing is it's nothing more than if you wish a back up store where it would take these snapshots of microservices and store them in order to recover them from catastrophic failures in order to just have an historic version to look into if you figure it out later, you know, something happened, and was this introduced in the last week, let me look at what it looked like the week before or to just migrate it to a different cluster. >> So, we're going to have to cut things short in a moment, but I wanted to ask you one last question: If like, microservices as a sweet spot and sort of near real time decisions are also a sweet spot for Kafka, what might we expect to see in terms of a roadmap that helps make those, either that generalizes those cases, or that opens up new use cases? >> Yes, so, what we're immediately working on in Flink right now is definitely extending the support in this area for the ability to keep much larger state in these applications, so state that really goes into the multiple terrabytes per service, functionality that allows us to manage this, even easier to evolve this, you know. If the application actually starts owning the state and it's not in a centralized database anymore, you start needing a little bit of tooling around this state, similar as the tooling you need in databases, a (mumbles) in all of that, so things that actually make that part easier. Handling (mumbles) and we're actually looking into what are the API's that users actually want in this area, so Flink has I think pretty stellar stream processing API's and if you've seen in the last release, we've actually started adding more low level API's one could even think, API's in which you don't think as streams as distributed collections and windows but to just think about the very basic in gradiances, events, state, time and snapshots, so more control and more flexibility by just taking directly the basic building blocks rather than more high level abstractions. I think you can expect more evolution on that layer, definitely in the near future. >> Alright, Stephan, we have to leave it at that, and hopefully to pick up the conversation not too long in the future, we are at the Flink Forward Conference at the Kabuki Hotel in San Francisco, and we will be back with more just after a few moments. (funky music)
SUMMARY :
and one of the creators of Flink. First of all, the ability to just more naturally that then becomes more familiar to more people that does the analysis in the decision just merge and it also covers the analytics of just saying, you know, the microservices, for, you know, and capabilities of the database, similar as the tooling you need in databases, a (mumbles) and hopefully to pick up the conversation
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Stephan | PERSON | 0.99+ |
Stephan Ewen | PERSON | 0.99+ |
Flink | ORGANIZATION | 0.99+ |
San Francisco | LOCATION | 0.99+ |
one | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
first step | QUANTITY | 0.99+ |
one piece | QUANTITY | 0.99+ |
both | QUANTITY | 0.98+ |
U.S. | LOCATION | 0.98+ |
one side | QUANTITY | 0.98+ |
first | QUANTITY | 0.98+ |
each microservice | QUANTITY | 0.98+ |
one thing | QUANTITY | 0.97+ |
First | QUANTITY | 0.97+ |
one last question | QUANTITY | 0.95+ |
Both | QUANTITY | 0.94+ |
third | QUANTITY | 0.92+ |
Kabuki Hotel | LOCATION | 0.9+ |
Kafka | TITLE | 0.89+ |
one step | QUANTITY | 0.89+ |
Artisan | ORGANIZATION | 0.85+ |
Flink Forward user | EVENT | 0.85+ |
millions of people | QUANTITY | 0.85+ |
data Artisans | ORGANIZATION | 0.82+ |
Flink Forward | ORGANIZATION | 0.82+ |
2017 | DATE | 0.73+ |
Forward Conference | LOCATION | 0.55+ |
Jamie Grier | Flink Forward 2017
>> Welcome back, everyone, we're at the Flink Forward conference, this is the user conference for the Flink community, started by Data Artisans and sponsored by Data Artisans. We're at the Kabuki Hotel in San Francisco and we have with this another special guest, Jamie Grier, who's Director of Applications Engineering at Data Artisans. Jamie, welcome. >> Thanks. >> So we've seen an incredible pace of innovation in the Apache open source community and as soon as one technology achieves mainstream acceptance, it sort of gets blown away by another one, like MapReduce and Spark. There's an energy building around Flink and help us understand where it fits relative to, not necessarily things that it's replacing so much as things that it's complementing. >> Sure. Really what Flink is is it's a real stream processor so it's a stateful stream processor. The reason that I say it's a real stream processor is because the model, the competition model, the way the engine works, the semantics of the whole thing are the continuous programming model, which means that, really, you just consume events one at a time, you can update any sort of data structures you want, which Flink manages, full tolerantly, at scale, and you can do flexible things with processing, with regards to time, scheduling things to happen at different times, when certain amounts of data are complete, et cetera, so it's not oriented strictly towards, a lot of the stream processing in the past has been oriented sort of towards analytics alone or that's the real sweet spot, whereas Flink as a technology enables you to build much more complex event- and time-driven applications in a much more flexible way. >> Okay so let me unpack that a bit. >> Sure. >> So what we've seen in the Haddud community for the last x many years was really an analytic data pipeline put the data into a data lake and the hand-offs between the services made it a batch process. We tried to start adding data science and machine learning to it, it remained pretty much a batch process, 'cause it's in the data lake, and then when we started to experiment with stream processors, their building blocks were all around analytics and so they were basically an analytic pipeline. If I'm understanding you, you handle not just the analytics but the update-oriented or the cred-oriented operations, create, read, update, delete. >> Yeah, exactly. >> That you would expect from having a database as part of an application platform. >> Yeah. I mean, that's all true, but it goes beyond that. I mean, Flink as a stateful stream processor has in a sense a micro simple database as part of the stream processor. So yeah, you can update that state, like you said, the crud operations on that state, but it's more than that, you can build any kind of logic at all that you can think of that's driven by consuming events. Consuming events, doing calculations, and emitting events. Analytics is very easily built on top of something as powerful as that, but if you drop down below these higher level analytics APIS, you truly can build anything you want that consumes events, updates state, and emits events. And especially when there's a time dimension to these things like sometimes you consume some event and it means that at some future time, you want to schedule some processing to happen. And these basic primitives really allow you to build, I tell people all the time, Flink allows you to do this consuming of events and updating data structures of your own choosing, does it full tolerantly and at scale, build whatever you want out of that. And what people are building are things that are truly not really expressible as an analytics jobs. It's more just building applications. >> Okay, so let me drill down on that. >> Sure. >> Let's take an example app, whether it's, I'll let you pick it, but one where you have to assume that you can update state and you can do analytis and they're both in the same map, which is what we've come to expect from traditional apps although they have their shared state in a database outside the application. >> So a good example is, I just got done doing a demo, literally just before this, and it's a training application, so you build a training engine, it's consuming position information from webstream systems and it's consuming quotes. Quotes are all the bids and all the offers to buy stock at a given price, we have our own positions we're holding within the firm if we're a bank, and those positions, that's our state we're talking about. So it says I own a million shares of Apple, I own this many shares of Google, this is the price I paid, et cetera, so then we have some series of complex rules that say, hey, I've been holding this position service for a certain period of time, I've been holding it for a day now and so I want to more aggressively trade out of this position and I do that by modifying my state, driven by time, so more time has gone past, I'm going to lower my ask price, now trades are streaming in as well to the system and I'm trying to more aggressively make trades by lowering the price I'm willing to trade for. So these things are all just event-driven applications, the state is your positions in the market and the time dimension is exactly that, as you've been holding the position longer, you start to change your price or change your trading strategy in order to liquidate a little bit more aggressively, none of that is in the category of, I'd say you're using analytics along the way but none of that is just what you'd think of as a typical analytics or an analytics API. You need an API that allows you to build those sorts of flexible event-driven things. >> And the persistence part or the maybe transactional part is I need to make a decision as a human or the machine and record that decision and so that's why there's benefit to having the analytics and the database, whatever term we give it, in the same. >> Co-located. >> Co-located, yeah, in the same platform. >> Yeah there's a bunch of reasons why that's good. That's one of them, another reason is because when you do things at high scale and you have high through, say in that trading system we're consuming the entire options chains worth of all the bids in asks, right? It's a load of data so you want to use a bunch of machines but you want to, you don't want to have to look up your state in some database for every single message when instead you could share the input stream and both input streams by the same key and you end up doing all of your look-up join type operations locally on one machine. So at high scale it's a huge just performance benefit. Also allows you to manage that state consistently, consistent with the input streams. If you have the data in a external database and a node fails then you need to sort of back up in the input stream a little bit, replay a little bit of the data, you have to also be able to back up your state to a consistent point with all of the inputs and if you don't manage that state, you cannot do it. So that's one of the core reasons why stream processors need to have state, so they can provide strong guarantees about correctness. >> What are some of the other popular stream processors, when they choose perhaps not to manage state to the same integrated degree that you guys do? What was their thinking in terms of, what trade-off did they make? >> It was hard. So I've also worked on previous streaming systems in the past and for a long time, actually, and managing all this state in a consistent way is difficult and so the early generation systems didn't do it for exactly that reason, let's just put it in the database but the problem with that is exactly what I just mentioned and in stream processing we tend to talk about exactly once and at least once, this is actually the source of the problem so if the database is storing your state, you can't really provide these exactly-once type guarantees because when you replace some data, you back up in the input, you also have to back up the state and that's not really a database operation that's normally available, so when you manage to state yourself in the stream processor, you can consistently manage the input in the state. So you can exactly-once semantics in the face of failure. >> And what do you trade in not having, what do you give up in not having a shared database that has 40 years of maturity and scalability behind it versus having these micro databases distributed around. Is it the shuffling of? >> You give up a robust external quarry interface, for one thing, you give up some things you don't need like the ability to have multiple writers and transactions and all that stuff, you don't need any of that because in a stream processor, for any given key there's always one writer and so you get a much simpler type of database you have to support. What else? Those are the main things you really give up but I would like to also draw a distinction here between state and storage. Databases are still obviously, Flink state is not storage, not long-term storage, it's to hold the data that's currently sort of in flight and mutable until it's no longer being mutated and then the best practice would be to emit that as some sort of event or as a sync into a database and then it's stored for the long-term, so it's really good to start to think about the difference between what is state and what is storage, does that make sense? >> I think so. >> So think of, you're accounting, you're doing distributed accounting, which is an analytics thing, you're counting by key, the count per key is your state until that window closes and I'm not going to be mutated anymore, then we're headed into the database. >> Got it. >> Right? >> Yeah. >> But that internal, that sort of in-flight state is what you need to manage in the stream process. >> Okay, so. >> So it's not a total replacement for database, it's not that. >> No no no, but this opens up another thread that I don't think we've heard enough of. Jamie, we're going to pause it here. >> Okay. >> 'Cause I hope to pick this thread up with you again, the big surprise from the last two interviews, really, is Flink is not just about being able to do low latency per event processing, it's that it's a new way of thinking about applications beyond the traditional stream processors where it manages state or data that you want to keep that's not just transient and that it becomes a new way of building micro services. >> Exactly, yeah. >> So on that note, we're going to sign off from the Data Artisans user conference, Flink Forward, we're here in San Francisco on the ground at the Kabuki Hotel. (upbeat music)
SUMMARY :
for the Flink community, started by Data Artisans in the Apache open source community and as soon as one and you can do flexible things with processing, 'cause it's in the data lake, and then when we started That you would expect from having a database I tell people all the time, Flink allows you to do this that you can update state and you can do analytis You need an API that allows you to build those sorts And the persistence part or the maybe transactional part in the same platform. by the same key and you end up doing all of your in the input, you also have to back up the state what do you give up in not having a shared database Those are the main things you really give up by key, the count per key is your state until that window that sort of in-flight state is what you need So it's not a total that I don't think we've heard enough of. this thread up with you again, the big surprise on the ground at the Kabuki Hotel.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jamie Grier | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Apple | ORGANIZATION | 0.99+ |
Jamie | PERSON | 0.99+ |
Data Artisans | ORGANIZATION | 0.99+ |
San Francisco | LOCATION | 0.99+ |
40 years | QUANTITY | 0.99+ |
Flink | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
one writer | QUANTITY | 0.99+ |
one machine | QUANTITY | 0.98+ |
a million shares | QUANTITY | 0.97+ |
both | QUANTITY | 0.97+ |
a day | QUANTITY | 0.97+ |
Flink Forward | ORGANIZATION | 0.97+ |
one thing | QUANTITY | 0.97+ |
Data Artisans | EVENT | 0.94+ |
2017 | DATE | 0.94+ |
Kabuki Hotel | LOCATION | 0.94+ |
once | QUANTITY | 0.93+ |
Flink Forward | EVENT | 0.85+ |
every single message | QUANTITY | 0.81+ |
both input | QUANTITY | 0.81+ |
two interviews | QUANTITY | 0.81+ |
Apache | ORGANIZATION | 0.78+ |
one technology | QUANTITY | 0.69+ |
MapReduce | ORGANIZATION | 0.67+ |
Haddud | LOCATION | 0.58+ |
least | QUANTITY | 0.58+ |
Spark | TITLE | 0.52+ |
APIS | ORGANIZATION | 0.41+ |
Xiaowei Jiang | Flink Forward 2017
>> Welcome everyone, we're back at the first Flink Forward Conference in the U.S. It's the Flink User Conference sponsored by Data Artisans, the creators of Apache Flink. We're on the ground at the Kabuki Hotel, and we've heard some very high-impact customer presentations this morning, including Uber and Netflix. And we have the great honor to have Xiaowei Jiang from Alibaba with us. He's Senior Director of Research, and what's so special about having him as our guest is that they have the largest Flink cluster in operation in the world that we know of, and that the Flink folks know of as well. So welcome, Xiaowei. >> Thanks for having me. >> So we gather you have a 1,500 node cluster running Flink. Let's sort of unpack how you got there. What were some of the use cases that drove you in the direction of Flink and complementary technologies to build with? >> Okay, I explain a few use cases. The first use case that prompted us to look into Flink is the classical Soch ETL case. Where basically it needs to process all the data that's necessary for such series. So we look into Flink about two years ago. The next case we use is the A/B testing framework which is used to evaluate how your machine learning model works. So, today we using a few other very interesting case, like we are using to do machine learning to adjust ranking of search results, to personalize your search results at real-time to deliver the best search results for our user. We are also using to do real-time anti-fraud detection for ads. So these are the typical use case we are doing. >> Okay, this is very interesting because with the ads, and the one before that, was it fraud? >> Ads is anti-fraud. Before that is machine learning, real-time machine learning. >> So for those, low latency is very important. Now, help unpack that. Are you doing the training for these models like in a central location and then pushing the models out close to where they're going to be used for like the near real-time decisions? Or is that all run in the same cluster? >> Yeah, so basically we are doing two things. We use Flink to do real-time feature update which change the feature at the real-time, like in a few seconds. So for example, when a user buy a product, the inventory needs to be updated. Such features get reflected in the ranking of search results to real-time. We also use it to do real-time trending of the model itself. This becomes important in some special events. For example, on China Singles Day which is the largest shopping holiday in China, it generates more revenue than Black Friday in United States already. On such a day, because things go on sale for almost 50 percent off, the user's behavior changes a lot. So whatever model you trend before does not work reliably. So it's really nice to have a way to adjust the model at real-time to deliver a best experience to our users. All this is actually running in the same cluster. >> OK, that's really interesting. So, it's like you have a multi-tenant solution that sounds like it's rather resource intensive. >> Yes. >> When you're changing a feature, or features, in the models, how do you go through the process of evaluating them and finding out their efficacy before you put them into production? >> Yeah, so this is exactly the A/B testing framework I just mentioned earlier. >> George: Okay. >> So, we also use Flink to track the metrics, the performance of these models at real time. Once these data are processed, we upload them into our Olark system so we can see the performance of the models at real time. >> Okay. Very, very impressive. So, explain perhaps why Flink was appropriate for those use cases. Is it because you really needed super low latency, or that you wanted less resource-intensive sort of streaming engine to support these? What made it fit that right sweet spot? >> Yeah, so Soch has lots of different products. They have lots of different data processing needs, so when we looked into all these needs, we quickly realized we actually need a computer engine that can do both batch processing and streaming processing. And in terms of streaming processing, we have a few needs. For example, we really need super low latency. So in some cases, for example, if a product is sold out, and is still displaying in your search results, when users click and try to buy they cannot buy it. It's a bad experience. So, the sooner you can get the data processed, the better. So with- >> So near real-time for you means, how many milliseconds does the- >> It's usually like a second. One second, something like that. >> But that's one second end to end talking to inventory. >> That's right. >> How much time would the model itself have to- >> Oh, it's very short. Yeah. >> George: In the single digit milliseconds? >> It's probably around that. There are some scenarios that require single digit milliseconds. Like a security scenario; that's something we are currently looking into. So when you do transactions in our site, we need to detect if it's a fraud transaction. We want to be able to block such transactions at real-time. For that to happen, we really need a latency that's below 10 millisecond. So when we're looking at computer engines, this is also one of the requirements we will think about. So we really need a computer engine which is able to deliver sub-second latency if necessary, and at the same time can also do batch efficiently. So we are looking for solutions that can cover all our computation needs. >> So one way of looking at it is many vendors and customers talk about elasticity as in the size of the cluster, but you're talking about elasticity or scaling in terms of latency. >> Yes, latency and the way of doing computation. So you can view the security in the scenario as super restrict on the latency requirement, but view Apache as most relaxed version of latency requirement. We want a full spectrum; it's a part of the full spectrum. It's possible that you can use different engines for each scenario; but which means you are required to maintain more code bases, which can be a headache. And we believe it's possible to have a single solution that works for all these use cases. >> So, okay, last question. Help us understand, for mainstream customers who don't hire the top Ph.D's out of the Chinese universities but who have skilled data scientists but not an unending supply, and aspire to build solutions like this; tell us some of the trade-offs they should consider given that, you know, the skillset and the bench strength is very deep at Alibaba, and it's perhaps not as widely disseminated or dispersed within a mainstream enterprise. How should they think about the trade-offs in terms of the building blocks for this type of system? >> Yeah, that's a very good question. So we actually thought about this. So, initially what we did is we were using data set and data string API, which is a relatively lower level API. So to develop an application with this is reasonable, but it still requires some skill. So we want a way to make it even simpler, for example, to make it possible for data scientists to do this. And so in the last half a year, we spent a lot of time working on Tableau API and SQL Support, which basically tries to describe your computation logic or data processing logic using SQL. SQL is used widely, so a lot of people have experience in it. So we are hoping with this approach, it will greatly lower the threshold of how people to use Flink. At the same time, SQL is also a nice way to unify the streaming processing and the batch processing. With SQL, you only need to write your process logic once. You can run it in different modes. >> So, okay this is interesting because some of the Flink folks say you know, structured streaming, which is a table construct with dataframes in Spark, is not a natural way to think about streaming. And yet, the Spark guys say hey that's what everyone's comfortable with. We'll live with probabilistic answers instead of deterministic answers, because we might have late arrivals in the data. But it sounds like there's a feeling in the Flink community that you really do want to work with tables despite their shortcomings, because so many people understand them. >> So ease of use is definitely one of the strengths of SQL, and the other strength of SQL is it's very descriptive. The user doesn't need to say exactly how you do the computation, but it just tells you what I want to get. This gives the framework a lot of freedom in optimization. So users don't need to worry about hard details to optimize their code. It lets the system do its work. At the same time, I think that deterministic things can be achieved in SQL. It just means the framework needs to handle such kind things correctly with the implementation of SQL. >> Okay. >> When using SQL, you are not really sacrificing such determinism. >> Okay. This is, we'll have to save this for a follow-up conversation, because there's more to unpack there. But Xiaowei Jiang, thank you very much for joining us and imparting some of the wisdom from Alibaba. We are on the ground at Flink Forward, the Data Artisans conference for the Flink community at the Kabuki hotel in San, Francisco; and we'll be right back.
SUMMARY :
and that the Flink folks know of as well. So we gather you have a 1,500 node cluster running Flink. So these are the typical use case we are doing. Before that is machine learning, Or is that all run in the same cluster? adjust the model at real-time to deliver a best experience So, it's like you have a multi-tenant solution Yeah, so this is exactly the A/B testing framework of the models at real time. or that you wanted less resource-intensive So, the sooner you can get the data processed, the better. It's usually like a second. Oh, it's very short. For that to happen, we really need a latency as in the size of the cluster, So you can view the security in the scenario in terms of the building blocks for this type of system? So we are hoping with this approach, because some of the Flink folks say It just means the framework needs to handle you are not really sacrificing such determinism. and imparting some of the wisdom from Alibaba.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George | PERSON | 0.99+ |
China | LOCATION | 0.99+ |
Xiaowei Jiang | PERSON | 0.99+ |
One second | QUANTITY | 0.99+ |
Alibaba | ORGANIZATION | 0.99+ |
United States | LOCATION | 0.99+ |
SQL | TITLE | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
two things | QUANTITY | 0.99+ |
Xiaowei | PERSON | 0.99+ |
Netflix | ORGANIZATION | 0.99+ |
San, Francisco | LOCATION | 0.99+ |
Tableau | TITLE | 0.99+ |
one second | QUANTITY | 0.99+ |
Flink | ORGANIZATION | 0.99+ |
Black Friday | EVENT | 0.98+ |
Soch | ORGANIZATION | 0.98+ |
each scenario | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Spark | TITLE | 0.98+ |
both | QUANTITY | 0.98+ |
first use case | QUANTITY | 0.97+ |
U.S. | LOCATION | 0.97+ |
Data Artisans | ORGANIZATION | 0.96+ |
China Singles Day | EVENT | 0.96+ |
almost 50 percent | QUANTITY | 0.96+ |
Flink Forward Conference | EVENT | 0.95+ |
single solution | QUANTITY | 0.94+ |
Flink User Conference | EVENT | 0.93+ |
two years ago | DATE | 0.93+ |
Flink Forward | EVENT | 0.91+ |
below 10 millisecond | QUANTITY | 0.91+ |
Apache Flink | ORGANIZATION | 0.91+ |
1,500 node | QUANTITY | 0.91+ |
Chinese | OTHER | 0.9+ |
second | QUANTITY | 0.89+ |
single | QUANTITY | 0.88+ |
this morning | DATE | 0.88+ |
Kabuki Hotel | LOCATION | 0.87+ |
last half a year | DATE | 0.87+ |
Apache | ORGANIZATION | 0.85+ |
Olark | ORGANIZATION | 0.83+ |
single digit | QUANTITY | 0.77+ |
Data Artisans conference | EVENT | 0.77+ |
Flink | TITLE | 0.74+ |
2017 | DATE | 0.74+ |
first | QUANTITY | 0.65+ |
seconds | QUANTITY | 0.62+ |
Kabuki | ORGANIZATION | 0.56+ |
people | QUANTITY | 0.51+ |
lot | QUANTITY | 0.47+ |
Sean Hester | Flink Forward 2017
>> Welcome back. We're at Flink Forward, the user conference for the Flink community, put on by data Artisans, the creators of Flink. We're on the ground at the Kabuki Hotel in Pacific Heights in San Francisco. And we have another special guest from BetterCloud, which is a management company. We have Sean Hester, Director of Engineering. And Sean, why don't you tell us, what brings you to Flink Forward? Give us some context for that. >> Sure, sure. So a little over a year ago we kind of started restructuring our application. We had a spike in our vision, we wanted to go a little bit bigger. And at that point we had done some things that were suboptimal, let's say, as far as our approach to the way we were generating operational intelligence. So we wanted to move to a streaming platform. We looked at a few different options and after pretty much a bake-off, Flink came out on top for us. And we've been using it ever since. It's been in production for us for about six months. We love it, we're big fans, we love their roadmap, so that's why we're here. >> Okay, so let's unpack that a little more. In the bake-off, what were the... So your use case is management. But within that bake-off, what were the criteria that surfaced as the highest priority? >> So for us we knew we wanted to be working with something that was kind of the latest generation of streaming technology. Something that had basically addressed all of the Google MillWheel paper, big problems, things like managing back pressure, how do you manage a checkpoint and restoring of state in a distributed streaming application? Things that we had no interest in writing ourselves after digging into the problem a little bit. So we wanted a solution that would solve those problems for us, and this seemed like it had a really solid community behind it. And again, Flink came off on top. >> Okay, so now, understanding sort of why you chose Flink, help us understand BetterCloud's service. What do you offer customers and how do you see that evolving over time? >> Sure, sure. So you've been calling us a management company, so we provide tooling for IT admins to manage their SAS applications. So things like the Google Suite, or Zendesk, or Slack. And we give them kind of that single point of entry, the single pane of glass to see everything, see all their users in one place, what applications are provisioned to which users, et cetera. And so we literally go to the APIs of each of our partners that we provide support for, gather data, and from there it starts flowing through the stream as a set of change events, basically. Hey, this user's had a title update or a manager update. Is that meaningful for us in some way? Do we want to run a particular work flow based on that event, or is that something that we need to take into account for a particular operational intelligence? >> Okay, so you dropped in there something really concrete. A change event for the role of an employee. That's a very application-specific piece of telemetry that's coming out of an app. Very different from saying, well, what's my CPU utilization, which'll be the same across all platforms. >> Correct. >> So how do you account for... applications that might have employees in one SAS app and also employees in a completely different SAS app, and they emit telemetry or events that mean different things? How do you bridge that? >> Exactly. So we have a set of teams that's dedicated to just the role of getting data from the SAS applications and emitting them into the overall BetterCloud system. After that there's another set of teams that's basically dedicated to providing that central, canonical view of a user or group or a... An asset, a document, et cetera. So all of those disparate models that might come in from any given SAS app get normalized by that team into what we call our canonical model. And that's what flows downstream to teams that I lead to have operational intelligence run on them. >> Okay, so just to be clear, for our mainstream customers who aren't rocket scientists like you-- (laughs) When they want to make sense of this, what you're telling them is they don't have to be locked into the management solution that comes from a cloud vendor where they're going to harmonize all their telemetry and their management solutions to work seamlessly across their services and the third party services that are on that platform. What you're saying is you're putting that commonality across apps that you support on different clouds. >> Yes, exactly. We provide kind of the glue, or the homogenization necessary to make that possible. >> Now this may sound arcane, but being able to put in place that commonality implies that there is overlap, complete overlap, for that information, for how to take into account and manage an employee onboarding here and one over there. What happens when, in applications, unlike in the hardware where it's obviously the same no matter what you're doing, what happens in applications where you can't find a full overlap? >> Well, it's never a full overlap. But there is typically a very core set of properties for a user account, for example, that we can work with regardless of what SAS application we might be integrating with. But we do have special areas, like metadata areas, within our events that are dedicated to the original data fresh from the SAS application's API, and we can do one-off operations specifically on that SAS app data. But yeah, in general there's a lot of commonality between the way people model a user account or a distribution group or a document. >> Okay, interesting. And so the role of streaming technology here is to get those events to you really quickly and then for you to apply your rules to identify a root cause or even to remediate either with advising a person, an administrator, or automatically. >> Yes, exactly. >> And plans for adding machine learning to this going forward? >> Absolutely, yeah. So one of our big asks, we started casting this vision in front of some of our core customers, was basically I don't know what normal is. You figure out what normal is and then let me know when something abnormal happens. Which is a perfect use case for machine learning. So we definitely want to get there. >> Running steady state, learning the steady state, then finding anomalies. >> Exactly, exactly. >> Interesting, okay. >> Not there yet but it's definitely on our roadmap. >> And then what about management companies that might say, we're just going to target workloads of this variety, like a big data workload, where we're going to take Kafka, Spark, Hive, and maybe something that predicts and serves, and we're just going to manage that. What trade-off to they get to make that are different from what you get to make? >> I'm not sure I quite understand the question you're getting at. >> If there's where they can narrow the scope of the processes they're going to model, or the workloads they're going to model, where it's, say, just big data workloads and there's going to be some batch interactive stuff and they are only going to cover a certain number of products because those are the only ones that fit into that type of workload. >> Oh I gotcha, gotcha. So we kind of designed our roadmap from the get-go knowing that one of our competitive advantages were going to be how quickly can we support additional SAS applications? So we've actually baked into most of our architecture, stuff that's very configuration-driven, let's say, versus hard coded, so that allows us to very quickly kind of onboard new SAS apps. So I think that winds up, the value of being able to manage and provision, run workloads against the 20 different SAS apps that an admin in a modern workplace might be working with is just so valuable that I think that's going to win the day eventually. >> Single pane of glass, not at the infrastructure level, but at the application level. >> Exactly, exactly. >> Okay. All right, we've been with Sean Hester of BetterCloud, and we will be right back. We're at the Flink Forward event, sponsored by data Artisans for the Flink user community. The first ever conference in the US for the Flink community. And we'll be back shortly. (electronic music)
SUMMARY :
And we have another special guest from BetterCloud, And at that point we had done some things that surfaced as the highest priority? So for us we knew we wanted to be working with and how do you see that evolving over time? based on that event, or is that something that we need to A change event for the role of an employee. So how do you account for... So we have a set of teams that's dedicated and the third party services that are on that platform. We provide kind of the glue, or the homogenization for that information, for how to take into account and we can do one-off operations And so the role of streaming technology here So one of our big asks, we started casting this vision Running steady state, learning the steady state, that are different from what you get to make? the question you're getting at. of the processes they're going to model, that I think that's going to win the day eventually. but at the application level. and we will be right back.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Sean | PERSON | 0.99+ |
Sean Hester | PERSON | 0.99+ |
US | LOCATION | 0.99+ |
Flink | ORGANIZATION | 0.99+ |
BetterCloud | ORGANIZATION | 0.99+ |
Flink Forward | EVENT | 0.99+ |
SAS | TITLE | 0.99+ |
Pacific Heights | LOCATION | 0.98+ |
one | QUANTITY | 0.97+ |
about six months | QUANTITY | 0.97+ |
one place | QUANTITY | 0.96+ |
ORGANIZATION | 0.95+ | |
Suite | TITLE | 0.94+ |
Single pane | QUANTITY | 0.91+ |
San Francisco | LOCATION | 0.91+ |
each | QUANTITY | 0.9+ |
single point | QUANTITY | 0.87+ |
Spark | TITLE | 0.86+ |
Zendesk | ORGANIZATION | 0.83+ |
a year ago | DATE | 0.8+ |
Hive | TITLE | 0.8+ |
Flink Forward 2017 | EVENT | 0.79+ |
Kafka | TITLE | 0.79+ |
Flink Forward | ORGANIZATION | 0.75+ |
over | DATE | 0.75+ |
MillWheel | COMMERCIAL_ITEM | 0.71+ |
first ever | QUANTITY | 0.71+ |
Kabuki Hotel | LOCATION | 0.71+ |
20 different | QUANTITY | 0.71+ |
single pane | QUANTITY | 0.71+ |
apps | QUANTITY | 0.61+ |
Slack | TITLE | 0.59+ |
BetterCloud | TITLE | 0.52+ |
Chinmay Soman | Flink Forward 2017
>> Welcome back, everyone. We are on the ground at the data Artisans user conference for Flink. It's called Flink Forward. We are at the Kabuki Hotel in lower Pacific Heights in San Francisco. The conference kicked off this morning with some great talks by Uber and Netflix. We have the privilege of having with us Chinmay Soman from Uber. >> Yes. >> Welcome, Chinmay, it's good to have you. >> Thank you. >> You gave a really, really interesting presentation about the pipelines you're building and where Flink fits, but you've also said there's a large deployment of Spark. Help us understand how Flink became a mainstream technology for you, where it fits, and why you chose it. >> Sure. About one year back, when we were starting to evaluate what technology makes sense for the problem space that we are trying to solve, which is neural dynamics. We observed that Spark's theme processing is actually more resource intensive then some of the other technologies we benchmarked. More specifically, it was using more memory and CPU, at that time. That's one... I actually came from the Apache Samza world. It wasn't the same LinkedIn team before I came to Uber. We had in-house expertise on Samza and I think the reliability was the key motivation for choosing Samza. So we started building on top of Apache Samza for almost the last one and a half years. But then, we hit the scale where Samza, we felt, was lacking. So with Samza, it's actually tied into Kafka a lot. You need to make sure your Kafka scales in order for the stream processing to scale. >> In other words, the topics and the partitions of those topics, you have to keep the physical layout of those in mind at the message cue level, in line with the stream processing. >> That's right. The paralysm is actually tied into a number of partitions in Kafka. Further more, if you have a multi-stage pipeline, where one stage processes data and sends output to another stage, all these intermediate stages, today, again go back to Kafka. So if you want to do a lot of these use cases, you actually end up creating a lot of Kafka topics and the I/O overhead on a cluster shoots up exponentially. >> So when creating topics, or creating consumers that do something and then output to producers, if you do too many of those things, you defeat the purpose of low-latency because you're storing everything. >> Yeah. The credit of it is, it is more robust because if you suddenly get a spike in your traffic, your system is going to handle it because Kafka buffers that spike. It gives you a very reliable platform, but it's not cheap. So that's why we're looking at Flink, In Flink, you can actually build a multi-stage pipeline and have in-memory cues instead of writing back to Kafka, so it is fast and you don't have to create multiple topics per pipeline. >> So, let me unpack that just a little bit to be clearer. The in-memory cues give you, obviously, better I/O. >> Yes. >> And if I understand correctly, that can absorb some of the backpressure? >> Yeah, so backpressure is interesting. If you have everything in Kafka and no in-memory cues, there is no backpressure because Kafka is a big buffer, it just keeps running. With in-memory cues, there is backpressure. Another question is, how do you handle this? So going back to Samza systems, they actually degrade and can't recover once they are in backpressure. But Flink, as you've seen, it slows down consuming from Kafka, but once the spike is over, once you're over that hill, it actually recovers quickly. It is able to sustain heavy spikes. >> Okay, so this goes to your issues with keeping up with the growth of data... >> That's right. >> You know, the system, there's multiple leaves of elasticity and then resource intensity. Tell us about that end and the desire to get as many jobs as possible out of a certain level of resource. >> So, today, we are a platform where people come in and say, "Here's my code." Or, "Here's my SQL that I want to run on your platform." In the old days, they were telling us, "Oh, I need 10 gigabytes for a container," and this they need these many CPUs and that really limited how many use cases we onboarded and made our hardware footprint pretty expensive. So we need the pipeline, the infrastructure, to be really memory efficient. What we have seen is memory is the bottle link in our world, more so than CPU. A lot of applications, they consume from Kafka, they actually buffer locally in each container and then they do that in the local memory, in the JVM memory. So we need the memory component to be very efficient and we can pack more jobs on the same cluster if everyone is using lesser memory. That's one motivation. The other thing, for example, that Flink does and Samza also does, is make use of a RocksDB store, which is a local persistent-- >> Oh, that's where it gets the state management. >> That's right, so you can offload from memory on to the disk-- >> Into a proper database. >> Into a proper database and you don't have to cross a network to do that because it's sitting locally. >> Just to elaborate on what might be, what might seem like, a arcane topic, if it's residing locally, than anything it's going to join with has to also be residing locally. >> Yeah, that's a good point. You have to be able to partition your inputs and your state in the same way, otherwise there's no locality. >> Okay, and you'd have to shuffle stuff around the network. >> And more than that, you'd need to be able to recover if something happens because there's no replication for this state. If the hard disk on that DR node crashes, you need to recreate that cache from somewhere. So either you go back and read from Kafka, or you store that cache somewhere. So Flink actually supports this out of the box and it snapshots the RocksDB state into HTFS. >> Got it, okay. It's more resilient--- >> Yes. >> And more resource efficient. So, let me ask one last question. Main stream enterprises, they, or at least the very largest ones, have been trying to wrestle their arms around some opensource projects. Very innovative, the pace of innovation is huge, but it demands a skillset that seems to be most resident in large consumer internet companies. What advice do you have for them where they aspire to use the same technologies that you're talking about to build new systems, but they might not have the skills. >> Right, that's a very good question. I'll try to answer in the way that I can. I think the first thing to do is understand your scale. Even if you're a big, large banking corporation, you need to understand where you fit in the industry ecosystem. If it turns out that your scale isn't that big and you're using it for internal analytics, then you can just pick the off-the-shelf pipelines and make it work. For example, if you don't care about multi-tendency, if your hardware span is not that much, actually anything might actually work. The real challenge is when you pick a technology and make it work for a large use cases and you want to optimize for cost. That's where you need a huge engineering organization. So in simpler words, if your use cases extent is not that big, pick something which has a lot of support from the community. Most more common things just work out-of-the-box, and that's good enough. But if you're doing a lot of complicated things, like real-time machine running, or your scale is in billions of messages per day, or terabytes of data per day, then you really need to make a choice: Whether you invest in an engineering organization that can really understand these use cases; or you go to companies like Databricks. Get a support from Databricks, or... >> Or maybe a cloud vendor? >> Or a cloud vendor, or things like Confluent which is giving Kafka support, things like that. I don't think there is one answer. To me, our use case, for example, the reason we chose to build an engineering organization around that is because our use cases are immensely complicated and not really seen before, so we had to invest in this technology. >> Alright, Chinmay, we're going to leave it on that and hopefully keep the dialogue going-- >> Sure. >> offline. So, we'll be back shortly. We're at Flink Forward, the data Artisans user conference for Flink. We're on the ground at the Kabuki Hotel in downtown San Francisco and we'll be right back.
SUMMARY :
We have the privilege of having with us where it fits, and why you chose it. in order for the stream processing to scale. you have to keep the physical layout of those So if you want to do a lot of these use cases, that do something and then output to producers, and you don't have to create The in-memory cues give you, obviously, better I/O. but once the spike is over, once you're over that hill, Okay, so this goes to your issues with You know, the system, there's multiple leaves and that really limited how many use cases we onboarded Into a proper database and you don't have to going to join with has to also be residing locally. You have to be able to partition Okay, and you'd have to shuffle stuff and it snapshots the RocksDB state into HTFS. It's more resilient--- but it demands a skillset that seems to be and you want to optimize for cost. the reason we chose to build We're on the ground at the Kabuki Hotel
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Databricks | ORGANIZATION | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
Netflix | ORGANIZATION | 0.99+ |
Chinmay | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Chinmay Soman | PERSON | 0.99+ |
Kafka | TITLE | 0.99+ |
Confluent | ORGANIZATION | 0.99+ |
Flink | ORGANIZATION | 0.99+ |
10 gigabytes | QUANTITY | 0.99+ |
each container | QUANTITY | 0.99+ |
San Francisco | LOCATION | 0.98+ |
today | DATE | 0.98+ |
one answer | QUANTITY | 0.98+ |
Apache | ORGANIZATION | 0.98+ |
2017 | DATE | 0.97+ |
one last question | QUANTITY | 0.95+ |
first thing | QUANTITY | 0.95+ |
Spark | TITLE | 0.93+ |
Pacific Heights | LOCATION | 0.91+ |
this morning | DATE | 0.86+ |
Kabuki Hotel | LOCATION | 0.85+ |
RocksDB | TITLE | 0.83+ |
About one year back | DATE | 0.82+ |
terabytes of data | QUANTITY | 0.82+ |
one motivation | QUANTITY | 0.8+ |
SQL | TITLE | 0.8+ |
Forward | EVENT | 0.78+ |
Samza | ORGANIZATION | 0.74+ |
Samza | TITLE | 0.73+ |
one stage | QUANTITY | 0.73+ |
billions of messages per day | QUANTITY | 0.72+ |
Artisans | EVENT | 0.7+ |
last one and a half years | DATE | 0.69+ |
Artisans user | EVENT | 0.62+ |
Samza | COMMERCIAL_ITEM | 0.34+ |
Dean Wampler Ph.D | Flink Forward 2017
>> Welcome everyone to the first ever U.S. user conference of Apache Flink, sponsored by data Artisans, the creators of Flink. The conference kicked off this morning with some very high-profile customer use cases, including Netflix and Uber, which were quite impressive. We're on the ground at the Kabuki Hotel in San Francisco and our first guest is Dean Wampler, VP of fast data engineering at Lightbend. Welcome Dean. >> Thank you. Good to see you again George. >> So, big picture context setting, Spark exploded on the scene, blew away the expectations, even of their creators, with the speed and the deeply integrated libraries, and essentially replaced MapReduce really quickly. >> Yeah. >> So what is behind Flink's rapid adoption? >> Right, I think it's an interesting story and if you'd asked me a year ago, I probably would've said, well I'm not sure we really need Flink, Spark seems to meet all our needs. But, I pretty quickly changed my mind as I got to know about Flink because, it is a broad ecosystem, there's a wide variety of problems people are trying to solve, and what Flink is doing very well is solving low latency streaming, but still at scale, like Spark. Where Spark is still primarily a mini-batch model, so it has longer latency. And Flink has been on the cutting edge too, of embracing some of the more advanced streaming scenarios, like proper handling of late arrival of data, windowing semantics, things like this. So it's really filling an important niche, but a fairly broad niche that people have. And also, not everybody needs the full-featured capabilities of Spark like batch analytics or whatever, and so having one tool that's focused just on processing streams is often a good idea. >> So would that relate to a smaller surface area to learn and to administer? >> I think it's a big part of it, yeah. I mean Spark is incredibly well engineered and it works very well, but it's a bigger system so there's going to be more to run. And there is something very attractive about having a more focused tool that, you know, less things to break basically. >> You mention sort of lower-latency and a few extra, a few fewer bells and whistles. Can you give us some examples of use cases where you wouldn't need perhaps all of the integrated libraries of Spark or the big footprint that gives you all that resilience and, you know, the functional programming that lets you sort of, recreate lineage. Tell us sort of how a customer who's approaching this should pick the trade-offs. >> Right. Well normally when you have a low latency problem, it means you have less time to do work, so you tend to do simpler things, in that time frame. But, just to give you a really interesting example, I was talking with a development team at a bank recently that does credit card authorizations. You click by on a website and there's maybe a few hundred milliseconds when the user is expecting a reply, right. But it turns out there's so many things going on in that loop, from browser to servers and back that they only have about ten milliseconds, when they get the data, to make a decision about whether this looks fraudulent or it looks legit, and they make a decision. So ten milliseconds is fairly narrow, that means you have to have your models already done and ready to go. And a quick way to actually apply them, you know, take this data, ask the model is this okay, and get a response. So, a lot of it is kind of boiling down to that, it's either, I would say one of two things, it's either I'm doing basic filtering, transforming of data, like raw data coming into my environment/ Or I have some maybe more sophisticated analytics that are running behind the scenes, and then in real time, so it's, so to speak, data is coming in and I'm asking questions against those models about this data, like authorizing credit cards. >> Okay, so to recap, the low latency means you have to have perhaps scored your models already. Okay, so trained and scored in the background and then, with this low latency solution you can look up, key based look up I guess, to an external store, okay. So how is Lightbend making it simple to put, what essentially has to be for any pipeline it appears, multiple products together seamlessly. >> That is the challenge. I mean it would be great if you could just deploy Flink, and that was the only thing you needed or Kafka, or pick any one of them. But of course, the reality is, we always have to integrate a bunch of tools together, and it's that integration that's usually the hard part. How do I know why this thing's misbehaving, when maybe it's something upstream that's misbehaving? That sort of thing. So, we've been surveying the landscape to understand, first of all, what are the tools that seem to be most mature, most vibrant as a community, that address the variety of scenarios people are trying to deal with, some of which we just discussed. And what are the kind of integration problems that you have to solve to make these reliable systems? So we've been building a platform, called the Fast Data Platform, that's approaching its first beta, that is designed to help solve a lot of those problems for you, so you can focus on your actual business problems. >> And from a customer point of view, would you take end-to-end ownership of that solution, so that if they chose you could manage it On-Prem or in the Cloud, and handle level three support across the stack? >> That's an interesting question. We think eventually we'll get to that point of more of a service offering, but right now most of the customers we're talking to are still more interested in managing things themselves, but not having as much of a hassle of doing it themselves. So what we're trying to balance is tooling that makes it easier to get started quickly and build applications, but also leverages some of the modern, like machine-learning, artificial intelligence stuff to automatically detect and correct for a lot of common problems, and other management scenarios. So at least it's not quite as, you're on your own, as it could be if you were just trying to glue everything together yourself. >> So if I understand, it sounds like the first stage in the journey is, help me rationalize what I'm trying to get to work together On-Prem, and part of that is using machine-learning now, as part of management. And then, over time, this management gets better and better at root-cause analysis and auto-remediation, and then it can move into the Cloud. And these disparate components become part of a single SAS solution, under the management. >> That's the long-term goal, definitely yeah. >> Looking out at where all this intense interest is right now in IOT applications. We can't really go back to the Cloud for, send all the data back to the Cloud, and get an immediate answer, and then drive an action. How do you see that shaping up in terms of what's on the edge and what's on the Cloud? >> Yeah, that's a really interesting question, and there are some particular challenges, because a lot of companies will migrate to the Cloud in a peace meal fashion, so they've got a sort of hybrid deployment scenario with things On-Premise and in the Cloud, and so forth. One of the things you mentioned that's pretty important, is I've got all this data coming in, how do I capture it reliably? So, tools like Kafka are really good for that and Pravega that Strachan from EMC mentioned, is sort of filling the same need, that I need to capture stuff reliably, serve downstream consumers, make it easy to do analytics over this stream that looks a lot different than a traditional database, where it's kind of data at rest, it's not static, but it's not moving. So, that's one of the things you have to do well, and then figure out how to get that data to the right consumer, and account for all of the latencies, like if I needed that ten millisecond credit card authorization, but I had data split over my On-Premise and my Cloud environment, you know, that would not work very well. So there's a lot of that kind of architecture of data flow, so it becomes really important. >> Do you see Lightbend offering that management solution that enforces SLAs or do you see sourcing that technology from others and then integrating it tightly with the particular software building blocks that make up the pipeline? >> It's a little of both. We're sort of in the early stages of building services along those lines. Some of the technology we've had for a while, our Akka middleware system, and the streaming API on top of it would be really good for basing that kind of a platform, where you can think about SLA requirements and trading off performance, or whatever, versus getting answers in a reasonable time, good recovery and error scenarios, stuff like that. So it's all early days, but we are thinking very hard about that problem, because ultimately, at the end of the, that's what customers care about, they don't care about Kafka versus Spark, or whatever. They just care about, I've got data coming in, I need an answer, and ten milliseconds or I lose money, and that's the kind of thing that they want you to sell for them, so that's really what we have to focus on. >> So, last question before we have to go, do you see potentially a scenario where there's one type of technology on the edge, or many types, and then something more dominant in the Cloud, where basically you do more training, model training, and out on the edge you do the low latency predictions or prescriptions. >> That's pretty much the architecture that has emerged. I'm going to talk a little bit about this today, in my talk, where, like we said earlier, I may have a very short window in which I have to make a decision, but it's based on a model that I have been building for a while and I can build in the background, where I have more tolerance for the time it takes. >> Up in the Cloud? >> Up in the Cloud. Actually this is kind of independent of deployment scenario, but it could be both like that, so you could have something that is closer to the consumer of the data, maybe in the Cloud, and deployed in Europe for European customers, but it might be working with systems back in the U.S.A. that are doing the heavy-lifting of building these models and so forth. We live in such a world where you can put things where you want, you can move things around, you can glue things together, and a lot of times it's just knowing what's the right combination of stuff. >> Alright Dean, it was great to see you and to hear the story. It sounds compelling. >> Thank you very much. >> So, this is George Gilbert. We are on the ground at Flink Forward, data Artisans user conference for the Flink product, and we will be back after this short break.
SUMMARY :
We're on the ground at the Kabuki Hotel in San Francisco Good to see you again George. Spark exploded on the scene, of embracing some of the more advanced streaming scenarios, you know, less things to break basically. that gives you all that resilience and, you know, that means you have to have your models already done Okay, so to recap, the low latency means you have to have and that was the only thing you needed that makes it easier to get started quickly and part of that is using machine-learning now, send all the data back to the Cloud, So, that's one of the things you have to do well, and that's the kind of thing in the Cloud, where basically you do more training, but it's based on a model that I have been building that are doing the heavy-lifting and to hear the story. We are on the ground at Flink Forward,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dean Wampler | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Dean | PERSON | 0.99+ |
George | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
U.S.A. | LOCATION | 0.99+ |
Flink | ORGANIZATION | 0.99+ |
EMC | ORGANIZATION | 0.99+ |
San Francisco | LOCATION | 0.99+ |
Lightbend | ORGANIZATION | 0.99+ |
first beta | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
first guest | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
first | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
ten milliseconds | QUANTITY | 0.98+ |
Kafka | TITLE | 0.98+ |
one | QUANTITY | 0.98+ |
Uber | ORGANIZATION | 0.98+ |
ten millisecond | QUANTITY | 0.98+ |
Spark | TITLE | 0.97+ |
U.S. | LOCATION | 0.97+ |
two things | QUANTITY | 0.96+ |
first stage | QUANTITY | 0.96+ |
Netflix | ORGANIZATION | 0.95+ |
a year ago | DATE | 0.95+ |
about ten milliseconds | QUANTITY | 0.94+ |
level three | QUANTITY | 0.94+ |
Flink Forward | ORGANIZATION | 0.93+ |
one type | QUANTITY | 0.93+ |
single | QUANTITY | 0.92+ |
2017 | DATE | 0.89+ |
MapReduce | TITLE | 0.89+ |
Apache Flink | ORGANIZATION | 0.89+ |
Akka | ORGANIZATION | 0.88+ |
European | OTHER | 0.83+ |
this morning | DATE | 0.78+ |
Kabuki Hotel | LOCATION | 0.78+ |
one tool | QUANTITY | 0.77+ |
Pravega | TITLE | 0.72+ |
few hundred milliseconds | QUANTITY | 0.66+ |
Strachan | PERSON | 0.62+ |
SAS | ORGANIZATION | 0.58+ |
Forward | EVENT | 0.55+ |
Cloud | TITLE | 0.51+ |
Fast Data Platform | TITLE | 0.5+ |
Flink | TITLE | 0.5+ |
Lightbend | PERSON | 0.38+ |