Image Title

Search Results for Compose for Hive:

Itamar Ankorian, Attunity | BigData NYC 2017


 

>> Announcer: Live from Midtown Manhattan, it's theCUBE, covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsor. >> Okay, welcome back, everyone, to our live special CUBE coverage in New York City in Manhattan, we're here in Hell's Kitchen for theCUBE's exclusive coverage of our Big Data NYC event and Strata Data, which used to be called Strata Hadoop, used to be Hadoop World, but our event, Big Data NYC, is our fifth year where we gather every year to see what's going on in big data world and also produce all of our great research. I'm John Furrier, the co-host of theCUBE, with Peter Burris, head of research. Our next guest, Itamar Ankorion, who's the Chief Marketing Officer at Attunity. Welcome back to theCUBE, good to see you. >> Thank you very much. It's good to be back. >> We've been covering Attunity for many, many years. We've had many conversations, you guys have had great success in big data, so congratulations on that. But the world is changing, and we're seeing data integration, we've been calling this for multiple years, that's not going away, people need to integrate more. But with cloud, there's been a real focus on accelerating the scale component with an emphasis on ease of use, data sovereignty, data governance, so all these things are coming together, the cloud has amplified. What's going on in the big data world, and it's like, listen, get movin' or you're out of business has pretty much been the mandate we've been seeing. A lot of people have been reacting. What's your response at Attunity these days because you have successful piece parts with your product offering? What's the big update for you guys with respect to this big growth area? >> Thank you. First of all, the cloud data lakes have been a major force, changing the data landscape and data management landscape for enterprises. For the past few years, I've been working closely with some of the world's leading organizations across different industries as they deploy the first and then the second and third iteration of the data lake and big data architectures. And one of the things, of course, we're all seeing is the move to cloud, whether we're seeing enterprises move completely to the cloud, kind of move the data lakes, that's where they build them, or actually have a hybrid environment where part of the data lake and data works analytics environment is on prem and part of it is in the cloud. The other thing we're seeing is that the enterprises are starting to mix more of the traditional data lake, the cloud is the platform, and streaming technologies is the way to enable all the modern data analytics that they need, and that's what we have been focusing on on enabling them to use data across all these different technologies where and when they need it. >> So, the sum of the parts is worth more if it's integrated together seems to be the positioning, which is great, it's what customers want, make it easier. What is the hard news that you guys have, 'cause you have some big news? Let's get to the news real quick. >> Thank you very much. We did, today, we have announced, we're very excited about it, we have announced a new big release of our data integration platform. Our modern platform brings together Attunity Replicate, Attunity Compose for Hive, and Attunity Enterprise Manager, or AEM. These are products that we've evolved significantly, invested a lot over the last few years to enable organizations to use data, make data available, and available in the real time across all these different platforms, and then, turn this data to be ready for analytics, especially in Hive and Hadoop environments on prem and now also in the cloud. Today, we've announced a major release with a lot of enhancements across the entire product line. >> Some people might know you guys for the Replicate piece. I know that this announcement was 6.0, but as you guys have the other piece part to this, really it's about modernization of kind of old-school techniques. That's really been the driver of your success. What specifically in this announcement makes it, you know, really work well for people who move in real time, they want to have good data access. What's the big aha for the customers out there with Attunity on this announcement? >> That's a great question, thank you. First of all is that we're bringing it all together. As you mentioned, over the past few years, Attunity Replicate has emerged as the choice of many Fortune 100 and other companies who are building modern architectures and moving data across different platforms, to the cloud, to their lakes, and they're doing it in a very efficient way. One of the things we've seen is that they needed the flexibility to adapt as they go through their journey, to adapt different platforms, and what we give them with Replicate was the flexibility to do so. We give them the flexibility, we give them the performance to get the data and efficiency to move only the change of the data as they happen and to do that in a real-time fashion. Now, that's all great, but once the data gets to the data lake, how do you then turn it into valuable information? That's when we introduced Compose for Hive, which we talked about in our last session a few month ago, which basically takes the next stage in the pipeline picking up incremental, continuous data that is fed into the data lake and turning those into operational data store, historical data stores, data store that's basically ready for analytics. What we've done with this release that we're really excited about is putting all of these together in a more integrated fashion, putting Attunity Enterprise Manager on top of it to help manage larger scale environments so customers can move faster in deploying these solutions. >> As you think about the role that Attunity's going to play over time, though, it's going to end up being part of a broader solution for how you handle your data. Imagine for a second the patterns that your customers are deploying. What is Attunity typically being deployed with? >> That's a great question. First of all, we're definitely part of a large ecosystem for building the new data architecture, new data management with data integration being more than ever a key part of that bigger ecosystem because as all they actually have today is more islands with more places where the data needs to go, and to your point, more patterns in which the data moves. One of those patterns that we've seen significantly increase in demand and deployment is streaming. Where data used to be batch, now we're all talking about streaming. Kafka has emerged as a very common platform, but not only Kafka. If you're on Amazon Web Services, you're using Kinesis. If you're in Azure, you're using Azure Event Hubs. You have different streaming technologies. That's part of how this has evolved. >> How is that challenge? 'Cause you just bring up a good point. I mean, with the big trend that customers want is they want either the same code basis on prem and that they have the hybrid, which means the gateway, if you will, to the public cloud. They want to have the same code base, or move workloads between different clouds, multi-cloud, it seems to be the Holy Grail, we've identified it. We are taking the position that we think multi-cloud will be the preferred architecture going forward. Not necessarily this year, but it's going to get there. But as a customer, I don't want to have to rebuild employees and get skill development and retraining on Amazon, Azure, Google. I mean, each one has its own different path, you mentioned it. How do you talk to customers about that because they might be like, whoa, I want it, but how do I work in that environment? You guys have a solution for that? >> We do, and in fact, one of the things we've seen, to your point, we've seen the adoption of multiple clouds, and even if that adoption is staged, what we're seeing is more and more customers that are actually referring to the term lock-in in respect to the cloud. Do we put all the eggs in one cloud, or do we allow ourselves the flexibility to move around and use different clouds, and also mitigate our risk in that respect? What we've done from that perspective is first of all, when you use the Attunity platform, we take away all the development complexity. In the Attunity platform, it is very easy to set up. Your data flow is your data pipelines, and it's all common and consistent. Whether you're working on prem, whether you work on Amazon Web Services, on Azure, or on Google or other platforms, it all looks and feels the same. First of all, and you solve the issue of the diversity, but also the complexity, because what we've done is, this is one of the big things that Attunity is focused on was reducing the complexity, allowing to configure these data pipelines without development efforts and resources. >> One of the challenges, or one of the things you typically do to take complexity out is you do a better job of design up front. And I know that Attunity's got a tool set that starts to address some of of these things. Take us a little bit through how your customers are starting to think in terms of designing flows as opposed to just cobbling together things in a bespoke way. How is that starting to change as customers gain experience with large data sets, the ability, the need to aggregate them, the ability to present them to developers in different ways? >> That's a great point, and again, one of the things we've focused on is to make the process of developing or configuring these different data flows easy and modular. First, while in Attunity you can set up different flows in different patterns, and you can then make them available to others for consumption. Some create the data ingestion, or some create the data ingestion and then create a data transformation with Compose for Hive, and with Attunity Enterprise Manager, we've now also introduced APIs that allow you to create your own microservices, consuming and using the services enabled by the platform, so we provide more flexibility to put all these different solutions together. >> What's the biggest thing that you see from a customer standpoint, from a problem that you solve? If you had to kind of lay it out, you know the classic, hey, what problem do you solve? 'Cause there are many, so take us through the key problem, and then, if there's any secondary issues that you guys can address customers, that seems the way conversation starts. What are key problems that you solve? >> I think one of the major problems that we solve is scale. Our customers that are deploying data lakes are trying to deploy and use data that is coming, not from five or 10 or even 50 data sources, we work at hundreds going on thousands of data sources now. That in itself represents a major challenge to our customers, and we're addressing it by dramatically simplifying and making the process of setting those up very repeatable, very easy, and then providing the management facility because when you have hundreds or thousands, management becomes a bigger issue to operationalize it. We invested a lot in a management facility for those, from a monitoring, control, security, how do you secure it? The data lake is used by many different groups, so how do we allow each group to see and work only on what belongs to that group? That's part it, too. So again, the scale is the major thing there. The other one is real timeliness. We talked about the move to streaming, and a lot of it is in order to enable streaming analytics, real-time analytics. That's only as good as your data, so you need to capture data in real time. And that of course has been our claim to fame for a long time, being the leading independent provider of CDC, change data capture technology. What we've done now, and also expanded significantly with the new release, version six, is creating universal database streaming. >> What is that? >> We take databases, we take databases, all the enterprise databases, and we turn them into live streams. When you think, by the way, by the most common way that people have used, customers have used to bring data into the lake from a database, it was Scoop. And Scoop is a great, easy software to use from an open source perspective, but it's scripting and batch. So, you're building your new modern architecture with the two are effectively scripting and batch. What we do with CDC is we enable to take a database, and instead of the database being something you come to periodically to read it, we actually turn it into a live feed, so as the data changes in the database, we stream it, we make it available across all these different platforms. >> Changes the definition of what live streaming is. We're live streaming theCUBE, we're data. We're data streaming, and you get great data. So, here's the question for you. This is a good topic, I love this topic. Pete and I talk about this all the time, and it's been addressed in the big data world, but it's kind of, you can see the pattern going mainstream in society globally, geopolitically and also in society. Batch processing and data in motion are real time. Streaming brings up this use case to the end customer, which is this is the way they've done it before, certainly store things in data lakes, that's not going to go away, you're going to store stuff, but the real gain is in motion. >> Itamar: Correct. >> How do you describe that to a customer when you go out and say, hey, you know, you've been living in a batch world, but wake up to the real world called real time. How do you get to them to align with it? Some people get it right away, I see that, some people don't. How do you talk about that because that seems to be a real cultural thing going on right now, or operational readiness from the customer standpoint? Can you just talk through your feeling on that? >> First of all, this often gets lost in translation, and we see quite a few companies and even IT departments that when you talk, when they refer to real time, or their business tells them we need real time, what they understand from it is when you ask for the data, the response will be immediate. You get real time access to the data, but the data is from last week. So, we get real time access, but for last week's data. And that's what we try to do is to basically say, wait a second, when you mean real time, what does real time mean? And we start to understand what is the meaning of using last week's data versus, or yesterday's data, over the real time data, and that makes a big difference. We actually see that today the access, the availability, the availability to act on the real time data, that's the frontier of competitive differentiation. That's what makes a customer experience better, that's what makes the business more operationally efficient than the competition. >> It's the data, not so much the process of what they used to do. They're version of real time is I responded to you pretty quickly. >> Exactly, the other thing that's interesting is because we see it with, again, change of the capture becoming a critical component of the modern data architecture. Traditionally, we used to talk about different type of tools and technology, now CDC itself is becoming a critical part of it, and the reason is that it serves and it answers a lot of fundamental needs that are now becoming critical. One is the need for real-time data. The other one is efficiency. If you're moving to the cloud, and we talked about this earlier, if you're data lake is going to be in the cloud, there's no way you're going to reload all your data because the bandwidth is going to get in the way. So, you have to move only the delta. You need the ability to capture and move only the delta, so CDC becomes fundamental both in enabling the real time as well the efficient, the low-impact data integration. >> You guys have a lot of partners, technology partners, global SIs, resellers, a bunch of different partnership levels. The question I have for you, love to get your reaction and share your insight into is, okay, as the relationship to the customer who has the problem, what's in it for me? I want to move my business forward, I want to do digital business, I need to get up my real-time data as it's happening. Whether it's near real time or real time, that's evolution, but ultimately, they have to move their developers down a certain path. They'll usually hire a partner. The relationship between partners and you, the supplier to the customer, has changed recently. >> That's correct. >> How is that evolving? >> First of all, it's evolving in several ways. We've invested on our part to make sure that we're building Attunity as a leading vendor in the ecosystem of they system integration consulting companies. We work with pretty much all the major global system integrators as well as regional ones, boutique ones, that focus on the emerging technologies as well as get the modern analytic-type platforms. We work a lot with plenty of them on major corporate data center-level migrations to the cloud. So again, the motivations are different, but we invest-- >> More specialized, are you seeing more specialty, what's the trend? >> We've been a technology partner of choice to both Amazon and Microsoft for enabling, facilitating the data migration to the cloud. They of course, their select or preferred group of partners they work with, so we all come together to create these solutions. >> Itamar, what's the goals for Attunity as we wrap up here? I give you the last word, as you guys have this big announcement, you're bringing it all together. Integrating is key, it's always been your ethos in the company. Where is this next level, what's the next milestone for you guys? What do you guys see going forward? >> First of all, we're going to continue to modernize. We're really excited about the new announcement we did today, Replicate six, AEM six, a new version of Compose for Hive that now also supports small data lakes, Aldermore, Scaldera, EMR, and a key point for us was expanding AEM to also enable analytics on the data we generate as data flows through it. The whole point is modernizing data integration, providing more intelligence in the process, reducing the complexity, and facilitating the automation end-to-end. We're going to continue to solve, >> Automation big, big time. >> Automation is a big thing for us, and the point is, you need to scale. In order to scale, we want to generate things for you so you don't to develop for every piece. We automate the automation, okay. The whole point is to deliver the solution faster, and the way we're going to do it is to continue to enhance each one of the products in its own space, if it's replication across systems, Compose for Hive for transformations in pipeline automation, and AEM for management, but also to create integration between them. Again, for us it's to create a platform that for our customers they get more than the sum of the parts, they get the unique capabilities that we bring together in this platform. >> Itamar, thanks for coming onto theCUBE, appreciate it, congratulations to Attunity. And you guys bringing it all together, congratulations. >> Thank you very much. >> This theCUBE live coverage, bringing it down here to New York City, Manhattan. I'm John Furrier, Peter Burris. Be right back with more after this short break. (upbeat electronic music)

Published Date : Sep 27 2017

SUMMARY :

Brought to you by SiliconANGLE Media I'm John Furrier, the co-host of theCUBE, Thank you very much. What's the big update for you guys the move to cloud, whether we're seeing enterprises What is the hard news that you guys have, and available in the real time That's really been the driver of your success. the flexibility to adapt as they go through their journey, Imagine for a second the patterns and to your point, more patterns in which the data moves. We are taking the position that we think multi-cloud We do, and in fact, one of the things we've seen, the ability to present them to developers in different ways? one of the things we've focused on is What's the biggest thing that you see We talked about the move to streaming, and instead of the database being something and it's been addressed in the big data world, or operational readiness from the customer standpoint? the availability to act on the real time data, I responded to you pretty quickly. because the bandwidth is going to get in the way. the supplier to the customer, has changed boutique ones, that focus on the emerging technologies facilitating the data migration to the cloud. What do you guys see going forward? on the data we generate as data flows through it. and the point is, you need to scale. And you guys bringing it all together, congratulations. it down here to New York City, Manhattan.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
MicrosoftORGANIZATION

0.99+

Itamar AnkorionPERSON

0.99+

Peter BurrisPERSON

0.99+

AmazonORGANIZATION

0.99+

hundredsQUANTITY

0.99+

John FurrierPERSON

0.99+

fiveQUANTITY

0.99+

last weekDATE

0.99+

New York CityLOCATION

0.99+

ItamarPERSON

0.99+

secondQUANTITY

0.99+

CDCORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

firstQUANTITY

0.99+

TodayDATE

0.99+

PetePERSON

0.99+

50 data sourcesQUANTITY

0.99+

10QUANTITY

0.99+

Itamar AnkorianPERSON

0.99+

twoQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

Amazon Web ServicesORGANIZATION

0.99+

each groupQUANTITY

0.99+

yesterdayDATE

0.99+

fifth yearQUANTITY

0.99+

OneQUANTITY

0.99+

todayDATE

0.99+

FirstQUANTITY

0.99+

Attunity ReplicateORGANIZATION

0.99+

ManhattanLOCATION

0.99+

oneQUANTITY

0.98+

Midtown ManhattanLOCATION

0.98+

NYCLOCATION

0.98+

AttunityORGANIZATION

0.97+

AldermoreORGANIZATION

0.97+

bothQUANTITY

0.97+

one cloudQUANTITY

0.97+

this yearDATE

0.97+

EMRORGANIZATION

0.96+

Big DataEVENT

0.96+

KafkaTITLE

0.95+

each oneQUANTITY

0.95+

ScalderaORGANIZATION

0.95+

thousandsQUANTITY

0.94+

AzureORGANIZATION

0.94+

Strata HadoopEVENT

0.94+

New York City, ManhattanLOCATION

0.94+

6.0QUANTITY

0.93+

theCUBEORGANIZATION

0.93+

Azure Event HubsTITLE

0.91+

2017EVENT

0.91+

a secondQUANTITY

0.91+

HiveTITLE

0.9+

rtune 100ORGANIZATION

0.9+

CUBEORGANIZATION

0.9+

few month agoDATE

0.88+

Attunity Enterprise ManagerTITLE

0.83+

thousands of data sourcesQUANTITY

0.83+

2017DATE

0.82+

AEMTITLE

0.8+

third iterationQUANTITY

0.79+

version sixQUANTITY

0.78+

Itamar Ankorion, Attunity & Arvind Rajagopalan, Verizon - #DataWorks - #theCUBE


 

>> Narrator: Live from San Jose in the heart of Silicon Valley, it's the CUBE covering DataWorks Summit 2017 brought to you by Hortonworks. >> Hey, welcome back to the CUBE live from the DataWorks Summit day 2. We've been here for a day and a half talking with fantastic leaders and innovators, learning a lot about what's happening in the world of big data, the convergence with Internet of Things Machine Learning, artificial intelligence, I could go on and on. I'm Lisa Martin, my co-host is George Gilbert and we are joined by a couple of guys, one is a Cube alumni, Itamar Ankorion, CMO of Attunity, Welcome back to the Cube. >> Thank you very much, good to be here, thank you Lisa and George. >> Lisa: Great to have you. >> And Arvind Rajagopalan, the Director of Technology Services for Verizon, welcome to the Cube. >> Thank you. >> So we were chatting before we went on, and Verizon, you're actually going to be presenting tomorrow, at the DataWorks summit, tell us about building... the journey that Verizon has been on building a Data Lake. >> Oh, Verizon is over the last 20 years, has been a large corporation, made up of a lot of different acquisitions and mergers, and that's how it was formed in 20 years back, and as we've gone through the journey of the mergers and the acquisitions over the years, we had data from different companies come together and form a lot of different data silos. So the reason we kind of started looking at this, is when our CFO started asking questions around... Being able to answer One Verizon questions, it's as simple as having Days Payable, or Working Capital Analysis across all the lines of businesses. And since we have a three-major-ERP footprint, it is extremely hard to get that data out, and there was a lot of manual data prep activities that was going into bringing together those One Verizon views. So that's really what was the catalyst to get the journey started for us. >> And it was driven by your CFO, you said? >> Arvind: That's right. >> Ah, very interesting, okay. So what are some of the things that people are going to hear tomorrow from your breakout session? >> Arvind: I'm sorry, say that again? >> Sorry, what are some of the things that the people, the attendees from your breakout session, are going to learn about the steps and the journey? >> So I'm going to primarily be talking about the challenges that we ran into, and share some around that, and also talk about some of the factors, such as the catalysts and what drew us to sort of moving in that direction, as well as getting to some architectural components, from high-level standpoint, talk about certain partners that we work with, the choices we made from an architecture perspective and the tools, as well as to kind of close the loop on, user adoption and what users are seeing in terms of business value, as we start centralizing all of the data at Verizon from a backoff as Finance and Supply Chains standpoint. So that's kind of what I'm looking at talking tomorrow. >> Arvind, it's interesting to hear you talk about sort of collecting data from essentially backoff as operational systems in a Data Lake. Were there... I assume that the state is sort of more refined and easily structured than the typical stories we hear about Data Lakes. Were there challenges in making it available for exploration and visualization, or were all the early-use cases really just Production Reporting? >> So standard reporting across the ERP systems is very mature and those capabilities are there, but then you look at across-ERP systems and we have three major ERP systems for each of the lines of businesses, when you want to look at combining all of the data, it's very hard, and to add to that, you pointed on self-service discovery, and visualization across all three datas, that's even more challenging, because it takes a lot of heavy lift, to normalize all of the data and bring it into one centralized platform, and we started off the journey with Oracle, and then we had SAP HANA, we were trying to bring all the data together, but then we were looking at systems in our non-SAP ERP systems and bringing that data into a SAP-kind of footprint, one, the cost was tremendously high, also there was a lot of heavy lift and challenges in terms of manually having to normalize the data and bring it into the same kind of data models. And even after all of that was done, it was not very self-service oriented for our users and Finance and Supply Chain. >> Let me drill into two of those things. So it sounds like the ETL process of converting it into a consumable format was very complex, and then it sounds like also, the discoverability, like where a tool, perhaps like Elation, might help, which is very, very immature right now, or maybe not immature, it's still young. Is that what was missing, or why was the ETL process so much more heavyweight than with a traditional data warehouse? >> The ETL processes, there's a lot of heavy lifting there involved, because of the proprietary data structures of the ERP systems, especially SAP is... The data structures and how the data is used across clustered and pool tables, is very proprietary. And on top of that, bringing the data formats and structures from a PeopleSoft ERP system which are supporting different lines of businesses, so there are a lot of customization that's gone into place, there are specific things that we use in the ERPs, in terms of the modules and how the processes are modeled in each of the lines of businesses, complicates things a lot. And then you try and bring all these three different ERPs, and the nuances that they have over the years, try and bring them together, it actually makes it very complex. >> So tell us then, help us understand, how the Data Lake made that easier. Was it because you didn't have to do all the refinement before it got there. And tell us how Attunity helped make that possible. >> Oh absolutely, so I think that's one of the big things, why we picked the Hortonworks as one of our key partners in terms of buidling out the Data Lake, it just came on greed, you aren't necessarily worried about doing a whole lot of ETL before you bring the data in, and it also provides with the tools and the technologies from a lot other partners. We have a lot of maturity now, better provided self-service discovery capabilities for ad hoc analysis and reporting. So this is helpful to the users because now they don't have to wait for prolonged IT development cycles to model the data, do the ETL and build reports for the to consume, which sometimes could take weeks and months. Now in a matter of days, they're able to see the data they're looking for and they're able to start the analysis, and once they start the analysis and the data is accessible, it's a matter of minutes and seconds looking at the different tools, how they want to look at it, how they want to model it, so it's actually being a huge value from the perspective of the users and what they're looking to do. >> Speaking of value, one of the things that was kind of thematic yesterday, we see enterprises are now embracing big data, they're embracing Hadoop, it's got to coexist within our ecosystem, and it's got to inter-operate, but just putting data in a Data Lake or Hadoop, that's not the value there, it's being able to analyze that data in motion, at rest, structured, unstructured, and start being able to glean or take actionable insights. From your CFO's perspective, where are you know of answering some of the questions that he or she had, from an insights perspective, with the Data Lake that you have in place? >> Yeah, before I address that, I wanted to quickly touch upon and wrap up George's question, if you don't mind. Because one of the key challenges, and I do talk about how Attunity helped. I was just about to answer the question before we moved on, so I just want to close the loop on that a little bit. So in terms of bringing the data in, the data acquisition or ingestion is key aspect of it, and again, looking at the proprietary data structures from the ERP systems is very complex, and involves a multi-step process to bring the data into a strange environment, and be able to put it in the swamp bring it into the Lake. And what Attunity has been able to help us with is, it has the intelligence to look at and understand the proprietary data structures of the ERPs, and it is able to bring all the data from the ERP source systems directly into Hadoop, without any stops, or staging data bases along the way. So it's been a huge value from that standpoint, I'll get into more details around that. And to answer your question, around how it's helping from a CFO standpoint, and the users in Finance, as I said, now all the data is available in one place, so it's very easy for them to consume the data, and be able to do ad hoc analysis. So if somebody's looking to, like I said earlier, want to look at and calculate base table, as an example, or they want to look at working capital, we are actually moving data using Attunity, CDC replicate product, we're getting data in real-time, into the Data Lake. So now they're able to turn things around, and do that kind of analysis in a matter of hours, versus overnight or in a matter of days, which was the previous environment. >> And that was kind of one of the things this morning, is it's really about speed, right? It's how fast can you move and it sounds like together with Attunity, Verizon is really not only making things simpler, as you talked about in this kind of model that you have, with different ERP systems, but you're also really able to get information into the right hands much, much faster. >> Absolutely, that's the beauty of the near real-time, and the CDC architecture, we're able to get data in, very easily and quickly, and Attunity also provides a lot of visibility as the data is in flight, we're able to see what's happening in the source system, how many packets are flowing through, and to a point, my developers are so excited to work with a product, because they don't have to worry about the changes happening in the source systems in terms of DDL and those changes are automatically understood by the product and pushed to the destination of Hadoop. So it's been a game-changer, because we have not had any downtime, because when there are things changing on the source system side, historically we had to take downtime, to change those configurations and the scripts, and publish it across environments, so that's been huge from that standpoint as well. >> Absolutely. >> Itamar, maybe, help us understand where Attunity can... It sounds like there's greatly reduced latency in the pipeline between the operational systems and the analytic system, but it also sounds like you still need to essentially reformat the data, so that it's consumable. So it sounds like there's an ETL pipeline that's just much, much faster, but at the same time, when it's like, replicate, it sounds like that goes without transformations. So help us sort of understand that nuance. >> Yeah, that's a great question, George. And indeed in the past few years, customers have been focused predominantly on getting the data to the Lake. I actually think it's one of the changes in the fame, we're hearing here in the show and the last few months is, how do we move to start using the data, the great applications on the data. So we're kind of moving to the next step, in the last few years we focused a lot on innovating and creating the solutions that facilitate and accelerate the process of getting data to the Lake, from a large scope of systems, including complex ones like SAP, and also making the process of doing that easier, providing real-time data that can both feed streaming architectures as well as batch ones. So once we got that covered, to your question, is what happens next, and one of the things we found, I think Verizon is also looking at it now and are being concomitant later. What we're seeing is, when you bring data in, and you want to adopt the streaming, or a continuous incremental type of data ingestion process, you're inherently building an architecture that takes what was originally a database, but you're kind of, in a sense, breaking it apart to partitions, as you're loading it over time. So when you land the data, and Arvind was referring to a swamp, or some customers refer to it as a landing zone, you bring the data into your Lake environment, but at the first stage that data is not structured, to your point, George, in a manner that's easily consumable. Alright, so the next step is, how do we facilitate the next step of the process, which today is still very manual-driven, has custom development and dealing with complex structures. So we actually are very excited, we've introduced, in the show here, we announced a new product by Attunity, Compose for Hive, which extends our Data Lake solutions, and what Compose of Hive is exactly designed to do, is address part of the problem you just described, where's when the data comes in and is partitioned, what Compose for Hive does, is it reassembles these partitions, and it then creates analytic-ready data sets, back in Hive, so it can create operational data stores, it can create historical data stores, so then the data becomes formatted, in a matter that's more easily accessible for users, who want to use analytic tools, VI-tools, Tableau, Qlik, any type of tool that can easily access a database. >> Would there be, as a next step, whether led by Verizon's requirements or Attunity's anticipation of broader customer requirements, something where, there's a, if not near real-time, but a very low latency landing and transformation, so that data that is time-sensitive can join the historical data. >> Absolutely, absolutely. So what we've done, is focus on real-time availability of data. So when we feed the data into the Data Lake, we fit it into ways, one is directly into Hive, but we also go through a streaming architecture, like Kafka, in the case of Hortonworks, can also fit also very well into HDF. So then the next step in the process, is producing those analytic data sets, or data source, out of it, which we enable, and what we do is design it together with our partners, with our inner customers. So again when we work on Replicate, then we worked on Compose, we worked very close with Fortune companies trying to deal with these challenges, so we can design a product. In the case of Compose for Hive for example, we have done a lot of collaboration, at a product engineering level, with Hortonworks, to leverage the latest and greatest in Hive 2.2, Hive LLAP, to be able to push down transformations, so those can be done faster, including real-time, so those datasets can be updated on a frequent basis. >> You talked about kind of customer requirements, either those specific or not, obviously talking to telecommunications company, are you seeing, Itamar, from Attunity's perspective, more of this need to... Alright, the data's in the Lake, or first it comes to the swamp, now it's in the Lake, to start partitioning it, are you seeing this need driven in specific industries, or is this really pretty horizontal? >> That's a good question and this is definitely a horizontal need, it's part of the infrastructure needs, so Verizon is a great customer, and we even worked similarly in telecommunications, we've been working with other customers in other industries, from manufacturing, to retail, to health care, to automotive and others, and in all of those cases it's on a foundation level, it's very similar architectural challenges. You need to ingest the data, you want to do it fast, you want to do it incrementally or continuously, even if you're loading directly into Hadoop. Naturally, when you're loading the data through a Kafka, or streaming architecture, it's a continuous fashon, and then you partition the data. So the partitioning of the data is kind of inherent to the architecture, and then you need to help deal with the data, for the next step in the process. And we're doing it both with Compose for Hive, but also for customers using streaming architectures like Kafka, we provide the mechanisms, from supporting or facilitating things like schema unpollution, and schema decoding, to be able to facilitate the downstream process of processing those partitions of data, so we can make the data available, that works both for analytics and streaming analytics, as well as for scenarios like microservices, where the way in which you partition the data or deliver the data, allows each microservice to pick up on the data it needs, from the relevant partition. >> Well guys, this has been a really informative conversation. Congratulations, Itamar, on the new announcement that you guys made today. >> Thank you very much. >> Lisa: Arvin, great to hear the use case and how Verizon really sounds quite pioneering in what you're doing, wish you continued success there, we look forward to hearing what's next for Verizon, we want to thank you for watching the CUBE, we are again live, day two, of the DataWorks summit, #DWS17, before me my co-host George Gilbert, I am Lisa Martin, stick around, we'll be right back. (relaxed techno music)

Published Date : Jun 14 2017

SUMMARY :

in the heart of Silicon Valley, and we are joined by a couple of guys, Thank you very much, good to be here, the Director of Technology Services for Verizon, at the DataWorks summit, So the reason we kind of started looking at this, that people are going to hear tomorrow and the tools, as well as to kind of close the loop on, than the typical stories we hear about Data Lakes. and bring it into the same kind of data models. So it sounds like the ETL process and the nuances that they have over the years, how the Data Lake made that easier. do the ETL and build reports for the to consume, and it's got to inter-operate, and it is able to bring all the data and it sounds like together with Attunity, and the CDC architecture, we're able to get data in, and the analytic system, getting the data to the Lake. can join the historical data. like Kafka, in the case of Hortonworks, Alright, the data's in the Lake, You need to ingest the data, you want to do it fast, Congratulations, Itamar, on the new announcement Lisa: Arvin, great to hear the use case

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

Arvind RajagopalanPERSON

0.99+

ArvindPERSON

0.99+

Lisa MartinPERSON

0.99+

VerizonORGANIZATION

0.99+

Itamar AnkorionPERSON

0.99+

LisaPERSON

0.99+

GeorgePERSON

0.99+

ItamarPERSON

0.99+

OracleORGANIZATION

0.99+

San JoseLOCATION

0.99+

Silicon ValleyLOCATION

0.99+

twoQUANTITY

0.99+

tomorrowDATE

0.99+

KafkaTITLE

0.99+

threeQUANTITY

0.99+

HortonworksORGANIZATION

0.99+

CubeORGANIZATION

0.99+

ArvinPERSON

0.99+

DataWorks SummitEVENT

0.99+

SAP HANATITLE

0.99+

OneQUANTITY

0.99+

eachQUANTITY

0.99+

yesterdayDATE

0.99+

#DWS17EVENT

0.99+

oneQUANTITY

0.98+

a day and a halfQUANTITY

0.98+

CDCORGANIZATION

0.98+

first stageQUANTITY

0.98+

TableauTITLE

0.98+

DataWorks Summit 2017EVENT

0.98+

AttunityORGANIZATION

0.98+

HiveTITLE

0.98+

bothQUANTITY

0.98+

AttunityPERSON

0.98+

DataWorksEVENT

0.97+

todayDATE

0.97+

Compose for HiveORGANIZATION

0.97+

ComposeORGANIZATION

0.96+

Hive 2.2TITLE

0.95+

QlikTITLE

0.94+

HadoopTITLE

0.94+

one placeQUANTITY

0.93+

day twoQUANTITY

0.92+

each microserviceQUANTITY

0.9+

firstQUANTITY

0.9+

20 years backDATE

0.89+

#DataWorksORGANIZATION

0.87+

three major ERP systemsQUANTITY

0.83+

last 20 yearsDATE

0.82+

PeopleSoftORGANIZATION

0.8+

Data LakeCOMMERCIAL_ITEM

0.8+

SAPORGANIZATION

0.79+