Image Title

Search Results for Vizaviz:

Tendü Yogurtçu | BigData SV 2017


 

>> Announcer: Live from San Jose, California. It's The Cube, covering Big Data Silicon Valley 2017. (upbeat electronic music) >> California, Silicon Valley, at the heart of the big data world, this is The Cube's coverage of Big Data Silicon Valley in conjunction with Strata Hadoop, well of course we've been here for multiple years, covering Hadoop World for now our eighth year, now that's Strata Hadoop but we do our own event, Big Data SV in New York City and Silicon Valley, SV NYC. I'm John Furrier, my cohost George Gilbert, analyst at Wikibon. Our next guest is Tendü Yogurtçu with Syncsort, general manager of the big data, did I get that right? >> Yes, you got it right. It's always a pleasure to be at The Cube. >> (laughs) I love your name. That's so hard for me to get, but I think I was close enough there. Welcome back. >> Thank you. >> Great to see you. You know, one of the things I'm excited about with Syncsort is we've been following you guys, we talk to you guys every year, and it just seems to be that every year, more and more announcements happen. You guys are unstoppable. You're like what Amazon does, just more and more announcements, but the theme seems to be integration. Give us the latest update. You had an update, you bought Trillium, you got a hit deal with Hortonworks, you got integrated with Spark, you got big news here, what's the news here this year? >> Sure. Thank you for having me. Yes, it's very exciting times at Syncsort and I've probably say that every time I appear because every time it's more exciting than the previous, which is great. We bought Trillium Software and Trillium Software has been leading data quality over a decade in many of the enterprises. It's very complimentary to our data integration, data management portfolio because we are helping our customers to access all of their enterprise data, not just the new emerging sources in the connected devices and mobile and streaming. Also leveraging reference data, my main frame legacy systems and the legacy enterprise data warehouse. While we are doing that, accessing data, data lake is now actually, in some cases, turning into data swamp. That was a term Dave Vellante used a couple of years back in one of the crowd chats and it's becoming real. So, data-- >> Real being the data swamps, data lakes are turning into swamps because they're not being leveraged properly? >> Exactly, exactly. Because it's about also having access to write data, and data quality is very complimentary because dream has had trusted right data, so to enterprise customers in the traditional environments, so now we are looking forward to bring that enterprise trust of the data quality into data lake. In terms of the data integration, data integration has been always very critical to any organization. It's even more critical now that the data is shifting gravity and the amount of data organizations have. What we have been delivering in very large enterprise production environments for the last three years is we are hearing our competitors making announcements in those areas very recently, which is a validation because we are already running in very large production environments. We are offering value by saying "Create your applications for integrating your data," whether it's in the cloud or originating on the cloud or origination on the main frames, whether it's on the legacy data warehouse, you can deploy the same exact application without any recompilations, without any changes on your standalone Windows laptop or in Hadoop MapReduce, or Spark in the cloud. So this design once and deploy anywhere is becoming more and more critical with data, it's originating in many different places and cloud is definitely one of them. Our data warehouse optimization solution with Hortonworks and AtScale, it's a special package to accelerate this adoption. It's basically helping organizations to offload the workload from the existing Teradata or Netezza data warehouse and deploying in Hadoop. We provide a single button to automatically map the metadata, create the metadata in Hive or on Hadoop and also make the data accessible in the new environment and AtScale provides fast BI on top of that. >> Wow, that's amazing. I want to ask you a question, because this is a theme, so I just did a tweetup just now while you were talking saying "the theme this year is cleaning up the data lakes, or data swamps, AKA data lakes. The other theme is integration. Can you just lay out your premise on how enterprises should be looking at integration now because it's the multi-vendor world, it's the multi-cloud world, multi-data type and source with metadata world. How do you advise customers that have the plethora of action coming at them. IOT, you've got cloud, you've got big data, I've got Hadoop here, I got Spark over here, what's the integration formula? >> First thing is identify your business use cases. What's your business's challenge, what's your business goals, and the challenge, because that should be the real driver. We assist in some organizations, they start with the intention "we would like to create a data lake" without having that very clear understanding, what is it that I'm trying to solve with this data lake? Data as a service is really becoming a theme across multiple organizations, whether it's on the enterprise side or on some of the online retail organizations, for example. As part of that data as a service, organizations really need to adopt tools that are going to enable them to take advantage of the technology stack. The technology stack is evolving very rapidly. The skill sets are rare, and skill sets are rare because you need to be kind of making adjustments. Am I hiring Ph.D students who can program Scala in the most optimized way, or should I hire Java developers, or should I hire Python developers, the names of the tools in the stack, Spark one versus Spark two APIs, change. It's really evolving very rapidly. >> It's hard to find Scala developers, I mean, you go outside Silicon Valley. >> Exactly. So you need to be, as an organization, ours advises that you really need to find tools that are going to fit those business use cases and provide a single software environment, that data integration might be happening on premise now, with some of the legacy enterprise data warehouse, and it might happen in a hybrid, on premise and cloud environment in the near future and perhaps completely in the cloud. >> So standard tools, tools that have some standard software behind it, so you don't get stuck in the personnel hiring problem. Some unique domain expertise that's hard to hire. >> Yes, skill set is one problem, the second problem is the fact that the applications needs to be recompiled because the stack is evolving and the APIs are not compatible with the previous version, so that's the maintenance cost to keep up with things, to be able to catch up with the new versions of the stack, that's another area that the tools really help, because you want to be able to develop the application and deploy it anywhere in any complete platform. >> So Tendü, if I hear you properly, what you're saying is integration sounds great on paper, it's important, but there's some hidden costs there, and that is the skill set and then there's the stack recompiling, I'm making sure. Okay, that's awesome. >> The tools help with that. >> Take a step back and zoom out and talk about Syncsort's positioning, because you guys have been changing with the stacks as well, I mean you guys have been doing very well with the announcements, you've been just coming on the market all the time. What is the current value proposition for Syncsort today? >> The current value proposition is really we have organizations to create the next generation modern data architecture by accessing and liberating all enterprise data and delivering that data at the right time and the right quality data. It's liberate, integrate, with integrity. That's our value proposition. How do we do that? We provide that single software environment. You can have batch legacy data and streaming data sources integrated in the same exact environment and it enables you to adapt to Spark 2 or Flink or whichever complete framework is going to help them. That has been our value proposition and it is proven in many production deployments. >> What's interesting to is the way you guys have approached the market. You've locked down the legacy, so you have, we talk about the main frame and well beyond that now, you guys have and understand the legacy, so you kind of lock that down, protect it, make it secure, it's security-wise, but you do that too, but making sure it works because it's still data there, because legacy systems are really critical in the hybrid. >> Main frame expertise and heritage that we have is a critical part of our offering. We will continue to focus on innovation on the main frame side as well as on the distributed. One of the announcements that we made since our last conversation was we have partnership with Compuware and we now bring in more data types about application failures, it's a Band-Aid data to Splunk for operational intelligence. We will continue to also support more delivery types, we have batch delivery, we have streaming delivery, and now replication into Hadoop has been a challenge so our focus is now replication from the B2 on mainframe and ISA on mainframe to Hadoop environments. That's what we will continue to focus on, mainframe, because we have heritage there and it's also part of big enterprise data lake. You cannot make sense of the customer data that you are getting from mobile if you don't reference the critical data sets that are on the mainframe. With the Trillium acquisition, it's very exciting because now we are at a kind of pivotal point in the market, we can bring that data validation, cleansing, and matching superior capabilities we have to the big data environments. One of the things-- >> So when you get in low latency, you guys do the whole low latency thing too? You bring it in fast? >> Yes, we bring it, that's our current value proposition and as we are accessing this data and integrating this part of the data lake, now we have capabilities with Trillium that we can profile that data, get statistics and start using machine learning to automate the data steward's job. Data stewards are still spending 75% of their time trying to clean the data. So if we can-- >> Lot of manual work labor there, and modeling too, by the way, the modeling and just the cleaning, cleaning and modeling kind of go hand in hand. >> Exactly. If we can automate any of these steps to drive the business rules automatically and provide right data on the data lake, that would be very valuable. This is what we are hearing from our customers as well. >> We've heard probably five years about the data lake as the center of gravity of big data, but we're hearing at least a bifurcation, maybe more, where now we want to take that data and apply it, operationalize it in making decisions with machine learning, predictive analytics, but at the same time we're trying to square this strange circle of data, the data lake where you didn't say up front what you wanted it to look like but now we want ever richer metadata to make sense out of it, a layer that you're putting on it, the data prep layer, and others are trying to put different metadata on top of it. What do you see that metadata layer looking like over the next three to five years? >> The governance is a very key topic and social organizations who are ahead of the game in the big data and who already established that data lake, data governance and even analytics governance becomes important. What we are delivering here with Trillium, we will have generally available by end of Q1. We are basically bringing business rules to the data. Instead of bringing data to business rules, we are taking the business rules and deploying them where the data exists. That will be key because of the data gravity you mentioned because the data might be in the Hadoop environment, there might be in a, like I said, enterprise data warehouse, and it might be originating in the cloud, and you don't want to move the data to the business rules. You want to move the business rules to where the data exists. Cloud is an area that we see more and more of our customers are moving forward. Two main use cases around our integration is one, because the data is originating in cloud, and the second one is archiving data to cloud, and we announced actually, tighter integration with cloud with our director earlier this week for this event, and that we have been in cloud deployments and we have actually an offering, an elastic MapReduce already and on AC too for couple of years now, and also on the Google cloud storage, but this announcement is primarily making deployments even easier by leveraging cloud director's elasticity for increasing and reducing the deployment. Now our customers will also take advantage of integration jobs from that elasticity. >> Tendü, it's great to have you on The Cube because you have an engineering mind but you're also now general manager of the business, and your business is changing. You're in the center of the action, so I want to get your expertise and insight into enterprise readiness concept and we saw last week at Google Cloud 2017, you know, Google going down the path of being enterprise ready, or taking steps, I don't think they're fully ready, but they're certainly serious about the cloud on the enterprise, and that's clear from Diane Green, who knows the enterprise. It sparked the conversation last week, around what does enterprise readiness mean for cloud players, because there's so many details in between the lines, if you will, of what products are, that integration, certification, SLAs. What's your take on the notion of cloud readiness? Vizaviz, Google and others that are bringing cloud compute, a lot of resources, with an IOT market that's now booming, big data evolving very, very fast, lot of realtime, lot of analytics, lot of innovation happening. What's the enterprise picture look like from a readiness standpoint? How do these guys get ready? >> From a big picture, for enterprise there are couple of things that these cannot be afterthought. Security, metadata lineage is part of data governance, and being able to have flexibility in the architecture, that they will not be kind of recreating the jobs that they might have all the way to deployed and on premise environments, right? To be able to have the same application running from on premise to cloud will be critical because it gives flexibility for adaptation in the enterprise. Enterprise may have some MapReduce jobs running on premise with the Spark jobs on cloud because they are really doing some predictive analytics, graph analytics on those, they want to be able to kind of have that flexible architecture where we hear this concept of a hybrid environment. You don't want to be deploying a completely different product in the cloud and redo your jobs. That flexibility of architecture, flexibility-- >> So having different code bases in the cloud versus on prem requires two jobs to do the same thing. >> Two jobs for maintaining, two jobs for standardizing, and two different skill sets of people potentially. So security, governance, and being able to access easily and have applications move in between environments will be very critical. >> So seamless integration between clouds and on prem first, and then potentially multi-cloud. That's table stakes in your mind. >> They are absolutely table stakes. A lot of vendors are trying to focus on that, definitely Hadoop vendors are also focusing on that. Also, one of the things, like when people talk about governance, the requirements are changing. We have been talking about single view and customer 360 for a while now, right? Do we have it right yet? The enrichment is becoming a key. With Trillium we made the recent announcement, the precise enriching, it's not just the address that you want to deliver and make sure that address should be correct, it's also the email address, and the phone number, is it mobile number, is it landline? It's enriched data sets that we have to be really dealing, and there's a lot of opportunity, and we are really excited because data quality, discovery and integration are coming together and we have a good-- >> Well Tendü, thank you for joining us, and congratulations as Syncsort broadens their scope to being a modern data platform solution provider for companies, congratulations. >> Thank you. >> Thanks for coming. >> Thank you for having me. >> This is The Cube here live in Silicon Valley and San Jose, I'm John Furrier, George Gilbert, you're watching our coverage of Big Data Silicon Valley in conjunction with Strata Hadoop. This is Silicon Angles, The Cube, we'll be right back with more live coverage. We've got two days of wall to wall coverage with experts and pros talking about big data, the transformations here inside The Cube. We'll be right back. (upbeat electronic music)

Published Date : Mar 14 2017

SUMMARY :

It's The Cube, covering Big Data Silicon Valley 2017. general manager of the big data, did I get that right? Yes, you got it right. That's so hard for me to get, but more announcements, but the theme seems to be integration. a decade in many of the enterprises. on Hadoop and also make the data accessible in it's the multi-cloud world, multi-data type it's on the enterprise side or on some It's hard to find Scala developers, I mean, the near future and perhaps completely in the cloud. get stuck in the personnel hiring problem. another area that the tools really help, So Tendü, if I hear you properly, what you're coming on the market all the time. and delivering that data at the right the legacy, so you kind of lock that down, One of the announcements that we made since automate the data steward's job. the modeling and just the cleaning, and provide right data on the data lake, data, the data lake where you didn't say the data to the business rules. many details in between the lines, if you will, kind of recreating the jobs that they might code bases in the cloud versus on prem So security, governance, and being able to on prem first, and then potentially multi-cloud. it's also the email address, and Well Tendü, thank you for the transformations here inside The Cube.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

John FurrierPERSON

0.99+

two jobsQUANTITY

0.99+

Two jobsQUANTITY

0.99+

Dave VellantePERSON

0.99+

75%QUANTITY

0.99+

AmazonORGANIZATION

0.99+

New York CityLOCATION

0.99+

Silicon ValleyLOCATION

0.99+

Diane GreenPERSON

0.99+

San Jose, CaliforniaLOCATION

0.99+

GoogleORGANIZATION

0.99+

ScalaTITLE

0.99+

SyncsortORGANIZATION

0.99+

San JoseLOCATION

0.99+

second problemQUANTITY

0.99+

last weekDATE

0.99+

CompuwareORGANIZATION

0.99+

two daysQUANTITY

0.99+

Spark 2TITLE

0.99+

oneQUANTITY

0.99+

one problemQUANTITY

0.99+

VizavizORGANIZATION

0.99+

Tendü YogurtçuPERSON

0.99+

SparkTITLE

0.99+

eighth yearQUANTITY

0.99+

OneQUANTITY

0.99+

five yearsQUANTITY

0.99+

Two main use casesQUANTITY

0.98+

TrilliumORGANIZATION

0.98+

PythonTITLE

0.98+

NetezzaORGANIZATION

0.98+

Trillium SoftwareORGANIZATION

0.98+

this yearDATE

0.98+

WikibonORGANIZATION

0.97+

HortonworksORGANIZATION

0.97+

HadoopTITLE

0.97+

earlier this weekDATE

0.96+

todayDATE

0.96+

TeradataORGANIZATION

0.95+

Big Data Silicon Valley 2017EVENT

0.94+

First thingQUANTITY

0.94+

single viewQUANTITY

0.94+

big dataORGANIZATION

0.92+

HiveTITLE

0.92+

JavaTITLE

0.92+

The CubeORGANIZATION

0.92+

single buttonQUANTITY

0.91+

AtScaleORGANIZATION

0.91+

end of Q1DATE

0.9+

single softwareQUANTITY

0.9+

second oneQUANTITY

0.89+

firstQUANTITY

0.89+

California,LOCATION

0.89+

FlinkTITLE

0.88+

Big DataTITLE

0.88+

two different skillQUANTITY

0.87+

Silicon Valley,LOCATION

0.84+

360QUANTITY

0.83+

threeQUANTITY

0.82+

last three yearsDATE

0.8+

ValleyTITLE

0.79+

Google Cloud 2017EVENT

0.79+

WindowsTITLE

0.78+

premORGANIZATION

0.76+

couple of years backDATE

0.76+

NYCLOCATION

0.75+

two APIsQUANTITY

0.75+