June 08, 2016
Plans have too many Scans? Toss the Table in the Can - RDBMS to Graph
As Gary Vaynerchuk wrote, "Content is King, but Context is God." Understanding the relationship between bits of data is the next frontier in application development. Ironically, relational databases are poor at modeling data relationships, and their performance doesn't scale as the number of JOINs grows and the data sizes explode.
Let's say we're building a recommendation system, master data management system, or attempting to do fraud detection. How do we architect an application where the performance of understanding data relationships is critical? In the past, we would have de-normalized the data so our queries could be performed without too many JOINs and accompanying index lookups. If duplication of data and issues with inconsistencies were a concern, we might move to a more scalable NoSQL solution. This requires tradeoffs though, often sacrificing ACID for performance.
Context and ACID are both gods I believe in, and neither deserve to be sacrificed.
Enter graph databases.
Through index-free adjacency, a native graph database enables traversing between nodes across relationships without doing expensive index lookups. This allows us to scale our applications to perform under much heavier loads. And, though graph databases are often considered a NoSQL technology, they can continue to provide the ACID guarantees our software requires.
This session will discuss the properties of the Property Graph model and how graph databases achieve performance, while also ensuring human efficiency through the intuitiveness of Cypher, a declarative query language optimized for graphs. We'll guide you through moving from a relational database ER diagram to a graph model. We'll use several different datasets to illustrate these concepts, including the Panama Papers data recently released by the ICIJ.
Ryan is a SF-based software engineer focused on helping developers understand the power of graph databases. Previously he was a product manager for architectural software, built applications and web hosting environments for higher education, and worked in developer relations for twenty products during his 8 years at Google. He enjoys cycling, sailing, skydiving, and many other adventures when not in front of his computer.
May 11, 2016
Getting To Data Truth -- Accurately and On Time
Finding the right data fast enough to make business decisions and respond to regulatory challenges seems to get harder every day. The technical teams charged with managing the data and the business people trying to use it are often in conflict because the data just doesn't seem to be credible. Often the root of the problems is data being created and collected manually, or passed from application to application to support different use cases. There's a lot of scope for mistrust, especially when the same people are looking at the "same" data from multiple applications. This session will describe an approach to helping business users to find and use the right data at the right time to meet their business decision making and compliance challenges.
Habas , VP of Product Management for ASG's Enterprise Data Intelligence Suite,
has over 18 years experience working with metadata on the buy side as a customers and the sell side as a vendor, including implementation, showcasing, and program support. She has worked to structure and drive enterprise metadata/data governance programs. Additionally, Sue has extensive support experience to sustain these initiatives within an organization. She has supported a wide range of clients, including financial, insurance, healthcare, manufacturing and e-Commerce with a general need to provide data-driven business practices. Sue is responsible for launching and guiding ASG's Enterprise Data Intelligence solution's superior Metadata/Data Governance technology in fresh, modern offerings that deliver excellent value for today's challenging business demands.
Ozkucur , VP and Practice Lead for ASG's Enterprise Data Intelligence Suite,
has been with the product for over 15 years. He has in-depth software and process experience as an Engineer and Architect as well as Enterprise-wide implementation best practices. He has designed and delivered many projects and POCs with a wide range of clients, including financial, insurance, healthcare, manufacturing and e-Commerce.
April 13, 2016
Application of Semantic Model (WYSIWYG) in Master Data Management
Although many enterprises are trying to introduce Master Data Management (MDM) to overcome data silos by integrating various systems to speak in common language, variety of business requirements from IT and business stakeholders create challenges in building a common data model across the enterprise. Every industry has common language and common models. In this presentation, we are show casing once such model as basis for building MDM. Considering Financial Industry Business Ontology (FIBO) as a common language for the industry-wide terms and facts and relationships associated with financial services, it is potentially an excellent basis for a data model covering overall enterprise requirements.
In this presentation, we introduce FIBO in building enterprise MDM model. We begin from the challenges in current state and then cover our strategic approach to build and deploy enterprise Security, Client and Account masters. We also introduce Proof of Concept implementation of Security and Client Masters and show how this can solve real world problems in integrating, mastering and distributing data with complex relationships. This includes:
. Challenges & Transition Strategy
. Case Study for Real World Problems and Proof of Concept Implementation
. Possible Future Extensions
Mani Keeran has over 20 years of business consulting and information technology experience in the areas of strategy development, product management, business analysis, process model design, data architecture, data model design and application development. His experience spans financial services, high technology, manufacturing, telecommunication, and transportation industries. He has extensive experience in implementing rapid application development in software design, development and implementation. His recent research interests include of Lean Software Development, Knowledge Management and Ontology based enterprise modeling for both structured and semi-structured data. Mani is currently managing the Information Architecture Group for Franklin Templeton Investments. Mani has been the President and Board Member for San Francisco Data Management Association since 2008. Mani has an MBA from SUNY Buffalo's Jacobs School of Management and holds the Chartered Financial Analyst (CFA) charter.
Gi Kim is an Information Architect of Franklin Templeton Investments. As proven by his academic history and professional experience, he has strong background in both of information technology and business side of financial services. Gi received his M.S. degree in Computer Science from Seoul National University, focusing on data management and modeling studies. He became a CFA charter holder after his 3 years experiences at Risk Management organization of Hyundai Securities. By combining his backgrounds in two different worlds, he could play a role as a bridge between business and technology in several projects after he joined PwC. As a professional consultant, he advised financial services firms to improve their data management practice for efficient risk management and investment activities. He also designed and delivered successful database systems for his clients. At Franklin Templeton, he is currently working on Security and Product Master Data Management projects as the Information Architect.
Preeti Sharma is an Information Architect working for Franklin Templeton Investments. She was formerly a Database Lead Engineer for IGT. In her current role at Franklin Templeton Investments, she has worked on building data architecture for various systems including MDMs, DW/Data Marts and OLTPs. She hasalso designed and implemented an home grown Ontology-based solution for classification of semi-structured documents. She has a Bachelor's degree in Electronics from Delhi University India and a Master's degree in Computer Applications from MDU University India. She is a Claritas Certificate holder from CFA Institute and is also a member of DAMA.
March 09, 2016
A Data-Centric Approach to Enterprise Architecture: Embedding Data Capabilities into the Operating Infrastructure
When data requirements are embedded in application requirements, the organization is forever playing catch-up. Data is "following" instead of "driving". That is, the business constantly creates work-around solutions and performs non-value-added work to deal with gaps caused by unique and stove-piped data requirements created for each application. These application based data silos create greater risk and prevent the business from rapidly responding to the ever changing market place.
This presentation will provide insights into how enterprise architecture can be leveraged to instantiate a data-centric operating environment. Different enterprise operating models will be presented to demonstrate how the enterprise architecture can apply a data-centric approach to embed data-driven capabilities within the operating environment. Use cases will include a global financial services company and a donor-supported healthcare company.
Peter Aiken, Ph.D., is widely acclaimed as one of the top ten data management authorities worldwide. As a practicing data consultant, author and researcher, he has been actively performing in and studying data management for more than 30 years. Throughout his career, he has held leadership positions and consulted with more than 50 organizations in 20 countries across numerous industries, including defense, banking, healthcare, telecommunications and manufacturing.
He is a highly sought-after keynote speaker and author of multiple publications, including his latest book "Monetizing Data Management".
Peter is the Founding Director of Data Blueprint, a data management consulting firm that puts organizations on the right path to leverage data for competitive advantage and operational efficiency.
He is also past President of the International Data Management Association (DAMA-I)
November 11, 2015
Why do Enterprise Data Warehouses take so long, cost so much and too often cannot answer my questions?
An Enterprise Data Warehouse (EDW) is a critical factor in getting a leg up on the competition. Organizations invest millions of dollars and thousands of person hours designing, implementing and maintaining their EDW. Far too often, those responsible for the EDW are disappointed that the EDW does not provide a competitive advantage and costs to maintain it keep climbing.
This presentation will explore the reasons why so many EDW implementations don't live up to expectations. We will also discuss some ways to have a high probability of success in implementing an EDW at a low initial cost and a low total cost of ownership (TCO) through Data Warehouse Automation (DWA) technology. We will discuss
How DWA significantly reduces time and cost while at the same time increasing quality and user satisfaction.
Identify the industry leaders
Discuss how DWA by itself does not necessarily guarantee success
Discuss how Enterprise Database Design Patterns coupled with DWA help ensure success
We will close with a demo of how DWA technology works by building a data mart for Sales with slowly changing dimensions and incremental loading, generating the data warehouse relational tables as well as cubes, populating the data warehouse and cubes and running some reports.
Joe Oates, is an internationally known speaker, author, thought-leader and consultant on data warehousing, database design and object-oriented (OO) development. He has more than 30 years of experience in the successful management and technical development of business, real-time, OO and data warehouse applications for industry and government clients. His successful data warehouses include the following industries: banking, health care, credit card, telephone, distribution and life insurance.
Tobias Eld, Vice President, TimeXtenderoversees the delivery of the company's data warehouse solutions for customers throughout the United States and Canada. He manages all efforts to assist partners and clients, and helps customers achieve a well-structured data warehouse, impactful analytics and accurate reporting.
August 12, 2015
Data Management at the
Crossroad of Governance and Quality – Metadata
The emergence of the role of the Chief Data
Officer has resulted in greater visibility of Data Governance issues that
have confronted data management professionals for years. Enterprises are
now coming to grips with the increasing importance of the reliability of
data that is provided to internal and external consumers. The creation of
a data quality industry segment can be seen as a response to these issues.
Don will be speaking about the Data Management
ecosystem that talks about traceability of the business glossary to
databases and from operational data collection to data distribution
Don Soulsby, VP Architecture Strategies,
Mr. Soulsby is Sandhill Consultants Vice President Architecture
Strategies. His practice areas include strategic and technical
architectures for data management, metadata management, and business
intelligence. Mr. Soulsby has held senior professional services and
product management positions with large multi-national corporations and
software development organizations.
He has an extensive background in enterprise architecture and data
modeling methodologies. He has over 30 years of experience in the
development of operational and decision support applications.
He is completing his qualification as an Enterprise Data Management
Expert (EDME) with the CMMI Institute. Mr. Soulsby is an excellent
communicator and has taught metadata, data modeling and data warehouse
courses through public offerings and onsite engagements to corporate
Mr. Soulsby is a recognized thought leader who speaks regularly at
international industry events, MIT CDO Conferences and DAMA functions.
June 10, 2015
Data Requirements Modeling
Continuing from last month’s webinar: It will be a quick review of what
was presented last month, then a short presentation on ramifications to
DRM related to the Zachman Framework, Three-Tier Architecture, data
reverse engineering, OOD (Object Oriented Design), followed by an
interactive demo of going through the steps of SDM, CDM, DRM, LDM, and
DR-LDM. The plan is to use ER/Studio DA as the modeling tool, as was
requested in the last meeting.
Yet, the data modeling of data requirements is not
done independently nowadays, if ever, of other types of data modeling.
Trying to include it in conceptual or logical modeling forces the
combination of designing a solution at the same time or even before the
problem space has been fully defined. The systematic reconciliation the
data design with the data requirements, element by element, is a manual
exercise that is most often not done and will cause the physical data
design to go through cycles of readjustments.
Why aren't we building data models that
graphically represent and specifically address data requirements? Why
can't we automatically reconcile a data model representing the results of
the analysis of data requirements with the LDM that represents the
proposed logical data design solution? Because the data modeling tools we
have now do not facilitate these activities.
François will share the evolution of his thinking
on DRM, what he realized and what he proposes. He will share how he sees
it being planned and done, and by whom, what changes the modeling tools
must incorporate to allow the tool users to benefit fully of the DRM
activities, the characteristics he sees in the central model object of
DRM: the Logical View, and how different this model object from the
François will spend some time in showing how,
using specific workarounds, one may be able to create data models that
look like DRMs in two different modeling tools.
François Cartier has more than forty years of diversified experience in
Information Technology in a wide variety of commercial sectors, including
telecommunications, transportation, manufacturing, wholesale, government
agencies, insurance, and financial institutions. He has designed systems
marrying relational with object oriented technologies, built and
contributed to corporate data models, designed operational and decision
support databases under a variety of DBMS’s.
managed data analysis, system development, application support and IT
change control teams. He has been using various modeling tools in the last
25 years. He has given classes at Golden Gate University, and made
technical and management level presentations at various forums in the USA
He is a
DAMA SF chapter member since 1985, a past president and the treasurer for
the last 12 years. He has been working for e-Modelers since 2002 on
various consulting and teaching assignments with clients.
Implementing a Data-Centric Strategy &
Roadmap – Focus on what Really Matters
Data is the lifeblood of
just about every organization and functional area today. As businesses
struggle to come to grips with the data tsunami, it is even more critical
to focus on data as an asset that directly supports business imperatives
as other organizational assets do. Organizations across most industries
attempt to address data opportunities (e.g. Big Data) and data challenges
(e.g. data quality) to enhance business unit performance. Unfortunately
however, the results of these efforts frequently fall far below
expectations due to haphazard approaches. Overall, poor organizational
data management (OM) capabilities are the root cause of many of these
failures. This workshop will cover three lessons as illustrated in
examples, which will help you to establish realistic OM plans and
expectations, and help demonstrate the value of such actions to both
internal and external decision makers.
Among others, you'll walk away with three takeaways:
1. That organizational thinking must change: Value-added data management
practices must be considered and included as a vital part of your business
2. Walk before you run
with data focused initiatives: Understand and implement necessary data
management prerequisites as a foundation, then build upon that foundation.
3. That there are no
silver bullets: Tools alone are not the answer. Specifying business
requirements, business practices and data governance are almost always
Aiken, Founding Director, Data Blueprint
Peter Aiken, Ph.D., is widely acclaimed as one of the top ten data
management authorities worldwide. As a practicing data consultant, author
and researcher, he has been actively performing in and studying data
management for more than 30 years. Throughout his career, he has held
leadership positions and consulted with more than 50 organizations in 20
countries across numerous industries, including defense, banking,
healthcare, telecommunications and manufacturing.
He is a highly sought-after keynote speaker and author of multiple
publications, including his latest book “Monetizing Data Management”.
Peter is the Founding Director of Data Blueprint, a data management
consulting firm that puts organizations on the right path to leverage data
for competitive advantage and operational efficiency.
He is also past President of the International Data Management Association
Lewis Broome, CEO, Data BlueprintAn innovative and
practiced thought-leader in data management, Lewis Broome has more than 20
years of experience successfully designing, managing, implementing and
leading global data management and information technology solutions. His
successful track record is marked by strong leadership coupled with a
passion for driving data and technology solutions from a clear
As an executive in the global financial industry,
Lewis led the development of globally integrated data solutions for two of
the largest banks in the world. He designed and delivered data solutions
(conceptual, logical and physical) and was able to drive standards and
deliver timely, cost-effective solutions that were aligned to business
his current role as CEO, Lewis, in partnership with Peter Aiken, Ph.D.,
has developed a tier-1 consulting organization that effectively combines
data management, management consulting and technology into a unique
professional services offering.
Title: Hadoop Data Lake Controversy: Can
You Have Your Lake And Use It Too?
Hadoop provides an ideal platform for storing
many types of data that business users - data engineers, data scientists,
data analysts, and business analysts - can leverage for data science and
analytics. But Hadoop is a file system that lacks the automation to
catalog what data it contains, and has no native way for users to find and
understand the data they need for their data science and analytics
projects. The lack of automation is overlooked when a team conducts a
pilot since the data set is known; however, it becomes debilitating as
projects grow beyond a proof point or two. The end result is data anarchy
where the business has to scavenge for data and hoard what it can find,
while IT is desperately trying to manage the data to meet the needs of the
Using data in Hadoop is like scavenging at a
flea market. It is impossible to know upfront what data is there and it
would take too much time to browse through the entire market. In the case
of Hadoop, it is not practical to browse through all the files in the
cluster to find the right ones to wrangle or visualize.
The opposite of shopping at a flea market is
Amazon.com. From a user perspective, it is easy to search and find the
right product very quickly. A user doesn’t need to write code or browse
through endless list of items. Amazon.com provides a catalog of products
with detailed information that anyone can use.
Waterline Data solves the challenges of
finding, understanding, and governing data in Hadoop. Waterline Data is
like Amazon.com for Hadoop data. Waterline helps anyone find and
understand data in Hadoop without writing code or wasting time browsing
through unintelligible files. In addition to providing the
self-service experience to find and understand the right data, Waterline
Data also automates building and maintaining a data inventory, securely
provisions data to users, and enables data governance throughout.
Founder and CEO, Waterline Data
Alex created Waterline Data to accelerate the adoption of Big Data and
data driven decision-making at enterprises.
Prior to Waterline Data, Alex served as general manager of Informatica’s
Data Quality Business Unit, driving marketing, product management and R&D.
Also for Informatica, Alex managed a team of 400 engineers and
product managers as SVP of R&D for Core Technology, developing
Informatica’s platform and data integration technology.
Alex joined Informatica from IBM, where he was an IBM Distinguished
Engineer for the Information Integration team. IBM acquired Alex's second
startup, Exeros that specialized in enterprise data discovery.
Previously, Alex was co-founder, CTO and VP of Engineering at Acta
Technology (acquired by Business Objects and now marketed as SAP Business
Objects Data Services).
Prior to founding Acta, Alex managed development of Replication Server at
Sybase and worked on Sybase’s strategy for enterprise application
integration (EAI). Earlier, he developed the database kernel for Amdahl’s
Design Automation group.
Alex holds a B.S. in Computer Science from Columbia University School of
Engineering and a M.S. in Computer Science from Stanford University.