Big Data Mythology
Big or small, structured or
unstructured, for data to be meaningful and usable in today’s massive and
complex enterprises, the means to assure that the data’s consistency and
interoperability is required. Increased size and complexity along with
reduced structural definition of data associated with Big Data initiatives
aggravate the problem.
Much has been written and
discussed about the many V’s (Volume, Velocity, Variety…) of Big Data. The
discussions have taken on an almost mythological quality.
We will look at two of the aspects of the BD-V’s of Variety and
Value. Taking a business Value perspective we will look at the Big Data
Varieties that are text centric; Human Generated, Big Transactional Data
and Web/Social Media. In this
session we will examine the “unclassified” rather than “unstructured”
nature of Big Data. We will introduce the Sandhill approach to practices
and technologies to classify and integrate Big Data into traditional data
management and data delivery processes.
Dare we say Myth…
Mr. Soulsby is an Enterprise
Architecture Practices Leader with Sandhill Consultants. His practice
areas include strategic and technical architectures for information
management, metadata management, and data warehousing. Mr. Soulsby has
held senior professional services and product management positions with
large multi-national corporations and software development organizations.
The Un-sung Hero of Big Data
Big data has been the talk of the analytics
world for quite a while and shows no sign of slowing down. The promise of
analyzing all relevant information, regardless of source, is an incredibly
appealing idea. But despite all the buzz and interest, very few big data
implementations have moved beyond the core Hadoop use cases of Web
analytics, ad click analysis and failure prediction. This limited
perspective has a lot to do with the fact that all big data has typically
been lumped under the label of unstructured data, thus limiting its
Has this stereotype impacted adoption?
Yes. Putting all types of information - including that which is not
traditionally structured data - under a single label makes it difficult
for companies to take a more nuanced view of their data landscape. This is
a core requirement to understanding what technologies to bring to bear
and, more importantly, understanding what data sources will provide the
biggest bang for your big data buck.
When companies start to compare the business value of also investing in
big content versus just big data, they start to realize that big content
is, in many cases, the most important aspect from a value standpoint - an
"unsung hero," if you will.
To register, please click on this link:
Webinar ID: 132-155-715
You will receive a confirmation email with the login details
VP, PaaS, Attivio
Reid Craig is Vice President of PaaS
(Platform-as-a-Service), responsible for product strategy, go-to-market
strategy, and customer success. He began as a software engineer in
Attivio’s R&D organization before moving over to the “Dark Side” to become
Attivio’s first sales engineer. Over the next few years Reid had the
opportunity to shape and grow what would become Attivio’s Solutions
Reid has more than 15 years of experience in software, having held
positions in all stages of the Software Development Life Cycle and all
possible client-facing technical roles – client services, business
analysis, QA, development, support, delivery, architecture, sales and
To date, most Big Data and Hadoop use cases have focused on ETL and batch
analytics. However, there is a new class of Big Data use cases emerging -
powering real-time applications and analytics.
What do we mean by real-time? We mean real-time, interactive queries on
data updated in real-time. This is usually the realm of traditional
databases such as Oracle and MySQL, but they lack the ability to scale-out
on commodity hardware, which is the hallmark of Big Data technologies.
Now RDBMS technology is being married to scale-out technology of Hadoop.
This enables a whole new class of real-time Big Data use cases. We will
discuss real-world uses cases in digital marketing, ad tech, and telecom
For instance, we will discuss a marketing services provider that uses IBM
Unica software to execute real-time campaigns for major retail brands such
as Sephora and Petsmart. They were experiencing cost and scaling issues
with one of their 20TB Oracle RAC databases. When they replaced their
Oracle RAC database with a Hadoop-based RDBMS, they saw queries speed up
by 3x-7x, while costs dropped by about 4x.
Wednesday, May 14, 2014 11:00 AM - 12:00 PM
Registration URL: [click
Zweben is the Co-Founder and Chief Executive Officer, Splicemachine.
A technology industry veteran, Monte’s early career was spent with the
NASA Ames Research Center as the Deputy Branch Chief of the Artificial
Intelligence Branch, where he won the prestigious Space Act Award for his
work on the Space Shuttle program. Monte then founded and was the Chairman
and CEO of Red Pepper Software, a leading supply chain optimization
company, which merged in 1996 with PeopleSoft, where he was VP and General
Manager, Manufacturing Business Unit.
In 1998, Monte was the founder and CEO of Blue Martini Software – the
leader in e-commerce and multi-channel systems for retailers. Blue Martini
went public on NASDAQ in one of the most successful IPOs of 2000, and is
now part of Red Prairie. Following Blue Martini, he was the chairman of
Non-Invasive Data Governance
A few points that should be considered regarding Non-Invasive Data
Many organizations view Data Governance as being over-and-above normal
work efforts and threatening to the existing work culture of the
organization. It does not have to be that way.
Many organizations have a difficult time getting people to adopt Data
Governance best practices because of a common belief that Data Governance
is about command-and-control. It does not have to be that way.
While Bob Seiner firmly states that "Data Governance
is the execution and enforcement of authority over the management of
data”, nowhere in that definition does it say that Data Governance has to
be invasive or threatening to the work, people and culture of the
organization. It does not have to be that way.
This SF DAMA webinar presentation on Non-Invasive
Data Governance™, the approach taught and implemented by Robert S. Seiner
of KIK Consulting & TDAN.com, focuses on formalizing existing
accountability for the management of data and improving formal
communications, value and quality efforts through effective
cross-organization stewarding of data resources.
Robert S. Seiner is the President and Principal
of KIK Consulting & Educational Services (KIKConsulting.com) and the
Publisher of The Data Administration Newsletter (TDAN.com). Bob was
recently awarded the DAMA Professional Award for significant and
demonstrable contributions to the data management industry. Bob
specializes in Non-Invasive Data Governance™, data stewardship, and
meta-data management solutions.
Data Integration Challenges With the
It’s no longer a question of if organizations
will be adopting cloud-based solutions, but when and how many. Gartner
predicts worldwide software-as-a-service (SaaS) application revenue will
reach $22.1 billion by 2015. With the proliferation of cloud-based
applications, platforms and infrastructure, the potential for data
fragmentation and disconnected data silos has grown exponentially. While
the benefits of cloud computing have been well documented – scalability,
agility, security and cost are usually at the top of the list, from the
outset, the push to adopt SaaS applications in the enterprise have come
from the business units or the higher ranks within an organization.
With the attraction of faster deployments, and the potential to not have
to get in line for limited IT resources, how are you going to ensure that
your data integration and cloud strategies are a lightning rod for
organizational improvement and not a data disaster waiting in the wings?
In this session we’ll discuss some of the specific dynamics and challenges
that Cloud brings to data integration and review some of the emerging
architectural patters that are starting to emerge. We’ll also review a
couple of customer case studies to see how some companies are embracing
this challenge head on.
Our speaker, Ron Lunasin is Sr. Director of Cloud
Platform Adoption for Informatica Corp. and is an 18 year veteran of the
company. He has held various roles at Informatica from software engineer
to internal IT data warehouse manager to product manager for Informatica’s
flagship on-premise product PowerCenter to then becoming one of the
founding members of Informatica’s Cloud business unit.