• Attention

    This portion of the site uses web technologies and standards which are not compatible with your current browser. Please consider using another browser or upgrading to fully experience the site.

    Think Like a Data Scientist

    How do data scientists utilize predictive and prescriptive analytics to create business value?

    Step 1
    Data Scientist

    These should be:

    Critical to immediate-term performance

    Documented (communicated internally/publicly)

    Cross-Functional (involving multiple business functions)

    Championed by a senior business executive

    Measurable against clear financial goals

    Time Bound that is well-defined

    Advantageous (deliver financial or competitive advantage)

    Make It Happen


    Key Technologies

    Step 2
    Data Scientist
    Stakeholder Personas Stakeholder Personas Stakeholder Personas

    Develop Stakeholder Personas

    Identify the key business stakeholders who either impact or are impacted by the targeted business initiative.

    Learn more about building stakeholder personas

    Make It Happen


    Key Technologies

    Step 3
    Data Scientist

    Identify Strategic Nouns

    What are the key business entities that either impact or are impacted by the organization's key business initiative?

    Learn more about identifying strategic nouns




    Wind Turbines






    Make It Happen


    Key Technologies

    Step 4
    Data Scientist
    Data Scientist Data Scientist

    Capture Business Decisions

    Document business stakeholder key decisions and write brief descriptions.

    Learn more about capturing business decisions

    How much stuff do I need?


    How many staff should be working?


    How much of product X should I stock?


    When is the best time to order more product?


    Make It Happen


    Key Technologies

    Step 5
    Data Scientist

    Brainstorm Business Questions

    This is perhaps the hardest part of the "thinking like a data scientist" exercise, which involves examining your strategic nouns from 3 perspectives...

    Learn more about how to brainstorm business questions

    Descriptive Analytics:

    Understanding what happened

    How many widgets did I sell last month?

    Predictive Analytics:

    Predicting what will happen

    How many widgets will I sell next month?

    Prescriptive Analytics:

    Recommending what to do next

    How much of component Z should I order?

    Make It Happen


    Key Technologies

    Step 6
    Data Scientist

    Leverage "By" Analysis.

    This is an exploratory technique of examining a strategic entity by its data attributes. This can uncover:

    • Additional data sources
    • Additional dimensional entity characteristics
    • Additional areas for analytics exploration
    Learn more about "By" analysis

    "Show me Customer
    habits by..."

    • Category
    • Remodel Date
    • Store
    • Day of Week
    • Customer demo

    Make It Happen


    Key Technologies

    Step 7
    Data Scientist

    Create Actionable Scores

    Look for groupings of strategic noun dimensions and attributes that can be combined to create a more predictive and actionable score.

    Learn more about
    scoring techniques

    FICO Score

    Examples of Score Techniques Examples of Score Techniques - Gauge needle

    Make It Happen


    Key Technologies

    Step 8
    Data Scientist
    Data Scientist Data Scientist

    Put Analytics Into Action

    Deliver analytics-driven scores and recommendations to the key business stakeholders.

    Learn more about putting analytics into action
    Persona Persona Persona
    Analytics Analytics Analytics

    Make It Happen


    Key Technologies



    The traditional driving force behind a data lake strategy is the economics of Big Data storage and management in an environment of rapidly growing unstructured data. By consolidating data and eliminating expensive and inefficient storage silos, organizations can significantly reduce costs and streamline management. With a data lake, organizations can also provide more consistent levels of data protection and security to meet their specific governance and compliance requirements.

    Beyond these core benefits, leading organizations recognize that the real power of the data lake is to enable their data science teams to quickly and easily apply powerful Big Data analytics that can unlock the value of Big Data assets, gain new insight and accelerate the success of the organization. With the 'in-place analytics' capabilities of a data lake, data scientists can initiate data analytics projects immediately and without the expense of investing in a separate analytics infrastructure or the time-consuming need to copy and move large data sets.

    To realize the many advantages of a data lake, organizations need a Big Data storage infrastructure with multi-protocol capabilities, including native support for the Hadoop Distributed File System (HDFS), to enable the data lake to support a wide range of applications and workloads including Big Data analytics. Many organizations also need a flexible data lake infrastructure that can extend to enterprise edge locations including remote and branch offices as well as to the cloud.

    Big Data Infrastructure

    With Dell EMC, you can take your data lake strategy to the next level. An Dell EMC data lake allows you to store, manage, protect and analyze data while gaining breakthrough efficiency, scalability, and business agility, from edge-to-core-to cloud. Key elements of an Dell EMC data lake solution are described below:

    Dell EMC Isilon NAS Storage Platform

    Isilon Storage
    The industry leading scale-out NAS platform, Isilon is ideal for Big Data storage and analytics. Isilon is simple to manage and scales easily to 68 PB in a single cluster. With native multi-protocol support, including HDFS, Isilon supports Big Data analytics and a wide range of other applications and workloads on a single platform. With Isilon CloudPools software, you can seamlessly integrate your on-premise Isilon storage with a choice of public or private cloud storage providers. IsilonSD Edge software defined storage allows you easily to integrate date from edge locations such as remote and branch offices to your core data center. In this way, Isilon enables you to extend your data lake from edge-to-core-to-cloud.

    VCE Block Pre-Integrated Stack

    VCE Vblock Systems
    VCE Vblock Systems simplify all aspects of IT and enable organizations to achieve better business outcomes faster with the world’s most advanced converged infrastructure. With flexible options like VCE technology extensions for Isilon, you can deploy a platform that advances development, QA and production lifecycles while modernizing and consolidating data center footprints. Harness the power of converged infrastructure to successfully deploy an enterprise data lake with built-in support for Hadoop and other Big Data analytics environments.

    EMC Elastic Cloud Storage Appliance

    Elastic Cloud Storage (ECS)
    Dell EMC Elastic Cloud Storage is a powerful hyper scale geo-distributed object and HDFS storage platform for geo-scale analytics and Multi-Cloud API's to seamlessly connect to public clouds.

    VMWare Big Data Extension of VMWare vSphere

    And VMWare Big Data Extensions are an extension of VMware vSphere that enables you to deploy, run, and manage a virtual Hadoop cluster. Big Data Extensions enables the rapid deployment of Hadoop clusters on VMware vSphere. Big Data Extensions provides a simple deployment toolkit that can be accessed through VMware vCenter Server to deploy a highly available Hadoop cluster in minutes using the Big Data Extensions user interface.

    Big Data Analytics

    Pivotal Big Data Suite Integration

    Pivotal Big Data Suite is an integration of Pivotal technologies with unlimited use of Pivotal HD to store all your data, accelerate processing, and increase the amount of data being analyzed and operationalized.

    With a rich and compliant Structured Query Language (SQL) dialect, Pivotal HAWQ® supports application portability and a large ecosystem of data analysis and data visualization tools such as SAS, Tableau and more. Analytic applications written over HAWQ are easily portable to other SQL compliant data engines, and vice versa. This prevents vendor lock-in for the enterprise and fosters innovation, while containing business risk. Pivotal HAWQ provides strong support for low-latency analytic SQL queries, coupled with massively parallel machine learning capabilities.

    ivotal Big Data Suite can be deployed as part of PaaS technologies, on-premise and in public clouds, in virtualized environments, on commodity hardware or delivered as an appliance.

    Pivotal Big Data Suite portfolio is compatible with distributions of Open Data Platform (ODP) versions of Hadoop. All components are distributions of open source projects or are in the process of becoming open source projects.

    Converged Infrastructure for Analytics

    Big Data Applications

    Pivotal Cloud Foundry is an industry-leading, enterprise platform-as-a-service solution, powered by Cloud Foundry. It delivers an always-available, turnkey experience for scaling and updating applications on the private cloud.

    EMC Isilon NAS Storage Platform

    Streamline application development, deployment and operation on a centrally-managed Platform-as-a-Service for public and private cloud. Streamline IT development with full visibility and control over your application lifecycle, provisioning, deployment, upgrades and security patches.

    Accelerate time-to-value through automated deployment of analytic systems on virtualized infrastructure utilizing shared storage for immediate data access from all applications (I.e. No data copy operations to DAS). Dell EMC built an extensible platform that allows fast integration of new analytic applications and platform components, from ingest, indexing and data security applications. We support 3rd party and open source applications so your business can run analytics its own way.


    Big Data Business Model Maturity Index)

    Bill Schmarzo developed a maturity model to help businesses understand where they are with Big Data proficiency. Businesses can use this to identify the transformational changes they need to make in order to gain Big Data capabilities, operationalize them, and use them to drive new types of value for IT and the lines of business.

    1. Business Monitoring is how organizations begin with Big Data, by deploying business intelligence tools to monitor current business performance. This approach is about reporting on the past to know what happened, such as how many widgets I sold last month, or profit for the last quarter.
    2. Business Insights At the next stage of maturity, organizations use analytics to drive insights that predict what will happen and integrate the insights into existing reports and dashboards, such as how many widgets will I sell next month, or projections for profits next quarter.
    3. Business Optimization is when organizations embed predictive and prescriptive analytics into existing business processes to optimize select business operations. This is the point where the analytics are providing guidance (tell me what I should do), such as telling me how many widgets to order to cover sales next month, or telling you to hire 4 new sales reps to cover expected seasonal demand.
    4. Data Monetization is reached when organizations create new revenue opportunities, such as 1) reselling data and analytics, 2) creating “intelligent” products, or 3) over-hauling the customer engagement experience.
    5. Business Metamorphosis is achieved when organizations leverage customers' usage patterns, product performance behaviors, and market trends to create entirely new business models, such as how Amazon transformed from an online bookstore to become the world's largest retailer, or think of GE Aviation selling thrust instead of jet engines, or John Deere selling farming optimization, or Florida Light and Power selling energy optimization.

    Currently, many organizations find themselves within the first two stages. And, they are generating business value in these stages. Our mission is to help organizations advance so they can uncover and execute on the highest-value business opportunities that will transform their businesses. We do it by starting with your strategic initiatives and business outcomes in mind.


    Success starts with aligning IT and the business around a single strategic business initiative within a 9-12 month timeframe. This helps us identify an analytics use case that will accelerate a current business goal or solve a current problem. You need to deliver the right analytic recommendations to the data science teams – the workhorses of your Big Data ecosystem – to help them surface insights that can drive business value.

    Big Data Vision Workshop

    We have a unique methodology to identify and prioritize a single analytics use case with the best combination of implementation feasibility and business value. It's a 3-week engagement that applies research, interviews, data science expertise and techniques to your business – culminating in a 1-day workshop to identify and agree on the best analytics use case and path forward to solving a business problem. This approach sets us apart from the "bring in a bunch of technology and see what it can do" approach that's pushed by many vendors. We call this a Big Data Vision Workshop. The Big Data Vision Workshop from EMG Global Services aligns business and IT goals around Big Data, identify strategic opportunities for Big Data analytics, prioritize key use cases by assessing feasibility and business benefits, demonstrate the potential value using data science techniques, and recommend the appropriate analytics engagement and deployment roadmap. Learn more here.


    Some organizations have already made progress implementing certain data and analytics use cases, and now the IT organization seeks to expand its capabilities and operationalize the processes, to meet growing demands for better/faster data and analytics. But what often happens is that IT hits a technology wall, because the underlying infrastructure, tools and processes don't support the new demands of the business. Some typical scenarios we see are gaps in the Big Data capabilities within the IT environment, and long delays in delivery of incoming requests for data and analytics. Uniquely, Dell EMC helps you understand your technology gaps in context to your business goals.


    Our Big Data Technology Advisory service helps your IT organization quickly understand its technology gaps with respect to your Big Data requirements and provide a roadmap and plan to integrate the capabilities you need. Although we often recommend a data lake with Hadoop as a foundational component of a Big Data architecture, we avoid recommending technology simply because a customer wants to "do" Big Data. Instead, we help you consider the technical capabilities required for your unique data sources and strategic business objectives so you can make the right recommendations about your future-state architecture. The target capabilities could include: data ingest challenges, ETL Offload, Data Discovery and Profiling, Rapid Environment Provisioning, or implementing components for Data-as-a-service. This process includes:

    • Identify target capabilities for Big Data and analytics
    • Assess current/desired state and obstacles
    • Perform gap analysis of capabilities and technology
    • Develop future state architecture
    • Deliver technology roadmap and implementation plan

    Once you know your technology gaps and future state architecture, our Proof of Technology Service lets you pilot the recommended architecture to validate that the hardware, software, and integration works with your existing environment and deliver the capabilities you need to meet the requirements of the business. The value to you is a validated architecture customized to your environment and needs. And we have implementation services to put your optimal architecture into production. Learn more here.

    Reskill for Digital Transformation

    Reskill for Digital Transformation

    Dell EMC offers a range of education services to help business leaders, aspiring Big Data practitioners, and seasoned data scientists increase their effectiveness with Big Data. We offer a 90-minute course for business leaders to develop a baseline understanding of data science and Big Data to help them identify opportunities and integrate Big Data into their business strategies.

    For Big Data analytics practitioners and team leads, we have 1-day and 5-day courses that utilize industry specific examples to explore team development, data science concepts, analytic approaches, tools, and advanced methods and hands-on labs. We offer advanced-level 5-day courses for specific methods and tools with labs and Dell EMC Proven Data Science Certification.

    Finally, we offer technology focused training on the core elements of the Federation Business Data Lake including the Islion, Pivotal HD and ECS components.