Beyond Object StorageA Unified Storage Architecture for IoTSasikanth Eda, Sandeep PatilIBM

Agenda Internet of Things (IoT): Market Estimates And Forecasts IoT: Challenges IoT Platform Architectures IoT Platform: Storage Requirements Enhancements to Object Storage to support IoT workloads Better2

IoT: Evolution of DevicesAccording to IDC - “The Internet of Things (IoT) is a network of networks of uniquelyidentifiable endpoints (or "things") that communicate without human interaction usingIP connectivity - whether locally or globally.”Source:

IoT: Market Estimates And Forecasts (Industry Specific) More than 50% of manufacturers identify lower operational costs as the mostsignificant driver of their organization’s Internet of Things (IoT) initiatives over the next12–24 months (05/06/2015, IBM Institute for Business Value). The value of Internet of Things in healthcare is forecasted to reach 163.24 billion by2020, with a CAGR of 38.1% for 2015-2020 (01/05/2016, eMarketer). By 2018, it is expected to drive development of over 200,000 new IoT apps andservices (11/03/2015, IDC).4

IoT: Market Estimates And Forecasts (Cloud Specific) 36% of hybrid leaders are using hybrid cloud for Internet of Things.(02/09/2016, IBM Center for Applied Insights) 52% of IoT developers implement cognitive computing and artificial intelligence intheir development work. (11/09/2015, Evans Data) By 2019, 80% of new applications using the Internet of Things (IoT) will analyze datain motion as well as collect this information for analysis of data at rest.(12/03/2015, Gartner)5

IoT: Challenges (Technology) Cost of connectivitySolutions are expensive (high infrastructure, maintenance costs, service costs ofmiddlemen). Internet after trustIn IoT networks trust (privacy, anonymity) is either expensive or difficult toguarantee. Lack of functional value- Simple connectivity enablement may or may not make a device smarter or better.- Connectivity with intelligence makes a better product and experience.6

IoT: Challenges (Business) Not future-proof- Refresh cycle for basic pieces of IoT infrastructure (door locks, bulbs etc.) is quitelong (may last for years, even decades). Broken business models- Unlike PC or smart phones, IoT devices (locks, bulbs etc.) worked without apps andservice contracts, which make revenue expectations unrealistic.- End to End IoT solutions (Smart TV to speak to the toaster, Smart washing machineto speak to detergent vendor) get cumbersome quickly.7

IoT: Current State – Needs a Reboot IoT landscape is been evolving at a rapid pace and there has been little innovation toaddress the IoT market from a storage infrastructure standpoint. Currently no storage vendor is offering specific storage solutions for IoT deployments. There is no single “best” approach to data management in the context of IoT. Need for a common framework, enhancements in performance and security.8

IoT Platform: Centralized Architecture IoT centralized architecture uses a hub (typically powered by thecloud) that controls the execution of nodes (smart devices). Uses Platform-as-a-Service model. Current database architecture and data management strategy maynot be sufficient to handle large scale IoT networks. Big data architecture gets complex, as IoT devices produce hundredsof thousands of data points per second. Offered Services: Event processing, Device discovery,Device management, Event Notifications, Real Time Analytics.9

IoT Platform: Decentralized ArchitectureADEPT platform – Autonomous Decentralized Peer-To-Peer TelemetryAn effort to prove the foundational concepts around decentralizedapproach.Key Objectives: Distributed Transaction Processing & Applications: Robust Security Privacy by design and default Designed for commerce and market places10

IoT Platform: Decentralized ArchitectureKey Solution Components: Trustless peer-to-peer messaging Secure distributed data sharing / file transfers Autonomous Device coordination.11

IoT Platform: Storage Requirements1. Scalable: Cost-effective scalability, and the ability to decouple capacity andperformance.2. Self-healing: Traditional RAID mechanisms may be ineffective in large scaleenvironments. Having a self-healing architecture with configurable data availabilityclasses.3. Data awareness: Ability to create a data-aware storage platform, which providesboth context to the data and policy management, as well as visualization tools toharness the value of that data for better business insights.* Source: Gartner, Reassess Storage Requirements for Successful IoT Implementations12

Why Object Storage For IoTEarly predictions (why object storage suits best for IoT workload) Designed to support large volumes and velocity of data. Native HTTP / REST support. Linear Scale out architecture (Scale in all directions including IOPS, latency) Distributed Architecture (Performance, load distribution) No single point of failure. Support for heterogeneous data types, different data protection policies.13

Why Object Storage is falling short ?Micro services ofIoTType of StorageAppropriate Storage PlatformsBatch ProcessingPrimary StorageDistributed FilesystemsReal-Time ProcessingPrimary StorageMemory-centric Filesystems, NoSQL Data StoresActive ArchivingSecondary StorageDistributed Filesystems, Object StorageCold StoragePrimary StorageObject Storage with Spin down capabilities, CloudStorage, Tape Libraries* Source: Gartner 2015Based on a recent survey on IoT implementations, it is observed that object storage isnot considered for real time processing of IoT data (However success of IoTimplementation hugely depends on analytics platform).14

Factors that help Object Storage to hold its promise1: In-Place Analytics2: IoT data collection and Privacy filters3: Native Event, Cross Device Notification, Life Cycle Management15

Issue-1: In-Place AnalyticsAnalytics on Traditional Object StoreAnalytics With Unified File and Object AccessObject(https)In-Place AnalyticsSpark or HadoopMapReduceExplicit Data movementData ingestedas ObjectsResultsreturnedIn place Unified Fileset / Device Results Publishedas Objects withno data movementClustered Filesystem1. Data to be migrated from object storeto dedicated analytic cluster.2. Perform the analysis and copy resultsback to object store for publishing.Reference: data available as “Files” on the same fileset.Analytics systems (Hadoop, Spark) can directlyleverage this data analytics.No data movement / In-Place immediate data analytics16

Swift-on-File for In-Place AnalyticsThis s stored with Swift 22b9285b05e665fd7b843bf79/1401254393.89313.dataBut is now stored with SwiftOnFile here:/mnt/scaleoutFS/acct/cont/objFor more details refer: OpenStack SwiftOnFile - User Identity for Cross Protocol Access Demystified17

Unified Identity Between Object And FileNFS / CIFSObject AccessFile AccessPOSIXObjectAuthenticationvia keystoneFileAuthenticationSwiftOnFiledeviceIBM Spectrum ScaleCommon AD/LDAP Common set of Object and File users using same directory service (AD RFC 2307 or LDAP) Objects created using Swift API will be owned by the user performing the Object operation(PUT) Note that if object already exists, existing ownership of object will be retained18

Issue-2: IoT Data Collection and Privacy Filters Data collection and maintaining privacy on IoTdevices is a big deal. Lack of trust on details collected by commodity IoTdevices. Storing these details on cloud makes situationmore uncomfortable. End user configurable Middleware to filter out theprivacy parameters per IoT device. End user configurable Expiry of objects, metadata perObjectServerFilter custom dataMiddlewareAppObject ServerIBM Spectrum ScaleIoT device. Automated placement of data per IoT device.19

Issue-3: Event, Cross Device Notification, Life Cycle Management Avoids need of database triggers or performanceObject-xObject-yhampering analytic engine to act on a particular set ofreceived IoT data.Cross Device Notification EngineIBM Spectrum Scale Policies, rules for cross device notification canapplied based on received object content, type, devicetype, metadata details and generated event type. Auto-migration / placement of data based on itsObject-xSlow DiskHigh Diskdependency for cross device notification.20

IBM Spectrum ScaleData management at scaleHDFSNFSGlance ManilaPOSIXSMBCinderSwiftFastDiskSlowDiskTape Avoid vendor lock-in with true SoftwareDefined Storage and Open Standards Seamless performance & capacity scaling Automate data management at scale Enable global collaborationBusiness: I need virtuallyunlimited storageAn open & scalablecloud platformOperations: I need a flexibleinfrastructure that supportsboth object and file basedstorageA single data plane thatsupports Cinder,Glance, Swift, Manila aswell as NFS, et. al.Operations: I need to minimizethe time it takes to performcommon storage managementtasksSpectrum ScaleSSDOpenStack and Spectrum Scale helpsclients manage data at scaleCollaboration: I need to sharedata between people,departments and sites with lowlatency.A fully automatedpolicy based dataplacement andmigration toolSharing with a varietyof WAN cachingmodes Converge File and Object based storage under one roof Employ enterprise features to protect data, e.g.Results Snapshots, Backup, and Disaster Recovery Support native file, block and object sharing to data.

THANK YOUAcknowledgementsDean HildebrandBill OwenShyama VenuGopal

environments. Having a self-healing architecture with configurable data availability classes. 3. Data awareness: Ability to create a data-aware storage platform, which provides both context to the data and policy management, as well as visualization tools to harness the value of