Cloud Integrated Storage In OES 2018

Cloud Integrated Storage (CIS) is a highly available service that allows you to move your cold data and store it in the object store and continue to provide the capability to seamlessly access the data.  This includes on-prem object storage or cloud storage. It provides a network wide view of the overall data. CIS does the adaptive scanning of the data on an OES server and provides meaningful information, which you can use to decide what data to migrate to the cloud. Based on your requirement, CIS helps you to decide which policy to create and run on the required OES server volume configured with the CIS server.  The files that satisfies the policy are migrated to cloud and the metadata information for the data migrated is stored in the CIS server.  Data moved to cloud storage by CIS is highly secure, as the data is encrypted and the keys do not leave the premises.

CIS Server

The CIS server requires multiple services to perform the overall orchestration. These services are built as Microservices, and the architecture is outlined in figure 1.  Microservices are a suite of independently deployable, small, modular services in which each service runs a unique process and communicates through a well-defined, lightweight mechanism to serve a business goal. Each service runs in its own container. There are multiple Microservices, namely:

  • Authentication (cis-auth), that authenticates the agents and users.  It also facilitates token creation.
  • Data (cis-data) that is used for data migrate, recall and communication with the target cloud.
  • Metadata (cis-metadata), provides the capability to migrate, recall and maintain the metadata.
  • Policy (cis-policy),  deals with the policies, agents, jobs, tiers and schedule operation.
  • Management (cis-mgmt) handles all the management operations such as CIS account configuration, policy creation, tier configuration, and assigning roles for other users.
  • Collector and Aggregator (cis-aggregator), obtains the metadata information from OES servers and provides overall data and meaningful information (hot and cold data) for the administrator to understand.
  • Collector and Aggregator for Reporting (cis-repcollector, cis-repaggregator), obtains the information about files migrated and recalled from cloud and provides meaningful information.
  • Gateway (cis-gateway) is the entry point to all services of CIS. It receives requests from OES servers and users and redirects it to the respective services. It listens by default on port 8243 for server operations and 8344 for management operations but is configurable.
39-cis-1
Figure 1: The CIS architecture

Agents

The OES server acts as agent to the CIS server.  Agents are able to auto discover the CIS server and get connected.   The CIS Agent (oes-cis-agent) performs the major operations such as volume listing and tier configuration. Secondary volumes called Cloud Backed Volumes (CBV) contain the metadata information of files that are migrated to cloud. It helps in migrating the data. By default, the agent communicates through port 8000.

The CIS Recall Agent (oes-cis-recall-agent) helps in recalling the data. When a request comes from a user for a specific file, the recall agent sends a request to the CIS server to retrieve the data from the cloud using the metadata information.

Just reading the metadata in the CBV does not recall the files from Object Storage. CIS Scanner (oes-cis-scanner) scans the NSS volume metadata in the OES server and sends it to the CIS server.

Dependency

CIS uses other services to make the deployment of CIS better and easier:

  • Database (MariaDB) is used to store OES server, cloud,  CIS service and information, and information about the migrated data.
  • Indexing (Elasticsearch) stores indexes and provides the capability for text search that enables the faster discovery and deliver of relevant data. It’s also used to analyse and aggregate the metadata obtained from respective OES servers and enables CIS to query the information faster.
  • Configuration Store (ZooKeeper) maintains the configuration information.
  • Messaging (Kafka) large scale message processing applications, used for asynchronous communication across services and to report event processing.

Benefits of Cloud Integrated Storage

  • Reduced Total Cost of Ownership: The active data (hot data) or frequently accessed data is stored on fast and high quality storage. The less accessed data (cold data) is placed on cloud storage with relatively slower access. Cloud Integrated Storage policies helps you to partition the files based on last accessed time, size, type, and name and so on. You can move the less active data (cold data) from higher performance storage to lower performance storage, thus reserving the expensive storage for active data (hot data).
  • Transparent File Access for End Users: Users can seamlessly access the files through the CIFS protocol. The user maps to the same logical place and is not aware of the physical location of the file. This allows the administrator to manage the data without disrupting the user’s view of the files. The files that are moved to cloud are represented with an offline (x) symbol in the Windows client, which indicates that the files are in an off-line state.
  • Data Availability on Access: After moving the data to cloud, you still have access to it. The access to data available on the cloud storage is taken care by secondary volumes called Cloud Backed Volumes (CBV).  The CBV contains the metadata information of the data available in the cloud. When the data is accessed, they are brought back to the OES server.
  • Policy-Based Migration: Administrators can set policies to migrate the data  (figure 4) from the primary volume to the cloud storage depending on the last accessed time, modified time, file type, size, and so on. To migrate the data, the policy can be run manually or you can choose to automatically run the policy based on the schedules (daily/weekly/monthly).
39-cis-2
Figure 2: The CIS Summary page

 

Deployment

CIS uses Docker containers to simplify deployment. The simplest form of deployment can be on a single server, although it is recommended to have different servers for each of the dependent services.  Follow these simple steps to get the CIS up and running.

  • Install and configure OES 2018.
  • Identify the eDirectory tree to be used.
  • Create certificates to be used by CIS, Elasticsearch and Kafka for secure communication.
  • Install and configure MariaDB,  Kafka,  ZooKeeper and Elasticsearch.
  • Install and configure CIS server.

Other methods of deployments are detailed in the administration guide.

Working with CIS

Based on the object store you want to deploy for your CIS, you could choose any S3 compatible service, such as SUSE Enterprise Storage, IBM Cloud Object Storage (CleverSafe), Amazon S3, Minio or any other S3 compatible object store whether on-prem private cloud or on public cloud.

Configure a cloud account
CIS needs to be associated with the object store to move the data from CIS to the object store. Based on the object store you have chosen, you will need to provide the interface and the credentials to CIS to connect to the Object Store (see figure 3).

39-cis-3
Figure 3: Configuring the Cloud storage account
39-cis-4
Figure 4: Creating a data policy

Have insights on data and create policies
On installation when the agents come up, they will auto discover the CIS server, do the communication based on certificate based authentication and then start the scanner on each of the agents. The Scanner collects the data from each of the agents and pushes the data to the elastic store.

The CIS Home page will be able to show the live data as and when it is being updated to the ES. You could have the details of hot and cold data based on the criteria you choose. These criteria can be turned to a policy and stored on CIS.

Create collections and or tiers
Collections are a group of volumes to which we associate the policy and the cloud account, and let each of the volumes in the collection be tiered. You can create individual tiers or collections based on the number of volumes that have to be associated to cloud storage. When there are a large number of volumes that need to use a common policy and a cloud account, collections could be used.

Monitor and view dashboard
Insights into the data migrated and recalled from each of the volumes can be viewed on a dashboard, as shown in figure 5.

39-cis-5
Figure 5: A CIS dashboard

The Future for CIS

Cloud Integrated Storage is introduced as a Technical Preview in OES 2018.  You can play with it, evaluate it, but it’s not supported for production deployment at this time.  The future of this technology depends on customer feedback and uptake.

If you are interested in this technology or have ideas around this, please get in touch with the OES Product Manager at pmadhan@microfocus.com or drop a note to OES@microfocus.com and we’ll get in touch to discuss your interests.

 

This article was first published in Open Horizons Magazine, Issue 39, 2017/4, p9-11.
....To view the full article you must have a full Digital Subscription.

Leave a Reply